AndesAIRE™ or Andes AI Runs Everywhere are ultimately efficient solutions designed for edge and end-point inference, including the first generation of AI/ML hardware accelerator intellectual property (IP) -the AndesAIRE™ AnDLA™ I350 (Andes Deep Learning Accelerator), the neural network software tools and runtimes – the AndesAIRE™ NN SDK, and the highly optimized and handcrafted NN compute libraries for RISC-V DSP/SIMD P-Extension (draft) and RISC-V Vector V-Extension v1.0 – AndesAIRE™ NN Library. The main features and benefits of the AndesAIRE™ are as following:
General-purpose compute flexibility provided by AndesCore™ CPU
In the AndesAIRE™ subsystem, most of the structured and computationally time-consuming parts of artificial intelligence workloads will be efficiently calculated in the AnDLA™ IP. For less structured calculations such as nonlinear functions, they can be calculated through RISC-V DSP/SIMD or Vector instruction extensions. AndesCore™ RISC-V DSP/SIMD (D23, D25F, D45) and RISC-V Vector (NX27V, AX45MPV) processors provide general-purpose compute flexibility and power efficiency for diversified AI applications and segments. As shown in the figure above, a conceptual AI SoC system, the AnDLA™ I350 can be paired with a compute acceleration processor and/or with a host processor, such type of AI SoC system can target a wide range of applications and choices.
Highly scalable AnDLA™ I350
The AndesAIRE™ AnDLA™ I350 provides industry-leading high efficiency, low power consumption and small area, making it ideal for a wide range of edge inference applications, including smart IoT devices, smart cameras, smart home appliances and robots. In response to the varying computing resource requirements, multiple AnDLA™ I350s can be instantiated in the SoC and tasks deployment is orchestrated by the CPU to address scalable AI/ML workloads. For general applications, AndesCore™ RISC-V DSP/SIMD cores (D23, D25F, D45) paired with AnDLA™ I350 is estimated to be capable of providing 10MOPS to 1GOPS computing power, while RISC-V vector cores (NX27V, AX45MPV) paired with AnDLA™ I350 is estimated to be capable of providing 1GOPS to 100 TOPS computing power.
Acceleration for most modern NN models
AndesAIRE™ mainly handles voice, audio, image and video AI workloads. It uses AndesAIRE™ NNPilot™ neural network optimization tool suite on PC to analyze the NN model, and then optimizes for running either on the CPU or AnDLA. In the AI operation of AndesAIRE™, processing is based on performance priority, if AnDLA can handle the operation, AnDLA will be used first, if AnDLA cannot handle the operation, it will be automatically assigned to the NN Library for processing on the CPU. Up to 2024, Andes has optimized over 800 functions for neural network processing targeting RVP DSP/SIMD and RVV Vector extensions, which can meet the needs of most customers’ AI acceleration applications.
In addition, to address the evolving and rapidly advancing AI technologies, Andes’ vision is an extensible AI subsystem, which seamlessly integrates AndesAIRE™ AnDLA™, AndesCore™ RISC-V CPU, and Andes Custom Extension™ (ACE). The ACE plays a key role for efficient data movement between the CPU and the AnDLA™ to reduce significant memory bandwidth and power consumption while increasing hardware utilization. The ACE can facilitate even faster processing by generating customized instructions for domain-specific applications, such as data pre- or post-processing. In addition to the extensibilities from hardware IP’s, Andes commits to continuous advancement of AndesAIRE™ NN SDK and AndesAIRE™ NN Library for the mass-production SoCs to adapt to the future AI algorithms. Andes has added more than one hundred compute library APIs yearly since 2021, and we will keep optimizing and adding new functions into the NN SDK and NN library in the future.
A high performance-efficient deep learning accelerator for edge and end-point inference
A Comprehensive Set of Software Tools and Runtimes for End-To-End Development and Deployment