AndesCore™ A25MP Multicore

32-bit Multiprocessors with Level-2 Cache-Coherence

AndesCore™ A25MP Overview

  • Symmetric multiprocessing up to 4 cores
  • Level-2 cache and cache coherence support
  • AndeStar™ V5 Instruction Set Architecture (ISA). Compliant to RISC-V ISA IMACFDN, with Andes performance/functionality extensions
  • Floating point extensions
  • Bit-manipulation extensions
  • DSP/SIMD ISA to boost the performance of voice, audio, image and signal processing
  • Separately licensable Andes Custom Extension™ (ACE) for customized acceleration
  • 32-bit, 5-stage pipeline CPU architecture
  • 16/32-bit mixable instruction format for compacting code density
  • Branch prediction to speed up control code
  • Return Address Stack (RAS) to speed up procedure returns
  • Memory Management Unit (MMU) and Physical Memory Protection (PMP)
  • Flexibly configurable Platform-Level Interrupt Controller (PLIC) for supporting wide range of system event scenarios
  • Enhancement of vectored interrupt handling for real-time performance
  • Advanced CoDense™ technology to reduce program code size

AndesCore™ A25MP 32-bit multicore CPU IP is based on AndeStar™ V5 architecture. It supports RISC-V standard ‘IMAC-FD’ extensions, bit-manipulation instructions ‘B’, Andes contributed DSP/SIMD ‘P’ extension (draft), user-level interrupt ‘N’ extension, and Andes performance/functionality enhancements such as instructions for faster memory accesses, faster branch handling, and Andes Custom Extension™ (ACE) to add user defined instructions. It features MMU for Linux based applications, branch prediction for efficient branch execution, level-1 instruction/data caches and local memories for low-latency accesses. 

The A25MP symmetric multiprocessor supports up to 4 cores and a level-2 cache controller with instruction and data prefetch. Andes Coherence Unit (ACU) manages level-1 cache coherence including I/O coherence for cacheless bus managers, and duplicated L1 tag to screen allocated lines for snoop queries. Other A25MP features include ECC for level-1/2 memory soft error protection, Platform-Level Interrupt Controller (PLIC) with enhancements for vectored dispatch and priority-based preemption, CoDense™, StackSafe™ for software quality improvement, and QuickNap™, PowerBrake, and WFI for power management.

Applications

  • High performance solid state drives
  •  Advanced Driver-Assistance Systems
  • Network communications

Block Diagram

Development Tools

  • AndeSight™ Integrated Development Environment
  • COPILOT: Custom-OPtimized Instruction deveLOpment Tool for ACE
  • ICE debugging hardware 

Key Features and Performance

AndeStar™ V5 Architecture

Key FeaturesBenefits
RISC-V RV32IMACFDBP instructions
  • State-of-the art ISA from latest developments in computer architecture
  • Industry standard and open architecture
RISC-V P-extension (draft) DSP/SIMD instructions with versatile operationsBoost the performance of voice, audio, image and signal processing
RISC-V single and double precision floating point instructionAccelerate the processing of high precision arithmetic
RISC-V bit-manipulation instructions, including the Zba, Zbb, Zbc and Zbs extensionsBenefits codes with bit-wise operations
Andes Extended InstructionsAndes exclusive performance and functionality enhancements
RISC-V N-extension, user-level interruptSupports user-level trap handling
Andes Custom Extension™ (ACE) option to create customized instructions for software acceleration
  • Add customized instruction extensions to facilitate Domain-Specific Architecture/Acceleration (DSA)
  • Boost application performance significantly, at the same time maintain the programmability
  • Powerful constructs are available to define high level instruction
  • ACE design is based on Verilog and C languages which are familiar to the designers
  • The COPILOT tool automatically generates the extended CPU and software toolchain
  • Do not require expertise in processor pipeline to design ACE instructions
16/32-bit mixable instruction formatFor compact code density
32 general-purpose registersFor better code size and performance
Machine (M), User (U) and Supervisor (S) Privilege levelsFor Linux and advanced operating systems with protection between kernel and user programs

CPU Core

Key FeaturesBenefits
3.57 Coremark/MHz, 1.98 DMIPS/MHz*Superior performance-per-MHz
5-stage pipeline, with a full-cycle reserved for critical SRAM accessesSuperior performance-efficiency, while allowing for high speeds

Extensive branch prediction features

  • Branch Target Buffer (BTB): 32, 64, 128 or 256-entry
  • Branch History Table (BHT): 256-entry, with 8-bit branch history
  • Return Address Stack (RAS): 4-entry
  • Branch Target Buffer and Branch History Table to speed up control codes
  • Return Address Stack to speeds up procedure returns

Memory Management Unit

  • Sv32 virtual-memory systems
  • 4/8-entry fully associative ITLB/DTLB
  • 32/64/128-entry 4-way set-associative shared TLB
  • Hardware page table walker
  • Virtual memory support for full address space and easy code/data sharing
  • Support for full-featured OS such as Linux
  • Protection of supervisor and user privilege
  • Hardware for fast address translation
Physical Memory Protection (PMP), 16 regionsBasic read/write/execute memory protection with minimum cost
Performance monitorsProgram code performance tuning
StackSafe™ hardware stack protection
  • Easy identification of stack size threshold during development
  • Hardware error detection of stack overflow and underflow at runtime

Multiplier options

  • Fast multiplier: pipelined, 2-cycle
  • Small multipliers: producing 1, 2, 4, or 8 bits per cycle
Option to choose between speed and area according to application's requirements
PowerBrake technologyPerformance throttling to digitally reduce power consumption

* AndeSight v500, DMIPS/MHZ follow Dhrystone’s no-inline ground rules, best performances

Memory Subsystems

Platform-Level Interrupt Controller (PLIC)

Key FeaturesBenefits

Implements RISC-V PLIC specification

  • Up to 1023 PLIC interrupt sources
  • Up to 255 PLIC interrupt priority levels
  • Up to 16 PLIC interrupt targets
Allow individual interrupts to be serviced and prioritized without sharing

Enhanced interrupt features

  • Vectored interrupt dispatch
  • Priority-based preemption
  • Selectable edge trigger or level trigger
  • Faster interrupt handling for real-time applications
  • Complete hardware preemption support for faster response
  • Flexible interrupt source interface for simpler SoC design

Debug Support

Key FeaturesBenefits
Implements RISC-V debug specificationsSupported by industry debug tool suppliers
JTAG Debug PortIndustry-standard support
Embedded Debug Module with up to 8 triggersFlexible configurations to tradeoff between gate count and debugging capabilities
Exception redirection supportEntering debugger upon selected exceptions without using breakpoints
Key FeaturesBenefits

Level-1 I-Cache & D-Cache

  • Size: 8KB to 64KB
  • Set associativity: Direct-mapped, 2-way or 4-way
  • Accelerating accesses to slow memories
  • Flexible cache configurations

ILM & DLM

  • Size: 4KB to 16MB
  • SRAM or AHB-Lite interface support
  • Bus manager accesses by local memory access port
  • For deterministic and efficient program execution
  • Flexible size selection to fit diversified needs
Soft-error protection: ECC or parity for I-Cache and D-Cache, ILM and DLM with SRAM interfaceCode and data integrity protection

Level-2 cache

  • Size: 128KB to 2MB
  • 32-byte cache line size, 16-way, with prefetch
  • 2 tag banks, 8 data banks with bank interleaving
  • Configurable memory access cycles to match SRAM timing
  • ECC soft-error protection
  • Accelerate multicore performance with level-2 memory
  • Flexible selections to meet performance and timing requirements

Multicore Cache Coherence

  • Support up to 4 cores with MESI cache coherence protocol by ACU (Andes Coherence Unit)
  • Support I/O coherence for cacheless bus managers by 64-bit AXI subordinate port
  • Multicore boosts performance significantly for computation intensive tasks
  • Cache coherence simplifies design for products with several CPUs and cacheless I/O
Bus manager port: AXI 64 or 128-bit data, 32 to 64-bit addressUser-selectable bus interface for optimal efficiency
Bus subordinate port: AHB with 64-bit data, for ILM/DLM accessesEfficient data transfer between CPU and SoC managers
Core/bus clock ratio of N:1Simplified SoC integration

Product Package

A25MP with 1, 2 or 4 Processor(s) and AE350 Platform