AndesCore™ D10 Overview
- > 130 DSP extension instructions
- Caches for fast code and data accesses
- Local Memories for deterministic code and data accesses
- IEEE754-compliant FPU coprocessor
- Memory Protection Unit (MPU) for secure RTOS
- Memory Management Unit (MMU) for Linux
The D1088 is a 5-stage pipeline integer processor with integrated DSP offering 130 DSP SIMD (single instruction, multiple data) instructions. Targeting the real time processing requirements of power-constrained multimedia applications, At 90nm low power process, the D1088 delivers 588 DMIPS, 134 percent higher than the competing offerings. When measured with the popular Whetstone floating-point benchmark , the D1088 achieves 92 percent better performance. When running the popular and comprehensive (over 200) DSP libraries, the D1088 is 116 percent faster with half the code size. Even with the above advantages, D1088 still comes with a little smaller die area and power per MHz. The D1088’s optimized DSP libraries and C/C++ compiler make algorithm programming easier.
Tightly integrated Integer and DSP processor architectures are not new, but most were designed for applications where power was not as much a constraint as it is today. The new D1088 was designed with this new reality in mind. It contains functionality to enhance efficiency and reduce code size. For example, to significantly boost the computational performance in matrix, filtering, Fourier Transform, and statistics functions, it can execute 4-way 8-bit, or 2-way 16-bit SIMD instructions in a single latency as well as 8-bit and 16-bit SIMD instructions. In addition, for multimedia applications, the D1088 also supports 64-bit as add, subtract, and multiply mixed computation.
For voice application, the D1088 offers left shift, right rounding and shift, most significant word, 32×32 multiply and specially designed 32-bit instructions to replace the lengthy 64-bit computation. To reduce code size and increase efficiency, the D1088 provides a Zero Overhead Loop instruction to offload loop branching. To enhance parallel computational capacity, the D1088 provides left and right shift, minimum, maximum, and absolute value, besides traditional SIMD instructions such as add, subtract, and multiply.
Development Tools
- AndeSight™ Integrated Development Environment
- AICE JTAG/SDP debugger hardware
Key Features and Performance
AndeStar™ V3 Architecture
Key Features | Benefits |
---|---|
21st-century RISC-like instruction set | Better performance for modern compiler |
16/32-bit mixable opcode format | Smaller code size |
16 or 32 general-purpose registers | Trade-off between core size and performance requirements |
All-C Embedded Programming | Faster SW development and easier maintenance |
Shadow stack pointer | Efficiency and protection with a dedicated kernel stack pointer |
Hardware divider | More performance |
Aligned and unaligned load/store multiple word instructions and post-increment load/store memory accesses | Better program code size and performance |
Direct support of up to 32 interrupts with programmable priority levels | Quick identification of interrupt sources and fast assignment of service routines |
4G address space | Full range address space |
Memory mapped IO | Easy to program and friendly to compiler |
CPU Core
Key Features | Benefits |
---|---|
| Superior performance-per-MHz |
5-stage pipeline | Superior performance-efficiency, while allowing for high speeds |
DSP extension instructions
| Better performance for branches |
Extensive branch predication (BTB and RAS) | Better performance for branches |
Hardware stack protection | Stack size determination and runtime overflow error detection |
Processor state bus | Simplification SoC design and debugging |
Performance monitors | Program code performance tuning |
Memory Management Unit
|
|
Memory Protection Unit
| Basic read/write/execute memory protection with minimun cost |
Fast multipliers (1 cycle) | More performance |
Extensive clock gating and logic gating | Lower power |
N:1 core/bus clock ratios | Simplified SoC integration |
Low-latency vectored interrupt | Faster context switch for real-time applications |
Completion of most operations in 1 cycle Single-cycle capable for Local Memory and AHB bus accesses | Better performance-efficiency |
PowerBrake technology | Peak power consumption reduction |
Coprocessor interface | For Andes FPU and other customer designed coprocessor unit |
* BSP v4.2.0, DMIPS/MHZ without no-inline option, best performances
Memory Subsystems
Key Features | Benefits |
---|---|
I & D Cache
| Higher performance for large program size
|
Optional External Instruction and Data Local Memory
| Higher efficiency for program execution
|
BIU supports 32-bit AHB/2AHB/AHB-lite/APB/AXI | User-selectable bus interface for optimal efficiency |
Debug Support
Key Features | Benefits |
---|---|
2-wire Serial Debug Port or 5-wire JTAG Debug Port | Low-cost 2 wire support and industry-standard 5-wire support |
Embedded Debug Module (EDM)
|
|
Performance
Process | 90LP | 28HPM |
---|---|---|
Frequency (MHz) | 50 | 50 |
Dynamic power (uW/MHz) | 32.5 | 7.98 |
Area (mm2) | 0.16 | 0.029 |
* Base configuration, RVt library. ; Power consumption at typical process corner, Vdd (90LP: 1.2V, 40LP: 1.1V, 28HPM: 0.9V), 25°C
Process | 40LP | 28HPM |
---|---|---|
Frequency (MHz) | 492 | 804 |
Dynamic power (uW/MHz) | 9.3 | 9.2 |
Area (mm2) | 0.11 | 0.06 |
* Base configuration, LVt library; Frequency at slow process corner, 40LP: 0.99V, 28HPM:0.81V, 125°C and without I/O constraint; Power consumption at typical process corner, Vdd (40LP:1.1V, 28HPM: 0.9V), 25°C