实时信号处理是I/O和运算密集型的任务。除了高速运算单元和包括乘累加(MAC)在内的单周期指令外,SHARC处理器设计目的是实现最大I/O与存储器访问带宽。这种内核速度、存储器集成与I/O带宽的平衡,实现了对实时应用至关重要的持续的高性能。
基准程序非常重要,因为它们显示了特定的信号处理器在具体的应用环境中运行的性能。基准指标越小,算法执行就越迅速。如果一个数字信号处理器能够更快地执行任务,处理器在指定的时间内就可以执行更多的任务。仅盯着数字信号处理器的周期时间、时钟速度或MIPS,是不能准确地反映处理器的真正性能的。因此,分析算法基准指标而非时钟速度与周期时间是非常重要的。
Real-time signal processing tasks are I/O and computationally intensive. In addition to high-speed math units and all instructions executing in a single-cycle, including single-cycle multiply accumulates (MACs), SHARC Processors are designed for maximum I/O and memory access bandwidth. This balance of core speed, memory integration and I/O bandwidth achieves the sustained performance critical to real-time applications.
Benchmarks are important in that they show how a particular DSP performs in the context of an application. The smaller the benchmark number, the quicker the algorithm execution. If a DSP can perform the task quicker, the processor can perform more tasks in a given amount of time. Just looking at the cycle time, clock speed or MIPS of a DSP can not give an accurate indication of the true performance of the processor. Therefore it's important to analyze algorithm benchmarks, not only clock speed and cycle time.
ADSP-21160N ADSP-21161N SIMD |
ADSP-21261 SIMD |
ADSP-21262 ADSP-21266 SIMD |
ADSP-21371 ADSP-21375 SIMD |
ADSP-21364 ADSP-21365 SIMD |
ADSP-21368 ADSP-21369 SIMD |
ADSP-2146x SIMD |
|
---|---|---|---|---|---|---|---|
Clock Cycle | 100 MHz | 150 MHz | 200 MHz | 266 MHz | 333 MHz | 400 MHz | 450 MHz |
Instruction Cycle Time | 10 ns | 6.67 ns | 5 ns | 3.75 ns | 3 ns | 2.5 ns | 2.22 ns |
MFLOPS Sustained | 400 MFLOPS | 600 MFLOPS | 800 MFLOPS | 1064 MFLOPS | 1332 MFLOPS | 1600 MFLOPS | 1800 MFLOPS |
MFLOPS Peak | 600 MFLOPS | 900 MFLOPS | 1200 MFLOPS | 1596 MFLOPS | 1998 MFLOPS | 2400 MFLOPS | 2700 MFLOPS |
1024 Point Complex FFT (Radix 4, with bit reversal) |
92 µs | 61.3 µs | 46 µs | 34.5 µs | 28 µs | 23 us | 20.44 µs |
FIR Filter (per tap) | 5 ns | 3.3 ns | 2.5 ns | 1.88 ns | 1.5 ns | 1.25 ns | 1.11 ns |
IIR Filter (per biquad) | 20 ns | 13.3 ns | 10 ns | 7.5 ns | 6 ns | 5 ns | 4.43 ns |
Matrix Multiply (pipelined) [3x3] * [3x1] [4x4] * [4x1] |
45 ns 80 ns |
30 ns 53.3 ns |
22.5 ns 40 ns |
16.91 ns 30.07 ns |
13.5 ns 24 ns |
11.25 ns 20 ns |
10.00 ns 17.78 ns |
Divide (y/x) | 30 ns | 20 ns | 15 ns | 11.27 ns | 9 ns | 7.5 ns | 6.67 ns |
Inverse Square Root | 45 ns | 30 ns | 22.5 ns | 16.91 ns | 13.5 ns | 11.25 ns | 10.00 ns |