You can realize significant improvements in the performance of signal
processing functions in wireless systems. How? By taking advantage of
the flexibility of FPGA fabric and the embedded DSP blocks in current
FPGA architectures for operations that can benefit from parallelism.
Common examples of operations found in wireless applications include
FIR filtering, Fast
Fourier Transforms (FFTs),
digital down and up conversion
and Forwared
Error Correction (FEC) blocks.
By offloading operations that require high-speed parallel processing
onto the FPGA and leaving operations
that require high-speed serial processing on the processor, overall
system performance and cost can be optimized while lowering system
requirements.
Partitioning
The FPGA can be used with a digital signal processor (DSP),
serving either as an independent pre-processor (or sometimes
post-processor) device, or as a co-processor. In a preprocessing
architecture, the FPGA sits directly in the data path and is
responsible for processing the signals to a point when they can be
efficiently and cost-effectively handed off to a DSP processor for
further lower-rate processing.
 |
| Figure
1: In co-processing architectures, the FPGA sits alongside the DSP,
which offloads specific algorithmic functions to the FPGA to be
processed at significantly higher speeds than what is possible in a DSP
processor alone. |
In co-processing architectures, the FPGA sits alongside the DSP,
which offloads specific algorithmic functions to the FPGA to be
processed at significantly higher speeds than what is possible in a DSP
processor alone. The results are passed back to the DSP or sent to
other devices for further processing, transmission or storage (Figure 1 above).
Timing margins
The choice of pre-processing, post-processing or co-processing is often
governed by the timing margins
needed to move data between the processor and FPGA and how that
impinges on the overall latency.
Although a co-processing solution is the topology most often
considered by designers--primarily because the DSP is in more direct
control of the data hand-off process-- this may not always be the best
overall strategy.
 |
| Figure
2: Shown is an LTE example of co-processing data-transfer latency
issues. |
Consider, for example, the latest specifications for 3GPP Long Term Evolution, in
which the transmission time interval has been reduced to 1ms, down from
2ms for HSDPA and 10ms for W-CDMA. This
essentially requires that data be processed from the receiver and
through to the output of the media access
control (MAC) layer in less than 1,000 microseconds.
Figure 2 above shows that
using a serial RapidIO port on the DSP running
at 3.125Gbit/s, with 8bit/10bit encoding and a 200bit overhead for the Turbo decode function, results
in a DSP-to-FPGA transfer delay of 230µs. Taking into account
other expected delays, the Turbo codec performance required to meet
these system timings is a very demanding 75.8Mbit/s for 50 users.
Using an FPGA to process the Turbo codecs as a largely independent
post-processor not only removes DSP latency but saves time because
there's no need to transfer the data at a high bandwidth between the
DSP and FPGA.
This reduces the throughput rate of the Turbo decoder down to
47Mbit/s, a decrease that allows more cost-effective devices, and has
reduced system power dissipation.
Another consideration is whether to use soft- or hard embedded
processor intellectual property (IP) on the FPGA to offload some of the
system processing tasks, which in turn offers the possibility of
additional cost, power and footprint reduction benefits.