|
Part 1 explains how Atmel used the Target tools to develop a DSP optimized for C programming.
In the first part of this two part series, we outlined the main reasons we designed the mAgic DSP architecture with C programmability in mind. In this article we'll explore how to get the most out of the mAgic architecture by using C programming techniques that maximize efficiency of the mAgic C compiler. As an example, we'll take unoptimized C code for a common DSP kernel—the polynomial approximation of the sine function—and go step by step through C optimizations that result in 100% theoretical efficiency.
Specifically, our function will calculate the sine of an array of angles, similar to the function found in the mAgic optimized DSP library:
Step 1: the initial code
Below is the initial unoptimized code. Note that some code has been left out as it isn't relevant in the context of the article:
We start by calculating the theoretical efficiency of the function. In other words, what is the minimum number of cycles we can expect to obtain? This can be done by analyzing the number of operations contained in the loop. In our case there are 19 multiplications, 12 additions, 1 comparison and 2 I/O operations. Since the mAgic DSP is able to perform 1 scalar multiplication, 1 scalar addition or comparison and 2 scalar I/O operations per cycle, the minimum number of cycles for this loop is 19.
If we compile this code as is, the loop executes in 75 cycles. This is only about 25% of the theoretical limit. Obviously we can do much better.
First, we apply Software Pipelining (SP). Without going into the details, which is outside the scope of this article, SP is a technique in which many iterations of the loop execute in parallel. This allows us to maximally utilize the processor resources. The mAgic C compiler can automatically perform SP of loops, but not in this case. What's wrong with our code? The problem is that it contains an 'if' construct. The compiler can't apply SP to loops that contain control constructs like 'if', 'goto' or function calls.
Step 2: eliminating the control construct
How can we remove the 'if' construct? We can remove a simple if by replacing it with the intrinsic function _SEL. The _SEL function takes as inputs two variables and a Boolean condition stored on the special variable _mreg_FLACKCOND,and returns one of the two variables according to the condition. So the 'if' construct
can be replaced with the following code:
The _SEL intrinsic function will return roundFl if the _mreg_FLACKCOND condition is true or roundMh if the condition is false, thus reproducing the 'if' construct behavior.
Of course, a normal function call is also a control construct, thus invalidating SP, but this is not a normal function: it's an intrinsic function, which is a special function that the compiler maps on a particular mAgic operation, as explained in the previous article.
Note that when using the if construct we had 3 operations: sum, compare, and subtract (plus of course the "if" statement itself). Now, using _SEL, we have 4 operations. Despite the increased number of operations however, we don't have to change our theoretical limit of 19 cycles because the _SEL operation can be performed in parallel with the multiplications. In this example, using the _SEL intrinsic function reduces the number of cycles required to execute the loop from 75 to 71 cycles. While not a huge savings, this step is necessary to allow further optimizations.
As a note to the reader, the mAgic C compiler can typically handle simple if constructs directly using a compile option. We choose to detail this conversion here, as these kinds of constructs are the most frequent reason for automatic SP to fail. Other techniques, like predicated execution, can be used to convert more complex if constructs.
|