|
It has been 30 years since the 8048/8051 microprocessors appeared on the market and changed the world's view of what an embedded processor should look like. Over the years, subsequent processors tended to follow that basic architecture, adding improvements at each step in order to stay in step with the increasingly demanding applications. As long-lived and important as that original architecture has been, it is now time to embrace a new architecture, one designed from the ground up to handle the applications of the 21st century. In this article, we discuss the sea changes in architecture design that are being driven by demands for higher operating speeds and lower power dissipation.
20th century applications
The typical application from the 8051 era used an 8-bit microcontroller—a bit banger—that could read sensors, do a small amount of data processing, and then drive some I/O lines, probably parallel, in order to send characters to a display or record a data byte onto tape or some other data logging device. Additional I/O lines could scan a simple keyboard or set of switches, and the whole thing could be driven within time constraints by an on-chip real time clock that could provide precise timing references to sync data transfers, and perform other time-driven tasks.
These applications used only a small amount of memory, perhaps 64 to 256 bytes of RAM, and most of that was integrated on the chip. Although provisions were made to access external memory as well, this was initially a primitive interface consisting of just an address and data bus and relied on the processor to read and move data in and out of external memory under software control.
Thus the emphasis was on controlling I/O within tight time constraints with very little actual data manipulation done by the processor chip. That's fortunate because the processor had extremely limited data processing capability anyway, and at clock rates of only a few Megahertz. As limited as these chips were, they were sufficient to control countless simple applications ranging from wall thermostats to simple home automation systems. In fact, at this moment I'm typing this paper on a recently introduced laptop that uses a derivative of that original 8048 chip for the sole purpose of reading keyboard clicks.
Over time, processors were introduced that were more advanced, both 16 and 32 bit versions, and with much faster and more sophisticated external memory interfaces using DMA controller circuitry. Still, the basic idea has remained the same. One or two processors on a chip, reading data from input lines and sending data to output lines, and wiggling I/O control pins as appropriate&ellip; all to the metronome of an external reference real time clock.
Consumer electronics drives the 21st century
But now the nature of the applications has changed dramatically. In addition to the traditional real time bit banging, a new dimension of processing capability has been added—the processing of algorithms. Today the high-volume applications are multimedia consumer applications that range from tiny MP3 music players to cell phones with video capability. Moreover, the long awaited avalanche of high-definition television has begun, and along with those televisions, consumers suddenly perceive the need for home networks that move video and music from room to room.
Multimedia capability
All new consumer applications have digital data at their heart, and that implies extensive digital signal processing in any device that displays or plays that data. Multimedia application make heavy use of mathematical algorithms—Fast Fourier Transforms (FFTs), discrete cosine transforms (DCTs), and so forth. The high bandwidth required to serve multimedia applications requires that 21st century processors have dedicated circuitry for processing those algorithms. But at the same time, none of the earlier requirements for general purpose I/O and real time clocks have gone away. New chips must handle both!
Bitstream orientation
Whereas earlier processors viewed external memory as the source and destination of applications data, modern processors must be able to operate with high-speed bitstreams of data arriving from the internet, USB and 1394 cables, as well as cable and satellite television services. The USB 2.0 interface, now nearly ubiquitous on consumer products such as cameras, MP3 players, and even cell phones, requires up to 480 mbit/sec. The 1394 interface is commonly used in video applications and comes in 200/400/800 mbit/sec rates. Even gigabit Ethernet is beginning to appear in homes with even higher data rates yet. Today's processors have to deal with these data rates, all of which are staggeringly fast by 20th century standards.
To make matters worse, the new High Definition Audio-Video Network Alliance (HANA) standard for home networking assumes up to four 1394 bitstreams that may reach 800 Mbit/sec. An MPEG-4 bitstream can carry multiple bitstreams for audio and video plus optional additional streams for things like subtitles and still images. In many cases, a processor may need to handle four or five of these high-speed bitstreams at once.
Fast external memory interface
The need to handle massive amounts of data applies to external memory as well, the days when the processor only needed to address a few hundred bytes of data are long over. For example, processors aimed at video applications must be able to handle 128 Mbytes of DDR SRAM memory. While it is relatively easy to implement larger address ranges on a processor chip, the speed of these memory interfaces is now critically important. The large address space translates into many pins on the processor dedicated to the external memory interface. The fact that there are multiple bitstreams required for many of these applications means that there must be an easy way to quickly switch the address bits on the memory interface. Most modern processors use full-blown DMA (direct memory access) controllers for this interface—typically three of them. Some even go the extra step of allowing indexed addressing in the controller. That's convenient, for instance, when the device is fetching multi-byte vectors from memory.
Low power dissipation
Many modern consumer devices are battery operated. The high processing load, combined with a display, and sometimes even a disk drive, place a heavy load on the batteries in these devices. As a result, power is at a premium and the processor itself must be capable of low-power operation to maximize battery life. Of course, low power does not normally go hand-in-hand with high processing speed, so this represents a serious design tradeoff.
21st century multicore processor architecture
Changes in the nature of applications clearly require corresponding changes in the processor architecture. For instance, the need for multimedia capability requires special high-speed arithmetic circuits. And the need for that high-speed processing has led chip designers to use multicore processors. This enables tasks that require real-time processing to be run on one core while the other, non-real-time tasks can be run on a second core.
|