An asynchronous circuit, or self-timed circuit, is a sequential digital logic circuit which is not governed by a clock circuit or global clock signal. Instead they often use signals that indicate completion of instructions and operations, specified by simple data transfer protocols. This type is contrasted with a synchronous circuit in which changes to the signal values in the circuit are triggered by repetitive pulses called a clock signal. Most digital devices today use synchronous circuits. However asynchronous circuits have the potential to be faster, and may also have advantages in lower power consumption, lower electromagnetic interference, and better modularity in large systems. Asynchronous circuits are an active area of research in digital logic design.
Contents
Synchronous vs asynchronous logic
Digital logic circuits can be divided into combinational logic, in which the output signals depend only on the current input signals, and sequential logic, in which the output depends both on current input and the past history of inputs. In other words, sequential logic is combinational logic with memory. Virtually all practical digital devices require sequential logic. Sequential logic can be divided into two types, synchronous logic and asynchronous logic.
Theoretical foundation
The term asynchronous logic is used to describe a variety of design styles, which use different assumptions about circuit properties. These vary from the bundled delay model – which uses "conventional" data processing elements with completion indicated by a locally generated delay model – to delay-insensitive design – where arbitrary delays through circuit elements can be accommodated. The latter style tends to yield circuits which are larger than bundled data implementations, but which are insensitive to layout and parametric variations and are thus "correct by design".
Asynchronous logic is the logic required for the design of asynchronous digital systems. These function without a clock signal and so individual logic elements cannot be relied upon to have a discrete true/false state at any given time. Boolean logic is inadequate for this and so extensions are required. Karl Fant developed a theoretical treatment of this in his work Logically determined design in 2005 which used four-valued logic with null and intermediate being the additional values. This architecture is important because it is quasi-delay-insensitive. Scott Smith and Jia Di developed an ultra-low-power variation of Fant's Null Convention Logic that incorporates multi-threshold CMOS. This variation is termed Multi-threshold Null Convention Logic (MTNCL), or alternatively Sleep Convention Logic (SCL). Vadim Vasyukevich developed a different approach based upon a new logical operation which he called venjunction. This takes into account not only the current value of an element, but also its history.
Petri nets are an attractive and powerful model for reasoning about asynchronous circuits. However, Petri nets have been criticized for their lack of physical realism (see Petri net: Subsequent models of concurrency). Subsequent to Petri nets other models of concurrency have been developed that can model asynchronous circuits including the Actor model and process calculi.
Benefits
A variety of advantages have been demonstrated by asynchronous circuits, including both quasi-delay-insensitive (QDI) circuits (generally agreed to be the most "pure" form of asynchronous logic that retains computational universality) and less pure forms of asynchronous circuitry which use timing constraints for higher performance and lower area and power:
Disadvantages
Communication protocols
There are several ways to create asynchronous communication channels. Usually, the sender signals the availability of data with a request, Req, and the receiver indicates completion with an acknowledgement signal, Ack, indicating that it is able to process new requests; this process is called a handshake. The differences lie in the way this signals are coded.
Protocols
There are two protocol families in asynchronous circuits, which differ in the way events are encoded:
This basic distinction doesn't account for the wide variety of protocols. These events may encode requests and acknowledgements only or encode the data, which leads to the popular multi-wire encodings. A lot of other, less common protocols have been proposed. Those include using a single wire for request and acknowledgment, using several significant voltages, using only pulses or balance timings in order to remove the latches.
Data encoding
There are several ways to encode data in asynchronous circuits. The most obvious encoding, similar to what can be found in synchronous circuits, is the bundled-data encoding, which uses one wire per bit of data and a separate request wire. Another common way to encode the data is to use multiple wires to encode a single digit: the value is determined by the wire on which the event occurs. This avoids some of the delay assumptions necessary with bundled-data encoding, since the request and the data are not separated anymore.
Bundled-data encoding
This is the same encoding as in synchronous circuits: it uses one wire per data bit. The request and the acknowledgement are sent on separate wires with various protocols. These circuits usually assume a bounded delay model, the completion signals being delayed long enough for the calculations to take place.
Such circuits are often referred to as micropipelines, whether they use a two-phase or four-phase protocol, even if the word was initially introduced for two-phase bundled-data.
Multi-rail encoding
Here, the request isn't sent on a dedicated wire: it is implicit, when a transition happens on one wire. Any m of n encoding can be used, where a digit is represented by m transitions on n wires, and the reception of these transitions is equivalent to a request, with the advantage that this communication is delay-insensitive. Usually, a one-hot (1 of n) encoding is preferred. They can represent a digit in radix n.
Dual-rail encoding is by far the most common, mostly with a four-phase protocol which is also called three-state encoding, since it has two valid states (10 and 01, after a transition) and a reset state (00). Another common encoding, which leads to simpler implementation than one-hot two-phase dual-rail, is four state encoding, or level encoded dual-rail, which uses a data bit and a parity bit to achieve a two-phase protocol.
Asynchronous CPU
Asynchronous CPUs are one of several ideas for radically changing CPU design.
Unlike a conventional processor, a clockless processor (asynchronous CPU) has no central clock to coordinate the progress of data through the pipeline. Instead, stages of the CPU are coordinated using logic devices called "pipeline controls" or "FIFO sequencers." Basically, the pipeline controller clocks the next stage of logic when the existing stage is complete. In this way, a central clock is unnecessary. It may actually be even easier to implement high performance devices in asynchronous, as opposed to clocked, logic:
Asynchronous logic proponents believe these capabilities would have these benefits:
The biggest disadvantage of the clockless CPU is that most CPU design tools assume a clocked CPU (i.e., a synchronous circuit). Many tools "enforce synchronous design practices". Making a clockless CPU (designing an asynchronous circuit) involves modifying the design tools to handle clockless logic and doing extra testing to ensure the design avoids metastable problems. The group that designed the AMULET, for example, developed a tool called LARD to cope with the complex design of AMULET3.
Despite the difficulty of doing so, numerous asynchronous CPUs have been built, including:
The ILLIAC II was the first completely asynchronous, speed independent processor design ever built; it was the most powerful computer at the time.
DEC PDP-16 Register Transfer Modules (ca. 1973) allowed the experimenter to construct asynchronous, 16-bit processing elements. Delays for each module were fixed and based on the module's worst-case timing.
The Caltech Asynchronous Microprocessor (1988) was the first asynchronous microprocessor (1988). Caltech designed and manufactured the world's first fully Quasi Delay Insensitive processor. During demonstrations, the researchers amazed viewers by loading a simple program which ran in a tight loop, pulsing one of the output lines after each instruction. This output line was connected to an oscilloscope. When a cup of hot coffee was placed on the chip, the pulse rate (the effective "clock rate") naturally slowed down to adapt to the worsening performance of the heated transistors. When liquid nitrogen was poured on the chip, the instruction rate shot up with no additional intervention. Additionally, at lower temperatures, the voltage supplied to the chip could be safely increased, which also improved the instruction rate – again, with no additional configuration.
In 2004, Epson manufactured the world's first bendable microprocessor called ACT11, an 8-bit asynchronous chip. Synchronous flexible processors are slower, since bending the material on which a chip is fabricated causes wild and unpredictable variations in the delays of various transistors, for which worst-case scenarios must be assumed everywhere and everything must be clocked at worst-case speed. The processor is intended for use in smart cards, whose chips are currently limited in size to those small enough that they can remain perfectly rigid.
In 2014, IBM announced a SyNAPSE-developed chip that runs in an asynchronous manner, with one of the highest transistor counts of any chip ever produced. IBM's chip consumes orders of magnitude less power than traditional computing systems on pattern recognition benchmarks.