Dadda multiplier

The Dadda multiplier is a hardware multiplier design invented by computer scientist Luigi Dadda in 1965.[1] It is similar to the Wallace multiplier, but it is slightly faster (for all operand sizes) and requires fewer gates (for all but the smallest operand sizes).[2]

Lattice multiplication, a similar concept from decimal math.

In fact, Dadda and Wallace multipliers have the same three steps for two bit strings $w_{1}$ and $w_{2}$ of lengths $\ell _{1}$ and $\ell _{2}$ respectively:

Multiply (logical AND) each bit of $w_{1}$ , by each bit of $w_{2}$ , yielding $\ell _{1}\cdot \ell _{2}$ results, grouped by weight in columns
Reduce the number of partial products by stages of full and half adders until we are left with at most two bits of each weight.
Add the final result with a conventional adder.

As with the Wallace multiplier, the multiplication products of the first step carry different weights reflecting the magnitude of the original bit values in the multiplication. For example, the product of bits $a_{n}b_{m}$ has weight $n+m$ .

Unlike Wallace multipliers that reduce as much as possible on each layer, Dadda multipliers attempt to minimize the number of gates used, as well as input/output delay. Because of this, Dadda multipliers have a less expensive reduction phase, but the final numbers may be a few bits longer, thus requiring slightly bigger adders.

Description

An example of a full-adder circuit.

To achieve a more optimal final product, the structure of the reduction process is governed by slightly more complex rules than in Wallace multipliers.

The progression of the reduction is controlled by a maximum-height sequence $d_{j}$ , defined by:

d_{1}=2{\text{ and }}d_{j+1}=\operatorname {floor} (1.5d_{j}).

This yields a sequence like so:

d_{1}=2,d_{2}=3,d_{3}=4,d_{4}=6,d_{5}=9,d_{6}=13,\ldots

The initial value of $j$ is chosen as the largest value such that $d_{j}<min(n_{1},n_{2})$ , where $n_{1}$ and $n_{2}$ are the number of bits in the input multiplicand and multiplier. The lesser of the two bit lengths will be the maximum height of each column of weights after the first stage of multiplication. For each stage $j$ of the reduction, the goal of the algorithm is the reduce the height of each column so that it is less than or equal to the value of $d_{j}$ .

For each stage from $,\ldots ,1$ , reduce each column starting at the lowest-weight column, $c_{0}$ according to these rules:

If $\operatorname {height} (c_{i})\leqslant d_{j}$ the column does not require reduction, move to column $c_{i+1}$
If $\operatorname {height} (c_{i})=d_{j}+1$ add the top two elements in a half-adder, placing the result at the bottom of the column and the carry at the top of column $c_{i+1}$ , then move to column $c_{i+1}$
Else, add the top three elements in a full-adder, placing the result at the bottom of the column and the carry at the top of column $c_{i+1}$ , restart $c_{i}$ at step 1

Algorithm example

Example of Dadda reduction on 8 × 8 multiplier. Bits with lower weight are rightmost.

The example in the adjacent image illustrates the reduction of an 8 × 8 multiplier, explained here.

The initial state $j=4$ is chosen as $d_{4}=6$ , the largest value less than 8.

Stage $j=4$ , $d_{4}=6$

$\operatorname {height} (c_{0}\cdots c_{5})$ are all less than or equal to six bits in height, so no changes are made
$\operatorname {height} (c_{6})=d_{4}+1=7$ , so a half-adder is applied, reducing it to six bits and adding its carry bit to $c_{7}$
$\operatorname {height} (c_{7})=9$ including the carry bit from $c_{6}$ , so we apply a full-adder and a half-adder to reduce it to six bits
$\operatorname {height} (c_{8})=9$ including two carry bits from $c_{7}$ , so we again apply a full-adder and a half-adder to reduce it to six bits
$\operatorname {height} (c_{9})=8$ including two carry bits from $c_{8}$ , so we apply a single full-adder and reduce it to six bits
$\operatorname {height} (c_{10}\cdots c_{14})$ are all less than or equal to six bits in height including carry bits, so no changes are made

Stage $j=3$ , $d_{3}=4$

$\operatorname {height} (c_{0}\cdots c_{3})$ are all less than or equal to four bits in height, so no changes are made
$\operatorname {height} (c_{4})=d_{3}+1=5$ , so a half-adder is applied, reducing it to four bits and adding its carry bit to $c_{5}$
$\operatorname {height} (c_{5})=7$ including the carry bit from $c_{4}$ , so we apply a full-adder and a half-adder to reduce it to four bits
$\operatorname {height} (c_{6}\cdots c_{10})=8$ including previous carry bits, so we apply two full-adders to reduce them to four bits
$\operatorname {height} (c_{11})=6$ including previous carry bits, so we apply a full-adder to reduce it to four bits
$\operatorname {height} (c_{12}\cdots c_{14})$ are all less than or equal to four bits in height including carry bits, so no changes are made

Stage $j=2$ , $d_{2}=3$

$\operatorname {height} (c_{0}\cdots c_{2})$ are all less than or equal to three bits in height, so no changes are made
$\operatorname {height} (c_{3})=d_{2}+1=4$ , so a half-adder is applied, reducing it to three bits and adding its carry bit to $c_{4}$
$\operatorname {height} (c_{4}\cdots c_{12})=5$ including previous carry bits, so we apply one full-adder to reduce them to three bits
$\operatorname {height} (c_{13}\cdots c_{14})$ are all less than or equal to three bits in height including carry bits, so no changes are made

Stage $j=1$ , $d_{1}=2$

$\operatorname {height} (c_{0}\cdots c_{1})$ are all less than or equal to two bits in height, so no changes are made
$\operatorname {height} (c_{2})=d_{1}+1=3$ , so a half-adder is applied, reducing it to two bits and adding its carry bit to $c_{3}$
$\operatorname {height} (c_{3}\cdots c_{13})=4$ including previous carry bits, so we apply one full-adder to reduce them to two bits
$\operatorname {height} (c_{14})=2$ including the carry bit from $c_{13}$ , so no changes are made

Addition

The output of the last stage leaves 15 columns of height two or less which can be passed into a standard adder.

References

Dadda, Luigi (May 1965). "Some schemes for parallel multipliers". Alta Frequenza. 34 (5): 349–356.
Townsend, Whitney J.; Swartzlander, Jr., Earl E.; Abraham, Jacob A. (December 2003) [2003-08-06]. "A Comparison of Dadda and Wallace Multiplier Delays" (PDF). SPIE Advanced Signal Processing Algorithms, Architectures, and Implementations XIII. The International Society. doi:10.1117/12.507012. Archived (PDF) from the original on 2018-07-16. Retrieved 2018-07-16.

Dadda multiplier

Description

Algorithm example

See also

References

Further reading