Title: Carry Lookahead Adder
1332437 Lecture 22 Computer Arithmetic
- Carry Lookahead Adder
- Manchester Carry Chain
- Magnitude Comparators
- Universal Arithmetic Logic Unit
- Binary Multipliers
- Four-bit Decomposition
- Parallel
- Booths Method
- Wallace Tree
- Wheelers Division Method
- Summary
2Material from Principles of CMOS VLSI Design, by
Neil Weste and Kamran Eshraghian,
Addison-Wesley Computer Arithmetic, Volumes I and
II, by Earl Swartzlander, Jr., IEEE Computer
Society Press
3Full Adder
4Ripple Carries
- Cascade 64 full adders to get a 64-bit ripple
carry adder - Problem Slow carries ripple
- If stage delay 20 nsec, Total delay 64 X 20
1280 nsec
5Carry Look-Ahead Adder
- Invented by Weinberger
- Each stage produces
- G generate carry in this stage
- G A B
- P Propagate carry through this stage
- P A B
- COUT G P CIN
6Carry Equations
- 4 stage binary carry look-ahead adder
- A1-4, B1-4 are addends, C0 is input carry
- Outputs are Sum S1-4 and Carries C1-4
- Multiply out equations
- C1 G1 P1 C0
- C2 G2 P2 C1 G2 P2 G1 P2 P1 C0
- C3 G3 P3 C2 G3 P3 G2 P3 P2 G1
- P3 P2 P1 C0
- C4 G4 P4 C3 G4 P4 G3 P4 P3 G2
- P4 P3 P2 G1 P4 P3 P2 P1 C0
7Carry Look-Ahead Adder
- Big speedup At most 4 logic delays to get all
carries - n full adder chips
- ttotal tXOR log4 n X tLACG tFA
- tXOR 10 nsec
- tLACG carry lookahead stage delay 21 nsec
- tFA full adder delay 15 nsec
- ttotal 10 nsec 1 X 21 nsec 15 nsec 46 nsec
8Manchester Carry Chain
- Basic idea is to use ripple carries, but design
hardware so that they propagate as rapidly as
possible
9Manchester Carry Chain
10Magnitude Comparators
- Compare 2 binary s and produce 3 signals that
propagate like carries - AltBout (AltBin) (Ai Bi) Ai Bi
- AgtBout (AgtBin) (Ai Bi) Ai Bi
- ABout (ABin) (Ai Bi)
11Magnitude Comparators
12Comparator Time Delays
- Linear connection
- tc individual comparator delay
- ttotal m tc, for m comparators 4 X
27 162 nsec (4-bit slices) - Cascaded connection
- ttotal (log4 m 1) X tc 2 X 27 54
nsec
13Universal ALU -- 74AS181
14Universal ALU
15Binary Multipliers
- Extremely complex even for small word length
- Very hard to test
- Truth table expansion varies with word length
- Must use at least one carry propagating full
adder to sum up partial products
16Multiplication Equations
- Multiplicand A Ah X 24 Al
- Multiplier B Bh X 24 Bl
- Product P A X B Ah X Bh X 28 (Ah X Bl
Al X Bh) X 24 (Al X Bl) - Must sum the 4 partial products
- Since A B are 8 bits, product is 16 bits
- Need 5 full adders to do this
174X4 bit Binary Multiplier
- 74LS274 Produces 8-bit product
18Example
19Combining Multipliers
20Pseudo or Carry Save Adder
- Advantage only final stage of partial-product
addition needs to propagate carries
21Carry Save Addition (CSA)
- A full adder sums 3 inputs and produces 2 outputs
- Carry output has twice weight of sum output
- N full adders in parallel are called carry save
adder - Produce N sums and N carry outs
22CSA Application
- Use k-2 stages of CSAs
- Keep result in carry-save redundant form
- Final CPA computes actual result
23CSA Application
- Use k-2 stages of CSAs
- Keep result in carry-save redundant form
- Final CPA computes actual result
24CSA Application
- Use k-2 stages of CSAs
- Keep result in carry-save redundant form
- Final CPA computes actual result
25Multiplication
26Multiplication
27Multiplication
28Multiplication
29Multiplication
30Multiplication
31Multiplication
- Example
- M x N-bit multiplication
- Produce N M-bit partial products
- Sum these to produce MN-bit product
32General Form
- Multiplicand Y (yM-1, yM-2, , y1, y0)
- Multiplier X (xN-1, xN-2, , x1, x0)
- Product
3316X16 Mult. Dot Diagram
- Each dot represents a bit
34Parallel Binary Multiplier
35One-Bit Multiplier Cell
X
36Rectangular Array
- Squash array to fit rectangular floorplan
37Fewer Partial Products
- Array multiplier requires N partial products
- If we looked at groups of multiplier r bits, we
could form N/r partial products. - Faster and smaller?
- Called radix-2r encoding
- Ex r 2 look at pairs of bits
- Form partial products of 0, Y, 2Y, 3Y
- First three are easy, but 3Y requires adder ?
38Booth Encoding
- Instead of 3Y, try Y, then increment next
partial product to add 4Y - Similarly, for 2Y, try 2Y 4Y in next partial
product
39Booth Encoding
- Instead of 3Y, try Y, then increment next
partial product to add 4Y - Similarly, for 2Y, try 2Y 4Y in next partial
product
40Booth Encoding
- Instead of 3Y, try Y, then increment next
partial product to add 4Y - Similarly, for 2Y, try 2Y 4Y in next partial
product
41Booth Encoding
- Instead of 3Y, try Y, then increment next
partial product to add 4Y - Similarly, for 2Y, try 2Y 4Y in next partial
product
42Booth Encoding
- Instead of 3Y, try Y, then increment next
partial product to add 4Y - Similarly, for 2Y, try 2Y 4Y in next partial
product
43Booth Encoding
- Instead of 3Y, try Y, then increment next
partial product to add 4Y - Similarly, for 2Y, try 2Y 4Y in next partial
product
44Booth Encoding
- Instead of 3Y, try Y, then increment next
partial product to add 4Y - Similarly, for 2Y, try 2Y 4Y in next partial
product
45Booth Encoding
- Instead of 3Y, try Y, then increment next
partial product to add 4Y - Similarly, for 2Y, try 2Y 4Y in next partial
product
Current
Prev.
46Booth Hardware
- Booth encoder generates control lines for each PP
- Booth selectors choose PP bits
Xi means add in Y 2Xi means add in 2Y M means
negate partial prod.
47Wallace and Dadda Multipliers
- Use Pseudo-adder to avoid carry propagation
ripple time - Pseudo-adder does not internally propagate
carries - Instead produces one word of sum bit outputs and
one word of output carries, shifted 1 bit left
from the sum - Has 3 input words and produces 2 output words
48Wallace and Dadda Multipliers
- Approach Generate summands simultaneously use
Wallace tree to add them rapidly - Problems Wallace tree is highly irregular, and
does not lead to structured design - Dadda multiplier is slightly faster, and uses a
little less hardware than Wallace tree
49Wallace Tree Multiplication
- CSA is effectively a ones counter
- Called a (3,2) counter converts 3 inputs into
count encoded as 2 outputs
50Dot Diagram of Array Mult.
51Wallace Tree
- Sum partial products in parallel
52Example Wallace Tree
53Original Wallace Tree
Carry Lookahead Adder
54Division
- Method of repeated subtraction is very slow
because of borrow propagation - Solution Rapidly approximate 1/divisor in
hardware, then multiply
55Wheelers Division Method
- Given positive normalized fraction x
- .10110001
- and crude approximation p to x
- Set a1 p x
- b1 p
- and iterate
- an1 an (2 an), bn1 bn (2 an)
- Converges quadratically
- an 1, bn 1 / x
56Wheelers Division Method
- Use logic to inspect 1st 6 digits of x
- Generate bn 1/x in 3 iterations
- Used lots of hardware up to 10 of 2nd
generation computer
57Summary
- Carry Lookahead Adder
- Manchester Carry Chain
- Magnitude Comparators
- Universal Arithmetic Logic Unit
- Binary Multipliers
- Four-bit Decomposition
- Parallel
- Booths Method
- Wallace Tree
- Wheelers Division Method