Title: Reconfigurable Computing
1Reconfigurable Computing (EN2911X, Fall07) Lab 2
presentations
Prof. Sherief Reda Division of Engineering, Brown
University http//ic.engin.brown.edu
2Runtimes by different teams
- 4.2 seconds
- 14 seconds
- 33 seconds
- 300 seconds
- 305 seconds
- 320 seconds
3Palindrome Checker
4Part I Verilog Module
- always _at_(posedge CLOCK_50)
- begin
- not_palindrome 1'd0len 0 tmp number
//reset - for (i 0 ilt9 i i 4'd1)
- begin
- if (tmp gt 0)
- begin
- modulo tmp 4'd10
- tmp tmp / 10
- vectorlen 9 modulo
- len len 1
- end
- end
- th (len gtgt 1)
- for (j0 jltth j j 4'd1)
- begin
- tmp2 (len-1) - j tmp3 vectorjtmp4
vectortmp2 - if ( tmp3 ! tmp4 )
- not_palindrome 1'b1
DECOMPOSE THE NUMBER IN DIGITS
Room for loop unrolling here..
5Part I Verilog Module
- always _at_(posedge CLOCK_50)
- begin
- not_palindrome 1'd0len 0 tmp number
//reset - for (i 0 ilt9 i i 4'd1)
- begin
- if (tmp gt 0)
- begin
- modulo tmp 4'd10
- tmp tmp / 10
- vectorlen 9 modulo
- len len 1
- end
- end
- th (len gtgt 1)
- for (j0 jltth j j 4'd1)
- begin
- tmp2 (len-1) - j tmp3 vectorjtmp4
vectortmp2 - if ( tmp3 ! tmp4 )
- not_palindrome 1'b1
COMPARE THE DIGITS STORED INTO THE VECTOR
loop unrolling, again..
6Optimized Verilog Code
- Do loop unrolling to compare digits
-
-
-
- if (digits0 digits3
- digits1 digits2)
- not_palindrome 1'd1//reset
7Unsolved things
- Our running time now depends on the way that we
extract digits from the number - Some ideas to improve?
- Using shift register
- Using non-blocking instructions
8Palindrome Homework Summary
- ENGN2911X
- Aaron Mandle
- Bryant Mairs
9Setup
- Two-cycle fixed length custom instruction
- Operates on 20 numbers at a time
- Returns total palindromes in that 20-number block
10Process
- Combinatorial conversion from binary to BCD
- Check number of digits
- Compare digits based on length
- Total up number of valid palindromes
11Binary to BCD Conversion
- Built using blocks of conditional add-3 modules
and shifts - Add-3 modules
- 4-bit input
- Adds 3 if input was 5 or greater
- Based on adding 6 numbers gt 9
12- module checkPalindrome(data, result)
-
- input 310 data
- output 310 result
-
- wire 30 digits 100
- wire 30 digCount
-
- bin2bcd(digits9, digits8, digits7,
digits6, digits5, digits4, digits3,
digits2, digits1, digits0, data) -
- assign digCount digits9 ! 0?10
- digits8 ! 0?9
- digits7 ! 0?8
- digits6 ! 0?7
- digits5 ! 0?6
- digits4 ! 0?5
- digits3 ! 0?4
- digits2 ! 0?3
13Yossi
14For all solutions
- Finding the length of the decimal representation
( digits) by - typedef unsigned long UINT
- inline UINT GetMSDFIndx(UINT n)
- return
- (n gt 100000000 ? 8
- (n gt 10000000 ? 7
- (n gt 1000000 ? 6
- (n gt 100000 ? 5
- (n gt 10000 ? 4
- (n gt 1000 ? 3
- (n gt 100 ? 2
- (n gt 10 ? 1 0))))))))
15Software Only Solutions
- Times
- On laptop (Intel 2333 MHz) 8 secs.
- On NIOS (100 MHz) 3500 secs.
- Inherently sequential
- Early false detection quit the computation if
we find two digits that do not match. - ? Brings down expected divide operations to
less than 2.2
16Software Only Solutions
- Observations 1. Detect whether the MSD is a
given number without division - MSD test d is the MSD of number n of length L if
and only if d10L-1 n lt (d1) 10L-1 E.g
4103 lt 4765 lt 5103 - 2. Cut out the MSD 4665 4103 665 and
continue. - Algorithm find one LSD after another, compare
with MSDs, quit early if not a palindrome. - Runs in 8 seconds on laptop
17Software Only Solutions
- On NIOS, division is really expensive
- Division free algorithmDont test the MSD, find
it with binary search
18Software Only Solutions
- On NIOS, division is really expensive
- Algorithm
- Start from left
- Find half of the digits
- Compute the palindrome whose left half matches
these digits - Compare to the tested number
- Loose the early false detection, but still better
than division. - Runs in 3500 secs on NIOS 100 MHz.
19Using the Hardware
- A general trick to divide by a constant without
using division.Based on trick I read in Hackers
Delight of how to divide by 3. - Demonstrate on divide by 10
- Given number n lt 230
- Needed floor(n/10)
- Algorithm Multiply n by (2312)/10
0xCCCCCCD, and then shift right 31 positions.
20Division Free divide by 10
- Algorithm Multiply nlt230 by (2312)/10
0xCCCCCCD, and then shift right 31 positions. - Proof The above algorithm outputs
floor n ((2312)/10) 1/231 floor n/10
2n/(10231)
floor(n/10)
n lt 230 implies 2n lt 231 ?
floor(n/10) lt n/10 lt floor(n/10) 9/10
21Divide by Constant
- Similarly, to divide n by a constant C, we need
to find P and R such that - 2P R 0 mod C.
- Rn lt 2P
- And then multiply n by (2P R)/C, and
shiftright P positions. - Found the constants to all powers of 10
needed.Algorithm worst register to register
delay 25 ns. - Run Time 33 secs.
22EN2911X Lab 2 Palindromes
- Brian Reggiannini and Chris Erway
23Checking a palindrome
- All combinational logic!
- Step 1 Convert 30-bit integer to 37-bit
binary-coded decimal (BCD) format - Step 2 Detect the length of decimal number
- Step 3 Compare pairs of digits with XOR
24Binary to BCD converter
25Binary to BCD converter
26Integration with Nios II
- Worst-case propagation delay 43ns, 5 cycles
- Dont want to wait! Use 32-bit PIO interface
- Array of 25 palindrome-checking units
- Write out 32-bit start value
- Read back of total palindromes found (from next
25) - While Nios is waiting increment loop counter
27Nios Software
28Results
- Original C program 49.59s/billion
- Unoptimized Nios C program 7842s/100million
- Final result 4.2s/billion (420000036 cycles _at_
100MHz) - Total logic elements 23,039 / 33,216 (69)