Title: Kris Gaj
1Kris Gaj
- Research and teaching interests
- cryptography
- computer arithmetic
- VLSI design and testing
- Contact
- Science Technology II, room 223
- kgaj_at_gmu.edu
- (703) 993-1575
Office hours Monday, 600-700 PM
Tuesday, Thursday, 730-830 PM,
and by appointment
2ECE 645
Part of
MS in CpE
Digital Systems Design pre-approved
course Other concentration areas elective
course
MS in EE
Certificate in VLSI Design/Manufacturing
PhD in ECE
PhD in IT
3Spring 2007 Enrollment as of January 22, 2006
PhD in CS 1
Non-Degree 4
MS in CpE 7
MS in EE 4
4Recommended degree concentration
My general area of interest is
I want to specialize primarily in
CAD tools Design Automation Hardware
description languages FPGAs Reconfigurable
computing Computer arithmetic Front-end
ASIC design (algorithmic downto gate
level) Back-end ASIC design (transistor downto
device level) Analog Mixed Circuit
Design VLSI Fabrication Micro- and
Nanoelectronics Semiconductor Devices
MS CpE Digital Systems Design
VLSI Digital Systems Design ASICs
FPGAs VHDL/Verilog CAD Tools Reconfigurable
Computing Microelectronics VLSI
Fabrication Nanoelectronics
MS EE Microelectronics
5Computer Arithmetic
VLSI Test Concepts
VLSI Design Automation
Introduction to VHDL
algorithmic
ECE 645
register-transfer
ECE 545
ECE 682
ECE 681
CpE core
gate
ECE 586
ECE 587
transistor
Digital Integrated Circuits
ECE 699
ECE 745
AnalogIntegrated Circuits
layout
MixedSignals VLSI
ECE684
ECE 584
EE core
ECE 699
devices
Semiconductor Device Fundamentals
MOS Device Electronics
ULSIMicroelectronics
Nano- electronics
6- MS CpE DIGITAL SYSTEMS DESIGN
- Concentration advisors Kris Gaj, Ken Hintz,
David Hwang - 1. ECE 545 Introduction to VHDL K. Gaj, D.
Hwang, project, VHDL, Aldec/Synplicity/Xilinx and
Synopsys Design Analyzer/PrimeTime - 2. ECE 645 Computer Arithmetic HW and SW
Implementation K. Gaj, project, VHDL,
Aldec/Synplicity/Xilinx and Synopsys Design
Analyzer/PrimeTime - 3. ECE 681 VLSI Design Automation T. Storey,
project/lab, back-end design with Synopsys tools - 4. ECE 586 Digital Integrated Circuits D.
Ioannou
7Prerequisites
ECE 545 Introduction to VHDL
or
Permission of the instructor, granted assuming
that you know
VHDL or Verilog,
High level programming language (preferably C)
8Course web page
ECE web page ? Courses ? Course web pages ? ECE
645
http//teal.gmu.edu/courses/ECE645/index.htm
9Computer Arithmetic
Lecture
Project
Project 1 20 Project 2 30
Homework 10 Midterm exam 1 (in class)
20 Midterm exam 2 (take-home)
20
10Advanced digital circuit design course covering
Efficient
- addition and subtraction
- multiplication
- division and modular reduction
- exponentiation
- Elements
- of the Galois
- field GF(2n)
- polynomial base
Integers unsigned and signed
Real numbers
- fixed point
- single and double precision
- floating point
11Lecture topics (1)
INTRODUCTION
1. Applications of computer arithmetic algorithms
2. Number representation
- Unsigned Integers
- Signed Integers
- Fixed-point real numbers
- Floating-point real numbers
- Elements of the Galois Field GF(2n)
12ADDITION AND SUBTRACTION
1. Basic addition, subtraction, and counting 2.
Carry-lookahead, carry-select, and hybrid
adders 3. Adders based on Parallel Prefix
Networks
13MULTIOPERAND ADDITION
1. Carry-save adders 2. Wallace and Dadda
Trees 3. Adding multiple signed numbers
14MULTIPLICATION
1. Tree and array multipliers 2. Sequential
multipliers 3. Multiplication of signed numbers
and squaring
15DIVISION
- Basic restoring and non-restoring
- sequential dividers
- 2. SRT and high-radix dividers
- 3. Array dividers
16LONG INTEGER ARITHMETIC
- Modular Exponentiation
- 2. Multi-Precision Arithmetic in Software
17FLOATING POINT AND GALOIS FIELD ARITHMETIC
- Floating-point units
- 2. Galois Field GF(2n) units
18Similar courses at other universities
- University of California, Santa Barbara, Behrooz
Parhami, - ECE252B Computer Arithmetic.
- University of Massachusetts, Amherst, Israel
Koren, - ECE666 Digital Computer Arithmetic
- Lehigh University, Michael Schulte,
- ECE496 High-Speed Computer Arithmetic.
- Worcester Polytechnic Institute, Berk Sunar,
- EE-579 V Computer Arithmetic Circuits.
- Stanford University, Michael Flynn,
- EE486 Advanced Computer Arithmetic.
- University of California, Davies, Vojin
Oklobdzija, - ECE278 Computer Arithmetic for Digital
Implementation.
19New in this course
- real-life project based on VHDL or Verilog HDL
- operations in the Galois Field
- (with application in cryptography
- and communications)
20Possible topics for a Scholarly Paper or
Research Project for the CpE EE students
Advanced Computer Arithmetic
Square root Exponential and logarithmic
functions Trigonometric functions Hyperbolic
functions Fault-Tolerant Arithmetic Low-Power
Arithmetic High-Throughput Arithmetic
21Literature (1)
Required textbook Behrooz Parhami, Computer
Arithmetic Algorithms and Hardware Design,
Oxford University Press, 2000.
Recommended textbooks
Milos D. Ercegovac and Tomas Lang Digital
Arithmetic, Morgan Kaufmann Publishers,
2004. Isreal Koren, Computer Arithmetic
Algorithms, 2nd edition, A. K. Peters, Natick,
MA, 2002.
22Literature (2)
VHDL books (used in ECE 545 in Fall 2005)
1. Sundar Rajan, Essential VHDL RTL Synthesis
Done Right, S G Publishing, 1998. 2.
Volnei A. Pedroni, Circuit Design with VHDL,
The MIT Press, 2004.
23Literature (3)
Supplementary books
- E. E. Swartzlander, Jr., Computer Arithmetic,
- vols. I and II, IEEE Computer Society
Press, 1990. - 2. Alfred J. Menezes, Paul C. van Oorschot,
- and Scott A. Vanstone,
- Handbook of Applied Cryptology,
- Chapter 14, Efficient Implementation,
- CRC Press, Inc., 1998.
- 3. Christof Paar, Efficient VLSI Architectures
for Bit - Parallel Computation in Galois Fields,
- VDI Verlag, 1994.
24Literature (3)
Proceedings of conferences ARITH -
International Symposium on Computer Arithmetic
ASIL - Asilomar Conference on Signals, Systems,
and Computers ICCD - International Conference
on Computer Design CHES - Workshop on
Cryptographic Hardware and
Embedded Systems
Journals and periodicals IEEE Transactions on
Computers, in particular special issues on
computer arithmetic 8/70, 6/73, 7/77,
4/83, 8/90, 8/92, 8/94, 7/00, 3/05. IEEE
Transactions on Circuits and Systems IEEE
Transactions on Very Large Scale Integration
IEE Proceedings Computer and Digital Techniques
Journal of VLSI Signal Processing
25Homework
- reading assignments (main textbook articles)
- analysis of hardware and software algorithms
- and implementations
- design of small hardware units using VHDL or
Verilog
Optional assignments
Possibility of trading analysis vs. design
vs. coding
26Midterm exams
Exam 1 - 2 hrs 30 minutes, in class
multiple choice short problems Exam 2 48
hrs, take-home conceptual
questions, analysis and design of
arithmetic units using VHDL or
Verilog HDL
Practice exams on the web
Tentative days of exams
Exam 1 - Monday, March 26 Exam 2 -
Saturday-Sunday, May 5-6
27Project (1)
Project I (20 of grade)
Design and comparative analysis of fast adders
(several hundred bits long)
- Optimization criteria
- minimum latency
- maximum throughput
- minimum area
- minimum product latency area
- maximum ratio throughput/area
- scalability
Similar for all students
Done individually
Final report due Monday, March 19
28Project (2)
Project II (30 of grade)
Long unsigned or signed integers
- Fast
- multiplication
- squaring
- division
- modular reduction, or
- modular exponentiation
or
Floating-point numbers
- Fast
- addition or
- multiplication
29Project II (rules)
- Real life application
- Requirements derived from the analysis of the
application - Typically both hardware and software design
- Several project topics proposed on the web
- You can choose project topic by yourself
- Can be done in a group of 1-3 students
Written report oral presentation Monday, May 14
30Project II (rules)
- Every team works on a slightly different problem
- Project topics should be more complex for larger
teams
- Cooperation (but not exchange of codes)
- between teams is encouraged
31Project
Hardware
Software
High level language (C preferred)
VHDL (or Verilog) code
Latency and/or throughput
Execution time
Area
Memory requirements
Scalability
Scalability
32Degrees of freedom and possible trade-offs
speed
area
ECE 645
power
testability
ECE 682
ECE 586, 681
33Degrees of freedom and possible trade-offs
speed
latency
area
throughput
34Timing parameters
definition
units
pipelining
time point?point
ns
delay
ns
bad
latency
time input?output
throughput
Mbits/s
good
output bits/time unit
rising edge ?rising edge of clock
ns
good
clock period
1
MHz
clock frequency
good
clock period
35Project technologies
semi-custom Application Specific Integrated
Circuits
and Field Programmable Gate
Arrays
36Levels of design description
Algorithmic level
Level of description most suitable for synthesis
Register Transfer Level
Logic (gate) level
Circuit (transistor) level
Physical (layout) level
37Register Transfer Logic (RTL) Design Description
Registers
Combinational Logic
Combinational Logic
Clock
38CAD software available at GMU (1)
VHDL simulators
- Aldec Active-HDL (under Windows)
- available in the FPGA Lab, ST II, room 203
- student edition can be purchased on an
individual - basis (59.95 SH)
- ModelSim Xilinx Edition III (under Windows)
- available in the FPGA Lab, ST II, room 203
- limited version available for free for
individual use - at home as a part of Xilinx WebPACK
39CAD software available at GMU (2)
Tools used for logic synthesis
FPGA synthesis
- Synplicity Synplify Pro (under Windows)
- Xilinx XST (under Windows)
- available in the FPGA Lab, ST II, room 203
- available for free as a part of WebPACK
ASIC synthesis
- Synopsys Design Compiler and PrimeTime (under
Unix)
- available from all PCs in the ECE educational
labs - using an X-terminal emulator
- available remotely from home using a fast
Internet - connection
40CAD software available at GMU (3)
Tools used for implementation (mapping, placing
routing) in the FPGA technology
- Xilinx ISE (under Windows)
- available in the FPGA Lab, ST II, room 203
- Xilinx WebPACK (under Windows)
- limited version available for free for
individual use - at home as a part of Xilinx WebPACK
41How to learn VHDL for synthesis by yourself?
- Lecture slides for ECE 545 from Fall 2005
- Sundar Rajan, Essential VHDL RTL Synthesis Done
Right, - S G Publishing, 1998.
- Volnei A. Pedroni, Circuit Design with VHDL,
- The MIT Press, 2004.
- Individual or small-group hands-on sessions with
the TA - Practice, Practice, Practice!!!
42Testbench
Non-synthesizable
testbench
Synthesizable
design entity
. . . .
Architecture N
Architecture 2
Architecture 1
43Hardware Design Verification
Testbench
actual results
HDL Design (VHDL or Verilog)
?
Representative Inputs
Reference Model (C or MAGMA )
expected results
44Primary applications (1)
Execution units of general purpose microprocessors
Integer units
Floating point units
Integers (8, 16, 32, 64 bits)
Real numbers (32, 64 bits)
45Primary applications (2)
Digital signal and digital image processing
e.g., digital filters Discrete
Fourier Transform Discrete Hilbert
Transform
General purpose DSP processors
Specialized circuits
Real or complex numbers (fixed-point or floating
point)
46Primary applications (3)
Coding
Error detection codes Error correcting codes
Elements of the Galois fields GF(2n)
(4-64 bits)
47Secret-key (Symmetric) Cryptosystems
key of Alice and Bob - KAB
key of Alice and Bob - KAB
Network
Decryption
Encryption
Bob
Alice
48Primary applications (4)
Cryptography
Secret key cryptography
IDEA, RC6, Mars
Twofish, Rijndael
Elements of the Galois field GF(2n)
(4, 8 bits)
Integers (16, 32 bits)
49Main operations
Auxiliary operations
2 x SQR32, 2 x ROL32
XOR, ADD/SUB32
RC6
MARS
XOR, ADD/SUB32
MUL32, 2 x ROL32, S-box 9x32
XOR ADD32
Twofish
96 S-box 4x4, 24 MUL GF(28)
Rijndael
16 S-box 8x8 24 MUL GF(28)
XOR
8 x 32 S-box 4x4
Serpent
XOR
50Public Key (Asymmetric) Cryptosystems
Private key of Bob - kB
Public key of Bob - KB
Network
Decryption
Encryption
Bob
Alice
51RSA as a trap-door one-way function
PUBLIC KEY
C f(M) Me mod N
M
C
M f-1(C) Cd mod N
PRIVATE KEY
N P ? Q
P, Q - large prime numbers
e ? d ? 1 mod ((P-1)(Q-1))
52RSA keys
PUBLIC KEY
PRIVATE KEY
e, N
d, P, Q
N P ? Q
P, Q - large prime numbers
e ? d ? 1 mod ((P-1)(Q-1))
53Primary applications (5)
Cryptography
Public key cryptography
RSA, DSA, Diffie-Hellman
Elliptic Curve Cryptosystems
Long integers (1000-16,000 bits)
Elements of the Galois field GF(2n)
(150-500 bits)
54Primary applications (5)
Cipher Breaking
Public key cryptography
RSA PUBLIC KEY
RSA PRIVATE KEY
e, N
d, P, Q
N P ? Q
P, Q
e ? d ? 1 mod ((P-1)(Q-1))
55Estimation of RSA Security Inc. regarding the
number and memory of PCs necessary to break
RSA-1024
Attack time 1 year Single machine PC, 500
MHz, 170 GB RAM Number of machines 342,000,000
56Factoring 1024-bit RSA keysusing Number Field
Sieve (NFS)
Polynomial Selection
Relation
Collection
Norm Factoring/Cofactoring
200 bit
350 bit
ECM p-1 methodPollard rho
Ramakrishna Bachimanchi
Sieving
numbers
Sashisu Bajracharya
Linear Algebra
Square Root
57Comparison among technologies
Microprocessors
ASICs
FPGAs
SRC
COPACOBANA
58FPGAs vs. Microprocessors
Spartan3s5000
Virtex2v6000
Pentium4 2.8GHz
10.8x
11.3x
869
857
7.9x
8.4x
637
635
10.8x
435
7.8x
315
80
76
40
ECM
p-1
rho
59Rho in an ASIC 130 nm
Global Memory
Local Memory
60ASIC 130 nm vs. Virtex II 6000 rho (24 units)
19.80 mm
51x
Area of Virtex II 6000 (estimation by R.J. Lim
Fong, MS Thesis, VPI, 2004)
19.68 mm
2.7 mm
2.82 mm
Area of an ASIC with equivalent functionality
61Number of computations per second using the same
chip area
130 nm ASIC library
101x
Virtex2v6000 FPGA
88,405
50x
21,739
869
435
rho
ECM
62Cofactorization Unit
interesting Computer Arithmetic project
63Famous computer arithmetic bugs and flaws
64Learn to deal with approximations
- In digital arithmetic one has to come to grips
with approximation and questions like - When is approximation good enough
- What margin of error is acceptable
- Be aware of the applications you are designing
the arithmetic circuit or program for - Analyze the implications of your approximation
65Calculators
u
v 21/1024 1.000 677 131
1.000 677 131
10 times
y (((v2)2))2 1.999 999 983
x (((u2)2))2 1.999 999 963
10 times
10 times
x u1024 1.999 999 973
y v1024 1.999 999 994
Hidden digits in the internal representation of
numbers Different algorithms give slightly
different results Very good accuracy
66Consequences of bad approximations
Example Failure of Patriot Missile (1991 Feb.
25) Source http//www.math.psu.edu/dna/455.f96/dis
asters.html American Patriot Missile battery in
Dharan, Saudi Arabia, failed to intercept
incoming Iraqi Scud missile The Scud struck an
American Army barracks, killing 28 Cause, per
GAO/IMTEC-92-26 report software problem
(inaccurate calculation of the time since boot)
Specifics of the problem time in tenths of
second as measured by the systems internal clock
was multiplied by 1/10 to get the time in
seconds. Internal registers were 24 bits wide
1/10 0.0001 1001 1001 1001 1001 100 (chopped to
24 b) Error _at_ 0.1100 1100 223 _at_ 9.5 108
Error in 100-hr operation period _at_ 9.5 108
100 60 60 10 0.34 s Distance traveled by
Scud (0.34 s) (1676 m/s) _at_ 570 m This put the
Scud outside the Patriots range gate
Ironically, the fact that the bad time
calculation had been improved in some (but not
all) code parts contributed to the problem, since
it meant that inaccuracies did not cancel out
67Consequences of bad approximations
Example Explosion of Ariane Rocket (1996 June
4) Source http//www.math.psu.edu/dna/455.f96/disa
sters.html Unmanned Ariane 5 rocket launched by
the European Space Agency veered off its flight
path, broke up, and exploded only 30 seconds
after lift-off (altitude of 3700 m) The 500
million rocket (with cargo) was on its 1st voyage
after a decade of development costing 7 billion
Cause software error in the inertial reference
system Specifics of the problem a 64 bit
floating point number relating to the horizontal
velocity of the rocket was being converted to a
16 bit signed integer An SRI software exception
arose during conversion because the 64-bit
floating point number had a value greater than
what could be represented by a 16-bit signed
integer (max 32 767)
68Pentium bug (1)
October 1994
Thomas Nicely, Lynchburg Collage, Virginia finds
an error in his computer calculations, and
traces it back to the Pentium processor
November 7, 1994
First press announcement, Electronic Engineering
Times
Late 1994
Tim Coe, Vitesse Semiconductor presents an
example with the worst-case error
c 4 195 835/3 145 727
Pentium 1.333 739 06... Correct result
1.333 820 44...
69Pentium bug (2)
Intel admits subtle flaw
November 30, 1994
Intels white paper about the bug and its
possible consequences
Intel - average spreadsheet user affected once in
27,000 years IBM - average spreadsheet user
affected once every 24 days
Replacements based on customer needs
December 20, 1994
Announcement of no-question-asked replacements
70Pentium bug (3)
Error traced back to the look-up table used
by the radix-4 SRT division algorithm
2048 cells, 1066 non-zero values -2, -1, 1, 2
5 non-zero values not downloaded
correctly to the lookup table due to an error in
the C script
71(No Transcript)