Kris Gaj

About This Presentation

Title:

Kris Gaj

Description:

Research and teaching interests: cryptography computer arithmetic VLSI design and testing Contact: Science & Technology II, room 223 kgaj_at_gmu.edu, kgaj01_at_yahoo.com, – PowerPoint PPT presentation

Number of Views:151

Avg rating:3.0/5.0

Slides: 69

Provided by: tealGmuEd

Category:

Tags: gaj | kris

more less

Transcript and Presenter's Notes

Title: Kris Gaj

1
Kris Gaj

Research and teaching interests
cryptography
computer arithmetic
VLSI design and testing
Contact
Science Technology II, room 223
kgaj_at_gmu.edu, kgaj01_at_yahoo.com,
(703) 993-1575

Office hours Monday, 730-830 PM
Thursday, 730-830 PM
2
ECE 645
Part of
MS in CpE
Digital Systems Design required course Other
concentration areas elective course
MS in EE
Certificate in VLSI Design/Manufacturing
PhD in ECE
PhD in IT
3
Spring 2006 Enrollment as of January 23, 2006
BS in CpE 1
NDG 1
PhD in IT 1
PhD in ECE 1
MS in CpE 7
MS in ISA 1
MS in EE 6
4

DIGITAL SYSTEMS DESIGN
Concentration advisor Ken Hintz
1. ECE 545 Introduction to VHDL K. Gaj, K.
Hintz, project, VHDL, Aldec/Synplicity/Xilinx and
ModelSim/Synopsys
2. ECE 645 Computer Arithmetic HW and SW
Implementation K. Gaj, project, VHDL,
Aldec/Synplicity/Xilinx and
ModelSim/Synopsys
3. ECE 586 Digital Integrated Circuits D.
Ioannou
4. ECE 681 VLSI Design Automation T. Storey,
project/lab, back-end design with Synopsys tools

5
Courses
Design level
Computer Arithmetic
Introduction to VHDL
VLSI Design Automation
algorithmic
ECE 645
ECE 545
register-transfer
ECE 681
gate
ECE 586
transistor
Digital Integrated Circuits
layout
Semiconductor Device Fundamentals
MOS Device Electronics
ECE 584
devices
ECE 684
6
Prerequisites
ECE 545 Introduction to VHDL
or
Permission of the instructor, granted assuming
that you know
VHDL or Verilog,
High level programming language (preferably C)
7
Course web page
ECE web page ? Courses ? Course web pages ? ECE
645
http//teal.gmu.edu/courses/ECE645/index.htm
8
Computer Arithmetic
Lecture
Project
Project 1 20 Project 2 30
Homework 15 Midterm exam 1 (in class)
20 Midterm exam 2 (take-home)
15
9
Advanced digital circuit design course covering
Efficient

addition and subtraction
multiplication
division and modular reduction
exponentiation

Elements
of the Galois
field GF(2n)
polynomial base

Integers unsigned and signed
Real numbers

fixed point
single and double precision
floating point

10
Lecture topics (1)
INTRODUCTION
1. Applications of computer arithmetic algorithms
2. Number representation

Unsigned Integers
Signed Integers
Fixed-point real numbers
Floating-point real numbers
Elements of the Galois Field GF(2n)

11
ADDITION AND SUBTRACTION
1. Basic addition, subtraction, and counting 2.
Carry-lookahead, carry-select, and hybrid
adders 3. Adders based on Parallel Prefix
Networks
12
MULTIOPERAND ADDITION
1. Carry-save adders 2. Wallace and Dadda
Trees 3. Adding multiple signed numbers
13
MULTIPLICATION
1. Tree and array multipliers 2. Sequential
multipliers 3. Multiplication of signed numbers
and squaring
14
DIVISION

Basic restoring and non-restoring
sequential dividers
2. SRT and high-radix dividers
3. Array dividers

15
FLOATING POINT AND GALOIS FIELD ARITHMETIC

Floating-point units
2. Galois Field GF(2n) units

16
Similar courses at other universities

University of California, Santa Barbara, Behrooz
Parhami,
ECE252B Computer Arithmetic.
University of Massachusetts, Amherst, Israel
Koren,
ECE666 Digital Computer Arithmetic
Lehigh University, Michael Schulte,
ECE496 High-Speed Computer Arithmetic.
Worcester Polytechnic Institute, Berk Sunar,
EE-579 V Computer Arithmetic Circuits.
Stanford University, Michael Flynn,
EE486 Advanced Computer Arithmetic.
University of California, Davies, Vojin
Oklobdzija,
ECE278 Computer Arithmetic for Digital
Implementation.

17
New in this course

real-life project based on VHDL or Verilog HDL
operations in the Galois Field
(with the application in cryptography
and communications)

18
Possible topics for a Scholarly Paper or
Research Project for the CpE EE students
Advanced Computer Arithmetic
Square root Exponential and logarithmic
functions Trigonometric functions Hyperbolic
functions Fault-Tolerant Arithmetic Low-Power
Arithmetic High-Throughput Arithmetic
19
Three Curriculum Options
2 core courses
4 required courses
2 elective courses
3 elective courses
4 elective courses
ECE 799 Masters Thesis (6 cr. hrs)
ECE 798 Research Project
Scholarly paper
Scholarly paper
MS Thesis Option
Research Project Option
Scholarly Paper Option
20
Literature (1)
Required textbook Behrooz Parhami, Computer
Arithmetic Algorithms and Hardware Design,
Oxford University Press, 2000.
Recommended textbooks
Milos D. Ercegovac and Tomas Lang Digital
Arithmetic, Morgan Kaufmann Publishers,
2004. Isreal Koren, Computer Arithmetic
Algorithms, 2nd edition, A. K. Peters, Natick,
MA, 2002.
21
Literature (2)
VHDL books (used in ECE 545 in Fall 2005)
1. Sundar Rajan, Essential VHDL RTL Synthesis
Done Right, S G Publishing, 1998. 2.
Volnei A. Pedroni, Circuit Design with VHDL,
The MIT Press, 2004.
22
Literature (3)
Supplementary books

E. E. Swartzlander, Jr., Computer Arithmetic,
vols. I and II, IEEE Computer Society
Press, 1990.
2. Alfred J. Menezes, Paul C. van Oorschot,
and Scott A. Vanstone,
Handbook of Applied Cryptology,
Chapter 14, Efficient Implementation,
CRC Press, Inc., 1998.
3. Christof Paar, Efficient VLSI Architectures
for Bit
Parallel Computation in Galois Fields,
VDI Verlag, 1994.

23
Literature (3)
Proceedings of conferences ARITH -
International Symposium on Computer Arithmetic
ASIL - Asilomar Conference on Signals, Systems,
and Computers ICCD - International Conference
on Computer Design CHES - Workshop on
Cryptographic Hardware and
Embedded Systems
Journals and periodicals IEEE Transactions on
Computers, in particular special issues on
computer arithmetic 8/70, 6/73, 7/77,
4/83, 8/90, 8/92, 8/94. IEEE Transactions on
Circuits and Systems IEEE Transactions on
Very Large Scale Integration IEE
Proceedings Computer and Digital Techniques
Journal of VLSI Signal Processing
24
Homework

reading assignments (main textbook articles)
analysis of hardware and software algorithms
and implementations
design of small hardware units using VHDL or
Verilog

Optional assignments
Possibility of trading analysis vs. design
vs. coding
25
Midterm exams
Exam 1 - 2 hrs 30 minutes, in class
multiple choice short problems Exam 2 48
hrs, take-home analysis and design
of arithmetic units using VHDL or
Verilog HDL
Practice exams on the web
Tentative days of exams
Exam 1 - Monday, March 27 Exam 2 -
Saturday-Sunday, May 6-7
26
Project (1)
Project I (20 of grade)
Design and comparative analysis of fast adders
(several hundred bits long)

Optimization criteria
minimum latency
maximum throughput
minimum area
minimum product latency area
maximum ratio throughput/area
scalability

Similar for all students
Done individually
Final report due Monday, March 20
27
Project (2)
Project II (30 of grade)
Long unsigned or signed integers

Fast
multiplication
squaring
division
modular reduction, or
modular exponentiation

or
Floating-point numbers

Fast
addition or
multiplication

28
Project II (rules)

Real life application
Requirements derived from the analysis of the
application
Typically both hardware and software design
Several project topics proposed on the web
You can choose project topic by yourself
Can be done in a group of 1-3 students

Written report oral presentation Monday, May 15
29
Project II (rules)

Every team works on a slightly different problem

Project topics should be more complex for larger
teams

Cooperation (but not exchange of code)
between teams is encouraged

30
Project
Hardware
Software
High level language (C preferred)
VHDL (or Verilog) code
Latency and/or throughput
Execution time
Area
Memory requirements
Scalability
Scalability
31
Degrees of freedom and possible trade-offs
speed
area
ECE 645
power
testability
ECE 682
ECE 586, 681
32
Degrees of freedom and possible trade-offs
speed
latency
area
throughput
33
Timing parameters
definition
units
pipelining
time point?point
ns
delay
ns
bad
latency
time input?output
throughput
Mbits/s
good
output bits/time unit
rising edge ?rising edge of clock
ns
good
clock period
1
MHz
clock frequency
good
clock period
34
Project technologies
semi-custom Application Specific Integrated
Circuits
and Field Programmable Gate
Arrays
35
Levels of design description
Algorithmic level
Level of description most suitable for synthesis
Register Transfer Level
Logic (gate) level
Circuit (transistor) level
Physical (layout) level
36
Register Transfer Logic (RTL) Design Description
Registers

Combinational Logic
Combinational Logic
Clock
37
RTL Block Synthesis
Estimated Area
Estimated Timing
Simplified design flow
38
VHDL Design Styles
VHDL Design Styles
behavioral (algorithmic)
structural
Components and interconnects
Concurrent statements
Sequential statements

Registers
State machines
Test benches

Subset most suitable for use in this course
39
CAD software available at GMU (1)
VHDL simulators

Aldec Active-HDL (under Windows)

available in the FPGA Lab, ST II, room 203

student edition can be purchased on an
individual
basis (59.95 SH)

ModelSim (under Unix)

available from all PCs in the ECE educational
labs
using an X-terminal emulator
available remotely from home using a fast
Internet
connection

40
CAD software available at GMU (2)
Tools used for logic synthesis
FPGA synthesis

Synplicity Synplify Pro (under Windows)

Xilinx XST (under Windows)

available in the FPGA Lab, ST II, room 203

ASIC synthesis

Synopsys Design Compiler (under Unix)

available from all PCs in the ECE educational
labs
using an X-terminal emulator
available remotely from home using a fast
Internet
connection

41
CAD software available at GMU (3)
Tools used for implementation (mapping, placing
routing) in the FPGA technology

Xilinx ISE (under Windows)

available in the FPGA Lab, ST II, room 203

42
How to learn VHDL for synthesis by yourself?

Lecture slides for ECE 545 from Fall 2005
Sundar Rajan, Essential VHDL RTL Synthesis Done
Right,
S G Publishing, 1998.
Volnei A. Pedroni, Circuit Design with VHDL,
The MIT Press, 2004.
Individual or small-group hands-on sessions with
the TA
Practice, Practice, Practice!!!

43
Testbench
Non-synthesizable
testbench
Synthesizable
design entity
. . . .
Architecture N
Architecture 2
Architecture 1
44
Design Environment
HDL Design (VHDL or Verilog)
Actual Resultsvs. Expected ResultsComparison
Test Vectors (Inputs)
Reference Model ( C )
45
Primary applications (1)
Execution units of general purpose microprocessors
Integer units
Floating point units
Integers (8, 16, 32, 64 bits)
Real numbers (32, 64 bits)
46
Primary applications (2)
Digital signal and digital image processing
e.g., digital filters Discrete
Fourier Transform Discrete Hilbert
Transform
General purpose DSP processors
Specialized circuits
Real numbers (fixed-point or floating point)
47
Primary applications (3)
Coding
Error detection codes Error correcting codes
Elements of the Galois fields GF(2n)
(4-64 bits)
48
Secret-key (Symmetric) Cryptosystems
key of Alice and Bob - KAB
key of Alice and Bob - KAB
Network
Decryption
Encryption
Bob
Alice
49
Primary applications (4)
Cryptography
Secret key cryptography
IDEA, RC6, Mars
Twofish, Rijndael
Elements of the Galois field GF(2n)
(4, 8 bits)
Integers (16, 32 bits)
50
Main operations
Auxiliary operations
2 x SQR32, 2 x ROL32
XOR, ADD/SUB32
RC6
MARS
XOR, ADD/SUB32
MUL32, 2 x ROL32, S-box 9x32
XOR ADD32
Twofish
96 S-box 4x4, 24 MUL GF(28)
Rijndael
16 S-box 8x8 24 MUL GF(28)
XOR
8 x 32 S-box 4x4
Serpent
XOR
51
Public Key (Asymmetric) Cryptosystems
Private key of Bob - kB
Public key of Bob - KB
Network
Decryption
Encryption
Bob
Alice
52
RSA as a trap-door one-way function
PUBLIC KEY
C f(M) Me mod N
M
C
M f-1(C) Cd mod N
PRIVATE KEY
N P ? Q
P, Q - large prime numbers
e ? d ? 1 mod ((P-1)(Q-1))
53
RSA keys
PUBLIC KEY
PRIVATE KEY
e, N
d, P, Q
N P ? Q
P, Q - large prime numbers
e ? d ? 1 mod ((P-1)(Q-1))
54
Primary applications (5)
Cryptography
Public key cryptography
RSA, DSS, Diffie-Hellman
Elliptic Curve Cryptosystems
Long integers (1000-2000 bits)
Elements of the Galois field GF(2n)
(150-250 bits)
55
Topic 1
C A B mod 232, C A2 mod 232
Function 32-bit unsigned
multiplication and squaring
modulo 232
Application modern secret-key ciphers,
candidates for the new
Advanced Encryption
Standard (AES) MARS developed by IBM
RC6 developed at MIT
Environment hardware, software for 8-bit
processors
Optimization

maximum throughput
minimum latency
minimum area

56
256
C ? Ai Bi
Topic 2
i1
Function 64-bit signed
multiplier-accumulator (MAC)
accumulating at least 256 partial products
Application digital filters
Environment hardware,
software for a general purpose DSP or
microprocessor
Optimization
Hardware - maximum throughput
limited area Software minimum execution time,
limited memory
57
Topic 3
C A B CA / B
Function multiplication of two 64-bit
signed numbers
division of a 128-bit number by a 64-bit
number
Application general purpose microprocessor
Environment hardware,
software for a 64-bit processor without
multiplication and
division built in
Optimization
Hardware minimum latency
maximum throughput limited
area Software minimum execution time,
limited memory
58
Topic 4
C AE mod N
Function modular exponentiation CME
mod N M, N arbitrary
768-bit numbers, E2161
Application modern public-key ciphers
RSA
Diffie-Hellman
Elliptic
Curve Cryptosystems
Environment hardware, software for 32-bit or
8-bit processors
Optimization
Hardware - minimum latency
limited area Software minimum execution time,
limited memory
59
Topic 5
Z XY Z X Y
Function floating point addition and
multiplication
according to ANSI/IEEE 754
Application general purpose microprocessor
or digital signal
processor
Environment hardware,
software for a 32-bit processor without
floating point
operations
Optimization
Hardware minimum latency
maximum throughput limited
area Software minimum execution time,
limited memory
60
Famous computer arithmetic bugs and flaws
61
Learn to deal with approximations

In digital arithmetic one has to come to grips
with approximation and questions like
When is approximation good enough
What margin of error is acceptable
Be aware of the applications you are designing
the arithmetic circuit or program for.
Analyze the implications of your approximation.

62
Calculators
u
v 21/1024 1.000 677 131
1.000 677 131
10 times
y (((v2)2))2 1.999 999 983
x (((u2)2))2 1.999 999 963
10 times
10 times
x u1024 1.999 999 973
y v1024 1.999 999 994
Hidden digits in the internal representation of
numbers Different algorithms give slightly
different results Very good accuracy
63
Consequences of bad approximations
Example Failure of Patriot Missile (1991 Feb.
25) Source http//www.math.psu.edu/dna/455.f96/dis
asters.html American Patriot Missile battery in
Dharan, Saudi Arabia, failed to intercept
incoming Iraqi Scud missile The Scud struck an
American Army barracks, killing 28 Cause, per
GAO/IMTEC-92-26 report software problem
(inaccurate calculation of the time since boot)
Specifics of the problem time in tenths of
second as measured by the systems internal clock
was multiplied by 1/10 to get the time in seconds
Internal registers were 24 bits wide 1/10
0.0001 1001 1001 1001 1001 100 (chopped to 24 b)
Error _at_ 0.1100 1100 2 23 _at_ 9.5 10 8 Error
in 100-hr operation period _at_ 9.5 10 8 100
60 60 10 0.34 s Distance traveled by Scud
(0.34 s) (1676 m/s) _at_ 570 m This put the Scud
outside the Patriots range gate Ironically,
the fact that the bad time calculation had been
improved in some (but not all) code parts
contributed to the problem, since it meant that
inaccuracies did not cancel out
64
Consequences of bad approximations
Example Explosion of Ariane Rocket (1996 June
4) Source http//www.math.psu.edu/dna/455.f96/disa
sters.html Unmanned Ariane 5 rocket launched by
the European Space Agency veered off its flight
path, broke up, and exploded only 30 seconds
after lift-off (altitude of 3700 m) The 500
million rocket (with cargo) was on its 1st voyage
after a decade of development costing 7 billion
Cause software error in the inertial reference
system Specifics of the problem a 64 bit
floating point number relating to the horizontal
velocity of the rocket was being converted to a
16 bit signed integer An SRI software exception
arose during conversion because the 64-bit
floating point number had a value greater than
what could be represented by a 16-bit signed
integer (max 32 767)
65
Pentium bug (1)
October 1994
Thomas Nicely, Lynchburg Collage, Virginia finds
an error in his computer calculations, and
traces it back to the Pentium processor
November 7, 1994
First press announcement, Electronic Engineering
Times
Late 1994
Tim Coe, Vitesse Semiconductor presents an
example with the worst-case error
c 4 195 835/3 145 727
Pentium 1.333 739 06... Correct result
1.333 820 44...
66
Pentium bug (2)
Intel admits subtle flaw
November 30, 1994
Intels white paper about the bug and its
possible consequences
Intel - average spreadsheet user affected once in
27,000 years IBM - average spreadsheet user
affected once every 24 days
Replacements based on customer needs
December 20, 1994
Announcement of no-question-asked replacements
67
Pentium bug (3)
Error traced back to the look-up table used
by the radix-4 SRT division algorithm
2048 cells, 1066 non-zero values -2, -1, 1, 2
5 non-zero values not downloaded
correctly to the lookup table due to an error in
the C script
68
(No Transcript)

Write a Comment

User Comments (0)