FPGA Implementation of Advanced Encryption standards - PowerPoint PPT Presentation

1 / 43
About This Presentation
Title:

FPGA Implementation of Advanced Encryption standards

Description:

Efficient Implementation of Rijndael Encryption in Reconfigurable Hardware: ... Composite field - Affine Transformation. Linear transformation Translation ... – PowerPoint PPT presentation

Number of Views:695
Avg rating:3.0/5.0
Slides: 44
Provided by: srih1
Category:

less

Transcript and Presenter's Notes

Title: FPGA Implementation of Advanced Encryption standards


1
FPGA Implementation of Advanced Encryption
standards
  • Srihari Sridharan
  • October 22nd 2007

2
Efficient Implementation of Rijndael Encryption
in Reconfigurable Hardware Improvements and
Design Tradeoffs
  • Francois-Xavier Standaert,Gael Rouvroy,Jean-Jacque
    s Quisquater, and Jean-Didler Legat
  • CHES Springer-Verlag Berlin Heidelberg 2003

3
OUTLINE
  • Performance Evaluation of AES Algorithm
  • Effective FPGA implementation
  • Heuristics to evaluate hardware efficiency
  • Derive at optimum throughput/area efficiency
  • Optimum Throughput 18.5 Gbps , Area 542
    slices , 10 RAM blocks

4
Hardware Description
5
Hardware Description
  • XILINX VIRTEX E
  • 32448 slices
  • 64986 LUTs,F.Fs
  • 208 RAM Blocks
  • Synthesis Synopsys
  • Circuit modeling - VHDL

6
Hardware Description
  • 2 Slices per CLB
  • Slice 2 L.C
  • L.C one 4-I/p LUT storage additional logic
  • Storage element Latch/Edge Triggered D F.F
  • Additional Logic Mux F5,F6
  • Arithmetic logic CY logic XOR AND

7
Evaluation Paramaters
  • 2 Types of Performance evaluation parameters.
  • In terms of performance
  • Throughput bits processed per sec
  • Area Slices
  • Ratio is an evaluation parameter
  • In terms of resource
  • Nbr of LUTs
  • Nbr of Registers
  • Ratio is Evaluation parameter

8
Encryption Block
9
Plain Text - Block Ciphers
  • Input 128 bit blocks
  • State transformed
  • Src inr4c
  • Outr4c Src
  • 0ltrlt4, oltcltNb(4)

10
Implementation
  • 2 Types of Optimization
  • Algorithmic
  • SBox
  • Multiplexer Model
  • RAM based
  • Composite field
  • MixColumns
  • MixColumns transform
  • Mixadd transform
  • Architectural
  • Loop unrolling
  • Pipelining
  • Sub-Pipelining

11
SBOX - Mux Model
  • Sbox Table

12
Mux Model - Background
  • N i/p boolean function G(x) represented by
  • In AES
  • Which is bit representation
  • Implemented as

13
Mux Model
  • MUX Model

14
Mux Model
  • Realization on FPGA
  • LUT based
  • 4 I/p 4 o/p Lookup
  • Four 4 I/p 1 o/p LUT
  • Coupled 41 Mux
  • Realizing 41 Mux through three 21 Mux

15
Mux Model - Implementation
16
Mux Model - Analysis
  • 1 Bit output
  • Repeated 16 times and looped 16 times
  • Critical path LUT4 MUXF5 MUXF6
  • 2 level pipelining
  • 12 clock pulses

17
Implementation
  • 2 Types of Optimization
  • Algorithmic
  • SBox
  • Multiplexer Model
  • RAM based
  • Composite field
  • MixColumns
  • MixColumns transform
  • Mixadd transform
  • Architectural
  • Loop unrolling
  • Pipelining
  • Sub-Pipelining

18
SBOX RAM Based
  • Lookup type
  • BRAM two single port 256x8 bit
  • Write enable of RAM made low
  • Input held low
  • ROM implemented
  • 1 clock
  • Design
  • SBOX 16x16x8 2048 bits 2Kbits
  • 16 SBOx for each state
  • 1 BRAM two 2Kbit RAM
  • Hence 8 BRAM required

19
Implementation
  • 2 Types of Optimization
  • Algorithmic
  • SBox
  • Multiplexer Model
  • RAM based
  • Composite field
  • MixColumns
  • MixColumns transform
  • Mixadd transform
  • Architectural
  • Loop unrolling
  • Pipelining
  • Sub-Pipelining

20
Composite field - Math Basics
  • Byte representation in Galois Field GF(28)
  • For e.g. 01100011 is x6 x5 x 1.
  • Addition Modulo 2 Arithmetic (No subtraction)
  • Multiplication polynomial multiplication modulo
    irreducible polynomial (deg 8) m(x) x8 x4
    x3 x 1
  • Multiplicative inverse
  • b(x)a(x) m(x)c(x) 1.
  • b-1 (x) a(x) mod m(x) because
  • a(x) b(x) mod m(x) 1,
  • E.g 3m 1 (mod 11) , 3-1 m (mod 11)

21
Composite model equations Multiplicative Inverse
  • GF(28) GF(24) 2
  • GF(24) a1x a0
  • Inverse given by
  • X belongs to x2 x ? 0
  • b0(a0a1)?-1
  • b1a1?-1
  • ? a0.(a0a1)? a12

22
Composite field - Affine Transformation
  • Linear transformation Translation
  • Transformation rotations, scaling, shear
  • Translation shift
  • In AES

23
Composite field - implementation
24
Implementation
  • 2 Types of Optimization
  • Algorithmic
  • SBox
  • Multiplexer Model
  • RAM based
  • Composite field
  • MixColumns
  • MixColumns transform
  • Mixadd transform
  • Architectural
  • Loop unrolling
  • Pipelining
  • Sub-Pipelining

25
Mixcolumns transform - Background
  • Four-term polynomials
  • Coefficients are bytes
  • M(x) X4 1
  • Product defined as a(x) X b(x) d(x)

26
Mixcolumns transform - Equations
  • Solution
  • Multiplication of GF(28) polynomial with X
    multiplication by 02 left shift plus
    Conditional XOR (based on MSB)

27
Mixcolumns transform - Implementation
28
Mixcolumns transform - Implementation
  • To implement
  • 03a1 (02 01)a1 02a1 a1
  • Hence we have
  • 2 multiplication with x (a0,a1)
  • 5 XOR addition Above two a1a2a3
  • 2 level pipelined

29
Mixcolumns transform - Implementation
30
Implementation
  • 2 Types of Optimization
  • Algorithmic
  • SBox
  • Multiplexer Model
  • RAM based
  • Composite field
  • MixColumns
  • MixColumns transform
  • Mixadd transform
  • Architectural
  • Loop unrolling
  • Pipelining
  • Sub-Pipelining

31
Mixadd transform - Principle
  • Inside X(a0) or X(a1) Mostly shift operator
  • In both the bytes XOR is done only to 3 bits
  • So these three bits separately added
  • Now pipelined
  • Combined with Key addition

32
Mixadd transform -Implementation
33
Implementation
  • 2 Types of Optimization
  • Algorithmic
  • SBox
  • Multiplexer Model
  • RAM based
  • Composite field
  • MixColumns
  • MixColumns transform
  • Mixadd transform
  • Architectural
  • Loop unrolling
  • Pipelining
  • Sub-Pipelining

34
Unrolled Architecture
  • 10 AES round unrolled
  • Lots of hardware
  • Area is increased
  • Throughput is Increased

35
Implementation
  • 2 Types of Optimization
  • Algorithmic
  • SBox
  • Multiplexer Model
  • RAM based
  • Composite field
  • MixColumns
  • MixColumns transform
  • Mixadd transform
  • Architectural
  • Loop unrolling
  • Pipelining
  • Sub-Pipelining

36
Pipelined Architecture - I
  • At a time only one round
  • Hardware reduced
  • Throughput reduced
  • Area reduced

37
Pipelined Architecture - II
  • All 10 rounds taken inside loop
  • Loss of mixadd combination
  • Additional Mux
  • Good choice in ASIC

38
Heuristic optimization
39
Results
  • Pipelined -I architecture
  • Unrolled Architecture

40
Results Contd
  • Comparison

RAM/unrolled
RAM/pipelined
Mux/pipelined
composite/pipelined
41
Summary
  • http//www.cs.bc.edu/straubin/cs381-05/blockciphe
    rs/rijndael_ingles2004.swf

42
Conclusion
  • Algorithmic and Architectural Design Tradeoffs
    were evaluated
  • Optimum Design principle found through heuristics
  • Throughput 1563Mbps
  • Performance (throughput/Area) .69

43
Phase 2 preview
  • Implement SBOX RAM based
  • Implement Mixcoloumn Mixcoloumn transform
  • Implement Addkey Direct XOR
  • Implement ShiftRow Simple cyclic shift
Write a Comment
User Comments (0)
About PowerShow.com