A HardwareSoftware Codesign Approach for Face Recognition - PowerPoint PPT Presentation

1 / 43
About This Presentation
Title:

A HardwareSoftware Codesign Approach for Face Recognition

Description:

Entertainment: Video game, Virtual reality, Human robot interaction ... licenses, Immigration, National ID, Passports, Voter registration, Welfare fraud ... – PowerPoint PPT presentation

Number of Views:82
Avg rating:3.0/5.0
Slides: 44
Provided by: sridhar2
Category:

less

Transcript and Presenter's Notes

Title: A HardwareSoftware Codesign Approach for Face Recognition


1
A Hardware/Software Co-design Approach for Face
Recognition
  • By Shawki Areibi
  • University of Guelph
  • School of Engineering
  • Engineering Systems Computing
  • Guelph, Ontario, Canada

2
Outline
  • Introduction
  • Face Recognition
  • Applications
  • Background
  • Face Recognition Methodolgies
  • Artificial Neural Networks
  • H/S Co-Design Approach
  • MicroBlaze Embedded System (software)
  • MicroBlaze with dedicated hardware module (H/S)
  • Results
  • Conclusions Future Work

3
Introduction
  • Face Recognition Given still or video images of
    a scene, identify or verify one or more persons
    in the scene using a stored database of faces

Database
Image 1
Target Image
Image 3 Matches The Target Image
Image 2
Image 3
Image 4
4
Typical Applications of Face Recognition
  • Entertainment
    Video game,
    Virtual reality, Human robot interaction
  • Smart Cards
    Drivers licenses,
    Immigration, National ID, Passports, Voter
    registration, Welfare fraud
  • Information Security
    Internet Access, Medical
    records, Secure trading terminals
  • Law Enforcement and Surveillance
    Advanced video surveillance, Shoplifting, Suspect
    tracking and investigation.

5
Face Recognition Methodologies
  • Statistical Methods
  • Template matching compares the image with a
    single template using a distance metric.
  • Projection-based methods (Principal Component
    Analysis)
  • ANN based approach
  • Geometrical local feature based ANN (fractal
    codes, i.e eyes, nose, eyebrows)
  • Holistic-based ANN (applying BP on the intensity
    values of the face image)

6
Artificial Neural Networks
An Artificial Neuron
Inputs
Output
f(net)
i
Inputs
Output
An Artificial Neural Network (ANN)
(a set of processing elements (PEs) with
adjustable strength (weights))
A three-layer Perceptron Structure
7
Functionality of ANNs
Desired
  • Training
  • Present data to the ANN
  • ANN computes an output
  • Compare computed output with desired
    output
  • Modify ANN weights to reduce error

Input
Output
Adaptive System
Cost
Change Parameters
Error
Training Algorithm
  • Testing
  • Present new data to the ANN
  • ANN computes an output based on its training

8
ANN Typical Applications
  • Function Approximation
  • Process Modeling, Process Control, Data Modeling,
    Machine Diagnostics
  • Time series Prediction
  • Financial Forecasting, Bankruptcy Prediction,
    Sales Forecasting, Dynamic System Modeling
  • Data Mining
  • Clustering, Data Visualization, Data Extraction
  • Classification
  • Medical Diagnosis, Target Recognition, Character
    Recognition, Face Recognition, Speech
    Recognition, Fraud Detection, etc.

9
Solving face recognition by ANN
Input Node Value
Computed Hidden Node Value
ah
ah
Computed Hidden Node Error
bi, ei
ei
bi
Computed Output Node Value
Computed output Node Error
cj, dj
cj
dj
Target Output
j
wij
wij
i
vhi
vhi
Adjustable Weights Between Hidden Layer and
Output Layer
h
Adjustable Weights Between Input Layer and Hidden
Layer
Training image list
10
Methodology
TO MAP ANNs ONTO A PLATFORM WHERE PERFORMANCE AND
FLEXIBILITY CAN BE BALANCED
Slow training
Lack of clear methodology to determine the
network topology
The function is hard to change after designed
TARGETING FPGAs
Time-consuming to get a hardware implementation
Results
11
Xilinx Multimedia Board
  • Xilinx Virtex-II XC2V2000 FPGA
  • 512?36-bit 130 MHz ZBT (Zero Bus Turnaround RAM)
  • 16M Flash memory and RS232 port
  • push buttons
  • SVGA output
  • Onboard network connection, 10/100 Ethernet
  • Audio CODEC compliant with AC97 and stereo
    amplifier with 18-bit sigmadelta A/Ds and D/As
  • Supports a single channel of real time PAL or
    NTSC video input
  • Headphone and microphone

12
MicroBlaze
  • Full Harvard, RISC pipelined architecture with
    32-bit data and instruction word
  • support Virtex, Virtex-E, Virtex-II, Virtex-II
    Pro, Spartan-II, and Spartan-IIE devices
  • 102 Dhrystone MIPS (D-MIPS) on Virtex-II Pro
    device at 150 MHz
  • Minimum logic requirements 900 logic cells
  • 32-bit pipelined RISC architecture
  • 32?32-bit general purpose registers
  • support Local Memory Bus (LMB) for fast access of
    on-chip BRAMs
  • Support IBM CoreConnect On-chip Peripheral Bus
    (OPB) for accessing peripherals
  • Processor peripherals compatible with PowerPC on
    Virtex-II Pro
  • complete hardware and software development tool
    and debug solution

13
A Pure MicroBlaze System
myjtaguart
mytimer
(i_lmb) instruction local memory bus
OPB Timer/Counter
lmb_bram_cntlr
bram
mblaze
OPB BUS
MicroBlaze
myuart
mygpio
(d_lmb) data local memory bus
HyperTerminal
Switches
14
A Pure MicroBlaze System
Face images are available at
http//www-2.cs.cmu.edu/afs/cs.cmu.edu/project/the
o-8/faceimages/faces/
(Professor Tom M. Mitchells Machine Learning
Course, Carnegie Mellon University)
  • 20 individuals, each with 32 images varying in
    expression, the direction, and whether or not
    their eyes are open
  • In total, 624 grayscale images in a PGM format
  • each image has a resolution of 120128 pixels
  • each image pixel described by a grayscale
    intensity value between 0 (black) and 255 (white)

15
A Pure MicroBlaze System
Start_timer() Randomization() while(iltepoch_num)
for(j0 jltnumber of training images
j) Initial input nodes to
the values of image j Set target
value for image j forward()//impleme
nted by C backward()//implemented by C
update()//implemented by C
Stop_timer() //Testing process Initial input
nodes to the value of a certain testing
image forward() print and interpret the result
void Randomization() for(h0
jltNODE_NUM_I h) for(i0
IltNODE_NUM_H I)
if(rand()2-1) vhirand()
mod 5000/10000.0 else
vhi-rand() mod 5000/10000.0
...
16
A Pure MicroBlaze System
void backward() for(j0 jltNODE_NUM_O
j) dj ?(1-cj)cj(c
j-cj) for(i0 iltNODE_NUM_H
i) temp0.0
for(j0 jltNODE_NUM_O j)
temptemp ?(1- bi)biwijdj
eitemp

void forward() for(i0 iltNODE_NUM_H
i) temp0.0
for(h0 hltNODE_NUM_I h)
temptempahvhi
bi1.0/(1.0exp(-?threhitemp))
for(j0 jltNODE_NUM_O j)
temp0.0 for(i0
iltNODE_NUM_H i)
temptempbiwij
ci1.0/(1.0exp(-?threojtemp))

void update() for(j0 jltNODE_NUM_O
j) for(i0 iltNODE_NUM_H
i)
wijwij?bidj
threojthreoj ?dj
for(i0 iltNODE_NUM_H i)
for(h0 hltNODE_NUM_I h)
vhivhi ?ahei
threhithrehi ?ei

k
17
A Pure MicroBlaze System
18
A Pure MicroBlaze System
  • FPGA usage

Result on pure MicroBlaze system
  • Profiling

Prof. Tom Mitchells C code
19
A H/S System
myjtaguart
mytimer
(i_lmb) instruction local memory bus
OPB JTAG_UART
OPB Timer/Counter
lmb_bram_cntlr
bram
mblaze
OPB Block RAM
OPB Block RAM Controller
OPB BUS
MicroBlaze
FSL0
myuart
mygpio
FSL1
OPB UART Lite
OPB GPIO
(d_lmb) date local memory bus
myhum
Hardware Update Module (HUM)
HyperTerminal
Switches
20
A H/S System
FSL0_S_Clk
FSL1_M_Clk
FSL0_S_Data
FSL1_M_Data
FSL0_S_Control
FSL1_M_Control
FSL0_S_Read
FSL1_M_Write
FSL0_S_Exist
FSL1_M_Full
SYS_CLK
Counter1
21
A H/S System
void update() for(i0 iltNODE_NUM_H
i) for(j0 jltNODE_NUM_O
j) microblaze_nbwrite_
datafsl(wij,0)
microblaze_nbwrite_datafsl(?,0)
microblaze_nbwrite_datafsl(bi,0)
microblaze_nbwrite_datafsl(dj,0)
for(j0 jltNODE_NUM_O j)
microblaze_nbread_datafsl(wij
,1)
  • From MicroBlaze to FSL0 and from FSL1 to
    MicroBlaze

Old wij
To update weight wij
?
bi
dj
New wijold wij?bidj
Old threoj
To update threshold threoj
?
1
dj
New threojold threoj ?dj
22
Hardware Update Unit (HUM)
  • Finite State Machine
  • UPDATE UNIT
  • Input signals
  • Ready_cal (from counter1)
  • Ready_out (from UPDATE_UNITs)
  • Done_out (from counter2)
  • Output signals
  • Start_cal (to start UPDATE_UNITs)
  • Start_out (to start counter2)

wij
?
dj
READY
bi
Waiting
00
1XX
UPDATE UNIT
XX1
Calculating
10
DONE
RESULT
Sending
X1X
01
23
A H/S System
  • FPGA usage
  • Profiling
  • Amdahls law

Execution time
1
old
Speedup
overall
Fraction
Execution time
enhanced
new
(1-Fraction )
enhanced
Speedup
enhanced
1.69
24
Conclusion Future Work
Conclusion
  • A pure software implementation gives you
    flexibility
  • A pure hardware implementation gives your
    performance
  • A H/S system balances flexibility and performance

Future Work
  • Improvement to hardware implementation of adder,
    multiplier and sigmoid function
  • More MicroBlazes with dedicated hardware

25
Thank you!
The presentation and code are available
at http//www.uoguelph.ca/sareibi Email
sareibi_at_uoguelph.ca
26
A H/S System
  • From FSL0 to HUM

CLK
FSL0_S_Exists
D0
D1
D13
D14
D15
FSL0_S_Exists
FSL0_S_Read
D0
Register P00
D1
Register P01
D15
Register P33
27
Backpropagation algorithm
A H/S Co-design Approach
  • Error
  • b f (? v ?a ? ), c f (? w ? b ?
    ), where
  • f(x)(1e ) .
  • E c -c
  • From hidden to output layer
  • ?w -? ??c -c ? f (c )?b
    ??c ?(1-c )?c -c ?b ?d b
  • ?? ?d
  • From input to hidden layer
  • ?v ?a e , e b ? (1-b )?? w d
  • ?? ?e

i
hi
h
i
j
ij
i
j
h
i
-x
-1
1
2
k
j
j
2
?E

k
k
j
j
?w
ij
j
i
j
j
j
j
i
j
i
ij
j
j
hi
h
i
i
i
i
ij
j
j
i
i
28
A Single Virtex-II FPGA Slice
Appendix
  • 4-input LUT
  • 16-bit distributed SelectRAM
  • 16-bit shift register
  • D FlipFlop

29
OPB Timer/Counter
Capture Trig0
Capture Trig1
OPB BUS
TCSR0
TLR0
TLR1
TCSR1
TCR0
TCR1
OPB BUS
TC_Interrupt
GenerateOut0
GenerateOut1
Void Start_timer() XIo_Out32(XPAR_OPB_TIME
R_TLR0,0X00000000) XIo_Out32(XPAR_OPB_TIMER
_TCSR0,0X00000020) XIo_Out32(XPAR_OPB_TIMER
_TCSR0,0X00000080)
Void Stop_timer() cyclesXio_In32(XPAR_OPB
_TIMER_TCR0) Xio_Out32(XPAR_OPB_TIMER_TCSR0
,0X00000000)
30
Introduction
A biological neuron system
The Dendrites (Greek, dendr /o tree) of a neuron
are its many short, branching fibers extending
from the cell body or soma. These fibers increase
the surface area available for receiving incoming
information.
The Synapse (Greek, syn union, association) is
the point of connection between two neurons or
between a neuron and a muscle or gland.
Electrochemical communication between neurons
takes place at these junctions. The synapse
consists of three elements 1) the presynaptic
membrane which is formed by the terminal button
of an axon, 2) the postsynaptic membrane which is
composed of a segment of dendrite or cell body,
and 3) the space between these two structures
which is called the synaptic cleft. Some cells in
the nervous system have as many as two hundred
thousand synaptic connections.
Axon is a singular fiber that carries information
away from the soma to the synaptic sites of other
neurons (dendrites and somas), muscles, or
glands.
31
Background
Floating-point format
Arithmetic formats for implementing MLP-BP
32
Background
Fixed-point format
Arithmetic formats for implementing MLP-BP
-4
33
Background
Precision and Range
Arithmetic formats for implementing MLP-BP
If we only have 4 bits (?2) to represent a
positive real number
Floating-point
0 0 0 0
0 0 0 1
0
1
2
3
4
5
6
7
exponent
mantissa
Fixed-point
0 0 0 0
0 0 0 1
0
1
2
3
3.75
  • Floating-point format has large dynamic range but
    with varied precision
  • fixed-point format has limited range but its
    precision is constant

34
RC implementations
Floating-point adder
Arithmetic formats for implementing MLP-BP
OP1
OP2
READY
EXCEPTION_IN
3
5
3.15?10 2.14 ?10
5
3
2.14?10 3.15 ?10
PARAMETERIZED _COMPARATOR
5
5
2.14?10 0.0315 ?10
5
2.1715?10
SWAP
5
2.17?10
f1
e2
f2
s1
s2
e1
SHIFT_ADJUST
  • Re-arrangement of input operands (SWAP)
  • Pre-shift for mantissa alignment (SHIFT_ADJUST)
  • Mantissa addition/subtraction (ADD_SUB)
  • Post-shift of mantissa and increment of exponent
    for result correction (CORRECTION)

ADD_SUB
s
e
f
clear
enable
CORRECTION
exception
EXCEPTION_OUT
RESULT
DONE
Pavle Belanovic Library of Parameterized Hardware
Modules for Floating-Point Arithmetic with An
Example Application M.S. Thesis, Dept of
Electrical and Computer Engineering,
Northeastern University, June 2002
35
RC implementations
Floating-point multiplier
Arithmetic formats for implementing MLP-BP
EXCEPTION_IN
READY
OP1
OP2
5
3
3.15?10 ?2.14 ?10
f1
f2
e1
e2
5
3
2.14?10 ? 3.15 ?10
s1
s2
cout
8
6.741 ?10
exp_bit1
BIAS-1
-
0
DONE
EXCEPTION_OUT
RESULT
Pavle Belanovic Library of Parameterized Hardware
Modules for Floating-Point Arithmetic with An
Example Application M.S. Thesis, Dept of
Electrical and Computer Engineering,
Northeastern University, June 2002
36
RC implementations
Fixed-point adder
Arithmetic formats for implementing MLP-BP
Fixed-point adder
OP1
OP2
m
m
Fixed-point adder
Carry Lookahead adder
Ripple carry adder
m1
OP2(m-1)
OP1(m-1)
OP2(1)
OP1(1)
OP2(0)
OP1(0)
OP2(m-1)
OP1(m-1)
OP2(1)
OP1(1)
OP2(0)
OP1(0)
0
0
From CLA logic
.
.
.
.
.
.
.
.
.
SUM(m)
SUM(m-1)
SUM(m-1)
SUM(1)
SUM(0)
SUM(1)
SUM(0)
SUM(m)
37
RC implementations
Fixed-point multiplier
Arithmetic formats for implementing MLP-BP
OP1
OP2
READY
EXCEPTION_IN
m
m
A
B
Unsigned multiplier
PRODUCT
2?m
DONE
EXCEPTION_OUT
RESULT
38
RC implementations
Fixed-point unsigned multiplier
Arithmetic formats for implementing MLP-BP
A
B
READY
A
B
READY
Extender (extend to double length of the input)
Extender (extend to double length of the input)
Shifter (shift LSB out at each cycle when
start/stop is 1)
start/stop
0
Controller
0
0
0
PRODUCT
PRODUCT
39
ResultsDiscussion
Tested formats and implementation details
Arithmetic formats for implementing MLP-BP
40
ResultsDiscussion
Comparison of various formats space requirements
Arithmetic formats for implementing MLP-BP
41
ResultsDiscussion
Comparison of various formats space requirements
Arithmetic formats for implementing MLP-BP
Targeting Spartan-IIE FPGA
Targeting Virtex-II FPGA
42
ResultsDiscussion
A Pure Hardware XOR ANN
Arithmetic formats for implementing MLP-BP
Cant be separated by a line! Not classifiable by
a single perceptron
1
0
1
0
43
Contributions
A H/S Co-design Approach
  • Investigation of different arithmetic formats for
    implementing ANNs
  • Construction of a pure MicroBlaze system for face
    recognition problem by using ANN technique
  • Construction of a MicroBlaze with dedicated
    hardware system (H/S) for face recognition
    problem by using ANN technique
  • Submission of a journal paper to the Canadian
    Journal of Electrical and Computer Engineering
    (CJECE)
Write a Comment
User Comments (0)
About PowerShow.com