Design Review - PowerPoint PPT Presentation

1 / 82
About This Presentation
Title:

Design Review

Description:

Design Review. Scooby Doo gang: Jonathan Hsieh. Annie Pettengill. Jim Hollifield. Jeff Barbieri ... Opening / closing book (download/upload in play ... – PowerPoint PPT presentation

Number of Views:118
Avg rating:3.0/5.0
Slides: 83
Provided by: jonatha55
Category:
Tags: design | doo | review | scooby

less

Transcript and Presenter's Notes

Title: Design Review


1
Design Review
  • Scooby Doo gang
  • Jonathan Hsieh
  • Annie Pettengill
  • Jim Hollifield
  • Jeff Barbieri
  • Matt Silverstein

2
Design Goals
  • Mystery Machine Requirements
  • Correctness / Proficiency
  • Compliance to external interface / protocol
  • support an interface for human to play
  • Asynchronous Decide now button

3
Other Goals
  • Priorities
  • Speedup over pure baseline software 68HC11
    based-implementation.
  • Hardware functional units
  • software optimization
  • Interesting architecture
  • Opening / closing book (download/upload in play
    configurations)
  • Hardware HCI. (software hci optional)

4
Function call dependance
Main
Think
Search
Queisce
Search
Init
Quiesce
In_check
Gen
Gen
In_check
For
eval
Gen
Gen_caps
Sort_pv
Think
Sort_pv
For
MakeMove
For
Gen
MakeMove
MakeMove
Search
MakeMove
Quiesce
Takeback
Gen
Takeback
5
More Call depedances
In_check
eval
Make_move
For
Attack
In_check
Eval_light_pawn
Can_castle
Gen
Eval_light_king
In_check
Gen_push
Takeback
Eval dark pawn
Eval dark king
Gen_push
Gen_promote
6
Read / Write access analysis
  • Eval
  • there are no writes from the board structure (but
    many reads).
  • In_check / attack
  • many reads. Returns a boolean, could be a array
    of bits to lookup to see if being attacked
  • Gen
  • generates list values.
  • Variable times

7
Quantify parameters
  • A program on sun machines
  • Compiles code with special hooks
  • graphically displays call info and run time info
    for profiling programs.
  • The idea -- Amdahls law -- speed up the slowest
    parts get most speedup
  • slowest parts move to hardware!

8
Quantify Results
  • After doing about 20 moves these functions take
    the most time (not including print and scanf.

9
Summed run-time analysis
  • In_check -gt attack
  • 55 of program run time!
  • Straight forward for hardware
  • Eval -gt eval_
  • 25 program run time!
  • Straight forward for hardware.
  • Gen
  • 15 of program!

10
Conclusion
  • Optimize in_check, eval, and gen by placing in
    hardware
  • This is most effective if board in FPGA
    registers. -- try to figure out if possible to
    use FPGA as memory for processor.
  • Keep recursion on processor.

11
Hardware software partition
Serial interface Can access anything memory cpu
can in simulation
SW / CPU Memory structure allow for recursion /
dynamic structures Compiler can handle
that Recursion cannot really happen parallely
(?) Should be able to access RAM as well as FPGA
registers using
Memory Move histories Recursion stacks
FPGA Many parallel executions happening High
speed custom implementations Good for static
structures and constants Simple for read only
functions if things read in registers. (always
execute!)
12
Implementation plan
13
Implementation Plan
  • Design hierarchy
  • HW/SW split
  • HW subsystems / goals
  • SW goals
  • Physical design
  • HW/SW interface
  • Memory access architecture
  • FPGA/HC11/mem interaction.

14
Implementation Plan
  • Handle all recursion on hc11 -- compiler and
    assembler code best for memory structures (trees,
    hash tables, etc.)
  • Software analysis shows that 3 function trees
    in_check/attack, eval, and gen take the majority
    of the algorithms time.

15
Shared memory architecture
FPGA
HC11
Psuedo clock clk
Clk
clk
Memory (Shared)
16
Architecture features
  • Observation
  • Memory (12 nsgt 83 Mhz) is as fast as max FPGA
    speed(100 Mhz).
  • about 10x faster than 8Mhz HC11.
  • Zoinks!
  • Clock set at high FPGA clock speed
  • HC11 clock psuedo clock. a function in the
    FPGA -- slows the FPGA clock to something in HC11
    range.

17
Clocking Diagram
FPGA clk
FPGA clk x2
FPGA clk x4 (HC11 psuedo clk)
Counter for clk
000
001
010
011
100
101
110
111
18
Clocking Diagram
FPGA clk
FPGA Mem read / Write
HC11 Mem read / Write
FPGA Mem read / Write
FPGA Mem read / Write
FPGA Mem read / Write
FPGA Mem read / Write
FPGA clk x8 (HC11 psuedo clk)
FPGA mem
HC11 mem
FPGA mem
19
Psuedoclock possiblities
  • Could allow for Processor to access memory as if
    it were the only thing using it.
  • While the Processor is waiting for next clock
    tick, and done with memory, FPGA can R/W memory.
  • FPGA can run and calculate information
    concurrently with the HC11!

20
FPGA Hardware Units
HC11
Psuedo Clk
Memory Bus Controller
Mem
Chess Board Registers
Chess Piece Registers
HCI
Eval Unit
Attack/Check Unit
Gen Unit
21
FPGA/Memory organization
  • Specific addressses would contain specific
    information all the time.
  • Board representation address
  • current eval score
  • in check map
  • next generated moves
  • Addresses can be proxied by fpga so that fpga
    registers acts like memory to HC11!

22
Performance prediction
  • Baseline 1
  • Best case based on profiling (assuming hyper
    idealized HW)
  • 55 gt 0 attack
  • 25 gt 0 eval.
  • 15gt 0 gen.
  • HW accelerated gt 0.05 baseline!
  • 20x speedup.

23
Attack/in_check
  • Annie Pettengill

24
In_check
  • Input is the color of the side to check if it is
    in check
  • Outputs true if in check, otherwise outputs false

25
in_check
  • In_checks looks at each of the 64 squares for the
    king of the color passed in to the function
  • It then calls attack on that square and color
  • If we used a pieces implementation (versus board)
    this would change a for loop and if statement
    into a single call of the attack function

26
In_check
Board Implementation
Piece Implementation
64 times..
27
Attack
  • Inputs the square the piece is on and the color
    of the other side
  • Outputs true if the square is being attacked by
    the color s and false if it is not

28
Details about Attack
  • The pawn is looked at separately because the way
    it moves is different from the way it attacks
  • The moves as organized now are different for
    black and white pawns
  • The different pieces are evaluated for every
    direction-to see whether they can actually move
    there and whether they can slide

29
Bigger Picture Attack Tables
  • Construct two chess boards one for white pieces
    and one for black
  • Instead of a piece, each square would contain a
    true or false depending on whether the square was
    being attacked for that color piece

30
So.
  • Every time a player makes a move, the attack
    function on the fpga, rebuilds the table
  • Can tailor the attack function for specific
    pieces in specific squares using a combination of
    board and piece implementation

31
Implementation of Attack Tables
32
Advantages
  • Avoid lots of useless searching, you know exactly
    where each piece is with the piece implementation
  • If running attack on one square, why not on 128
    squares in parallel? or perhaps use a piece
    implementation for of attack table and only run
    it on 32 squares..
  • Acts as a lookup table for other functions

33
Parameter versus Internal
  • Use special tailoring for parameter squares that
    only check for pertinent cases use a special
    numbering scheme for the location of pieces in
    piece implementation
  • Internal squares stay the same

34
Eval
  • Matthew Silverstein

35
Eval() Function Inputs
  • What side the current move is for
  • Light or Dark
  • Board Configuration
  • Algorithm uses two 64 space arrays to represent
    the board
  • Which pieces are where (piece64 structure)
  • What color the pieces are (color64 structure)
  • The hardware overhead of can be cut by using an
    array that is indexed by piece, not but position

36
Eval() Function Outputs
  • Score
  • an integer value
  • based the present configuration of the board
  • Calibrated for if the current player is Light or
    Dark

37
Eval
  • The function call breaks down into three main
    subsections
  • Initialization
  • Takes the current board configuration and sets
    all of the internal registers to an appropriate
    value
  • Sets up the pawn_rank, pawn_mat, and pawn_count
    structures

38
  • Bonus / Penalty assess
  • For each square calculates either a bonus or
    penalty based upon relative benefit of certain
    pieces being on that square
  • Sums the results for each square and provides a
    Light_score and a Dark_score
  • Calculate score
  • Combines the Light and dark scores to provided an
    single return value for the function, based on if
    it is presently Light or darks turn.

39
Eval structure
Init
Board registers
Bonus and penalty cases
Light or dark
Calculate
40
Bonus / Penalty Assess structure
One for each input square
.
adder
Adds the values generated at each block
Score_light
41
Eval_pawn
  • Inputs
  • Square to calculate penalty for
  • Pawn_rank structure
  • Pawn_count structure
  • Based on the inputs there is a possibility of
    assessing up to four different penalties

42
Eval_pawn Penalties, Bonuses
  • Penalty A if theres a pawn behind this one
  • Penalty B if there are no friendly pawns
    adjacent to the current pawn
  • Penalty C if the pawn is not isolated
  • Bonus D if the pawn is passed

43
Eval_pawn structure
Square pawn_rank pawn_count
Penalty A 0 Penalty B 0 Penalty C 0 Penalty D 0
Pawn penalty Control logic
mux
mux
mux
mux
adder
44
Eval_king
  • Inputs (same as eval_pawn)
  • Square to calculate penalty for
  • Pawn_rank structure
  • Pawn_count structure
  • The function returns a penalty value that is
    adjusted depending on how well shielded the king
    is by its own pawns

45
Eval_king Penalties
  • The File A, B, C, F, G, and H Penalties
  • These penalties are assessed when there is no
    pawn in File, one row away from the king.
  • The magnitude of the penalty is dependent on the
    distance in the row the pawn is from the king
  • The pawn attack Penalty
  • This penalty is assessed if the enemy's pawns
    have advanced too far down the board towards the
    king

46
Eval_king structure
File A penalty File B penalty File C
penalty Pawn Approach File F penalty File G
penalty File H penalty Pawn Approach No
penalty
Pawn_count pawn_rank
Adder
Adder
control
mux
47
Bonus Penalty Assess Structure
  • Switching from a position to a piece
    representation of the board
  • No longer need to repeat mux 64 times
  • Adder now has 16 inputs one for each piece (vs.
    64 inputs).
  • Knight, Bishop, and Rock still strait table
    lookups
  • Pawn_eval is repeated 8 times
  • Still better then 64 times

48
Gen
  • Jim Hollifield

49
gen() function
  • Searches through all 64 spaces
  • Skips empty spaces and opponent pieces
  • Creates all possible moves for each friendly
    piece
  • Pushes (with helper function gen_push() onto
    move_stack

50
Possible Moves
  • Basic Pawn Moves
  • Move forward 1 or 2 spaces
  • Take Left or Right
  • Non-pawn Piece (N, B, R, Q, K) moves
  • B, R, Q can slide (move more than one space), but
    stops when another piece is blocking path
  • Castle (King or Queen side)
  • En Passant

51
For Pawns
Pawn
Light
Dark
Take Left
Take Right
Same as Light, But Reversed
Move Forward 1 Space
Move Backward 1 Space
52
For Non-Pawn Pieces
Is space Empty?
No
Yes
Is Piece Friendly?
Does Piece Slide?
Yes
No
Yes
No
Take Piece
Move to next square in current direction
Move to next direction
No More Directions
Edge of Board
Move to next piece
53
Other Functions
  • gen_caps()
  • Same as gen(), except only checks for capture
    moves
  • called by quiesce()
  • gen_push()
  • Pushes moves from gen and gen_caps onto
    move_stack
  • gen_promote()
  • Pushes pawn promote move onto move_stack
  • One move for each possible piece (Q, B, R, or N)

54
move_stack
Before Ply 0
gen_begin
move 0
move 1
. . .
gen_end
move 2
Ply 0
After Ply 0 (before Ply 1)
move n
move 0
move 1
. . .
gen_begin
move 2
Ply 1
gen_end
move n
After Ply 1 (before Ply 2)
move 0
Ply 2
gen_begin
. . .
gen_end
55
HW/SW Breakdown
  • FPGA puts moves into stack structure
  • Currently done by gen(), gen_caps(), gen_push(),
    and gen_promote() functions
  • HC11 sorts stack structure
  • Currently done by sort() and sort_pv() functions

56
gen() Hardware
Move Generator (FSM)
Pieces Board
Stack (in Shared Memory)
Pusher
Moves
gen_begin
current_ply gen_begin gen_end
57
Generator FSM
Reset
x7
x7
x7
Q move Back Right

P take Left
Q move Back
Q move Right
Pawn
x7


x7
P take Right

P move 2
R move Right
x8
Q move Forward Right
Q move Back Left
Queen
x7
x7
P move 1
x7

R move Forward
R move Left
x2

Q move Left
Q move Forward
Q move Forward Left
x7
Rook
R move Back
x7
N move 7
x7
N move 6
x7
N move 0
Knight
N move 5
K move Forward Right
N move 1
K move Right
x7
K move Forward
x2
B move Forward Right
N move 4
N move 2
K move Back Right
K move Forward Left
King
x7
N move 3
B move Forward Left
B move Back Left
x2
K move Back
K move Left
x7
B move Back Right
K move BackLeft
Bishop
Special
x7
- moves skipped by gen_caps()



Castle K side
Castle Q side
En Passant Left
En Passant Right
DONE

- moves checked by gen_promote()
58
HCI
  • Jeff Barbieri

59
Possible Ideas
  • Interface Design 1
  • 8x8 Bar LED board on left displaying the piece
    that is in each location (period is black/white)
  • DIP Switches to select the from/to for a move
  • Clock to make the move
  • Interface Design 2
  • 8x8 Bar LED board on left displaying the piece
    that is in each location (period is black/white)
  • Button in each square next to the Bar LED to
    select the from and then to for a move
  • Clock to make the move

60
Interface Design Idea 1
Latches Other Logic
Ribbon Cable Connection
Make Move
From
To
DIP
DIP
61
Interface Design Idea 2
Ribbon Cable Connection
Make Move
62
Other Ideas
  • Beep to signify illegal move
  • Touch-screen for the board
  • LEDs to signify which players move it is
  • Alternate board layouts (many of these)

63
Considerations
  • Selection of parts
  • Numeric, Alphanumeric, LCD, etc. LEDs
  • Push buttons
  • Latches
  • Costs for parts
  • Number of FPGA pins needed
  • Time to wire-wrap board
  • for parts

64
Considerations
  • Feasibility of design
  • Design of board in relation to design of chess
    game

65
Summary
  • Many possibilities
  • Two basic likely designs
  • Lots of thought and planning needs to go into
    design before acquiring parts and building

66
Software optimizations
  • Jonathan Hsieh

67
Software optimizations
  • All that remains is
  • sort 2.49 -gt currently a O() alg, can change to
    be a heap (log n / constant)
  • makemove 1.15 -gt these will be slower
  • takeback 0.88 -gt these will be slower
  • quiesce 0.38 -gt will probably go up
  • search 0.19

68
Algorithm improvements
  • search
  • Killer Heuristic (search attack branches first,
    implemented already done)
  • think on opponents time. (multi
    threading/interrupts! Do it)
  • history heuristic. (built in already?)
  • tighter searches (could be implemented)
  • refutation tables (not sure what they are)
  • transposition tables (not sure what they are)

69
  • Eval
  • Pawn formation hash table (not needed, in hw)
  • King safety hash table. (not needed, in hw)
  • Jinkies!
  • can probably squeeze out another 20 reduction.
    0.05 gt 0.04.
  • Idealized speedup target 25x!

70
Integration plan
Software/Profiling
HW/SW Partitioning
Baseline Stats
FPGAAttack
FPGAEval
FPGAGen
HW/SW Interface
SW modification / optimization
FPGAHCI
CoSim
Physical Design
Integration/Debugging
Optimizations
Baseline Stats
71
Division of labor
If it werent for those meddling kids!
72
Jon
  • Group leader
  • prevent him from dropping the class
  • Software guru (algorithm, HC11, HiWare)
  • strong software background
  • hates wires
  • overall system design
  • some hw/sw partitioning experience (research).

73
Jim
  • move generation hardware considerations
  • when wiring requirements dried up, we moved from
    individual projects to pairs -- volunteered to
    put thought into and implement
  • Wirewrap Whiz (Physical Interfacing)/ FPGA
    interface Whiz
  • doesnt care what he does and didnt volunteer
    for a verilog job at first

74
Jeff
  • Project management software
  • Experience with lots of MS software.
  • HCI Hardware
  • display, inputs, verilog, (beeper?) and (wiring?)

75
Matt
  • Verilog Whiz
  • eval function design
  • wanted it and did a good job with it.
  • Memory Master
  • got FPGA demo 0 mem-gtfpga-gtmem proof working.

76
  • Annie
  • Soldering Wire wrap Queen
  • wire wrapped everything a lot faster than jim
  • HC11 hardware interface
  • Got that portion of demo 0 to work.
  • Attack function design.

77
Demonstration Plan
  • Demo 1 (week of 10/4)
  • Stats on baseline chess algorithm.
  • HW/SW partitioning and interfacing method.
  • Details about HW sub systems
  • eval / attack / gen / hci
  • (Co)simulation of separate parts of HW/SW
    partitioning

78
  • Demo 2 (Week of 11/1)
  • Frozen Physical Hardware
  • Co Simulation working with hw/sw
  • Chess that works and communicates.
  • Preliminary stats on new design
  • Optimizing / Debugging process

79
  • Final Demo 11/29
  • Optimizations and speedup statistics.
  • PC interface GUI (depending on interface)

80
Demo 1 Work Schedule
  • 9/17 F. Demo 0 completion
  • 9/20 M. Internal design review
  • 9/22 W. HW/SW partitioning details
  • 9/23 R. Design review
  • 9/29 W. HW/SW interfacing resolved
  • 10/4 M. Verilog simulations for FPGA stuff.

81
Demo 2 Work Schedule
  • 10/11 M. HW/SW integration/Co-simulation
  • 10/18 M Physical Hardware frozen
  • 10/25 Algorithm Optimizations
  • 11/1 Clock speed optimizations

82
Final Demo
  • Final review ready. Add more bells and whistles
  • Have one month for unpredicted delays..
Write a Comment
User Comments (0)
About PowerShow.com