From Combinational to Sequential Circuits to Simple Processors - PowerPoint PPT Presentation

About This Presentation
Title:

From Combinational to Sequential Circuits to Simple Processors

Description:

... NOR x F = x' 1 inverter 0 ... Minimize size Minimum cover Minimum cover that is prime Heuristics Multilevel minimization Trade performance for size Pareto ... – PowerPoint PPT presentation

Number of Views:50
Avg rating:3.0/5.0
Slides: 73
Provided by: VahidSiam
Learn more at: http://web.cecs.pdx.edu
Category:

less

Transcript and Presenter's Notes

Title: From Combinational to Sequential Circuits to Simple Processors


1
From Combinational to Sequential Circuits to
Simple Processors
2
Reminder Embedded Systems
2
3
Outline
  • Introduction
  • Combinational logic
  • Sequential logic
  • FSM design
  • Custom single-purpose processor design
  • RT-level custom single-purpose processor design

4
SYNTHESIS METHODOLOGIES
5
Increasing abstraction level in design
specification
  • Higher abstraction level focus of
    hardware/software design evolution
  • Description smaller/easier to capture
  • E.g., Line of sequential program code can
    translate to 1000 gates
  • Many more possible implementations available
  • (a) Like flashlight, the higher above the ground,
    the more ground illuminated
  • Sequential program designs may differ in
    performance/transistor count by orders of
    magnitude
  • Logic-level designs may differ by only power of 2
  • (b) Design process proceeds to lower abstraction
    level, narrowing in on single implementation

5
6
What is Synthesis
  • Automatically converting systems behavioral
    description to a structural implementation
  • Complex whole formed by parts
  • Structural implementation must optimize design
    metrics
  • Synthesis is more expensive, it is complex than
    compilers
  • Cost 100s to 10,000s
  • User controls 100s of synthesis options
  • Optimization critical
  • Otherwise could use software
  • Optimizations different for each user
  • Run time hours, days

7
Gajskis Y-chart
  • Each axis represents type of description
  • Behavioral
  • Defines outputs as function of inputs
  • Algorithms but no implementation
  • Structural
  • Implements behavior by connecting components with
    known behavior
  • Physical
  • Gives size/locations of components and wires on
    chip/board
  • Synthesis converts behavior at given level to
    structure at same level or lower
  • E.g.,
  • FSM ? gates, flip-flops (same level)
  • FSM ? transistors (lower level)
  • FSM X registers, FUs (higher level)
  • FSM X processors, memories (higher level)

FU functional unit FSM finite state machine
7
8
Example of Custom Processor
  • Processor
  • Digital circuit that performs a computation tasks
  • Controller and datapath
  • General-purpose variety of computation tasks
  • Single-purpose one particular computation task
  • Custom single-purpose non-standard task
  • A custom single-purpose processor may be
  • Fast, small, low power
  • But, high NRE, longer time-to-market, less
    flexible

9
CMOS transistor on silicon
  • Transistor
  • The basic electrical component in digital systems
  • Acts as an on/off switch
  • Voltage at gate controls whether current flows
    from source to drain
  • Dont confuse this gate with a logic gate

10
CMOS transistor implementations
  • Complementary Metal Oxide Semiconductor
  • We refer to logic levels
  • Typically 0 is 0V, 1 is 5V
  • Two basic CMOS types
  • nMOS conducts if gate1
  • pMOS conducts if gate0
  • Hence complementary
  • Basic gates
  • Inverter, NAND, NOR

11
Basic logic gates
F x y AND
F x ? y XOR
F x Driver
F x y OR
F (x y) NAND
F x Inverter
F (xy) NOR
12
Combinational logic design
A) Problem description y is 1 if a is to 1, or
b and c are 1. z is 1 if b or c is to 1, but not
both, or if all are 1.
13
Combinational components
Students should be able to use all kinds of
combinational blocks in synthesis of various
problems
14
Levels of synthesis
  • Logic-level behavior to structural implementation
  • Logic equations and/or FSM to connected gates
  • Combinational logic synthesis
  • Two-level minimization (Sum of products/product
    of sums)
  • Best possible performance
  • Longest path 2 gates (AND gate OR gate/OR
    gate AND gate)
  • Minimize size
  • Minimum cover
  • Minimum cover that is prime
  • Heuristics
  • Multilevel minimization
  • Trade performance for size
  • Pareto-optimal solution
  • Heuristics
  • FSM synthesis and Control Unit Synthesis
  • State minimization
  • State encoding
  • State decomposition
  • Special architectures

15
Minimum Cover
16
Two-level logic minimization
  • Represent logic function as sum of products (or
    product of sums)
  • AND gate for each product
  • OR gate for each sum
  • Gives best possible performance
  • At most 2 gate delay
  • Goal minimize size
  • Minimum cover
  • Minimum of AND gates (sum of products)
  • Minimum cover that is prime
  • Minimum of inputs to each AND gate (sum of
    products)

16
17
Minimum cover
  • Minimum of AND gates (sum of products)
  • Literal variable or its complement
  • a or a, b or b, etc.
  • Minterm product of literals
  • Each literal appears exactly once
  • abcd, abcd, abcd, etc.
  • Implicant product of literals
  • Each literal appears no more than once
  • abcd, acd, etc.
  • Covers 1 or more minterms
  • acd covers abcd and abcd
  • Cover set of implicants that covers all minterms
    of function
  • Minimum cover cover with minimum of implicants

18
Minimum cover K-map approach
  • Karnaugh map (K-map)
  • 1 represents minterm
  • Circle represents implicant
  • Minimum cover
  • Covering all 1s with min of circles
  • Example direct vs. min cover
  • Less gates
  • 4 vs. 5
  • Less transistors
  • 28 vs. 40

K-map sum of products
K-map minimum cover
Minimum cover
Fabc'd' a'cd ab'cd
Minimum cover implementation
2 4-input AND gate 1 3-input AND gates 1 4 input
OR gate ? 28 transistors
18
19
Minimum cover that is a prime cover
  • Minimum of inputs to AND gates
  • Prime implicant
  • Implicant not covered by any other implicant
  • Max-sized circle in K-map
  • Minimum cover that is prime
  • Covering with min of prime implicants
  • Min of max-sized circles
  • Example prime cover vs. min cover
  • Same of gates
  • 4 vs. 4
  • Less transistors
  • 26 vs. 28

19
20
Minimum cover heuristics
  • K-maps give optimal solution every time
  • Functions with gt 6 inputs too complicated
  • Use computer-based tabular method
  • Finds all prime implicants
  • Finds min cover that is prime
  • Also optimal solution every time
  • Problem 2n minterms for n inputs
  • 32 inputs 4 billion minterms
  • Exponential complexity
  • Heuristic
  • Solution technique where optimal solution not
    guaranteed
  • Hopefully comes close

21
Heuristics iterative improvement
  • Start with initial solution
  • i.e., original logic equation
  • Repeatedly make modifications toward better
    solution
  • Common modifications
  • Expand
  • Replace each nonprime implicant with a prime
    implicant covering it
  • Delete all implicants covered by new prime
    implicant
  • Reduce
  • Opposite of expand
  • Reshape
  • Expands one implicant while reducing another
  • Maintains total of implicants
  • Irredundant
  • Selects min of implicants that cover from
    existing implicants
  • Synthesis tools differ in modifications used and
    the order they are used

22
Multilevel logic minimization
  • Trade performance for size
  • Increase delay for lower of gates
  • Gray area represents all possible solutions
  • Circle with X represents ideal solution
  • Generally not possible
  • 2-level gives best performance
  • max delay 2 gates
  • Solve for smallest size
  • Multilevel gives pareto-optimal solution
  • Minimum delay for a given size
  • Minimum size for a given delay

multi-level minim.
delay
2-level minim.
size
23
Example of logic factorization
  • Minimized 2-level logic function
  • F adef bdef cdef gh
  • Requires 5 gates with 18 total gate inputs
  • 4 ANDS and 1 OR
  • After algebraic manipulation
  • F (a b c)def gh
  • Requires only 4 gates with 11 total gate inputs
  • 2 ANDS and 2 ORs
  • Less inputs per gate
  • Assume gate inputs 2 transistors
  • Reduced by 14 transistors
  • 36 (18 2) down to 22 (11 2)
  • Sacrifices performance for size
  • Inputs a, b, and c now have 3-gate delay
  • Iterative improvement heuristic commonly used

23
24
Control automata
inputs
Counter
Counter
Register
Small FSM
Address of outputs
Address of outputs
page
ROM or similar logic
ROM or similar logic
outputs
Counter
Register
ROM or similar logic
Variant 1
Variant 2
Variant 3
25
Control automata
Load new address
inputs
Small FSM
Load/count
Register/ Counter
outputs
Address of outputs
Counter
Register
ROM or similar logic
ROM or similar logic
Variant 4
Variant 6
26
FSM synthesis
27
FSM synthesis
  • FSM to gates
  • State minimization
  • Reduce of states
  • Identify and merge equivalent states
  • Outputs, next states same for all possible inputs
  • Tabular method gives exact solution
  • Table of all possible state pairs
  • If n states, n2 table entries
  • Thus, heuristics used with large of states
  • State encoding
  • Unique bit sequence for each state
  • If n states, log2(n) bits
  • n! possible encodings
  • Thus, heuristics common

27
28
Sequential components
Q lsb - Content shifted - I stored in msb
Q 0 if clear1, I if load1 and
clock1, Q(previous) otherwise.
Q 0 if clear1, Q(prev)1 if count1 and
clock1.
Reversible shifter shifts left and
rigth Reversible counter counts up and
down Reading it operation in most of registers
generalized registers.
29
Sequential logic design
A) Problem Description You want to construct a
clock divider. Slow down your pre-existing clock
so that you output a 1 for every four clock cycles
  • Given this implementation model
  • Sequential logic design quickly reduces to
    combinational logic design

30
Sequential logic design (cont.)
31
Custom single-purpose processor basic model
32
Example greatest common divisor
33
Example greatest common divisor
  • First create algorithm
  • Convert algorithm to complex state machine
  • Known as FSMD finite-state machine with datapath
  • Can use templates to perform such conversion

(c) state diagram
(b) desired functionality
0 int x, y 1 while (1) 2 while
(!go_i) 3 x x_i 4 y y_i 5 while
(x ! y) 6 if (x lt y) 7
y y - x else 8
x x - y 9 d_o x
34
State diagram templates
35
Creating the datapath
  • Create a register for any declared variable
  • Create a functional unit for each arithmetic
    operation
  • Connect the ports, registers and functional units
  • Based on reads and writes
  • Use multiplexors for multiple sources
  • Create unique identifier
  • for each datapath component control input and
    output

36
Creating the controllers FSM
  • Same structure as FSMD
  • Replace complex actions/conditions with datapath
    configurations

37
Splitting into a controller and datapath
go_i
Controller
!1
1
0000
1
!(!go_i)
2
0001
!go_i
2-J
0010
x_sel 0 x_ld 1
3
0011
y_sel 0 y_ld 1
4
0100
x_neq_y0
5
0101
x_neq_y1
6
0110
x_lt_y1
x_lt_y0
y_sel 1 y_ld 1
x_sel 1 x_ld 1
7
8
0111
1000
6-J
1001
5-J
1010
d_ld 1
9
1011
1-J
1100
38
Controller state table for the GCD example
39
Completing the GCD custom single-purpose
processor design
  • We finished the datapath
  • We have a state table for the next state and
    control logic
  • All thats left is combinational logic design
  • This is not an optimized design, but we see the
    basic steps

You may be asked in homeworks or exams or
projects to optimize the design with some respect
such as area, speed , power or testability
40
Example Bus Bridge Design
41
RT-level custom single-purpose processor design
Example Bus Bridge
  • We often start with a state machine
  • Rather than algorithm
  • Cycle timing often too central to functionality
  • Example
  • Bus bridge that converts 4-bit bus to 8-bit bus
  • Start with FSMD
  • Known as register-transfer (RT) level
  • Exercise complete the design

42
RT-level custom single-purpose processor design
(cont)
Example Bus Bridge
43
Optimization in Synthesis
44
Optimizing single-purpose processors
  • Optimization is the task of making design metric
    values the best possible
  • Optimization opportunities
  • original program
  • FSMD
  • datapath
  • FSM

45
Optimizing the original program
  • Analyze program attributes and look for areas of
    possible improvement
  • number of computations
  • size of variable
  • time and space complexity
  • operations used
  • multiplication and division very expensive

46
Optimizing the original program (cont)
original program
optimized program
0 int x, y 1 while (1) 2 while
(!go_i) 3 x x_i 4 y y_i 5 while
(x ! y) 6 if (x lt y) 7
y y - x else 8
x x - y 9 d_o x
0 int x, y, r 1 while (1) 2 while
(!go_i) // x must be the larger number
3 if (x_i gt y_i) 4 xx_i 5
yy_i 6 else 7
xy_i 8 yx_i 9
while (y ! 0) 10 r x y 11
x y 12 y r 13 d_o
x
replace the subtraction operation(s) with modulo
operation in order to speed up program
GCD(42, 8) - 9 iterations to complete the loop x
and y values evaluated as follows (42, 8), (43,
8), (26,8), (18,8), (10, 8), (2,8), (2,6), (2,4),
(2,2).
GCD(42,8) - 3 iterations to complete the loop x
and y values evaluated as follows (42, 8),
(8,2), (2,0)
47
Optimizing the FSMD
  • Areas of possible improvements
  • merge states
  • states with constants on transitions can be
    eliminated, transition taken is already known
  • states with independent operations can be merged
  • separate states
  • states which require complex operations (abcd)
    can be broken into smaller states to reduce
    hardware size
  • scheduling

48
Optimizing the FSMD (cont.)
int x, y
optimized FSMD
!1
original FSMD
1
int x, y
1
eliminate state 1 transitions have constant
values
!(!go_i)
2
2
go_i
!go_i
!go_i
x x_i y y_i
2-J
3
merge state 2 and state 2J no loop operation in
between them
x x_i
3
5
y y_i
4
xlty
xgty
merge state 3 and state 4 assignment operations
are independent of one another
y y -x
x x - y
8
7
!(x!y)
5
x!y
d_o x
9
merge state 5 and state 6 transitions from
state 6 can be done in state 5
6
xlty
!(xlty)
y y -x
x x - y
8
7
eliminate state 5J and 6J transitions from each
state can be done from state 7 and state 8,
respectively
6-J
5-J
eliminate state 1-J transition from state 1-J
can be done directly from state 9
d_o x
9
1-J
49
Optimizing the datapath
  • Sharing of functional units
  • one-to-one mapping, as done previously, is not
    necessary
  • if same operation occurs in different states,
    they can share a single functional unit
  • Multi-functional units
  • ALUs support a variety of operations, it can be
    shared among operations occurring in different
    states

50
Optimizing the FSM
  • State encoding
  • task of assigning a unique bit pattern to each
    state in an FSM
  • size of state register and combinational logic
    vary
  • can be treated as an ordering problem
  • State minimization
  • task of merging equivalent states into a single
    state
  • state equivalent if for all possible input
    combinations the two states generate the same
    outputs and transitions to the next same state

51
Technology mapping
  • Library of gates available for implementation
  • Simple
  • only 2-input AND,OR gates
  • Complex
  • various-input AND,OR,NAND,NOR,etc. gates
  • Efficiently implemented meta-gates (i.e.,
    AND-OR-INVERT,MUX)
  • Final structure consists of specified librarys
    components only
  • If technology mapping integrated with logic
    synthesis
  • More efficient circuit
  • More complex problem
  • Heuristics required

51
52
Complexity impact on user
  • As complexity grows, heuristics used
  • Heuristics differ tremendously among synthesis
    tools
  • Computationally expensive
  • Higher quality results
  • Variable optimization effort settings
  • Long run times (hours, days)
  • Requires huge amounts of memory
  • Typically needs to run on servers, workstations
  • Fast heuristics
  • Lower quality results
  • Shorter run times (minutes, hours)
  • Smaller amount of memory required
  • Could run on PC
  • Super-linear-time (i.e. n3) heuristics usually
    used
  • User can partition large systems to reduce run
    times/size
  • 1003 gt 503 503 (1,000,000 gt 250,000)

52
53
Integrating logic design and physical design
  • Past
  • Gate delay much greater than wire delay
  • Thus, performance evaluated as of levels of
    gates only
  • Today
  • Gate delay shrinking as feature size shrinking
  • Wire delay increasing
  • Performance evaluation needs wire length
  • Transistor placement (needed for wire length)
    domain of physical design
  • Thus, simultaneous logic synthesis and physical
    design required for efficient circuits

53
54
Embedded Systems CaseStudy
Elevator Controller
54
55
55
56
Elevator System
  • CRC cards is a well-known method for analyzing a
    system and developing an architecture.
  • CRC
  • Classes logical groupings of data and
    functionality
  • Responsibilities describe what the class do
  • Collaborators other classes w/ which a given
    class works
  • Elevator Control Classes
  • Elevator car, Passenger, Floor control, Car
    control, Car sensors, etc.
  • Architectural Classes
  • Car state, Floor control reader, Car control
    reader, Car control sender, Scheduler

56
57
F floors N hoistways
57
58
58
59
59
60
60
61
61
62
Classes logical groupings of data and
functionality Responsibilities describe what
the class do Collaborators other classes w/
which a given class works Elevator Control
Classes Elevator car, Passenger, Floor control,
Car control, Car sensors, etc. Architectural
Classes Car state, Floor control reader, Car
control reader, Car control sender, Scheduler
Physical Interfaces
62
63
63
64
Architecture
  • Computation and I/O occur at
  • Floor control panels/displays
  • Elevator cars
  • System controller
  • Panels Controller
  • Car Controller
  • read buttons and send events to system
    controller
  • read sensor inputs and send to system controller

64
65
System Controller
  • Must take inputs from many sources
  • Must control cars to hard real-time deadlines
  • User interface, scheduling are soft deadlines
  • Testing
  • Build an elevator simulator using SystemC,
    Verilog, VHDL and FPGA
  • Simulate multiple elevators
  • Simulate real-time control demands

65
66
Homework
  • The simplest possible custom single-purpose
    processor
  • Design a processor to multiply two numbers. The
    initial data are in registers/counters A and B.
    The result should be in register/counter C.
  • You have only reversible counters (with reading)
    to be used in the data path.
  • The counters perform the following operations
  • Add one
  • Subtract one
  • Read new value
  • Invent the algorithm for multiplication. Use
    minimum number of counters
  • Design the reversible counter by hand using logic
    gates and D FFs.
  • Design the control unit
  • Design the data path
  • Draw the timing diagram of the whole system.
  • You can use VHDL or Verilog to help you, but I
    need your design by hand.

67
Summary
  • Custom single-purpose processors
  • Straightforward design techniques
  • Can be built to execute algorithms
  • Typically start with FSMD
  • CAD tools can be of great assistance

68
Questions to Exams (1)
  1. What are the main methods of Combinational logic
    design?
  2. What is Mealy FSM (Finite State Machine)?
  3. What is Moore State Machine?
  4. Think about a robot controller as a Sequential
    logic Circuit. What are the blocks and their
    role?
  5. Role of abstraction in FSM design. Give examples.
  6. Explain the concepts from Gajskis Chart in a
    Custom single-purpose processor design
  7. RT-level custom single-purpose processor design.
    Explain briefly all design stages from bottom of
    design hierarchy (layout) to the top (system
    design of a GCD processor as an example)
  8. List and explain logic gates.
  9. List and explain combinational blocks.
  10. List and explain sequential blocks.
  11. List and explain sensors to be used with embedded
    systems of FSM type.
  12. List and explain actuators to be used with such
    embedded systems.

69
Questions to Exams (2)
  • What are the main synthesis processes and CAD
    tools in Combinational logic design?
  • What are the methods to solve the covering
    problem?
  • Explain the concept of search and give examples.
  • Explain the concept of heuristic in search and
    give examples. SOP minimization can be very
    useful. Also ESOP.
  • Explain design tradeoffs and Pareto Optimization
    on one practical example.
  • Explain in detail on example the basic synthesis
    method for Mealy FSM from specification to a
    circuit from D type flip-flops (FFs) and logic
    gates.
  • Explain and illustrate how D, T and JK flip-flops
    work.
  • What is a difference between
  • Register with enable
  • Register without enable
  • Reversible register
  • Draw the schematic of the FSMD.
  • Explain GCD algorithm of Euclides on examples.
  • Without looking to the slides, convert GCD
    algorithm to a FSMD.
  • How can we optimize GCD?
  • Apply these ideas to Least Common Multiplier
    algorithm and FSMD for two numbers.

70
Questions to Exams (3)
  1. The role of GO-TO commands in FSMD design. Are
    they good or bad? Give examples. The role of
    structured design of FSMD.
  2. How the data path is created from FSMD? This is
    one of main topics for this whole class. You have
    to know it well.
  3. How CU (Control Unit) is created from FSMD? This
    is one of main topics for this whole class. You
    have to know it well.
  4. Compare state graph, state transition table and
    flow-chart. Why we need all of them?
  5. In this class we are not optimizing combinational
    logic or FSMs too much. But if you have taken ECE
    572 or ECE 573 classes you know many methods to
    optimize on these levels. Can you give practical
    examples of these optimizations in GCD or other
    similar system?
  6. Complete the Bus bridge FSMD that converts
    4-bit bus to 8-bit bus and is given in these
    slides.
  7. Discuss Optimizing the single-purpose processors.
    Give examples. Explain levels of optimization,
    such as the original program, the FSMD, the data
    path, the CU, the register, the combinational
    logic, finally the technology mapping.
  8. Design the complete elevator system for a villa
    of a crazy millionaire artist from Hollywood.
    Cost does not count. You have to amaze his
    guests.

71
Sources
  • EECE 353-1
  • Real-Time Systems
  • T. John Koo
  • Embedded Computing Systems Laboratory
  • Institute for Software Integrated Systems
  • Department of Electrical Engineering and Computer
    Science
  • Vanderbilt University
  • 5306 Stevenson Center
  • January 16, 2006
  • john.koo_at_vanderbilt.edu

Slides from S. Mohammadi Vahid, Siamak Mohammadi
Givargis and Marwedel
72
What we can cover on Monday meeting?
  1. Design of SOP circuits from KMaps. Prime
    implicants and Covering
  2. Design of POS circuits from KMaps. Prime
    implicates and Covering
  3. Design of ESOP circuits from KMaps. Algebraic
    rules for AND/EXOR logic.
  4. Design using NAND and NOR gates. De Morgan Rules.
  5. Factorization.
  6. Multiplexers.
  7. Iterative circuits and their types.
  8. Using State Machines to design one-directional
    iterative circuits
  9. Predicates
  10. Oracles
  11. SAT oracles
  12. Graph Coloring oracles and distributed processors
  13. SENDMOREMONEY problem and its oracle.
  14. The idea of Constraint Satisfaction and
    Distributed Software/hardware for it.
Write a Comment
User Comments (0)
About PowerShow.com