Title: Computer Organization and CPUs AMS I2'1'2 Fall 2004
1Computer Organization and CPUs AMS I-2.1.2 Fall
2004
- Greg Phillips
- greg.phillips_at_rmc.ca
- Royal Military College of Canada
- Electrical and Computer Engineering
2Hardware module
- topics
- fundamentals of digital systems
- computer system organization and CPUs
- gate logic (lab)
- parallel and serial interconnects
- core memory
- secondary storage
- display and input devices
3Typical System Architecture
core memory
this lecture
display and input devices
serial and parallel interconnects
ISA Devices
display and input devices
secondary storage
4Whats a CPU?
- a central processing unit (CPU) is at the heart
of all modern digital computers - a CPU is a complex arrangement of combinational
and sequential logic - combinational outputs depend only on current
inputs - sequential has memory, is normally controlled by
a clock - with a simple function
- fetch instruction ? decode instruction ? execute
instruction - each instruction is a simple operation on binary
symbols - e.g., add two numbers together, take the logical
and of two bit patterns, read from or write to
memory, etc. - almost all modern CPUs are implemented as
microprocessors, that is, as single integrated
circuits
5Major CPU components
- register section
- contains registers (memory local to the CPU)
necessary for CPU functioning, including
programmer-visible registers and internal
registers - CPU interconnect
- interconnects the registers and connects to
memory - arithmetic-logic unit (ALU)
- contains circuitry to perform data manipulation
operations (ADD, AND) - control unit
- provides appropriate signals to allow the CPU to
step through its required operational states
6States and clocks
- CPUs operate as state machines
- in each state they do exactly one thing
- which state will be next is determined by the
control unit and is based on previous state and
register contents - CPUs change state each time the clock ticks
- CPU performance is largely (but not entirely)
governed by clock speed - current CPU clocks are in the 2.5 GHz range
7Fetch, decode, execute
- fetch cycle
- fetch an instruction from memory, store it in an
instruction register - decode cycle
- the instruction register is fed into a
(combinational) decoder to enable the relevant
instruction logic - execute cycle
- execute the instruction
- then fetch the next
CPU and CPU state diagrams from Computer
Systems Organization and Architecture by John D.
Carpinelli
8Specification of a very simple CPU
- suppose we want a CPU that
- works with data values between 0 and 255
- accesses up to 64 distinct data values
- this memory stores programs and data
- includes instructions to
- add values together
- perform a logical AND of values
- increment a value
- jump to a new program instruction
- design considerations
- how many bits does each data value need?
- how many bits do we need to encode memory
addresses? - how many bits do we need to encode the
instructions?
Practical Note This CPU is too simple to be
useful it has no instruction that provides
output! But complex enough for an example
9Programmer-visible design
- user-accessible register
- AC (8-bit accumulator) for math/logic functions
- system register
- PC (6-bit program counter) represents the address
of then next instruction to be executed - instruction set
- ADD 00aaaaaa AC ? AC Maaaaaa
- AND 01aaaaaa AC ? AC AND Maaaaaa
- JMP 10aaaaaa PC ? aaaaaa
- INC 11xxxxxx AC ? AC 1
7 6 5 4 3 2 1 0
Practical Note We can provide a crude form of
output by connecting a light to each bit of the
AC and PC registers this is what the blinking
lights on the face of (very) old computers were
for.
10Registers
- a register is just a small piece of memory local
to the CPU - well come back to registers when we talk about
core memory - typically, registers have a load enable signal
- when this signal is asserted (logic 1) at the
moment of a clock pulse, the registers contents
are set to the value of its current inputs - it remembers this input until the next load in
the meantime it outputs the remembered value
11Major CPU components
- register section
- contains registers (memory local to the CPU)
necessary for CPU functioning, including AC and
PC - CPU interconnect
- interconnects the registers and connects to
memory - arithmetic-logic unit (ALU)
- contains circuitry to perform data manipulation
operations (ADD, AND) - control unit
- provides appropriate signals to allow the CPU to
step through its required operational states
12Initial Register and Interconnect Design
- AR address register
- holds address bits (operand) on the memory
address bus - DR data register
- stores memory data taken off the memory data bus
- IR instruction register
- holds operator of instruction
- 8-bit CPU interconnect bus
- interconnects registers, also connects data bus
to registers - M system memory (64 bytes)
- 6-bit line (6 wires)
- tri-state buffer 0, 1, or
- disconnected (Z) (has disable input)
memory address bus
memory data bus
M
AR
PC
DR
AC
IR
8-bit CPU interconnect bus
13CPU State Design
Each bubble represents a state in which the CPU
is performing a specific function. The CPU
advances from state to state on each clock pulse,
under direction of the control unit. How many
clock pulses does each instruction take?
14Final Register and Interconnect Design
- Major changes
- introduced ALU
- removed some connections
- added control signals
- ARLOAD, PCLOAD, DRLOAD, ACLOAD, IRLOAD
- load the relevant register from the interconnect
bus - PCINC, ACINC
- increment the PC and AC
- PCBUS, ARBUS, MEMBUS
- enable the PC, AR, or memory data bus output on
the interconnect bus - READ
- makes memory subsystem put onto data bus the
contents of memory indicated by address bus - ALUSEL
- selects ALU function (next slide)
memory data bus
memory address bus
M
AR
PC
DR
AC
ALU
IR
8-bit CPU interconnect bus
15Major CPU components
- register section
- contains registers (memory local to the CPU)
necessary for CPU functioning, including AC and
PC - CPU interconnect
- interconnects the registers and connects to
memory - arithmetic-logic unit (ALU)
- contains circuitry to perform data manipulation
operations (ADD, AND) - control unit
- provides appropriate signals to allow the CPU to
step through its required operational states
16ALU Design
ALUSEL
MUX is an eight-bit multiplexer which selects one
of several (here, two) inputs based on the value
of S (here, ALUSEL) and passes the selected input
through to its output.
17Major CPU components
- register section
- contains registers (memory local to the CPU)
necessary for CPU functioning, including AC and
PC - CPU interconnect
- interconnects the registers and connects to
memory - arithmetic-logic unit (ALU)
- contains circuitry to perform data manipulation
operations (ADD, AND) - control unit
- provides appropriate signals to allow the CPU to
step through its required operational states
18Generic control unit design
- counter keeps track of which state the CPU is in
- LD loads the counter from the inputs, INC
increments, CLR clears - how many bits does the counter need to be?
- must perform a state assignment encoding states
as numbers - the decoder generates a unique signal for each
state - in each state sets one (distinct) output to 1,
holds all others at 0 - the control logic generates the necessary control
signals based on current state
19Recall CPU States
Each bubble represents a state in which the CPU
is performing a specific function. The CPU
advances from state to state on each clock pulse,
under direction of the control unit. How many
clock pulses does each instruction take?
20Final counter and decoder design
- in state assignment, aim is to minimize total
amount of logic by choosing clever assignments - counters simplify the logic since states usually
advance use INC - for our CPU the state assignments shown here turn
out to be appropriate - see the online reference (ch 6. of Carpinellis
Computer Systems Organization and Architecture)
for rationale (if interested)
21Control logic design
In each case state outputs (from the decoder) are
ORed together and used as a control signal. E.g.,
if were in any one of FETCH1, ADD1 or AND1,
assert DRLOAD (load the data register from the
bus).
22Major CPU components
- register section
- contains registers (memory local to the CPU)
necessary for CPU functioning, including AC and
PC - CPU interconnect
- interconnects the registers and connects to
memory - arithmetic-logic unit (ALU)
- contains circuitry to perform data manipulation
operations (ADD, AND) - control unit
- provides appropriate signals to allow the CPU to
step through its required operational states
Done! Now we have the complete design of a CPU
that implements our original specification
23Modern CPU (Pentium 4)
- essentially the same architecture as our Very
Simple CPU, just more stuff - allows for superscalar operation
- controller implemented as a m-instruction
sequencer (computer within a computer) rather
than hardwired combinational logic - includes a dedicated floating point arithmetic
module
24Understanding CPU performance
- clock speed
- faster clock faster performance
- instruction set design
- more useful work per instruction faster
performance - fewer states per instruction faster performance
- normally cant do both
- reduced instruction set computing (RISC)
emphasizes few states, simple instructions (e.g.,
IBM/Motorola PowerPC family) - complex instruction set computing (CISC)
emphasizes instructions that do lots of useful
work (e.g., Intel Pentium family) - sophisticated tricks, like
- pipelining like an assembly line, several
instructions in the pipe at a time - superscalar design execute more than one
instruction in parallel
25Moores Law
- an empirical observation of the semiconductor
industry - made by Gordon Moore (Intel founder) in 1965
- states that the complexity of integrated circuits
at a given price point doubles every 18 months - observationally true over the last forty years
- implications
- processing speed doubles every 18 months
- memory density doubles every 18 months
- the Intel P4 3 GHz CPU you buy today for 250
will cost - 125 in eighteen months 62.50 in three years
36.25 in four and a half years 18.12 in six
years and free with your breakfast cereal 1.5
years later - by which time your 250 will buy you a 96 GHz
CPU
26Moores Law in action
Intel 4004 (1971) 2300 transistors 12 mm2 die 10
?m features 16 pins 108 kHz clock 0.06 MIPS
Pentium 4 (2000) 42M transistors 217 mm2
die 0.18 ?m features 478 pins 1.5 GHz clock 2660
MIPS 830 MFLOPS
CPU images from Intel web site
27Relative measures
- P4 versus 4004
- 18,300 times as many transistors
- 18 times as much area
- 1000 times as dense
- features are 55 times narrower
- 30 times as many pins
- clock runs 13,900 times as fast
- executes 4,300 times as many operations per
second - operations do more
Intel Pentium 4
Intel 4004
28Next class
- Parallel and Serial
- Interconnects