Title: Computer Organization and CPUs AMS I2'1'2 Fall 2005
1Computer Organization and CPUs AMS I-2.1.2 Fall
2005
- Greg Phillips
- greg.phillips_at_rmc.ca
- Royal Military College of Canada
- Electrical and Computer Engineering
2Hardware module
- topics
- fundamentals of digital systems
- computer system organization and CPUs
- gate logic (lab)
- parallel and serial interconnects
- core memory
- secondary storage
- display and input devices
3Typical System Architecture
core memory
this lecture
display and input devices
serial and parallel interconnects
ISA Devices
display and input devices
secondary storage
4Whats a CPU?
- a central processing unit (CPU) is at the heart
of all modern digital computers - a CPU is a complex arrangement of combinational
and sequential logic - combinational outputs depend only on current
inputs - sequential has memory, is normally controlled by
a clock - with a simple function
- fetch instruction ? decode instruction ? execute
instruction - each instruction is a simple operation on binary
symbols - e.g., add two numbers together, take the logical
and of two bit patterns, read from or write to
memory, etc. - almost all modern CPUs are implemented as
microprocessors, that is, as single integrated
circuits
5Major CPU components
- register section
- contains registers (memory local to the CPU)
necessary for CPU functioning, including
programmer-visible registers and internal
registers - CPU interconnect
- interconnects the registers and connects to
memory - arithmetic-logic unit (ALU)
- contains circuitry to perform data manipulation
operations (ADD, AND) - control unit
- provides appropriate signals to allow the CPU to
step through its required operational states
6States and clocks
- CPUs operate as state machines
- in each state they do exactly one thing
- which state will be next is determined by the
control unit and is based on previous state and
register contents - CPUs change state each time the clock ticks
- CPU performance is largely (but not entirely)
governed by clock speed - current CPU clocks are in the 2.5 GHz range
7Fetch, decode, execute
- fetch cycle
- fetch an instruction from memory, store it in an
instruction register - decode cycle
- the instruction register is fed into a
(combinational) decoder to enable the relevant
instruction logic - execute cycle
- execute the instruction
- then fetch the next
CPU and CPU state diagrams from Computer
Systems Organization and Architecture by John D.
Carpinelli
8Specification of a very simple CPU
- suppose we want a CPU that
- works with data values between 0 and 255
- accesses up to 64 distinct data values
- this memory stores programs and data
- includes instructions to
- add values together
- perform a logical AND of values
- increment a value
- jump to a new program instruction
- design considerations
- how many bits does each data value need?
- how many bits do we need to encode memory
addresses? - how many bits do we need to encode the
instructions?
Practical Note This CPU is too simple to be
useful it has no instruction that provides
output! But complex enough for an example
9Programmer-visible design
- user-accessible register
- AC (8-bit accumulator) for math/logic functions
- system register
- PC (6-bit program counter) represents the address
of then next instruction to be executed - instruction set
- ADD 00aaaaaa AC ? AC Maaaaaa
- AND 01aaaaaa AC ? AC AND Maaaaaa
- JMP 10aaaaaa PC ? aaaaaa
- INC 11xxxxxx AC ? AC 1
7 6 5 4 3 2 1 0
Practical Note We can provide a crude form of
output by connecting a light to each bit of the
AC and PC registers this is what the blinking
lights on the face of (very) old computers were
for.
10Registers
- a register is just a small piece of memory local
to the CPU - well come back to registers when we talk about
core memory - typically, registers have a load enable signal
- when this signal is asserted (logic 1) at the
moment of a clock pulse, the registers contents
are set to the value of its current inputs - it remembers this input until the next load in
the meantime it outputs the remembered value
11Major CPU components
- register section
- contains registers (memory local to the CPU)
necessary for CPU functioning, including AC and
PC - CPU interconnect
- interconnects the registers and connects to
memory - arithmetic-logic unit (ALU)
- contains circuitry to perform data manipulation
operations (ADD, AND) - control unit
- provides appropriate signals to allow the CPU to
step through its required operational states
12Register and Interconnect Design
- AR address register
- holds address bits (operand) on the memory
address bus - DR data register
- stores memory data taken off the memory data bus
- IR instruction register
- holds operator of instruction
- 8-bit CPU interconnect bus
- interconnects registers, also connects data bus
to registers - M system memory (64 bytes)
- 6-bit line (6 wires)
- tri-state buffer 0, 1, or
- disconnected (Z) (has disable input)
memory data bus
memory address bus
M
AR
PC
DR
AC
ALU
IR
8-bit CPU interconnect bus
13CPU State Design
Each bubble represents a state in which the CPU
is performing a specific function. The CPU
advances from state to state on each clock pulse,
under direction of the control unit. How many
clock pulses does each instruction take?
14Major CPU components
- register section
- contains registers (memory local to the CPU)
necessary for CPU functioning, including AC and
PC - CPU interconnect
- interconnects the registers and connects to
memory - arithmetic-logic unit (ALU)
- contains circuitry to perform data manipulation
operations (ADD, AND) - control unit
- provides appropriate signals to allow the CPU to
step through its required operational states
15ALU Design
ALUSEL
MUX is a multiplexer which selects one of two
eight bit inputs based on the value of ALUSEL and
passes the selected input through to its output.
16Major CPU components
- register section
- contains registers (memory local to the CPU)
necessary for CPU functioning, including AC and
PC - CPU interconnect
- interconnects the registers and connects to
memory - arithmetic-logic unit (ALU)
- contains circuitry to perform data manipulation
operations (ADD, AND) - control unit
- provides appropriate signals to allow the CPU to
step through its required operational states
17Modern CPU (Pentium 4)
- essentially the same architecture as our Very
Simple CPU, just more stuff - allows for superscalar operation
- controller implemented as a m-instruction
sequencer (computer within a computer) rather
than hardwired combinational logic - includes a dedicated floating point arithmetic
module
18Understanding CPU performance
- clock speed
- faster clock faster performance
- instruction set design
- more useful work per instruction faster
performance - fewer states per instruction faster performance
- normally cant do both
- reduced instruction set computing (RISC)
emphasizes few states, simple instructions (e.g.,
IBM/Motorola PowerPC family) - complex instruction set computing (CISC)
emphasizes instructions that do lots of useful
work (e.g., Intel Pentium family) - sophisticated tricks, like
- pipelining like an assembly line, several
instructions in the pipe at a time - superscalar design execute more than one
instruction in parallel
19Moores Law
- an empirical observation of the semiconductor
industry - made by Gordon Moore (Intel founder) in 1965
- states that the complexity of integrated circuits
at a given price point doubles every 18 months - observationally true over the last forty years
- implications at a given price point
- processing speed doubles every 18 months
- this is no longer completely true although
processing power is still roughly doubling - memory density doubles every 18 months
- the Intel P4 3 GHz CPU you buy today for 250
will cost - 125 in eighteen months 62.50 in three years
36.25 in four and a half years 18.12 in six
years and free with your breakfast cereal 1.5
years later
20Moores Law in action
Intel 4004 (1971) 2300 transistors 12 mm2 die 10
?m features 16 pins 108 kHz clock 0.06 MIPS
Pentium 4 (2000) 42M transistors 217 mm2
die 0.18 ?m features 478 pins 1.5 GHz clock 2660
MIPS 830 MFLOPS
CPU images from Intel web site
21Relative measures
- P4 versus 4004
- 18,300 times as many transistors
- 18 times as much area
- 1000 times as dense
- features are 55 times narrower
- 30 times as many pins
- clock runs 13,900 times as fast
- executes 4,300 times as many operations per
second - operations do more
Intel Pentium 4
Intel 4004
22Next class
- Gate logic lab
- Sawyer 5033, 1000 hrs Friday
- Parallel and Serial
- Interconnects