Title: Chapter 1 Background
1Chapter 1 Background
2Outline
- Instruction Set Principles
- The Simplified Instructional Computer (SIC)
- Traditional (CISC) Machines
- RISC Machines
3Instruction Set Principles
4Brief Introduction to ISA
- Instruction Set Architecture a set of
instructions - Each instruction is directly executed by the
CPUs hardware - How is it represented?
- By a binary format since the hardware understands
only bits - Concatenate together binary encoding for
instructions, registers, constants, memories - Options - fixed or variable length formats
- Fixed - each instruction encoded in same size
field (typically 1 word) - Variable half-word, whole-word, multiple word
instructions are possible
5Example of Program Execution
- Command
- 1 Load AC from Memory
- 2 Store AC to memory
- 5 Add to AC from memory
- Add the contents of memory 940 to the content of
memory 941 and stores the result at 941
Fetch
Execution
6Instruction Set Principles -- Classifying
Instruction Set Architecture
7Instruction Set Design
The instruction set influences everything
8Instruction Characteristics
- Usually a simple operation
- Which operation is identified by the op-code
field - But operations require operands - 0, 1, or 2
- To identify where they are, they must be
addressed - Address is to some piece of storage
- Typical storages are main memory, registers, or a
stack - 2 options explicit or implicit addressing
- Implicit - the op-code implies the address of the
operands - ADD on a stack machine - pops the top 2 elements
of the stack, then pushes the result - Explicit - the address is specified in some field
of the instruction - Note the potential for 3 addresses - 2 operands
the destination
9Classifying Instruction Set Architectures
10Operand Locations for Four ISA Classes
11CAB
- Stack
- Push A
- Push B
- Add
- Pop the top-2 values of the stack (A, B) and push
the result value into the stack - Pop C
- Accumulator (AC)
- Load A
- Add B
- Add AC (A) with B and store the result into AC
- Store C
- Register (register-memory)
- Load R1, A
- Add R3, R1, B
- Store R3, C
- Register (load-store)
- Load R1, A
- Load R2, B
- Add R3, R1, R2
- Store R3, C
12Pros and Cons of Stack, Accumulator, Register
Machine
13Modern Choice Load-store Register (GPR)
Architecture
- Reasons for choosing GPR (general-purpose
registers) architecture - Registers (stacks and accumulators) are faster
than memory - Registers are easier and more effective for a
compiler to use - (AB) (CD) (EF)
- May be evaluated in any order
- But on a stack machine ? must left to right
- Registers can be used to hold variables
- Reduce memory traffic
- Speed up programs
- Improve code density (fewer bits are used to name
a register)
14Characteristics Divide GPR Architectures
- of operands
- Three-operand 1 result and 2 source operands
- Two-operand 1 both source/result and 1 source
- How many operands are memory addresses
- 0 3 (two sources 1 result)
Load-store
Register-memory
Memory-memory
15Pros and Cons of Three Most Common GPR Computers
16Instruction Set Principles -- Memory Addressing
17Typical Address Modes (I)
18Typical Address Modes (II)
19Operand Type Size
- Specified by instruction (opcode) or by hardware
tag - Tagged machines are extinct
- Typical types assume word 32 bits
- Character - byte - ASCII or EBCDIC (IBM) - 4 per
word - Short integer - 2- bytes, 2s complement
- Integer - one word - 2s complement
- Float - one word - usually IEEE 754 these days
- Double precision float - 2 words - IEEE 754
- BCD or packed decimal - 4- bit values packed 8
per word - Instructions will be needed for common
conversions -- software can do the rare ones
20What Operations are Needed
- Arithmetic Logical
- Integer arithmetic ADD, SUB, MULT, DIV, SHIFT
- Logical operation AND, OR, XOR, NOT
- Data Transfer - copy, load, store
- Control - branch, jump, call, return, trap
- System - OS and memory management
- Well ignore these for now - but remember they
are needed - Floating Point
- Same as arithmetic but usually take bigger
operands - Decimal - if you go for it what else do you need?
- legacy from COBOL and the commercial application
domain - String - move, compare, search
- Graphics pixel and vertex, compression/decompres
sion operations
21Top 10 Instructions for 80x86
- load 22
- conditional branch 20
- compare 16
- store 12
- add 8
- and 6
- sub 5
- move register-register 4
- call 1
- return 1
- The most widely executed instructions are the
simple operations of an instruction set - The top-10 instructions for 80x86 account for 96
of instructions executed - Make them fast, as they are the common case
22Control Instructions are a Big Deal
- Jumps - unconditional transfer
- Conditional Branches
- How is condition code set? by flag or part of
the instruction - How is target specified? How far away is it?
- Calls
- How is target specified? How far away is it?
- Where is return address kept?
- How are the arguments passed? Callee vs. Caller
save! - Returns
- Where is the return address? How far away is it?
- How are the results passed?
23Branch Address Specification
- Known at compile time for unconditional and
conditional branches - hence specified in the
instruction - As a register containing the target address
- As a PC-relative offset
- Consider word length addresses, registers, and
instructions - Full address desired? Then pick the register
option. - BUT - setup and effective address will take
longer. - If you can deal with smaller offset then PC
relative works - PC relative is also position independent - so
simple linker duty
24Returns and Indirect Jumps
- Branch target is not known at compile time
- Need a way to specify the target dynamically
- Use a register
25Condition Testing Options
26Instruction Set Principles -- Encoding an
Instruction Set
27Encoding the ISA
- Encode instructions into a binary representation
for execution by CPU - Can pick anything but
- Affects the size of code - so it should be tight
- Affects the CPU design - in particular the
instruction decode - So it may have a big influence on the CPI or
cycle-time - Must balance several competing forces
- Desire for lots of addressing modes and registers
- Desire to make average program size compact
- Desire to have instructions encoded into lengths
that will be easy to handle in a pipelined
implementation (multiple of bytes)
283 Popular Encoding Choices
- Variable (compact code but difficult to encode)
- Primary opcode is fixed in size, but opcode
modifiers may exist - Opcode specifies number of arguments - each used
as address fields - Best when there are many addressing modes and
operations - Use as few bits as possible, but individual
instructions can vary widely in length - e. g. VAX - integer ADD versions vary between 3
and 19 bytes - Fixed (easy to encode, but lengthy code)
- Every instruction looks the same - some field may
be interpreted differently - Combine the operation and the addressing mode
into the opcode - e. g. all modern RISC machines
- Hybrid
- Set of fixed formats
- e. g. IBM 360 and Intel 80x86
Trade-off between size of programVS. ease of
decoding
293 Popular Encoding Choices (Cont.)
30An Example of Variable Encoding -- VAX
- addl3 r1, 737(r2), (r3) 32-bit integer add
instruction with 3 operands ? need 6 bytes to
represent it - Opcode for addl3 1 byte
- A VAX address specifier is 1 byte (4-bits
addressing mode, 4-bits register) - r1 1 byte (register addressing mode r1)
- 737(r2)
- 1 byte for address specifier (displacement
addressing r2) - 2 bytes for displacement 737
- (r3) 1 byte for address specifier (register
indirect r3) - Length of VAX instructions 153 bytes
31Short Summary Encoding the Instruction Set
- Choice between variable and fixed instruction
encoding - Code size than performance ? variable encoding
- Performance than code size ? fixed encoding
32The Simplified Instructional Computer (SIC)
33Overview
- Designed to illustrate the most commonly
encountered hardware features and concepts, while
avoiding most of the idiosyncrasies that are
often found in real machines - Standard model and an XE (extra equipment)
version - Upward compatible
34SIC Machine Architecture
- Memory
- Byte addressable
- 8-bit bytes
- 3-byte words
- Addressed by the lowest numbered byte
- A total of 32,768 (215) bytes
- Registers
- Five special-purpose registers
- Each 24 bits in length
- Data formats
- Integers 24-bit binary numbers
- 2s complement
- Characters 8-bit ASCII
- No floating-point hardware
- Instruction formats
8 1 15
opcode
x
address
x indexed-addressing mode
35SIC Registers
36SIC Machine Instructions (Example)
- (Binary) 00000000 0 000000000011111
- Load the content of the word at memory address 31
into A - (H) 00001F
Appendix A
37SIC Machine Architecture (Cont.)
- Addressing modes
- Two addressing modes indicated by the x bit
- Target address
- (X) the contents of register X
- Example 00801F
- Load the content of the word at memory address
(X)31 into A
38SIC Machine Architecture (Cont.)
See Appendix A for a complete list of SIC
instructions
- Instruction set
- Load and store registers (LDA, LDX, STA, STX)
- Integer arithmetic operations (ADD, SUB, MUL,
DIV) - All arithmetic operations involve register A and
a word in memory, with the result being left in
the register - COMP compare the value in register A with a word
in memory - Set a condition code (CC) to indicate the result
(lt, , gt) - Conditional jump instructions (JLT, JEQ, JGT)
test the setting of CC and jump accordingly - Subroutine linkage
- JSUB jump to the subroutine, placing the return
address in register L - RSUB return by jumping to the address contained
in register L
39SIC Machine Architecture (Cont.)
- Input and output
- Perform I/O by transferring 1 byte at a time to
or from the rightmost 8 bits of register A - Each device is assigned a unique 8-bit code
- Three I/O instructions, each specifies the device
code as an operand - Test Device (TD) tests whether the addressed
device is ready to send or receive a byte of data - The conditional code is set to indicate the
result of this test (lt ready not ready) - Read Data (RD)
- Write Data (WD)
40SIC/XE Machine Architecture
- Memory
- Memory structure same as SIC
- Maximum 1 megabyte (220 bytes)
- Registers
- Additional registers
41SIC/XE Machine Architecture (Cont.)
- Data formats
- Additional 48-bit floating-point data type
- Fraction 01 (high-order bit must be 1)
- Exponent unsigned binary number between 0 and
2047
1 11 36
f 2(e-1024)
exponent
s
fraction
42SIC/XE Machine Architecture (Cont.) Instruction
Formats
Format 1 (1 byte)
Format 4 (3 bytes)
8
op
Format 2 (2 bytes)
1
8
4
4
op
r1
r2
Format 3 (3 bytes)
Relative Addressing
0
Instructions of formats 1 and 2 do not refer to
memory
43SIC/XE Machine Architecture (Cont.)
- Addressing modes
- Relative addressing mode for Format-3
instructions - Direct addressing
- b p 0 for Format 3 ? disp is the target
address - b p 0 (normally) for Format 4 ? address is
the target address - Indexed addressing
- x 1 ? (X) is added in the target address
calculation
44SIC/XE Machine Architecture (Cont.)
- Addressing modes (Cont.)
- i1 and n0 ? the target address itself is used
as the operand value (immediate addressing) - i0 and n1 ? the word at the location given by
the target address is fetched the value
contained in this word is then taken as the
address of the operand value (indirect
addressing) - i n 0 or 1 ? the target address is taken as
the location of the operand (simple addressing) - SIC/XE ? in1
- SIC?in0
- Indexing cannot be used with immediate or
indirect addressing
45Examples of SIC/XE Instructions and Addressing
Modes
46SIC/XE Machine Architecture (Cont.)
- Instruction set
- Provides all the instructions available on the
standard version - Instructions to load and store the new registers
(LDB, STB) - Instructions to perform FP arithmetic operations
(ADDF, SUBF) - RMO (register move)
- Register-to-register arithmetic operations (ADDR,
SUBR) - Supervisor call instruction (SVC) for generating
interrupts - Input and output
- I/O channels to be used to perform I/O while CPU
is executing other instructions - SIO, TIO, HIO Start, test, halt the operation of
I/O channels
47Data Movement for SIC
48Data Movement for SIC/XE
49Arithmetic for SIC
50Arithmetic for SIC/XE
51Looping and Indexing for SIC
52Looping and Indexing for SIC/XE
53Looping and Indexing for SIC
54Looping and Indexing for SIC/XE
55Input and Output for SIC
56Subroutine Call and Record Input for SIC
JSUB READ CALL READ SUBROUTINE
57Subroutine Call and Record Input for SIC/XE
58Traditional (CISC) Machines
- Complex Instruction Set Computers (CISC)
- Generally have a relative large and complicated
instruction set, several different instruction
formats and lengths, and many different
addressing modes - Complex to implement such a machine
- Examples
- VAX
- Pentium Pro
59RISC Machines
- Reduced Instruction Set Computers
- Standard, fixed instruction length (usually one
machine word), and single-cycle execution of most
instructions - Memory access is usually done by load and store
only - All instructions except for load and store are
register-to-register - Typically a relative large number of GPR
- The number of machine instructions, instruction
formats, and addressing modes is relative small - Simplify the design of processors
- Faster and less expensive processor development,
greater reliability, and faster instruction
execution times - ExamplesSun UltraSPARC, PowerPC, Cray T3E