Title: Today's Agenda
1Today's Agenda
- Instruction Set Architecture assembly language
- Instruction Set Encoding machine language
- Things to remember
- HW9 is due immediately after the thanks giving
break(on Monday, 26th november)? - HW 10.1 and 10.2 will be released this week.
- All the numbers in the assembly instructions in
hw9 are in DECIMAL.
2Programming and CPUs
- Programs written in a high-level language like
C must be compiled to produce an executable
program. - The result is a CPU-specific machine language
program. This can be loaded into memory and
executed by the processor. - CS231 focuses on stuff below the dotted blue
line, but machine language serves as the
interface between hardware and software.
3High-level languages vs Low Level Languages
- High-level languages provide many useful
programming constructs. - For, while, and do loops
- If-then-else statements
- Functions and procedures for code abstraction
- Variables and arrays for storage
- Static and dynamic typechecking
- Garbage collection
- High-level languages are also relatively
portable.Theoretically, you can write one program
and compile it on many different processors. - On the other hand Low Level Languages provided
none of those, but.. - thats the only thing the CPU would understand
- Hence the job of compilers which convert a high
level language into low machine level code
4Assembly and machine languages
- Machine language instructions are sequences of
bits in a specific order. - To make things simpler, people typically use
assembly language. - We assign mnemonic names to operations and
operands. - There is (almost) a one-to-one correspondence
between these mnemonics and machine instructions,
so it is very easy to convert assembly programs
to machine language. - Well use assembly code this today to introduce
the basic ideas, and switch to machine language
next time. - But always remember Assembly Language is NOT
Machine Language.
5Data manipulation instructions
- Data manipulation instructions correspond to ALU
operations. - For example, here is a possible addition
instruction, and its equivalent using our
register transfer notation - This is similar to a high-level programming
statement like - R0 R1 R2
- Here, all of the operands are registers.
6More data manipulation instructions
- Here are some other kinds of data manipulation
instructions. - NOT R0, R1 R0 ? R1
- ADD R3, R3, 1 R3 ? R3 1
- SUB R1, R2, 5 R1 ? R2 - 5
- Some instructions, like the NOT, have only one
operand. - In addition to register operands, constant
operands like 1 and 5 are also possible.
Constants are denoted with a hash mark in front.
7Relation to the Datapath and hence the
restrictions
- Recall that our ALU has direct access only to the
register file. - There are at most two source operands in each
instruction, since our ALU has just two inputs. - The two sources can be two registers, or one
register and one constant. - Instructions have just one destination operand,
which must be a register. - RAM contents must be copied to the registers
before they can be used as ALU operands. - Similarly, ALU results must go through the
registers before they can be stored into memory. - We rely on data movement instructions to transfer
data between the RAM and the register file.
8Loading a register from RAM
- A load instruction copies data from a RAM address
to one of the registers. - LD R1,(R3) R1 ? MR3
- Remember in our datapath, the RAM address must
come from one of the registersin the example
above, R3. - The parentheses help show which register operand
holds the memory address.
D data
Write
WR
D address
DA
Register File
A address
B address
AA
BA
A data
B data
Constant
MB
S D1 D0 Q
RAM
ADRS
DATA
OUT
CS
5V
WR
MW
MD
9Storing a register to RAM
- A store instruction copies data from a register
to an address in RAM. - ST (R3),R1 MR3 ? R1
- One register specifies the RAM address to write
toin the example above, R3. - The other operand specifies the actual data to be
stored into RAMR1 above.
Constant
MB
S D1 D0 Q
MD
10Loading a register with a constant
- With our datapath, its also possible to load a
constant into the register file - LD R1, 0 R1 ? 0
- Our example ALU has a transfer B operation
(FS10000) which lets us pass a constant up to
the register file. - This gives us an easy way to initialize registers.
D data
Write
WR
D address
DA
Register File
A address
B address
AA
BA
A data
B data
Constant
MB
S D1 D0 Q
RAM
ADRS
DATA
OUT
CS
5V
WR
MW
MD
11Storing a constant to RAM
- And you can store a constant value directly to
RAM too - ST (R3), 0 MR3 ? 0
- This provides an easy way to initialize memory
contents.
Constant
MB
S D1 D0 Q
MD
12The and ( ) are important!
- Weve seen several statements containing the or
( ) symbols. These are ways of specifying
different addressing modes. - The addressing mode we use determines which data
are actually used as operands
13A small example
- Heres an example register-transfer operation.
- M1000 ? M1000 1
- This is the assembly-language equivalent
14A simple question
- Consider the operation X(A-B) x (AC) x (B-D).
A,B,C and D are constants. - Perform this operation using minimum number of
registers and without - using memory (also assume you have ADD, SUB and
MULT functions). - Solution
- We can implement this using 4 registers
- LD R1,A
- LD R2,B
- LD R3,C
- LD R4,D
- ADD R3, R3, R1
- SUB R1, R1, R2
- SUB R2, R2, R4
- MULT R1,R1,R2
- MULT R1,R1,R3
- Now you can store R1 in the memory as X or
perform some operations on it.
15Jumps
- A jump instruction always changes the value of
the PC. - The operand specifies exactly how to change the
PC. - For simplicity, we often use labels to denote
actual addresses. - For example, a program can skip certain
instructions. - You can also use jumps to repeat instructions.
LD R1, 10 LD R2, 3 JMP L K LD R1, 20 //
These two instructions LD R2, 4 // would be
skipped L ADD R3, R3, R2 ST (R1), R3
LD R1, 0 F ADD R1, R1, 1 JMP F // An
infinite loop!
16Branches
- A branch instruction may change the PC, depending
on whether a given condition is true.
LD R1, 10 LD R2, 3 BZ R4, L // Jump to L
if R4 0 K LD R1, 20 // These instructions
may be LD R2, 4 // skipped, depending on
R4 L ADD R3, R3, R2 ST (R1), R3
17Types of branches
- Branch conditions are often based on the ALU
result. - This is what the ALU status bits V, C, N and Z
are used for. With them we can implement various
branch instructions like the ones below. - Other branch conditions (e.g., branch if greater,
equal or less) can be derived from these, along
with the right ALU operation.
18Translating the C if-then statement
- We can use branch instructions to translate
high-level conditional statements into assembly
code. - Sometimes its easier to invert the original
condition. Here, we effectively changed the R1 lt
0 test into R1 gt 0.
19Translating the C for loop
- Here is a translation of the for loop, using a
hypothetical BGT branch.
20Write the assembly code for the given C program
- FOR for(R10 R1lt5 R1)
- If(R2gtR3)
- L1 R3R21
-
- else
- L2 R2R31
-
-
- L3 R2R2R3
- How many registers do you need?
21The assembly code for the program given on the
previous page
- LD R1,0
- FOR SUB R4, R1, 5
- BNN R4, L3
- ADD R1, R1, 1
- SUB R4, R3, R2
- BNN R4, L2
- L1 ADD R3, R2, 1
- JMP FOR
- L2 ADD R2, R3, 1
- JMP FOR
- L3 ADD R2,R2,R3
- So we need total 4 registers.
22Summing up
- There are three main categories of instructions.
- Data manipulation operations, such as adding or
shifting - Data transfer operations to copy data between
registers and RAM - Control flow instructions to change the execution
order
23Block diagram of a processor
- The control unit connects programs with the
datapath. - It converts program instructions into control
words for the datapath, including signals WR, DA,
AA, BA, MB, FS, MW, MD. - It executes program instructions in the correct
sequence. - It generates the constant input for the
datapath. - The datapath also sends information back to the
control unit. For instance, the ALU status bits
V, C, N, Z can be inspected by branch
instructions to alter a programs control flow.
Program
Control signals
Control Unit
Datapath
Status signals
24A specific instruction set
- For now lets stick with the three-address,
register-to-register instruction set architecture
introduced in the last lecture. - Data manipulation instructions have one
destination and up to two sources, which must be
either registers or constants. - We include dedicated load and store instructions
to transfer data to and from memory.
25From assembly to machine language
- Next, we must define a machine language, or a
binary representation of the assembly
instructions that our processor supports. - Our CPU includes three types of instructions,
which have different operands and will need
different representations. - Register format instructions require two source
registers. - Immediate format instructions have one source
register and one constant operand. - Jump and branch format instructions need one
source register and one constant address. - Even though there are three different instruction
formats, it is best to make their binary
representations as similar as possible. - This will make the control unit hardware simpler.
- Well start by making all of our instructions 16
bits long.
26Register format
- An example register-format instruction
- ADD R1, R2, R3
- Our binary representation for these instructions
will include - A 7-bit opcode field, specifying the operation
(e.g., ADD). - A 3-bit destination register, DR.
- Two 3-bit source registers, SA and SB.
27Immediate format
- An example immediate-format instruction
- ADD R1, R2, 3
- Immediate-format instructions will consist of
- A 7-bit instruction opcode.
- A 3-bit destination register, DR.
- A 3-bit source register, SA.
- A 3-bit constant operand, OP.
28Jump and branch format
- Two example jump and branch instructions
- BZ R3, -24
- JMP 18
- Jump and branch format instructions include
- A 7-bit instruction opcode.
- A 3-bit source register SA for branch conditions.
- A 6-bit address field, AD, for storing jump or
branch offsets. So we can jump 31 addresses
forward and 32 addresses backward. - Our branch instructions support only one source
register. Other types of branches can be
simulated from these basic ones. - The jumps made are relative to current position.
29Instruction format uniformity
- Notice the similarities between the different
instruction formats. - The Opcode field always appears in the same
position (bits 15-9). - DR is in the same place for register and
immediate instructions. - The SA field also appears in the same position,
even though this forced us to split AD into two
parts for jumps and branches.
30Instruction formats and the datapath
- The instruction format and datapath are
inter-related. - Since register addresses (DR, SA and SB) are
three bits each, this instruction set can only
support eight registers. - The constant operand (OP) is also three bits
long. Its value will have to be sign-extended if
the ALU supports wider inputs and outputs. - It also means that we can jump at max 32
addresses. But thats okay because we can club
together several jumps. - Conversely, supporting more registers or larger
constants would require us to increase the length
of our machine language instructions.
31The usual beaten to death question
- Suppose you have 32 bit instruction architecture,
16 registers. You should - be able to handle 3 registers and one constant
(with max magnitude 50) in - an instruction. How many possible operations can
you define using this - ISA?
- Answer Since there are 16 registers, so you need
4 bits to address any register. A constant of
magnitude 50 can be represented using 7
bits(remember 2c compliment notation). So total
bits occupied by 3 registers and 1
constant3x4719. Bits left for
opcodes32-1913. Hence total number of opcodes
possible2138192 .
32Organizing our instructions
- How can we select binary opcodes for each
possible operation? - In general, similar instructions should have
similar opcodes. Again, this will lead to simpler
control unit hardware. - We can divide our instructions into eight
different categories, each of which require
similar datapath control signals. - To show the similarities within categories, well
look at register-based ALU operations and memory
write operations in detail.
33Register format ALU operations
- ADD R1, R2, R3
- All register format ALU operations need the same
values for the following control signals - MB 0, because all operands come from the
register file. - MD 0 and WR 1, to save the ALU result back
into a register. - MW 0 since RAM is not modified.
WR 1
34Memory write operations
- ST (R0), R1
- All memory write operations need the same values
for the following control signals - MB 0, because the data to write comes from the
register file. - MD X and WR 0, since none of the registers
are changed. - MW 1, to update RAM.
WR 0
35Selecting opcodes
- Instructions in each of these categories are
similar, so it would be convenient if those
instructions had similar opcodes. - Well assign opcodes so that all instructions in
the same category will have the same first three
opcode bits (bits 15-13 of the instruction). - Next time well talk about the other instruction
categories shown here.
36ALU and shift instructions
- What about the rest of the opcode bits?
- For ALU and shift operations, lets fill in bits
12-9 of the opcode with FS3-FS0 of the five-bit
ALU function select code. - For example, a register-based XOR instruction
would have the opcode 0001100. - The first three bits 000 indicate a
register-based ALU instruction. - 1100 denotes the ALU XOR function.
- An immediate shift left instruction would have
the opcode 1011000. - 101 indicates an immediate shift.
- 1000 denotes a shift left.
37Branch instructions
- Well implement branch instructions for the eight
different conditions shown here. - Bits 11-9 of the opcode field will indicate the
type of branch. (We only need three bits to
select one of eight branches, so opcode bit 12
wont be needed.)? - For example, the branch if zero instruction BZ
would have the opcode 110x011. - The first three bits 110 indicate a branch.
- 011 specifies branch if zero.
38Sample opcodes
- Here are some more examples of instructions and
their corresponding opcodes in our instruction
set. - Several opcodes have unused bits.
- We only need three bits to distinguish eight
types of branches. - There is only one kind of jump and one kind of
load instruction. - These unused opcodes allow for future expansion
of the instruction set. For instance, we might
add new instructions or new addressing modes.
39Convert the following assembly program to machine
code
- Answers
- LD R1, 4 101 0000 001 xxx 100
- LD R3, (R2)? 011 xxxx 011 010 xxx
- SUB R2,R2,R3 000 0101 010 010 011
- BNN R2,2 110 101 000 010 010
40Summary
- Today we defined a binary machine language for
the instruction set from last week. - Different instructions have different operands
and formats, but keeping the formats uniform will
help simplify our hardware. - We also try to assign similar opcodes to
similar instructions. - The instruction encodings and datapath are
closely related. For example, our opcodes include
ALU selection codes, and the number of available
registers is limited by the size of each
instruction. - This is just one example of how to define a
machine language.