Title: William Stallings Computer Organization and Architecture 5th Edition
1William Stallings Computer Organization and
Architecture5th Edition
- Chapter 12
- Reduced Instruction
- Set Computers
- ????????
2Topics
- Major Advances in Computers ????????
- Instruction Execution Characteristics ???????
- Use of Large Register File ???????
- Compiler-Based Register Optimization
- ???????????
- Reduced Instruction Set Architecture
- ?????????
- RISC Pipelining RISC???
- RISC vs. CISC Controversy RISC?CISC ???
3Major Advances in Computers(1)
- The family concept ????
- IBM System/360 1964
- DEC PDP-8
- Separates architecture from implementation
- ????????????
- Microporgrammed control unit ??????
- Idea by Wilkes 1951
- Produced by IBM S/360 1964
- Cache memory Cache???
- IBM S/360 model 85 1969
4Major Advances in Computers(2)
- Solid State RAM ?????
- (See memory notes)
- Microprocessors ????
- Intel 4004 1971
- Pipelining ??
- Introduces parallelism into fetch execute cycle
- Multiple processors ????
5The Next Step - RISC
- RISC-Reduced Instruction Set Computer
- RISC????????
- Key features
- Large number of general purpose registers, or use
of compiler technology to optimize register use - ???????,??????????????
- Limited and simple instruction set
- ??????????
- Emphasis on optimising the instruction pipeline
- ?????????
6Comparison of processors
7Driving force for CISC(1)
- CISC-Complex Instruction Set Computer
- CISC-????????
- Why CISC?
- Software costs far exceed hardware costs
- ??????????
- Increasingly complex high level languages
- ??????????
- Semantic gap Difference between operations
provided in HLLs and those provided in computer
architecture - ??????
- HLLs????????????????????
8Driving force for CISC(2)
- to close the gapLeads to
- Large instruction sets ??????
- More addressing modes ???????
- Hardware implementations of HLL statements
- e.g. CASE (switch) on VAX
- HLL????????
9Intention of CISC ????????????
- Ease compiler writing ??????????
- Improve execution efficiency ??????
- Complex operations in microcode
- ?????????????
- Support more complex HLLs
- ??????HLL??
- A totally different approachSimpler architecture
- ??????
10Execution Characteristics
- Developments of RISCs were based on the study of
instruction execution characteristics - RISC?????????????
- Operations performed ?????
- determine functions to be performed and
interaction with memory - ???CPU??????????????????
- Operands used (types and frequencies)
- ?????????????
- determine memory organization and addressing
modes - ?????????????????????????
- Execution sequencing
- determines the control and pipeline organization
- ??????????????
11Execution Characteristics
- In the remainder of this section, we summarize
the results of a number of studies of
high-level-language programs. All of the results
are based on dynamic measurements. - Dynamic studies are measured during the execution
of the program. - ??????????????
- Static measurements merely perform these counts
on the source text of a program.?????????????????
,??????????????
12Operations
- Table 4.9 reveal
- Assignment statements predominate
- Movement of data is of high importance ????????
- Preponderance of Conditional statements (IF,
LOOP)???? - Sequence control is important ????????
13Relative Dynamic Frequency
- Dynamic Machine Instruction Memory Reference
- Occurrence (Weighted) (Weighted)
- Pascal C Pascal C Pascal C
- Assign 45 38 13 13 14 15
- Loop 5 3 42 32 33 26
- Call 15 12 31 33 44 45
- If 29 43 11 21 7 13
- GoTo - 3 - - - -
- Other 6 1 3 1 2 1
14Operations
- Procedure call-return is very time consuming
- ???????????
- Some HLL instruction lead to many machine code
operations - ??HLL????????????
15Operands
- Mainly local scalar variables
- ?????,????
- Optimisation should concentrate on accessing
local variables - ???????????????
- Pascal C Average
- Integer constant 16 23 20
- Scalar variable 58 53 55
- Array/structure 26 24 25
16Procedure Calls
- Very time consuming
- ?????????HLL?????????
- To implement efficiently, two aspects are
significant - Depends on number of parameters passed
- ???????????
- Depends on level of nesting ???????
- Most programs do not do a lot of calls followed
by lots of returns - ?????????????
- Most variables are local ?????????
17Implications
- Making instruction set architecture close to HLL
- ?????HLL??
- ? not most effective ??????
- Best support is given by optimising most used
and most time consuming features - ?????????????????,???????
18Implications
- Generalizing from the work of a number of
researchers, three elements emerge that, by and
large, characterize RISC architectures. - Large number of registers ?????
- Operand referencing optimization locality of
references ? memory references reduced
??????? - Careful design of pipelines ???????
- Conditional branch and procedure call ?????????
- Simplified (reduced) instruction set ?????
19Use of Large Register File
- From the analysis
- Large number of assignment statements
- Most accesses to local scalars
- ????????
- ? Heavy reliance on register storage
- ????????
- ? Minimizing memory access
- ???????
20Approaches
- Software solution to maximize register usage????
- Require compiler to allocate registers to those
most used variables in a given time - ??????,?????????????????????
- Requires sophisticated program analysis
- ?????????
- Hardware solution????
- Have more registers
- ?????
- Thus more variables will be in registers
- ??????????
21Registers for Local Variables
- Store local scalar variables in registers
- ?????????????
- ? Reduces memory access ???????
- Some problems
- Every procedure (function) call changes locality
- ??????????????
- On every call, local variables must be saved to
memory ??????????????? - Parameters must be passed ??????
- On return, results must be returned and variables
from calling programs must be restored - ?????????????????
22Register Windows
- Solution Register windows
- Organization of registers to realize the goal
- ?????????????,????????????
- From the analysis
- Only few parameters and local variables
- ?????????
- Limited range of depth of call ???????
- ?
- Use multiple small sets of registers
- ??????????
- Calls switch to a different set of registers
- ????????????????????
- Returns switch back to a previously used set of
registers - ???????????????
23Overlapping Register Windows
- Three areas within a register set ????3??
- Parameter registers ??????
- Local registers ??????
- Temporary registers ??????
24Register Windows cont.
- Temporary registers from one set overlap
parameter registers from the next - ?????????????????(??????????)???????
- Temporary registers at one level are physically
the same as the parameter registers at the next
lower level.?????????????????????????????? - This allows parameter passing without moving data
- ????????????????????
25Circular Buffer diagram
The actual organization of the register file is
as a circular buffer of overlapping
windows. ????????????????????????????
26Operation of Circular Buffer
- When a call is made, a current window pointer
(CWP) is moved to show the currently active
register window - ????????,??????????????????
- If all windows are in use, an interrupt is
generated and the oldest window (the one furthest
back in the call nesting) is saved to memory
(only .in and .loc need to be saved) - ???????????????,???????????(?????????????)
- A saved window pointer indicates where the next
saved windows should restore to - ??????????????????????
27Operation of Circular Buffer (2)
- Studies show 8 windows are enough to handle up
to of call/return without save/restore - 8???????99??????
- E.g., Berkeley RISC uses 8 windows of 16
registers each
28Global Variables - 2 Options
- Allocated by the compiler to memory
- ????????????????
- Straightforward ????
- Inefficient for frequently accessed
variables?????????????? - Have a set of registers for global variables
- CPU?????????
- e.g., registers 0 - 7 global
- 8 - 31 local to current window
- Increased hardware burden ??????
- Compiler must decide which global variables
should be designed to registers - ????????????????????????
29Registers v Cache
- Large Register File Cache
- All local scalars
Recently used local scalars - ??????
????????? - Individual variables
Blocks of memory - ????
???? - Compiler assigned global variables Recently
used global variables - ????????? ?????????
- Save/restore based on Save/restore
based on caching - procedure nesting
algorithm - ??/??????????? ??/????cache????
- Register addressing Memory
addressing - ?????
?????
30Registers v Cache
- ????
- ????????????
- ??????(???,???)
- ??????????????????
- Cache
- ????????????
- ?????????(????)
- ??????????????(????,??????)
- ???????????????????(?????)
31Registers v Cache
- ?????Cache???
- ???????????????????,????????????????????????????
????????????????? - ???cache?????????,????????????????????????????????
??cache?,???????????????????????(tag),????????????
????,?????????? - ?????????,??cache??????????,?cache????????????,???
???,???????????????????????????????cache,?????????
32Referencing a Scalar - Window Based Register File
virtual register number
window number
?
33Referencing a Scalar - Cache
?
34Compiler Based Register Optimization
- Assume small number of registers (16-32)
- ???????????
- ????????????????
- HLL programs have no explicit references to
registers - ????????????????????
- The objective of the compiler is to keep the
operands for as many computations as possible in
registers rather than main memory, and to
minimize load-and-store operations.????????,??????
????????????????????,????????????????
35Compiler Based Register Optimization cont.
- Each quantity is assigned to a symbolic or
virtual register?????????????????????????????????
- Map (unlimited) symbolic registers to real
registers - ?????????????????????????????????
- Symbolic registers that do not overlap can share
real registers - ??????????????????????
- If you run out of real registers, some variables
use memory - ??????????,???????????????????????????????
36Optimization
- The essence of the optimization task is to decide
which quantities are to be assigned to registers - ??????? ???????????????????
- The technique is known as graph coloring
- ?????
- Used in RISC compiler ??RISC???
- Borrowed from the discipline of topology
- ?????????????
37Graph Coloring
- Given a graph of nodes and edges
- ???????????????
- Assign a color to each node
- ?????????
- Adjacent nodes have different colors
- ????????
- Use minimum number of colors??????????
- Nodes are symbolic registers????????
38Graph Coloring cont.
- Two registers that are live in the same program
fragment are joined by an edge - ??????????????????,????????????????????????
- Try to color the graph with n colors, where n is
the number of real registers - ???n???????,n????????
- Nodes that can not be colored are placed in
memory - ?????????????????
39Graph Coloring Approach
- Assume a program with six symbolic registers to
be compiled into three actual registers - Part a ???????????? Part b ??????
40A Trade-Off
- A trade-off between large registers and register
optimization - ??????????????????????????
- With even simple register optimization, there is
little benefit to the use of more than 64
registers - ?????????????,??????64????????????
- With reasonably sophisticated register
optimization techniques, there is only marginal
performance improvement with more than 32
registers??????????????,??????32???????????????? - Studies show
- 64 registers are enough with simple register
optimization - 32 registers are enough with sophisticated
register optimization
41Reduced Instruction Set ArchitectureWhy CISC (1)?
- Why CISC?
- Ease compiler writing ??????????
- Improve execution efficiency ??????
- Compiler simplification?
- Disputed??
- Complex machine instructions harder to exploit
- ????,????????????????
- Optimization more difficult
- E.g. Minimize code size, enhance pipelining
?????????????
42Why CISC (2)?
- Smaller programs?
- Program takes up less memory ???????
- But memory is now cheap ????????
- Fewer instructions to be fetched, reducing page
faults. - May not occupy less bits in symbolic machine
language - ?????????,??????????????
- More instructions require longer op-codes
CISC???,???????? - RISC tend to emphasize register, and register
references require fewer bitsRISC????????????????
43Why CISC (1)?
- Code Size Relative to RISC I
- 11 C Programs
- RISC I 1.0
- VAX-11/780 0.8
- M68000 0.9
- Z8002 1.2
- PDP-11/70 0.9
- CISC?RISC??????????
- VAX?PDP-11????,?VAX??????
CISC
RISC
44Why CISC (3)?
- Faster programs?
- More complex control unit
- ????????
- Microprogram control store larger
- ?????????
- thus simple instructions take longer to execute
- ????????????
- It is far from clear that CISC is the appropriate
solution - CISC?????????
45RISC Characteristics
- One instruction per cycle ???????
- Register to register operations
- ?????????
- Few, simple addressing modes ??????
- Few, simple instruction formats ??????
46One Instruction Per Machine Cycle
- In a machine cycle ???????
- fetch two operands from registers
- ??????????
- Perform an ALU operation ????ALU??
- Store the result in a register ??????
- There is little or not need for microcode
- ??????????
- Machine instructions can be hardwired
- ???????????????
- Such instructions should execute faster than
comparable machine instructions on other machines.
47Register-to-Register Operations
- Most operations is register-to-register
- ???????????????
- Only LOAD and STORE accessing memory
- ?????LOAD?STORE???????
- Simplify instruction set and control unit
- ?????????
- RISC include only 1 or 2 ADD instructions
- VAX has 25 different ADD instructions
- Encourages the optimization of register use
- ????????????
- Frequently accessed operands remain in high-speed
storage?????????????????
48Simple Addressing Modes
- Almost all RISC instructions use simple register
addressing - ????RISC????????????
- May include several additional modes
- Displacement and PC-relative
- ????????????,??????
- Simplify instruction set and control unit
- ?????????
49Simple Instruction Formats
- Only one or a few formats are used
- ????????????
- Instruction length is fixed and aligned on word
boundaries ??????????????? - A single instruction does not cross page
boundaries - ???????????????
- Field locations, especially the opcode, are fixed
- ????,??????????????
- Opcode decoding and register operand accessing
can occur simultaneously - ?????????????????????
- Simplify control unit ?????
50CISC v RISC
- Typical of a RISC
- A single instruction size (typically 4 bytes)
- ??????(??4???)A small number of data
addressing modes (typically less than five) - ???????(????5?)
- No indirect addressing ?????No operations that
combine load/store with arithmetic - ??/???????????????No more than one
memory-addressed operand per instruction?????????
????????Does not support arbitrary alignment of
data for load/store operations???/????.??????????
51CISC v RISC
- RISC designs may benefit from the inclusion of
some CISC features - RISC??????CISC??????
- CISC designs may benefit from the inclusion of
some RISC featuresCISC??????RISC?????? - PowerPC ????RISC?
- Pentium????RISC???
52RISC Pipelining
- RISC Most instructions are register to register
- ??????????????
- Two phases of execution ??????????
- I Instruction fetch ???
- E Execute ( ALU operation with register input
and output )
????(?????????,????ALU??) - For load and store ?????????????
- I Instruction fetch ???
- E Execute ( Calculate memory address )
- ????(???????)
- D Memory( Register to memory or memory to
register operation )
???(????????????)
53Effects of Pipelining
Only one memory access per phase
1
2
3
4
5
6
7
8
9
10
11
12
13
1
2
3
4
5
6
7
8
9
10
Permitting two memory access per phase
1
2
3
4
5
6
7
8
1
2
3
4
5
6
7
8
10
9
54Effects of Pipelining
1
2
3
4
5
6
7
8
9
10
11
12
13
55Effects of Pipelining
- Only one memory access per phase
1
2
3
4
5
6
7
8
9
10
56Effects of Pipelining
- Permitting two memory access per phase
1
2
3
4
5
6
7
8
57Effects of Pipelining
1
2
3
4
5
6
7
8
10
9
58Optimization of Pipelining
- Delayed branch ????
- Makes use of a branch that does not take effect
until after execution of the following
instruction. The instruction location immediately
following the branch is referred to as the delay
slot. - ???????????????????????????,???????????????????
???????????????????????????
59Normal and Delayed Branch
- Address Normal Delayed Optimized
- 100 LOAD X,A LOAD X,A LOAD X,A
- 101 ADD 1,A ADD 1,A JUMP 105
- 102 JUMP 105 JUMP 106 ADD 1,A
- 103 ADD A,B NOOP ADD A,B
- 104 SUB C,B ADD A,B SUB C,B
- 105 STORE A,Z SUB C,B STORE A,Z
- 106 STORE A,Z
60Use of Delayed Branch
61Use of Delayed Branch
62Use of Delayed Branch
1 2 3 4 5
6 7
63Use of Delayed Branch
1 2 3 4 5
6 7
64Controversy
- Problems
- No pair of RISC and CISC that are directly
comparable?????????RISC?CISC ?? - No definitive set of test programs
- ?????????
- Difficult to separate hardware effects from
complier effects - ????????????????????
- Most comparisons done on toy rather than
production machines - ????????.???????????????????
- Most commercial devices are a mixture
- ?????????RISC?CISC?????