Title: Chapter 1: Fundamentals of Computer Design
1Chapter 1 Fundamentals of Computer Design
- What is computer architecture?
- Why study computer architecture?
- Performance
- What is performance latency, throughput
- The performance equation
- Amdahl's law
- Measuring performance
- Cost
2What is Computer Architecture?
- Instruction set architecture
- Instructions visible to programmer
- e.g., SPARC V8 vs. V9, Intel IA32 vs. IA64
- Organization
- Highlevel aspects of the system
- e.g., how many functional units, size of the
cache, pipeline organization - e.g., Ultra II vs. Ultra III
- e.g., Pentium III vs. Pentium 4
- Implementation or hardware
- Logic design
- e.g., 1.8 GHz vs. 2.4 GHz Pentium 4
3Goals of the Computer Architect
4Goals of the Computer Architect
- Depends on type of computer
- Supercomputer
- Server
- Desktop
- Embedded
5Goals of the Computer Architect
- Functional goals
- Meet application area demands
- Compatability with previous systems
- Standards (e.g., IEEE floating point)
- Last through trends
- Performance
- Cost
- Power
- Energy
- Dependability
- Need to be familiar with design alternatives and
criteria for selecting among them
6Historical Trends
- Figure 1.1
- System performance is quadrupling every three
years - But this is not a law of nature!
7Why Study Computer Architecture?
8Why Study Computer Architecture?
- Technology changes fast and on different curves
9Why Study Computer Architecture?
- Technology changes fast and on different curves
- Applications change
- From scientific to personal computing, databases,
graphics, multimedia, communications - Compiler / hardware boundary shifts
- New languages (e.g., shift from assembly to
high-level languages) - For many apps, the exponential curve is not
enough! - Parallel computing
10Relationship to Prerequisites
- Prerequisite
- How to design a uniprocessor?
- This course
- How to design a uniprocessor WELL?
- Focus on performance
- Emphasis on Quantitative vs. Qualitative
- Common parallel architectures
- Be sure to check the handout for details on the
prerequisites
11What is Performance?
- Two Metrics
- Latency (or response time or execution time)
- Throughput (or bandwidth)
12What is Performance?
- Two Metrics
- Latency (or response time or execution time)
- Time from start to finish of a task
- Throughput (or bandwidth)
13What is Performance?
- Two Metrics
- Latency (or response time or execution time)
- Time from start to finish of a task
- Throughput (or bandwidth)
- Rate of task completion
- Rate of task initiation
- 1 / (time between task completions)
14Performance (Cont.)
- Definition X is n faster than Y if
- Example X 1 minute, Y 2 minutes
- X is 100 faster than Y
- Example Automobile assembly line starts one car
per hour and holds 20 cars - Latency 20 hours
- Throughput one car per hour
- Throughput gt 1/Latency due to overlap
Execution TimeY Execution TimeX
n 100
1
15Key Performance Equation
instructions cycles time
program instruction cycle
CPUtime X
X
- Instructions per program (path length)
- ISA and compiler
- Cycles per instruction (CPI)
- ISA and organization (e.g., cache misses)
- Time per cycle (clock time, cycle time)
- Organization and hardware
16Amdahl's Law
- (Or why the common case matters most)
- Let
- Consider an enhancement x that speeds up fraction
fx of a task by Sx - Amdahls law gives
new rate old latency old rate new
latency
Speedup
old latency new latency
Speedupoverall
(1 - fx) (fx) ? old latency (1 - fx) ? old
latency fx /Sx ? old latency
1 (1 - fx) fx /Sx
Speedupoverall
17Amdahl's Law, cont.
- Example fx 95 and Sx 1.10
- Example fx 5 and Sx 10
- Example fx 5 and Sx ?
1 (1 - 0.95) (0.95/1.10)
Speedupoverall
1.094
1 (1 - 0.05) (0.05/10)
Speedupoverall
1.047
1 (1 - 0.05) (0.05/?)
Speedupoverall
1.052
18Amdahl's Law Corollary
- Since Sx ? ? implies Example
- For all real speedups
- Or make the common case fast
- An application?
1 (1 - fx) (fx /?)
Speedupoverall
1 1 - fx
Speedupoverall lt
19Measuring Performance
- MIPS, MFLOPS don't mean much
- Benchmarks
- Real programs
- Representative of real workload
- Only way to characterize performance
- SPEC89 ? SPEC92 ? SPEC95 ? SPEC CPU2000
- TPC
- Kernels
- Representative'' program fragments
- Often not representative of full applications
- EEMBC for embedded systems
- Toy benchmarks and synthetic benchmarks
- Don't mean much
20Cost
- Cost is very important in most real designs
- But usually hard to quantify for the architect
- Costs change over time
- Learning curve lowers manufacturing costs
- Technology improvements lower costs
- E.g., DRAM generation price falls by 10 to 30x
over lifetime - Figures 1.5 and 1.6
- Focus on IC costs bigger price variable
21Integrated Circuit Cost
Cost of Die Cost of Testing Cost of
Packaging Final Test Yield
Cost of IC
Cost of Wafer Dies per Wafer ? Die Yield
Cost of Die
? ? (Wafer Diameter/2)2 Die Area
Dies per Wafer (
)
(Correction factor for Edge Effects)
Defects per unit area ? Die Area ?
Die Yield Wafer Yield ?1
- ?
Cost of Die f(Die area)5, assuming ? 4
22Cost Breakdown
- Component Cost
- Microprocessor, SRAM, DRAM
- Disk
- Power supplies and packaging
- Direct Costs
- Manufacturing (labor)
- Warranty
- Indirect Costs or Gross Margin
- Research and Development
- Sales and Marketing
- Profits and Taxes
23Price
- Only loosely related to cost
- Start with all component costs
- Add 10 to 30 for direct costs
- Add 10 to 80 gross margin (indirect costs)
- AVERAGE SELLING PRICE
- Add discounts and dealer profit
- LIST PRICE
- Note
- 1 increase in component can imply
- 1.21 to 2.34 increase in ASP
- 1.57 to 3.04 increase in list price (if
discount 30 ASP) - Component cost 33 64 of list price
- RD is often 4 to 12 of list price