Title: Comparing CPU Time
1Comparing CPU Time
CPU time Seconds Instructions x
Cycles x Seconds Program Program
Instruction Cycle
- A 500 MHz Pentium III processor takes 2 ms to run
a program with 200,000 instructions. - A 300 MHz UltraSparc processor takes 1.8 ms to
run the same program with 230,000 instructions. - What is the CPI for each processor for this
program? - CPI Cycles / Instruction Count
- CPU time X Clock Rate / Instruction Count
- CPIPentium 210-3 X 500106 / 2105 5.00
- CPISPARC 1.810-3 X 300106 / 2.3105 2.35
- Which processor is faster and by how much?
- The UltraSparc is 2/1.8 1.11 times as fast, or
11 faster.
2Cycles Per Instruction
Average Cycles per Instruction
CPU Time Cycle Time Number of cycles
n
CPU time CycleTime S CPI IC
i
i
i 1
Instruction Frequency
n
CPI S CPI F where F
IC
i
i
i
i
i 1
Instruction Count
- Invest resources where time is spent!
3Example Calculating CPI
Base Machine (Reg / Reg) Op Freq Cycles FiCPIi (
Time) ALU 50 1 .5 (33) Load 20 2
.4 (27) Store 10 2 .2 (13) Branch 20 2
.4 (27) 1.5
Typical Mix
4Example
- Add register / memory ALU operations
- One source operand in memory
- One source operand in register
- Cycle count of 2
- Branch cycle count to increase to 3.
- What fraction of the loads must be eliminated for
this to pay off, assuming the clock rate is not
affected?
- Base Machine (Reg / Reg)
- Op Freq Cycles
- ALU 50 1
- Load 20 2
- Store 10 2
- Branch 20 2
Typical Mix
5Example Solution
- Exec Time Instr Cnt x CPI x Clock
- Op Freq Cycles
- ALU .50 1 .5
- Load .20 2 .4
- Store .10 2 .2
- Branch .20 2 .4
- Reg/Mem
- 1.00 1.5
6Example Solution
- Exec Time Instr Cnt x CPI x Clock
- Op Freq Cycles Freq Cycles
- ALU .50 1 .5 .5 X 1 .5 X
- Load .20 2 .4 .2 X 2 .4 2X
- Store .10 2 .2 .1 2 .2
- Branch .20 2 .3 .2 3 .6
- Reg/Mem X 2 2X
- 1.00 1.5 1 X (1.7 X)/(1 X)
CyclesNew
InstructionsNew
CPINew must be normalized to new instruction
frequency X is the fraction of register-memory
instructions.
7Example Solution
- Exec Time Instr Cnt x CPI x Clock
- Op Freq Cycles Freq Cycles
- ALU .50 1 .5 .5 X 1 .5 X
- Load .20 2 .4 .2 X 2 .4 2X
- Store .10 2 .2 .1 2 .2
- Branch .20 2 .3 .2 3 .6
- Reg/Mem X 2 2X
- 1.00 1.5 1 X (1.7 X)/(1 X)
-
- Instr CntOld x CPIOld x ClockOld Instr CntNew x
CPINew x ClockNew - 1.00 x 1.5
(1 X) x (1.7 X)/(1 X)
8Example Solution
- Exec Time Instr Cnt x CPI x Clock
- Op Freq Cycles Freq Cycles
- ALU .50 1 .5 .5 X 1 .5 X
- Load .20 2 .4 .2 X 2 .4 2X
- Store .10 2 .2 .1 2 .2
- Branch .20 2 .3 .2 3 .6
- Reg/Mem X 2 2X
- 1.00 1.5 1 X (1.7 X)/(1 X)
-
- Instr CntOld x CPIOld x ClockOld Instr CntNew x
CPINew x ClockNew - 1.00 x 1.5
(1 X) x (1.7 X)/(1 X) - 1.5
1.7 X - 0.2
X - ALL loads must be eliminated for this to be a win!
9Integrated Circuits Costs
- IC cost Die cost Testing cost
Packaging cost - Final
test yield - Final test yield The fraction of packaged dies
which pass the final testing state. -
10Integrated Circuits Costs
- IC cost Die cost Testing cost
Packaging cost - Final
test yield - Die cost Wafer cost
- Dies per Wafer Die
yield - Die yield The fraction of good dies on a wafer,
before packaging. -
11Integrated Circuits Costs
- IC cost Die cost Testing cost
Packaging cost - Final
test yield - Die cost Wafer cost
- Dies per Wafer Die
yield - Dies ?( Wafer_diam/2)2 ?Wafer_diam
- Wafer Die Area Sqrt(2Die Area)
-
Test dies
12Integrated Circuits Costs
- IC cost Die cost Testing cost
Packaging cost - Final
test yield - Die cost Wafer cost
- Dies per Wafer Die
yield - Dies per wafer ? ( Wafer_diam / 2)2
? Wafer_diam Test dies - Die
Area Sqrt(2Die Area) - Die Yield Wafer yield 1Defects_per_unit_area
Die_Area
Defects per unit area (0.6 to 1.2), Fabrication
complexity a (about 3) Die cost increases
roughly as die area4
13Real World Examples
- Chip Metal Line Wafer Defect Area Dies/ Yield Die
Cost layers width cost
/cm2 mm2 wafer - 386DX 2 0.90 900 1.0 43 360 71 4
- 486DX2 3 0.80 1200 1.0 81 181 54 12
- PowerPC 601 4 0.80 1700 1.3 121 115 28 53
- HP PA 7100 3 0.80 1300 1.0 196 66 27 73
- DEC Alpha 3 0.70 1500 1.2 234 53 19 149
- SuperSPARC 3 0.70 1700 1.6 256 48 13 272
- Pentium 3 0.80 1500 1.5 296 40 9 417
- From "Estimating IC Manufacturing Costs, by
Linley Gwennap, Microprocessor Report, August 2,
1993, p. 15 - New products end up being much more expensive to
manufacture
14Other Costs
- Die Test Cost Test Cost Ave. Test Time
- Die
Yield - Packaging Cost depends on pins, heat
dissipation, appearance, ...
Chip Die Package Test Total cost pins ty
pe cost Assembly 386DX 4 132 QFP 1 4 9
486DX2 12 168 PGA 11 12 35 PowerPC
601 53 304 QFP 3 21 77 HP PA 7100 73
504 PGA 35 16 124 DEC Alpha 149
431 PGA 30 23 202 SuperSPARC 272
293 PGA 20 34 326 Pentium 417
273 PGA 19 37 473
15Chip Prices (August 1993)
- Assume purchase 10,000 units
Chip Area Mfg. Price Multi- Comment mm2 cost pli
er 386DX 43 9 31 3.4 Intense
Competition 486DX2 81 35 245 7.0 No
Competition PowerPC 601 121 77 280 3.6 Gain
market share DEC Alpha 234 202 1231 6.1
Recoup RD Pentium 296 473 965 2.0 Early
in shipments
16Power Dissipation
Source Intel
- Lead processor power increases every generation
- Compactions provide higher performance at lower
power
17Workstation Costs
- DRAM 50 to 55
- Color Monitor 15 to 20
- CPU board 10 to 15
- Hard disk 8 to 10
- CPU cabinet 3 to 5
- Video other I/O 3 to 7
- Keyboard, mouse 1 to 2
18Volume vs. Cost
- Rule of thumb on applying learning curve to
manufacturing - When volume doubles, costs reduce 10
- A DEC View of Computer Engineering by C. G.
Bell, J. C. Mudge, and J. E. McNamara, Digital
Press, Bedford, MA., 1978. - 40 MPPs _at_ 200 nodes 8,000 nodes/year vs.
100,000 Workstations/year - 2X (100,000/8,000) gt x 3.6
- Since doubling value reduces cost by 10, costs
reduces to - (0.9)3.6 0.68
- of the original price (about 1/3 less
expensive).
19Volume vs. Cost PCs vs. Workstations
- 1990 1992 1994 1997
- PC 23,880,898 33,547,589 44,006,000 65,480,000
- WS 407,624 584,544 679,320 978,585
- Ratio 59 57 65 67
- 2x 65 gt X 6.0 and (0.9)6.0 0.53
- PC costs are 47 less than workstation costs
for whole market. - Single company 20 WS market vs. 10 PC market
- Ratio 29 29 32 33
- 2x 32 gt X 5.0 and (0.9)5.0 0.59
- PCs cost 41 less than workstations for single
company.
20Learning Curve
production costs
volume
Years
time to introduce new product
21High Margins on High-End Machines
- RD considered return on investment
- Most companies spend 4 to 12 of income on RD.
- Every 1 RD must generate 8 to 25 in sales
- High end machines need more for RD
- Sell fewer high end machines
- Fewer to amortize RD
- Much higher cost margins
- Cost of 1 MB Memory (January 1994)
- PC 40 (Mac Quadra)
- WS 42 (SS-10)
- Mainframe 1920 (IBM 3090)
- Supercomputer 600 (M90 DRAM) 1375 (C90
15 ns SRAM)
22Cost/PerformanceWhat is Relationship of Cost to
Price?
- Component Costs
- Direct Costs (add 25 to 40) recurring costs
labor, purchasing, scrap, warranty - Gross Margin (add 82 to 186) nonrecurring
costs RD, marketing, sales, equipment
maintenance, rental, financing cost, pretax
profits, taxes - Average Discount to get List Price (add 33 to
66) volume discounts and/or retailer markup
List Price
25 to 40
Avg. Selling Price
34 to 39
6 to 8
Direct Cost
15 to 33
23Summary Price vs. Cost