Introduction to HardwareArchitecture - PowerPoint PPT Presentation

1 / 30
About This Presentation
Title:

Introduction to HardwareArchitecture

Description:

mistakenly referred to as Moore's Law (transistors/chip) 4. 5 components of any Computer ... DRAM capacity: 2x / 1.5 years; 1000X size in last 15 years ... – PowerPoint PPT presentation

Number of Views:42
Avg rating:3.0/5.0
Slides: 31
Provided by: csBer
Category:

less

Transcript and Presenter's Notes

Title: Introduction to HardwareArchitecture


1
Introduction to Hardware/Architecture
  • David A. Patterson

http//cs.berkeley.edu/patterson/talks patterso
n_at_cs.berkeley.edu EECS, University of
California Berkeley, CA 94720-1776
2
Technology Trends Microprocessor Capacity
Alpha 21264 15 million Pentium Pro 5.5
million PowerPC 620 6.9 million Alpha 21164 9.3
million Sparc Ultra 5.2 million
Moores Law
2X transistors/Chip Every 1.5 years Called
Moores Law
3
Technology Trends Processor Performance
1.54X/yr
Processor performance increase/yr mistakenly
referred to as Moores Law (transistors/chip)
4
5 components of any Computer
Keyboard, Mouse
Computer
Processor (active)
Devices
Memory (passive) (where programs, data live
when running)
Input
Control (brain)
Disk, Network
Output
Datapath (brawn)
Display, Printer
5
Computer TechnologygtDramatic Change
  • Processor
  • 2X in speed every 1.5 years 1000X performance in
    last 15 years
  • Memory
  • DRAM capacity 2x / 1.5 years 1000X size in last
    15 years
  • Cost per bit improves about 25 per year
  • Disk
  • capacity gt 2X in size every 1.5 years
  • Cost per bit improves about 60 per year
  • 120X size in last decade
  • State-of-the-art PC when you graduate
    (1997-2001)
  • Processor clock speed 1500 MegaHertz (1.5
    GigaHertz)
  • Memory capacity 500 MegaByte (0.5 GigaBytes)
  • Disk capacity 100 GigaBytes (0.1 TeraBytes)
  • New units! Mega gt Giga, Giga gt Tera

6
Integrated Circuit Costs
Die cost Wafer cost
Dies per Wafer Die yield

Dies
Flaws
Die Cost is goes roughly with the cube of the
area fewer dies per wafer yield worse with
die area
7
Die Yield (1993 data)
Raw Dices Per Wafer wafer diameter die area
(mm2) 100 144 196 256 324 400 6/15cm 139
90 62 44 32 23 8/20cm 265 177 124 90 68
52 10/25cm 431 290 206 153 116 90 die
yield 23 19 16 12 11 10 typical CMOS
process ? 2, wafer yield90, defect
density2/cm2, 4 test sites/wafer Good Dices Per
Wafer (Before Testing!) 6/15cm 31 16 9 5 3 2
8/20cm 59 32 19 11 7 5 10/25cm 96 53 32 20 13
9 typical cost of an 8, 4 metal layers, 0.5um
CMOS wafer 2000
8
1993 Real World Examples
  • Chip Metal Line Wafer Defect Area Dies/ Yield Di
    e Cost layers width cost /cm2 mm2 wafer
  • 386DX 2 0.90 900 1.0 43 360 71 4
  • 486DX2 3 0.80 1200 1.0 81 181 54 12
  • PowerPC 601 4 0.80 1700 1.3 121 115 28 53
  • HP PA 7100 3 0.80 1300 1.0 196 66 27 73
  • DEC Alpha 3 0.70 1500 1.2 234 53 19 149
  • SuperSPARC 3 0.70 1700 1.6 256 48 13 272
  • Pentium 3 0.80 1500 1.5 296 40 9 417
  • From "Estimating IC Manufacturing Costs, by
    Linley Gwennap, Microprocessor Report, August 2,
    1993, p. 15

9
Processor Trends/ History
  • History of innovations to 2X / 1.5 yr
  • Pipelining (helps seconds / clock, or clock rate)
  • Out-of-Order Execution (helps clocks /
    instruction)
  • Superscalar (helps clocks / instruction)

10
Pipelining is Natural!
  • Laundry Example
  • Ann, Brian, Cathy, Dave each have one load of
    clothes to wash, dry, fold, and put away
  • Washer takes 30 minutes
  • Dryer takes 30 minutes
  • Folder takes 30 minutes
  • Stasher takes 30 minutesto put clothes into
    drawers

A
B
C
D
11
Sequential Laundry
2 AM
6 PM
12
8
1
7
10
11
9
30
30
30
30
30
30
30
30
30
30
30
30
30
30
30
30
T a s k O r d e r
Time
A
B
C
D
  • Sequential laundry takes 8 hours for 4 loads

12
Pipelined Laundry Start work ASAP
2 AM
12
6 PM
8
1
7
10
11
9
Time
30
30
30
30
30
30
30
T a s k O r d e r
A
B
C
D
  • Pipelined laundry takes 3.5 hours for 4 loads!

13
Pipeline Hazard Stall
2 AM
12
6 PM
8
1
7
10
11
9
Time
T a s k O r d e r
A
B
C
E
F
  • A depends on D stall since folder tied up

14
Out-of-Order Laundry Dont Wait
2 AM
12
6 PM
8
1
7
10
11
9
Time
30
30
30
30
30
30
30
T a s k O r d e r
A
B
C
D
E
F
  • A depends on D rest continue need more
    resources to allow out-of-order

15
Superscalar Laundry Parallel per stage
2 AM
12
6 PM
8
1
7
10
11
9
Time
T a s k O r d e r
D
E
F
  • More resources, HW match mix of parallel tasks?

16
Superscalar Laundry Mismatch Mix
2 AM
12
6 PM
8
1
7
10
11
9
Time
30
30
30
30
30
30
30
T a s k O r d e r
(light clothing)
(dark clothing)
(light clothing)
  • Task mix underutilizes extra resources

17
State of the Art Alpha 21264
  • 15M transistors
  • 2 64KB caches on chip 16MB L2 cache off chip
  • Clock lt1.7 nsec, or gt600 MHz
  • 90 watts
  • Superscalar fetch up to 6 instructions/clock
    cycle, retires up to 4 instruction/clock cycle
  • Execution out-of-order

18
Other example Sony Playstation 2
  • Emotion Engine 6.2 GFLOPS, 75 million polygons
    per second (Microprocessor Report, 135)
  • Superscalar MIPS core vector coprocessor
    graphics/DRAM
  • Claim Toy Story realism brought to games

19
The Goal Illusion of large, fast, cheap memory
  • Fact Large memories are slow, fast memories are
    small
  • How do we create a memory that is large, cheap
    and fast (most of the time)?
  • Hierarchy of Levels
  • Similar to Principle of Abstraction hide
    details of multiple levels

20
Hierarchy Analogy Term Paper
  • Working on paper in library at a desk
  • Option 1 Every time need a book
  • Leave desk to go to shelves (or stacks)
  • Find the book
  • Bring one book back to desk
  • Read section interested in
  • When done with section, leave desk and go to
    shelves carrying book
  • Put the book back on shelf
  • Return to desk to work
  • Next time need a book, go to first step

21
Hierarchy Analogy Library
  • Option 2 Every time need a book
  • Leave some books on desk after fetching them
  • Only go to shelves when need a new book
  • When go to shelves, bring back related books in
    case you need them sometimes youll need to
    return books not used recently to make space for
    new books on desk
  • Return to desk to work
  • When done, replace books on shelves, carrying as
    many as you can per trip
  • Illusion whole library on your desktop
  • Buzzword cache from French for hidden treasure

22
Why Hierarchy works Natural Locality
  • The Principle of Locality
  • Program access a relatively small portion of the
    address space at any instant of time.
  • What programming constructs lead to Principle of
    Locality?

23
Memory Hierarchy How Does it Work?
  • Temporal Locality (Locality in Time)
  • ? Keep most recently accessed data items closer
    to the processor
  • Library Analogy Recently read books are kept on
    desk
  • Block is unit of transfer (like book)
  • Spatial Locality (Locality in Space)
  • ? Move blocks consists of contiguous words to the
    upper levels
  • Library Analogy Bring back nearby books on
    shelves when fetch a book hope that you might
    need it later for your paper

24
Memory Hierarchy Pyramid
  • Levels in memory hierarchy

Level n
(data cannot be in level i unless also in i1)
25
Big Idea of Memory Hierarchy
  • Temporal locality keep recently accessed data
    items closer to processor
  • Spatial locality moving contiguous words in
    memory to upper levels of hierarchy
  • Uses smaller and faster memory technologies close
    to the processor
  • Fast hit time in highest level of hierarchy
  • Cheap, slow memory furthest from processor
  • If hit rate is high enough, hierarchy has access
    time close to the highest (and fastest) level and
    size equal to the lowest (and largest) level

26
Disk Description / History
Track
Embed. Proc. (ECC, SCSI)
Sector
Track Buffer
Arm
Head
Platter
1973 1. 7 Mbit/sq. in 140 MBytes
1979 7. 7 Mbit/sq. in 2,300 MBytes
Cylinder
source New York Times, 2/23/98, page C3,
Makers of disk drives crowd even more data into
even smaller spaces
27
Disk History
2000 10,100 Mb/s. i. 25,000 MBytes
2000 11,000 Mb/s. i. 73,400 MBytes
1989 63 Mbit/sq. in 60,000 MBytes
1997 1450 Mbit/sq. in 2300 Mbytes (2.5
diameter)
1997 3090 Mbit/s. i. 8100 Mbytes (3.5 diameter)
source N.Y. Times, 2/23/98, page C3
28
State of the Art Ultrastar 72ZX
  • 73.4 GB, 3.5 inch disk
  • 2/MB
  • 16 MB track buffer
  • 11 platters, 22 surfaces
  • 15,110 cylinders
  • 7 Gbit/sq. in. areal density
  • 17 watts (idle)
  • 0.1 ms controller time
  • 5.3 ms avg. seek (seek 1 track gt 0.6 ms)
  • 3 ms 1/2 rotation
  • 37 to 22 MB/s to media

Embed. Proc.
Track
Sector
Cylinder
Track Buffer
Platter
Arm
Head
source www.ibm.com www.pricewatch.com 2/14/00
29
A glimpse into the future?
  • IBM microdrive for digital cameras
  • 340 Mbytes
  • Disk target in 5-7 years?

30
Questions?
  • Contact us if youre interestedemail
    patterson_at_cs.berkeley.edu http//iram.cs.berkeley
    .edu/
Write a Comment
User Comments (0)
About PowerShow.com