Fundamentals of Computer Design - PowerPoint PPT Presentation

1 / 63
About This Presentation
Title:

Fundamentals of Computer Design

Description:

Emerging Technologies. Interleaving. Bus protocols. RAID. VLSI. Input/Output and Storage ... Computer Designers and Chip Costs ... – PowerPoint PPT presentation

Number of Views:2040
Avg rating:5.0/5.0
Slides: 64
Provided by: ccNct
Category:

less

Transcript and Presenter's Notes

Title: Fundamentals of Computer Design


1
Fundamentals of Computer Design
2
Outline
  • Performance Evolution
  • The Task of a Computer Designer
  • Technology and Computer Usage Trends
  • Cost and Trends in Cost
  • Measuring and Reporting Performance
  • Quantitative Principles of Computer Design

3
Computer Architecture Is
  • The attributes of a computing system as seen by
    the programmer, i.e., the conceptual structure
    and functional behavior, as distinct from the
    organization of the data flows and controls, the
    logic design, and the physical implementation.
    (Amdahl, Blaaw, and Brooks, 1964)

4
Computer Architectures Changing Definition
  • 1950s to 1960s Computer Architecture Course
  • Computer Arithmetic
  • 1970s to mid 1980s Computer Architecture Course
  • Instruction Set Design, especially ISA
    appropriate for compilers
  • 1990s to 2000s Computer Architecture Course
  • Design of CPU, memory system, I/O system,
    Multiprocessors

5
Performance Evolution
  • 1K today buys a gizmo better than 1M could buy
    in 1965.
  • 1970s
  • Mainframes dominated performance improved
    2530/yr
  • Mostly due to improved architecture some
    technology aids
  • 1980s
  • VLSI microprocessor became the foundation
  • Technology improves at 35/yr
  • Machine language death opportunity
  • Mostly with UNIX and C in mid-80s
  • Even most system programmers gave up assembly
    language
  • With this came the need for efficient compilers

6
Performance Evolution (Cont.)
  • 1980s (Cont.)
  • Compiler focus brought on the great CISC vs. RISC
    debate
  • With the exception of Intel RISC won the
    argument
  • RISC performance improved by 50/year initially
  • Of course RISC is not as simple anymore and the
    compiler is a key part of the game
  • Does not matter how fast your computer is, if the
    compiler wastes most of it due to the inability
    to generate efficient code
  • With the exploitation of instruction-level
    parallelism (pipeline super-scalar) and the use
    of caches, performance is further enhanced

CISC Complex Instruction Set Computing RISC
Relegate Important Stuff to the Compiler (Reduced
Instruction Set Computing)
7
Growth in Performance (Figure 1.1)
Mainly due to advanced architecture ideas
Technology driven
8
The Task of A Computer Designer
9
Aspects of Computer Design
  • Changing face of computing different system
    design issues
  • Desktop computing
  • Servers
  • Embedded computers
  • Bottom line is that it is a complex game
  • Determine important attributes (perhaps a market
    issue)
  • Functional Requirement
  • THEN maximize performance
  • WHILE staying within the cost and power
    constraints
  • Classic conflicting constraint problem

10
A Summary of the Three Computing Classes and
Their System Characteristics
11
Functional Requirements
12
Functional Requirements (Cont.)
Functional Requirement Typical
Features Required or Supported
13
Aspects of Computer Design
Software
Hardware
OurFocus
Architecture
Implementation
VLSI
Logic
Power
Packaging

14
Task of A Computer Design
15
Task of A Computer Design
Shared Memory, Message Passing, Data Parallelism
M
P
M
P
M
P
M
P
  
Network Interfaces
S
Interconnection Network
Processor-Memory-Switch
Topologies, Routing, Bandwidth, Latency, Reliabili
ty
Multiprocessors Networks and Interconnections
16
Optimizing the Design
  • Usually the functional requirements are set by
    the company/marketplace
  • Which design is optimal dependent on the choice
    of metric
  • Cost minimized ? simple design
  • Performance maximized ? complex design or better
    technology
  • Time to market minimized ? also favors simplicity
  • Oh and you only get one shot
  • Requires heaps of simulation and must quantify
    everything
  • Inherent requirements for deep infrastructure and
    support
  • Plus you must predict the trends

17
Key trends that must always be tracked
  • Usage patterns and the market
  • Technology
  • Cost and performance

18
Technology and Computer Usage Trends
19
Usage Trends
  • Memory usage
  • Average program needs grow by 50 to 100/year
  • Impact - add an address bit each year
    (Instruction set)
  • Assembly language replaced by HLL
  • Increasingly important role of compilers
  • Compiler and architecture types MUST now work
    together
  • Whacked on pictures - even TV
  • Graphics and multimedia capability
  • Whacked on communications
  • I/O subsystems become a higher priority

20
Technology Trends
  • Integrated Circuits
  • Density increases at 35/yr.
  • Die size increases 10-20/yr
  • Combination is a chip complexity growth rate of
    55/yr
  • Transistor speed increase is similar but signal
    propagation does not track this curve - so clock
    rates dont go up as fast
  • Semiconductor DRAM
  • Density quadruples every 3 years (approx. 60/yr)
    4x steps
  • Cycle time decreases slowly - 33 in 10 years
  • Interface changes have improved bandwidth

21
Technology Trends (Cont.)
  • Magnetic Disk
  • Currently density improves at 100/yr
  • Access time has improved by 33 in 10 years
  • Network Technology
  • Depends both on the performance of switches and
    transmission system
  • 1GB Ethernet becomes available about 5 years
    after 100MB
  • Doubling in bandwidth every year
  • Scaling of transistor performance, wires, and
    power in ICs

22
Effects of Rapid Technology Trends
  • Consider todays product cycle
  • concept to production 2 years
  • AND market requirement
  • of something new every 6 - 12 months
  • Implications
  • Pipelined design efforts using multiple design
    teams
  • Have to design for a complexity target that cant
    be implemented until the end of the cycle (Design
    for the next technology)
  • Cant afford to miss the best technology so you
    have to chase the trends

23
Cost, Price, and Their Trends
24
Cost
  • Clearly a market place issue -- profit as a
    function of volume
  • Lets focus on hardware costs
  • Factors impacting cost
  • Learning curve manufacturing costs decrease
    over time
  • Yield the percentage of manufactured devices
    that survives the testing procedure
  • Volume is also a key factor in determine cost
  • Commodities are products that are sold by
    multiple vendors in large volumes and are
    essentially identical.

25
Learning Curve at Work
26
Integrated Circuits Costs
27
Remember This Comic?
28
Cost of an Integrated Circuit
  • The cost of a packaged integrated circuit is
  • Cost of dieCost of testing
    dieCost of packaging and final test
  • Cost of IC---------------------------------------
    --------------------------------------

  • Final test yield
  • Cost of die(Cost of wafer) / (Dies per wafer ?
    Die yield)
  • ? ? (Wafer
    diameter/2)2 ? ? Wafer diameter
  • Dies per wafer------------------------------ -
    -------------------------
  • Die area
    (2 ? Die area) 0.5

29
Cost of an Integrated Circuit
  • The fraction or percentage of good dies on a
    wafer number (die yield)
  • Defects per
    unit area ? Die area -?
  • Die yieldWafer yield ? 1 ---------------------
    ---------------------

  • ?
  • Where ? is a parameter that corresponds roughly
    to the number of masking level, a measure on
    manufacturing complexity, critical to die yield
    (? 4.0 is a good estimate).

Die Cost goes roughly with die area5
30
Example Finding the number of dies
  • Find the number of die per 30-cm wafer for a die
    that is 0.7 cm on a side.
  • Ans The total die area is 049 cm2. Thus

? ? (30/2)2
? ? 30 Dies per wafer ------------- ?
---------------- 1347
0.49 ( 2 ? 0.49)0.5
31
Example Finding the die yield
  • Find the die yield for dies that are 1 cm on a
    side and 0.7 cm on a side, assuming a defect
    density of 0.6 per cm2.
  • Ans The total die areas are 1 cm2 and 0.49
    cm2.
  • For the larger die yield is
  • Die yield1(0.6 ? 1)/4-40.57
  • For the smaller die, it is
  • Die yield 1(0.6 ? 0.49)/4-40.75

32
Computer Designers and Chip Costs
  • The computer designer affects die size, and hence
    cost, both by what functions are included on or
    excluded from the die and by the number of I/O
    pins

33
Cost/Price
  • Component Costs
  • Direct Costs (add 10 to 30) costs directly
    related to making a project
  • Labor, purchasing, scrap, warranty
  • Gross Margin (add 10 to 45) the companys
    overhead that cannot be billed directly to one
    project
  • RD, marketing, sales, equipment maintenance,
    rental, financing cost, pretax profits, taxes
  • Average Discount to get List Price (add 33 to
    66)
  • Volume discounts and/or retailer markup

34
Cost/Price Illustration
35
Cost/Price for Different Kinds of Systems
36
Measuring and Reporting Performance
37
Performance
  • 2 key aspects making 1 faster may slow the
    other
  • Execution time (single task)
  • Throughput (multiple tasks)
  • Comparing performance
  • Key measurement is Time of real programs
  • MIPS? MFLOPS?
  • Performance 1/execution time
  • If X is N times faster than Y
  • Similar for throughput comparisons
  • Improved performance ? decreasing execution time

38
Measuring Performance
  • Several kinds of time
  • Wall-clock time response time, or elapsed time
  • Load, I/O delays, OS overhead
  • CPU time - time spent computing your program
  • Factors out time spent waiting for I/O delays
  • But includes the OS your program
  • Hence system CPU time, and user CPU time

39
OS Time
  • Unix time command reports
  • User CPU time
  • System CPU time
  • Total elapsed time
  • of elapsed time that is user system CPU time
  • Tells you how much time you spent waiting as a
  • BEWARE
  • OSs have a way of under-measuring themselves

90.7u 12.9s 239 65
40
Choosing Programs to Evaluate Performance
  • Real applications clearly the right choice
  • Porting and eliminating system-dependent
    activities
  • User burden -- to know which of your programs you
    really care about
  • Modified (or scripted) applications
  • Enhance portability or focus on particular
    aspects of system performance
  • Kernels small, key pieces of real programs
  • Best used to isolate performance of individual
    features to explain the reasons from differences
    in performance of real programs
  • Livermore Loops and Linpack are examples
  • Not real programs however -- no user really uses
    them

41
Choosing Programs to Evaluate Performance (Cont.)
  • Toy benchmarks quicksort, puzzle
  • Beginning programming assignment
  • Synthetic benchmarks
  • Try to match the average frequency of operations
    and operands of a large set of programs
  • No user really runs them -- not even pieces of
    real programs
  • They typically reside in cache dont test
    memory performance
  • At the very least you must understand what the
    benchmark code is in order to understand what it
    might be measuring
  • Companies thrive or bust on benchmark performance
  • Hence they optimize for the benchmark
  • BEWARE ALWAYS!!

42
Benchmark Suites
  • SPEC (Standard Performance Evaluation
    Corporation)
  • http//www.spec.org
  • Desktop benchmarks
  • CPU-intensive SPEC CPU2000
  • Graphic-intensive SPECviewperf
  • Server benchmarks
  • CPU throughput-oriented SPECrate
  • I/O activity SPECSFS (NFS), SPECWeb
  • Transaction processing TPC (Transaction
    Processing Council)
  • Embedded benchmarks
  • EEMBC (EDN Embedded Microprocessor Benchmark
    Consortium)

43
Some PC Benchmarks
44
SPEC CPU2000 Benchmark Suites - Integer
45
SPEC CPU2000 Benchmark Suites Floating Point
46
Reporting Performance Results
  • Claim Spice takes X seconds on machine Y
  • Missing
  • Spice version input? What was the circuit?
  • Operational parameters - time step, duration
  • Compiler and version optimization settings
  • Machine configuration - disk, memory, etc.
  • Source code modification or hand-generated
    assembly language
  • Reproducibility is a must
  • List everything another experimenter would need
    to duplicate the results

47
Benchmark Reporting
48
Other Problems
  • Lets assume we can get the test jig specified
    properly
  • See the following example
  • Which is better?
  • By how much?
  • Are the program equally important?

49
Some Aggregate Job Mix Options
  • Arithmetic Mean - provides a simple average
  • Does not account for weight - all programs
    treated equal
  • Weighted arithmetic mean
  • Weight is the frequency of use
  • Better but beware the dominant program time
  • Depend on the reference machine

50
Weighted Arithmetic Mean
51
Normalized Time Metrics
  • Geometric Mean
  • Has the nice property that
  • Ratio of the means Mean of the ratios
  • Consistent no matter which machine is the
    reference
  • Better than arithmetic means but
  • Dont form accurate prediction models dont
    predict execution time
  • Still have to remain cautious (more drawbacks
    pp. 3739)

52
Normalized Time Metrics
Arithmetic mean should not be used to average
normalized execution time
53
Quantitative Principles of Computer Design
54
Make the Common Case Fast
  • Most pervasive principle of design
  • Need to validate that it is common or uncommon
  • Often
  • Common cases are simpler than uncommon cases
  • e.g. exceptions like overflow, interrupts, ...
  • Truly simple is usually both cheap and fast -
    best of both worlds
  • Trick is to quantify the advantage of a proposed
    enhancement

55
Amdahls Law
Quantification of the diminishing return
principle
  • Defines speedup gained from a particular feature
  • Depends on 2 factors
  • Fraction of original computation time that can
    take advantage of the enhancement - e.g. the
    commonality of the feature
  • Level of improvement gained by the feature
  • Amdahls law

56
Amdahl's Law (Cont.)
Suppose that enhancement E accelerates a fraction
F of the task by a factor S, and the remainder
of the task is unaffected
57
Simple Example
Amdahls Law says nothing about cost
  • Important Application
  • FPSQRT 20
  • FP instructions account for 50
  • Other 30
  • Designers say same cost to speedup
  • FPSQRT by 40x
  • FP by 2x
  • Other by 8x
  • Which one should you invest?
  • Straightforward plug in the numbers compare BUT
    whats your guess??

58
And the Winner Is?
59
Calculating CPU Performance
  • All commercial machines are synchronous
  • Implies there is a clock which ticks once per
    cycle
  • Hence there are 2 useful basic metrics
  • Clock Rate - today in MHz
  • Clock cycle time
  • Clock cycle time 1/clock rate
  • E.g. 250 MHz rate corresponds to a 4 ns. cycle
    time

60
Calculating CPU Performance (Cont.)
  • We tend to count instructions executed IC
  • Note looking at the object code is just a start
  • What we care about is the dynamic count - e.g.
    dont forget loops, recursion, branches, etc.
  • CPI (Clock Per Instruction) is a figure of merit

61
Calculating CPU Performance (Cont.)
  • 3 Focus Factors -- Cycle Time, CPI, IC
  • Sadly - they are interdependent and making one
    better often makes another worse (but small or
    predictable impacts)
  • Cycle time depends on HW technology and
    organization
  • CPI depends on organization (pipeline,
    caching...) and ISA
  • IC depends on ISA and compiler technology
  • Often CPIs are easier to deal with on a per
    instruction basis

62
Simple Example
  • Suppose we have made the following measurements
  • Frequency of FP operations (other than FPSQR)
    25
  • Average CPI of FP operations4.0
  • Average CPI of other instructions1.33
  • Frequency of FPSQR2
  • CPI of FPSQR20
  • Two design alternatives
  • Reduce the CPI of FPSQR to 2
  • Reduce the average CPI of all FP operations to 2

63
And The Winner is
Write a Comment
User Comments (0)
About PowerShow.com