Computer Architecture - PowerPoint PPT Presentation

1 / 41

About This Presentation

Title:

Computer Architecture

Description:

The course starts with a quick overview of computer design fundamentals and ... Emergence of the first microprocessor in. late 1970's. Roughly 35% growth per year ... – PowerPoint PPT presentation

Number of Views:97

Avg rating:3.0/5.0

Slides: 42

Provided by: oma2

Category:

more less

Transcript and Presenter's Notes

Title: Computer Architecture

1
Computer Architecture

The architecture of a computer is the interface
between the machine and the software
- Andris Padges
IBM 360/370 Architect

2
Course Outline

Computer ArchitectureQuarter Autumn 2006-7
Instructor Muhammad Jahangir Ikram
Office Room 424
e-mail jikram_at_lums.edu.pk
Office Hours Monday and Wednesday, 300 430pm

3
Course Outline (Contd..)

Description
This course focuses on the principles, practices
and issues in Computer Architecture, while
examining computer design tradeoffs both
qualitatively and quantitatively.
The course starts with a quick overview of
computer design fundamentals and instruction set
principles, the materials which the student has
already covered in the pre-requisite of this
course.
The following topics are covered in greater
detail
Advanced Pipelining
Instruction-level parallelism and Compiler
Support
Memory - hierarchy design
SIMD, VLIW, Superscalar Architectures
Code Optimization and Compiler Issues

4
Course Outline (Contd..)

Text Book
Hennessy, J. L, and Patterson, D. A., Computer
Architecture A Quantitative Approach, 2nd
Edition. Morgan Kaufmann, 1996.

5
Course Outline (Contd..)
Lectures There will be two 75 minutes lecturers
per week and 50 minutes Lecture/ 100 minutes lab.
TOTAL SESSIONS 29 There will be four Labs
during weeks 2, 3, 4, 5.
6
Course Outline (Contd..)

Grading
Quizzes assignments 173
Laboratory 10 (Atten 3 Lab Task 3 HW 4)
Midterm exam 30
Final exam 40

7
Schedule

Fundamentals of Computer Design 1,2 1.1 1.10
Measuring and Reporting Performance
Quantitative Principles of Computer Design
Instruction Set Principles and Examples 3-5 2.1
2.8
Classifying Instruction Set Architectures
Memory Addressing
Operations in the Instruction Set
Encoding an Instruction Set
LAB 1 MIPS Instruction Format and Instruction
Study 6
Pipelining Overview 7-14 A.1 to A10
What Is Pipelining?
Single Cycle Computer Study 9
The Major Hurdle of Pipelining Pipeline Hazards
Data Hazards
LAB 2 Study of Pipelining 12

8
Schedule

Control Hazards and Static Branch Prediction
LAB 3 Pipeline Studies and Control Hazards 15
Scoreboarding
MIDTERM
ILP and Dynamic Exploitation 17-19 3.1 3.5
Static Branch Prediction
Tomasulos Dynamic Scheduling
Dynamic Branch Prediction
Superscalar and VLIW architectures
Advanced Pipelining And ILP (Contd.) 20-22 3.6
3.10
Taking Advantage of More ILP with Multiple Issue
P6 Architecture
Advanced Pipelining And ILP (Contd.) 23-25 4.1,
4.7
Compiler Support for Exploiting ILP
Hardware Support for Extracting More Parallelism
Putting It All Together The PowerPC 620, and
Itanium

9
Schedule

Memory-Hierarchy Design 26-29 5.1 5.7
The ABCs of Caches
Reducing Cache Misses
Reducing Cache Miss Penalty
Virtual Memory System
Computer I/O 30 6.1 - ?

10
Background

Emergence of the first microprocessor in
late 1970s
Roughly 35 growth per year
Important changes in the marketplace
Virtual elimination of assembly language
programming reduced the need for object code
compatibility
Creation of standardized, vendor-independent
operating systems, such as UINX, LINX lowered the
risk of bringing out a new architecture

11
Development of RISC

These changes lead to the development of a new
set of architectures, called the
RISC (Reduced Instruction Set Computer)
architecture
RISC uses two performance techniques
Instruction level parallelism (pipelining)
Use of Cache

12
Growth in microprocessor performance
13
Moores Law
14
Technology Scaling
15
Scaling of Transistors

Feature Size has reduced to 3 micron in 1985 to
0.09 micron.
Reducing Feature-size means quadratic increase in
Transistor Count and better Performance.
But higher routing Delays and poor performance of
Long Wires
Also means More Power Consumption (Less load
Capacitance)

16
The Itanium Processor
17
Intel microprocessor die
18
IC Cost Trends (Source IC Knowledge)
19
Measuring performance

Definition of time
Response time, elapse time The latency to
complete the task, including disk access,
input/output, operating system overhead etc.
CPU time
User CPU Time
Time spent in the program
System CPU Time
Time Spent by operating system.
Unix Time Command
90.7s 12.9s 239 (159s) 65 (90.712.9)/159
(User, System, Elapsed Time)

20
What is a Benchmark?

A benchmark is "a standard of measurement or
evaluation" (Websters II Dictionary).
A computer benchmark is typically a computer
program that performs a strictly defined set of
operations - a workload - and returns some form
of result - a metric - describing how the tested
computer performed.
Computer benchmark metrics usually measure
speed how fast was the workload completed or
throughput how many workload units per unit time
were completed.
Running the same computer benchmark on multiple
computers allows a comparison to be made.
Source Standards Performance Evaluation
Corporation

21
Programs to Evaluate Performance

Real Applications
Modified (or scripted) applications
Kernels
Toy benchmarks
Synthetic benchmarks

22
Programs to evaluate performance

Real Applications
Example Compliers for C, text-processing
software etc.
Modified (or scripted) applications
CPU oriented bench mark, I/O may be removed to
minimize its impact on execution

23
Programs to evaluate performance

Kernels
To isolate performance of individual features of
a machine.
Toy benchmarks
Produces a result that the user already knows
Synthetic benchmarks
Try to match the average frequency of operations
and operands of a large set of programs

24
Benchmark Suites

SPEC95, SPEC2000 (11 Integer, 14 FP), SPEC2006
(12 Integer, 17 FP)
C Compiler, Router, FEM
Desktop (CPU and Graphics Intensive)
Server (File Servers, Web Servers, Transaction
Processing)
Embedded (EEMBC)
34 Kernels

25
What is SPEC

SPEC is the Standard Performance Evaluation
Corporation. SPEC is a non-profit organization
whose members include computer hardware vendors,
software companies, universities, research
organizations, systems integrators, publishers
and consultants. SPEC's goal is to establish,
maintain and endorse a standardized set of
relevant benchmarks for computer systems.
Although no one set of tests can fully
characterize overall system performance, SPEC
believes that the user community benefits from
objective tests which can serve as a common
reference point.

26
What does a benchmark measure?

the computer processor (CPU),
the memory architecture, and
the compilers.
SPEC CPU2006 contains two components that focus
on two different types of compute intensive
performance
The CINT2006 suite measures compute-intensive
integer performance, and
The CFP2006 suite measures compute-intensive
floating point performance
Source Standards Performance Evaluation
Corporation

27
Reference Machine Source Standards Performance
Evaluation Corporation

SPEC uses a historical Sun system, the "Ultra
Enterprise 2" which was introduced in 1997, as
the reference machine. The reference machine uses
a 296 MHz UltraSPARC II processor, as did the
reference machine for CPU2000. But the reference
machines for the two suites are not identical
the CPU2006 reference machine has substantially
better caches, and the CPU2000 reference machine
could not have held enough memory to run CPU2006.
It takes about 12 days to do a rule-conforming
run of the base metrics for CINT2006 and CFP2006
on the CPU2006 reference machine. SPEC2000 now
takes less a minute on latest High Performance
M/Cs

28
Example Result for SPEC 2000 Source Standards
Performance Evaluation Corporation
29
Example Result for SPEC 2000Source Standards
Performance Evaluation Corporation
30
Summarizing Performance
31
Amdahls Law

The performance improvement to be gained from
using faster mode of execution is limited by the
fraction of the time the faster mode can be used

32
Amdahls Law Law of Diminishing Returns
33
CPU performance Equations
34
Example

Frequency of FP operations 25
Average CPI of FP operations 4.0
Average CPI of other instructions 1.33
Frequency of FPSQR 2
CPI of FPSQR 20
Assume CPI of FPSQR decreased to 2 OR the CPI of
all FP operations to 2.5
Compare these two designs using the CPU
performance equations

35
Example Solution
CPI for enhanced FPSQR
CPI for enhanced FP operation
36
Example Solution
37
Another Measure -- MIPS
38
ExampleAn Embedded Processor

120 MIPS for single processor.
80 MIPS for Processor Co-Processor Combination
(That is how they are measured for combined)
I Number of Integer Instructions
F Number of Floating Point Instructions (8M)
Y No. of Integer Instructions to Emulate one FP
Instruction (50)
W Time for choice 1 (4 seconds)
B Time for Choice 2