Lecture 3: Computer Architectures - PowerPoint PPT Presentation

1 / 22

About This Presentation

Title:

Lecture 3: Computer Architectures

Description:

Title: CSE 574 Parallel Processing Author: ICS Faculty User Last modified by: Esin Onba o lu Created Date: 7/12/2005 12:19:29 PM Document presentation format – PowerPoint PPT presentation

Number of Views:50

Avg rating:3.0/5.0

Slides: 23

Provided by: ICSFacu3

Category:

more less

Transcript and Presenter's Notes

Title: Lecture 3: Computer Architectures

1
Lecture 3Computer Architectures
2
Basic Computer Architecture

Von Neumann Architecture

Memory
instruction
data
Input unit
Output unit
ALU
Processor
CU
Reg.
3
Levels of Parallelism

Bit level parallelism
Within arithmetic logic circuits
Instruction level parallelism
Multiple instructions execute per clock cycle
Memory system parallelism
Overlap of memory operations with computation
Operating system parallelism
More than one processor
Multiple jobs run in parallel on SMP
Loop level
Procedure level

4
Levels of Parallelism

Bit Level Parallelism
Within arithmetic logic circuits

5
Levels of Parallelism

Instruction Level Parallelism (ILP)
Multiple instructions execute per clock cycle
Pipelining (instruction - data)
Multiple Issue (VLIW)

6
Levels of Parallelism

Memory System Parallelism
Overlap of memory operations with computation

7
Levels of Parallelism

Operating System Parallelism
There are more than one processor
Multiple jobs run in parallel on SMP
Loop level
Procedure level

8
Flynns Taxonomy

Single Instruction stream - Single Data stream
(SISD)
Single Instruction stream - Multiple Data stream
(SIMD)
Multiple Instruction stream - Single Data stream
(MISD)
Multiple Instruction stream - Multiple Data
stream (MIMD)

9
Single Instruction stream - Single Data stream
(SISD)

Von Neumann Architecture

Memory
instruction
data
ALU
CU
Processor
10
Flynns Taxonomy

Single Instruction stream - Single Data stream
(SISD)
Single Instruction stream - Multiple Data stream
(SIMD)
Multiple Instruction stream - Single Data stream
(MISD)
Multiple Instruction stream - Multiple Data
stream (MIMD)

11
Single Instruction stream - Multiple Data stream
(SIMD)

Instructions of the program are broadcast to more
than one processor
Each processor executes the same instruction
synchronously, but using different data
Used for applications that operate upon arrays of
data

data
PE
data
PE
instruction
CU
Memory
data
PE
data
PE
instruction
12
Flynns Taxonomy

Single Instruction stream - Single Data stream
(SISD)
Single Instruction stream - Multiple Data stream
(SIMD)
Multiple Instruction stream - Single Data stream
(MISD)
Multiple Instruction stream - Multiple Data
stream (MIMD)

13
Multiple Instruction stream - Multiple Data
stream (MIMD)

Each processor has a separate program
An instruction stream is generated for each
program on each processor
Each instruction operates upon different data

14
Multiple Instruction stream - Multiple Data
stream (MIMD)

Shared memory
Distributed memory

15
Shared vs Distributed Memory

Distributed memory
Each processor has its own local memory
Message-passing is used to exchange data between
processors
Shared memory
Single address space
All processes have access to the pool of shared
memory

16
Distributed Memory

Processors cannot directly access another
processors memory
Each node has a network interface (NI) for
communication and synchronization

M
M
M
M
P
P
P
P
NI
NI
NI
NI
Network
17
Distributed Memory

Each processor executes different instructions
asynchronously, using different data

data
instr
PE
CU
M
data
instr
PE
CU
M
Network
data
instr
PE
CU
M
data
instr
PE
CU
M
18
Shared Memory

Each processor executes different instructions
asynchronously, using different data

data
PE
CU
data
PE
CU
Memory
data
PE
CU
data
PE
CU
instruction
19
Shared Memory

Uniform memory access
(UMA)
Each processor has uniform access to memory
(symmetric multiprocessor - SMP)
Non-uniform memory access (NUMA)
Time for memory access depends on the location of
data
Local access is faster than non-local access
Easier to scale than SMPs

P
P
P
P
Bus
Memory
Network
20
Distributed Shared Memory

Making the main memory of a cluster of computers
look as if it is a single memory with a single
address space
Shared memory programming techniques can be used

21
Multicore Systems

Many general purpose processors
GPU (Graphics Processor Unit)
GPGPU (General Purpose GPU)
Hybrid

The trend is
Board composed of multiple manycore chips sharing
memory
Rack composed of multiple boards
A room full of these racks

22
Distributed Systems

Clusters
Individual computers, that are tightly coupled by
software, in a local environment, to work
together on single problems or on related
problems
Grid
Many individual systems, that are geographically
distributed, are tightly coupled by software, to
work together on single problems or on related
problems