Computer Architecture - PowerPoint PPT Presentation

1 / 24
About This Presentation
Title:

Computer Architecture

Description:

We're going to do only one section from this chapter, that part related to how ... (avoids bottlenecks) Send point-to-point requests to processors via network ... – PowerPoint PPT presentation

Number of Views:36
Avg rating:3.0/5.0
Slides: 25
Provided by: jb20
Category:

less

Transcript and Presenter's Notes

Title: Computer Architecture


1
Computer Architecture
  • Chapter 8
  • Multiprocessors
  • Shared Memory Architectures
  • Prof. Jerry Breecher
  • CSCI 240
  • Fall 2003

2
Chapter Overview
  • Were going to do only one section from this
    chapter, that part related to how caches from
    multiple processors interact with each other.
  • 8.1 Introduction the big picture
  • 8.3 Centralized Shared Memory Architectures

3
Introduction
The Big Picture Where are We Now?
  • 8.1 Introduction
  • 8.3 Centralized Shared Memory Architectures

The major issue is this Weve taken copies of
the contents of main memory and put them in
caches closer to the processors. But what
happens to those copies if someone else wants to
use the main memory data? How do we keep all
copies of the data in synch with each other?
4
The Multiprocessor Picture
Processor/Memory Bus
Example Pentium System Organization
PCI Bus
I/O Busses
5
Shared Memory Multiprocessor
Chipset
Memory
  • Memory centralized with Uniform Memory Access
    time (uma) and bus interconnect, I/O
  • Examples Sun Enterprise 6000, SGI Challenge,
    Intel SystemPro

Disk other IO
6
Shared Memory Multiprocessor
  • Several processors share one address space
  • conceptually a shared memory
  • often implemented just like a multicomputer
  • address space distributed over private memories
  • Communication is implicit
  • read and write accesses to shared memory
    locations
  • Synchronization
  • via shared memory locations
  • spin waiting for non-zero
  • barriers

P
P
P
Network/Bus
M
Conceptual Model
7
Message Passing Multicomputers
  • Computers (nodes) connected by a network
  • Fast network interface
  • Send, receive, barrier
  • Nodes not different than regular PC or
    workstation
  • Cluster conventional workstations or PCs with
    fast network
  • cluster computing
  • Berkley NOW
  • IBM SP2

8
Large-Scale MP Designs
  • Memory distributed with nonuniform memory access
    time (numa) and scalable interconnect
    (distributed memory)

100 cycles
Low Latency High Reliability
40 cycles
1 cycle
9
Shared Memory Architectures
8.1 Introduction 8.3 Centralized Shared
Memory Architectures
  • In this section we will understand the issues
    around
  • Sharing one memory space among several
    processors.
  • Maintaining coherence among several copies of a
    data item.

10
The Problem of Cache Coherency
Shared Memory Architectures
CPU
CPU
CPU
Cache 100 200
Cache 550 200
Cache 100 200
A
A
A
B
B
B
Memory 100 200
Memory 100 200
Memory 100 440
A
A
A
B
B
B
I/O
I/O Output of A gives 100
I/O Input 440 to B
a) Cache and memory coherent A A, B B.
b) Cache and memory incoherent A A.
c) Cache and memory incoherent B B.
11
Some Simple Definitions
Shared Memory Architectures
Mechanism
How It Works
Performance
Coherency Issues
Write Back
Write modified data from cache to memory only
when necessary.
Good, because doesnt tie up memory bandwidth.
Can have problems with various copies containing
different values.
Write Through
Write modified data from cache to memory
immediately.
Not so good - uses a lot of memory bandwidth.
Modified values always written to memory data
always matches.
12
What Does Coherency Mean?
Shared Memory Architectures
  • Informally
  • Any read must return the most recent write
  • Too strict and too difficult to implement
  • Better
  • Any write must eventually be seen by a read
  • All writes are seen in proper order
    (serialization)
  • Two rules to ensure this
  • If P writes x and P1 reads it, Ps write will be
    seen by P1 if the read and write are sufficiently
    far apart
  • Writes to a single location are serialized seen
    in one order
  • Latest write will be seen
  • Otherwise could see writes in illogical order
    (could see older value after a newer value)

13
There are Different Types of Memory In The Cache
Shared Memory Architectures
Test_and_set(lock) shared_data
xyz Clear(lock)
  • What kinds of memory are there in the cache?

TYPE
Shared?
Writable
How Kept Coherent
Code
Shared
No
No Need.
Private Data
Exclusive
Yes
Write Back
Shared Data
Shared
Yes
Write Back
Interlock Data
Shared
Yes
Write Through
Write Back gives good performance, but if you
use write through here, there will be performance
degradation. Write through here means the
lock state is seen immediately. You want a write
through here to flush the cache.
14
Potential HW Coherency Solutions
Shared Memory Architectures
  • Snooping Solution (Snoopy Bus)
  • Send all requests for data to all processors
  • Processors snoop to see if they have a copy and
    respond accordingly
  • Requires broadcast, since caching information is
    at processors
  • Works well with bus (natural broadcast medium)
  • Dominates for small scale machines (most of the
    market)
  • Directory-Based Schemes
  • Keep track of what is being shared in one
    centralized place
  • Distributed memory gt distributed directory for
    scalability(avoids bottlenecks)
  • Send point-to-point requests to processors via
    network
  • Scales better than Snooping
  • Actually existed BEFORE Snooping-based schemes

15
An Example Snoopy ProtocolMaintained by Hardware
Shared Memory Architectures
  • Invalidation protocol, write-back cache
  • Each block of memory is in one state
  • Clean in all caches and up-to-date in memory
    (Shared)
  • OR Dirty in exactly one cache (Exclusive)
  • OR Not in any caches
  • Each cache block is in one state (track these)
  • Shared block can be read
  • OR Exclusive cache has only copy, its
    writeable, and dirty
  • OR Invalid block contains no data
  • Read misses cause all caches to snoop bus
  • Writes to clean line are treated as misses

16
Snoopy-Cache State Machine-I
Shared Memory Architectures
CPU Read hit
  • State machinefor CPU requestsfor each cache
    block

CPU Read
Shared (read/only)
Invalid
Place read miss on bus
CPU Write
Applies to Write Back Data
CPU read miss Write back block
CPU Read miss Place read miss on bus
Place Write Miss on bus
CPU Write Place Write Miss on Bus
Cache Block State
Exclusive (read/write)
CPU Write Miss Write back cache block Place write
miss on bus
CPU read hit CPU write hit
17
Snoopy-Cache State Machine-II
Shared Memory Architectures
  • State machinefor bus requests for each cache
    block
  • Appendix E gives details of bus requests

Write miss for this block
Shared (read/only)
Invalid
Write Back Block (abort memory access)
Write Back Block (abort memory access)
Write miss for this block
Read miss for this block
Exclusive (read/write)
18
Example
Shared Memory Architectures
Bus
Processor 1
Processor 2
Memory
Assumes initial cache state is invalid and A1
and A2 map to same cache block, but A1 ? A2
This is the Cache for P1.
19
Example Step 1
Shared Memory Architectures
20
Example Step 2
Shared Memory Architectures
Assumes initial cache state is invalid and A1
and A2 map to same cache block, but A1 ? A2
21
Example Step 3
Shared Memory Architectures
A1
Assumes initial cache state is invalid and A1
and A2 map to same cache block, but A1 ? A2.
22
Example Step 4
Shared Memory Architectures
A1
Assumes initial cache state is invalid and A1
and A2 map to same cache block, but A1 ? A2
23
Example Step 5
Shared Memory Architectures
A1
A1
Assumes initial cache state is invalid and A1
and A2 map to same cache block, but A1 ? A2
24
Summary
  • 8.1 Introduction the big picture
  • 8.3 Centralized Shared Memory Architectures
  • Weve looked at what happens to caches when we
    have multiple processors or devices looking at
    memory.
Write a Comment
User Comments (0)
About PowerShow.com