CS 2200 IO - PowerPoint PPT Presentation

1 / 76
About This Presentation
Title:

CS 2200 IO

Description:

(Lectures based on the work of Jay Brockman, Sharon Hu, Randy Katz, ... accelerates disk downsizing: 8 inch to 5.25 inch. Mass market disk drives become a reality ... – PowerPoint PPT presentation

Number of Views:67
Avg rating:3.0/5.0
Slides: 77
Provided by: michaelt8
Category:

less

Transcript and Presenter's Notes

Title: CS 2200 IO


1
CS 2200 I/O
  • (Lectures based on the work of Jay Brockman,
    Sharon Hu, Randy Katz, Peter Kogge, Bill Leahy,
    Ken MacKenzie, Richard Murphy, and Michael
    Niemier)

2
What is it exactly?
  • To anyone in computer science or computer
    engineering I/O probably has many different
    meanings
  • My research in computer architecture focuses on
    processor design
  • So I/O generally just involves a processor/memory
    interface
  • For a DRAM chip designer, I/O might involve
  • A processor/memory interface
  • A memory/disk interface
  • For an OS designer, I/O might be
  • An interrupt from a device, input from the user,
    etc. etc.
  • Basically it can mean lots of different things
  • In computer architecture levels of memory
    hierarchy beyond main memory are often ignored

3
Why study I/O?
  • Weve talked a lot about the CPU time metric
  • (In fact Ive probably stressed it quite a bit!)
  • CPU time is important
  • for measuring how fast an instruction or program
    is actually executed
  • But whats perhaps more important is response
    time
  • The time between when the user types a command
    and when the results appear
  • This might be a better measure of performance
  • A brief study of I/O will help complete the
    picture of a general computer architecture or
    organization

4
A quick example
  • Response time is 10 longer than CPU time
  • (So, I/O overhead adds 10 to our execution time)
  • Can speed up CPU by a factor of 10,but I/O
    overhead/time will stay the same
  • Amdahls law!
  • Only a speedup of 5.5! ½ of CPU improvement is
    wasted
  • What if we make the CPU 100 times faster?
  • Speedup of 10! 90 of speedup wasted!
  • With CPU performance skyrocketing, if we dont
    improve I/O, all tasks will just become I/O
    bound

5
Our Road Map
Processor
Memory Hierarchy
I/O Subsystem
Parallel Systems
Networking
6
Five Classic Components of a Computer Systemall
computers since 1946
7
... and the software abstractions atop them!
operating systems, networking
Computation Processes Threads
Communication I/O devices, the internet
Storage Virtual Memory, Files
8
I/0 Plan
  • I/O devices in general
  • magnetic disks in particular
  • networks in particular
  • Hardware interface issues
  • tradeoff of performance and convenience
  • dealing with external events
  • Software abstractions
  • example filesystems
  • disk head scheduling
  • POSIX models all I/O as files device drivers

9
I/O Types and Rates
  • Device Behavior Partner DataRate
    kb/s
  • Keyboard I Human 0.01
  • Mouse I Human 0.02
  • Voice Input I Human 0.02
  • Scanner I Human 400
  • Voice Output O Human 0.6
  • Line Printer O Human 1
  • Laser Printer O Human 200
  • Graphics Display O Human 60,000
  • Modem IO Machine 8
  • Network IO Machine 6,000
  • Floppy Disk S Machine 100
  • Optical Disk S Machine 1,000
  • Magnetic Tape S Machine 2,000
  • Magnetic Disk S Machine 10,000

10
Mouse
I got the idea for the mouse while attending a
talk at a computer conference. The speaker was so
boring that I started daydreaming and hit upon
the idea. Doug Englebart
  • Uses mechanical counters or optical devices to
    generate pulses which increment or decrement
    counters
  • Counter values determined by polling.

11
Magnetic Disks
  • Drums
  • Disks
  • Removable disk packs
  • Floppy disk
  • Invented for IBM Field Engineers
  • Contact
  • Slow speed

12
Magnetic Disks
  • Most common form of long term, rewriteable
    storage devices
  • Usually considered the lowest level of memory
    hierarchy
  • How does a magnetic disk work?
  • Collection of platters rotates on a spindle at
    some RPM
  • Platters are metal disks covered with magnetic
    recording material on both sides
  • Disk diameters can vary
  • Usually the wider faster, narrower cheaper
  • Disk surface divided into tracks which are
    divided into sectors
  • Sectors are the smallest unit that can be written

13
A disk, pictorially
  • When accessing data we read or write to a sector
  • All sectors the same size, outer tracks just less
    dense
  • To read or write, moveable arm with read/write
    head moves over each surface
  • Cylinder all tracks under the arms at a given
    point on all surfaces
  • To read or write
  • Disk controller moves arm over proper track a
    seek
  • The time to move is called the seek time
  • When sector found, data is transferred

14
Disk Terminology
Cylinder Track 'x' on all platters/surfaces
15
The speed of light? No.
  • Time required for a requested track sector to
    rotate under the read/write head is called the
    rotation latency or rotational delay
  • Mechanical components on the order of
    milliseconds
  • No longer moving at the speed of light like in
    our CPU!
  • Time required to actually write or read data is
    called the transfer time
  • (a function of block size, rotation speed,
    recording density on a track, and speed of the
    electronics connecting the disk to the computer)

16
Disk odds n ends
  • Often transfer time is a very small portion of a
    full access
  • Its possible to use techniques (discussed in
    caches) to help reduce disk overhead. Any
    thoughts?
  • To help reduce complexity theres usually
    additional HW called a disk controller
  • Disk controller helps manage disk accesses
  • but also adds more overhead controller time
  • (Can also have a queuing delay)
  • (Time spent waiting for a disk to become free if
    its already in use for another access)

17
Example average disk access time
  • What is the average time to read or write a
    512-byte sector for a typical disk?
  • The average seek time is given to be 9 ms
  • The transfer rate is 4 MB per second
  • The disk rotates at 7200 RPM
  • The controller overhead is 1 ms
  • The disk is currently idle before any requests
    are made (so there is no queuing delay)
  • Average disk access time average seek time
    average rotational delay transfer time
    controller overhead

18
Capacity trends and disks
  • Capacity of disks usually referred to as areal
    density

Cost for 1GB of magnetic disk space has
decreased/ will decrease almost exponentially
over time!
19
Magnetic Disks short overview
  • Hard disk
  • Higher speed (3600 - 7200)
  • Larger
  • Higher Density
  • Multiple platters
  • Performance
  • Seek time (8-20 ms or faster)
  • Rotational latency (4-8 ms)
  • Transfer rate 2-40 MB/sec

20
Disk Latency
Disk Latency Queuing Time Controller time
Seek Time Rotation Time Transfer Time
Order of magnitude times for 4K byte transfers
Seek 8 ms or less Rotate 4.2 ms _at_ 7200
rpm Transfer 1 ms _at_ 7200 rpm
21
Technology Trends
Disk Capacity now doubles every 18
months before 1990 every 36 months
Today Processing Power Doubles Every 18
months  Today Memory Size Doubles Every 18
months(4X/3yr)  Today Disk Capacity Doubles
Every 18 months  Disk Positioning Rate (Seek
Rotate) Doubles Every Ten Years!
The I/O GAP
22
Historical Perspective
  • 1956 IBM Ramac early 1970s Winchester
  • Developed for mainframes
  • Had proprietary interfaces
  • Steady shrink in form factor 27 in. to 14 in.
  • 1970s developments
  • 5.25 inch floppy disk formfactor (microcode into
    mainframe)
  • early emergence of industry standard disk
    interfaces
  • ST506, SASI, SMD, ESDI

23
Historical Perspective
  • Early 1980s
  • PCs and first generation workstations
  • Mid 1980s
  • Client/server computing
  • Centralized storage on file server
  • accelerates disk downsizing 8 inch to 5.25 inch
  • Mass market disk drives become a reality
  • industry standards SCSI, IPI, IDE
  • 5.25 inch drives for standalone PCs, End of
    proprietary interfaces

24
Disk History
Data density Mbit/sq. in.
Capacity of Unit Shown Megabytes
1973 1. 7 Mbit/sq. in 140 MBytes
1979 7. 7 Mbit/sq. in 2,300 MBytes
source New York Times, 2/23/98, page C3,
Makers of disk drives crowd even more data into
even smaller spaces
25
Historical Perspective
  • Late 1980s/Early 1990s
  • Laptops, notebooks, (palmtops)
  • 3.5 inch, 2.5 inch, (1.8 inch formfactors)
  • Formfactor plus capacity drives market, not so
    much performance
  • Recently Bandwidth improving at 40/ year
  • Challenged by DRAM, flash RAM in PCMCIA cards
  • still expensive
  • unattractive MBytes per cubic inch
  • Optical disk fails on performace (e.g., NEXT) but
    finds niche (CD ROM)

26
Disk History
1989 63 Mbit/sq. in 60,000 MBytes
1997 1450 Mbit/sq. in 2300 MBytes
1997 3090 Mbit/sq. in 8100 MBytes
source New York Times, 2/23/98, page C3,
Makers of disk drives crowd even more data into
even smaller spaces
27
Magnetic Disks
illustration source unknown
28
Second Major Example Networks
  • Examples
  • System Area Networks (SP2) 100s nodes 25
    meters per link
  • Local Area Networks (Ethernet) 100s nodes
    1000 meters
  • Wide Area Network (ATM) 1000s nodes 5,000,000
    meters

a.k.a. end systems, hosts
a.k.a. network, communication subnet
Interconnection Network
29
ABCs of Networks
  • Starting Point Send bits between 2 computers
  • Queue (FIFO) on each end
  • Information sent called a message
  • Can send both ways (Full Duplex)
  • Rules for communication? protocol
  • Inside a computer
  • Loads/Stores Request (Address) Response (Data)
  • Need Request Response signaling

30
Trivial Example
  • What is the format of mesage?
  • Fixed? Number bytes?

Request/ Response
Address/Data
1 bit
32 bits
0 Please send data from Address 1 Packet
contains data corresponding to request
  • Header/Trailer information to deliver a message
  • Payload data in message (1 word above)

31
Extensions
  • What if more than 2 computers want to
    communicate?
  • Need computer address field (destination) in
    packet
  • What if packet is garbled in transit?
  • Add error detection field in packet (e.g., CRC)
  • What if packet is lost?
  • More elaborate protocols to detect loss
  • What if multiple processes/machine?
  • Queue per process to provide protection
  • Simple questions such as these lead to elaborate
    protocols and packet formats gt complexity
  • note complexity often gt slow

32
A Simple Example Revisted
  • What is the format of packet?
  • Fixed? Number bytes?

Address/Data
CRC
Code
2 bits
32 bits
4 bits
00 RequestPlease send data from Address 01
ReplyPacket contains data corresponding to
request 10 Acknowledge request 11 Acknowledge
reply
33
Network Media
  • There are different ways to connect computers
    together
  • Can kind of think of it like a memory hierarchy
  • Different kinds of media vary in cost,
    performance, and reliability
  • There are several different kinds well consider
  • Twisted Pair
  • Coaxial Cable
  • Fiber Optics
  • Air
  • (first, see board for summary discussion)

34
Twisted pair media
  • Just a twisted pair of copper wires
  • Insulated, about 1mm thick
  • Data transfer speeds of
  • A few Mb/s over a few kilometers
  • 10s of Mb/s over shorter distances
  • Uses
  • Used lots in the telephone industry
  • OK for LANs because of reasonable data transfer
    rates

35
Coaxial (coax) cable
  • A picture of it is included below
  • Pretty complicated (and expensive) for a wire
  • But very good signal propagation properties
  • Good bandwidth
  • 10s Mbs over a kilometer
  • Good for LAN

36
Fiber optics
  • Replaces copper with plastic and electrons with
    light
  • Usually, 3 basic components
  • Transmission medium fiber optic cable
  • Light source LED or laser diode
  • Light detector photodiode
  • A simplex media data can only go in 1 direction
  • But goes really fast (many Gb/s) and far (100s of
    km)

37
Some comparisons
38
The bottom line
  • Bandwidth problems can be fixed
  • More money More wires
  • Improving your latency is somewhat more
    difficult
  • After all, 299792.5 km/s is kinda fixed

39
I/O Device Summary
  • Disks/Networks very different but consider these
    similarities
  • Data handled in batches (sectors, messages)
  • Lots of waiting around for external events
  • Compatibility is important (more than
    performance)
  • Reliability is important (and requires work to
    achieve)
  • Slow devices are simple (and boring)
  • Fast devices may be substantially autonomous
  • graphics

40
I/O Hardware Interface Issues
41
I/O Hardware
  • Basic memory-map w/polling and/or interrupts
  • Project 2!
  • Advanced bus issues
  • Performance vs. compatibility -gt multiple busses
  • Namespaces
  • Smart device controllers
  • Direct Memory Access (DMA)
  • Arbitration
  • Caching issues
  • I/O processors
  • the wheel of reincarnation
  • (see board for preliminary examples)

42
Basic I/O devices as memorya la project 2
43
Performance vs. Compatibility
  • Problem
  • Processor - memory is a performance-crucial path
    ... improve as often as possible!
  • I/O controllers made by many vendors ... change
    is expensive!

44
(No Transcript)
45
Multiple Busses
Cache Bus e.g. 256b, 533MHz
Memory Bus e.g. 64b, 533MHz
Processor
interrupts
Cache
I/O Bus e.g. 64b, 66MHz
Memory Bus
bridge
Main Memory
I/O Bus (e.g. PCI)
I/O Controller
I/O Controller
I/O Controller
Disk Drive Bus e.g. SCSI 16b, 20MHz
Graphics
Disk
Disk
Network
46
Smart Device Controllers
47
Polling
  • Computer
  • Busy bit set? Yes.
  • Busy bit set? Yes.
  • Busy bit set? Yes.
  • Busy bit set? Yes.
  • Busy bit set? Yes.
  • Busy bit set? Yes.
  • Busy bit set? Yes.
  • Busy bit set? Yes.
  • Busy bit set? Yes.
  • Busy bit set? Yes.
  • Controller

48
Polling
  • Computer
  • Busy bit set? No.
  • Set write bit in command register
  • Write a byte (or word) of data to Data-out
  • Set command ready bit in control register
  • Busy bit set? Yes.
  • Busy bit set? Yes.
  • Controller
  • Controller clears busy bit
  • Sees command ready
  • Set busy bit

49
Polling
  • Computer
  • Busy bit set? Yes.
  • Busy bit set? Yes.
  • Busy bit set? Yes.
  • Busy bit set? Yes.
  • Busy bit set? Yes.
  • Busy bit set? Yes.
  • Busy bit set? Yes.
  • Busy bit set? Yes.
  • Busy bit set? Yes.
  • Busy bit set? No.
  • Controller
  • Checks write bit
  • Reads data-out
  • Does I/O with device
  • Clears command ready bit
  • Clears error bit
  • Clears busy bit

50
Polling
  • Appropriate when controller and device very fast
  • Very inefficient when controller mostly busy
  • Better solution...Interrupts

51
(Recall) Interrupt Mechanism Hardware
Address Bus
Processor
Data Bus
Int
Inta
Device 1
Device 2
If the processor decides to handle the interrupt
it asserts the inta (interrupt acknowledege) line
52
(Recall) Interrupts/Exceptions/Traps protection
I/O (kernel) space
a loop
user space
PC (mem. addr.)
a system call
kernel space
an interrupt
time
53
(Recall) Process States
New
Terminated
Ready
Running
A longer example using these states later on in
lecture
Waiting
54
Interrupts
  • Used by I/O controllers to communicate to
    Processor
  • Also used by applications to communicate with OS
  • Software Interrupt or Trap
  • OS can now use same device registers as before

55
Interrupts
  • Processor
  • Initiate I/O
  • Context switch to something else
  • Receive interrupt transfer to handler
  • Interrupt handler processes data, returns from
    interrupt
  • Resume processing of interrupted task
  • I/O Controller
  • Initiate I/O with physical device
  • Completion (good or bad)
  • Generate interrupt

56
DMA
  • Preceding scheme effective but transfers oflarge
    blocks of data causes lots of interrupts
  • And uses a sophisticated general-purpose
    processorfor a very specialized function (moving
    data around)
  • Solution Add enough processing power to device
    controller (and possibly bus controller) to allow
    direct transfer between device and memory.

57
DMA
N
Processor tells controller to make DMA
transfer. Assume disk to memory. (Includes N
number of bytes)
58
DMA
N
Controller gets sector of data from disk.
59
DMA
N-1
Controller transfers one word to memory and
updates count. Checks for termination. If not...
60
DMA
N-2
Controller transfers one word to memory and
updates count. Checks for termination. If not...
61
DMA
N-3
Controller transfers one word to memory and
updates count. Checks for termination. If not...
62
DMA
N-4
Controller transfers one word to memory and
updates count. Checks for termination. If not...
63
DMA
N-5
Controller transfers one word to memory and
updates count. Checks for termination. If not...
64
DMA
0
Controller transfers one word to memory and
updates count. Checks for termination. If done...
65
DMA
Controller interrupts processor
66
DMA
Processor acknowledges interrupt
67
DMA
Controller sends interrupt vector
68
DMA
Processor can now have scheduler take
appropriate action (i.e. move process waiting
for I/O into ready queue, etc.)
69
Arbitration
  • DMA implies multiple owners of the bus
  • must decide who owns the bus from cycle to cycle
  • Arbitration
  • Daisy chain
  • Centralized parallel arbitration
  • Distributed arbitration by self selection
  • Distributed arbitration by collision detection
  • (see board for detailed examples and pictures)

70
Daisy Chain
Simple but not fair and slow.
71
Centralized Parallel Arbitration
  • Requires central arbiter
  • Each device has separate line
  • Central arbiter may become bottleneck
  • Used in PCI bus

72
Distributed Arbitration by Self Selection
  • Each device sees all requestors
  • Priority scheme allows each to know if they get
    bus
  • Requires lots of request lines
  • Used by Apple NuBus (backplane)

73
Distributed Arbitration by Collision Detection
  • Devices independently request bus
  • Devices have ability to detect simultaneous
    requests or Collisions.
  • Upon collision a variety of schemes are used to
    select among requestors
  • Used by Ethernet

74
Caching Issues
  • What happens if the processor has a cached copy
    of data when a device does DMA?
  • Short answer is that theres a cache coherance
    problem the DMA may change memory and the
    processor doesnt see the change. Two solutions
  • Device driver (software) flushes cache before
    using DMA
  • Elaborate bus hardware maintains consistency by
    checking the cache on every external bus
    transaction

75
wheel of reincarnation
  • Start with simple devices
  • Add cute functionality
  • Add lots of functionality
  • Declare it to be a processor in its own right
  • Repeat...
  • Graphics community has been around this wheel a
    couple of times now.

76
Summary
  • Example Devices
  • often work in blocks
  • spend lots of time waiting
  • Bus Issues
  • memory map w/polling and/or interrupts (project
    2)
  • Performance vs. compatibility -gt multiple busses
  • Namespaces
  • Smart device controllers
  • Direct Memory Access (DMA)
  • Arbitration
  • Caching issues
  • I/O processors
  • the wheel of reincarnation
Write a Comment
User Comments (0)
About PowerShow.com