Title: CS 2200 Lecture 18 IO 1
1CS 2200 Lecture 18I/O (1)
- (Lectures based on the work of Jay Brockman,
Sharon Hu, Randy Katz, Peter Kogge, Bill Leahy,
Ken MacKenzie, Richard Murphy, and Michael
Niemier)
2What is it exactly?
- To anyone in computer science or computer
engineering I/O probably has many different
meanings - My research in computer architecture focuses on
processor design - So I/O generally just involves a processor/memory
interface - For a DRAM chip designer, I/O might involve
- A processor/memory interface
- A memory/disk interface
- For an OS designer, I/O might be
- An interrupt from a device, input from the user,
etc. etc. - Basically it can mean lots of different things
- In computer architecture levels of memory
hierarchy beyond main memory are often ignored
3Why study I/O?
- Weve talked a lot about the CPU time metric
- (In fact Ive probably stressed it quite a bit!)
- CPU time is important
- for measuring how fast an instruction or program
is actually executed - But whats perhaps more important is response
time - The time between when the user types a command
and when the results appear - This might be a better measure of performance
- A brief study of I/O will help complete the
picture of a general computer architecture or
organization
4A quick example
- The difference between CPU time and response time
is 10 - (So, I/O overhead basically adds 10 to our
execution time before user sees results) - Can speed up CPU by a factor of 10, but I/O
overhead/time will stay the same - With no changes in the I/O performance, Amdahls
law states that - Well only get a speedup of 5.5! ½ of CPU
improvement is wasted - What if we make the CPU 100 times faster?
- Well only get a speed up of 10! 90 of speedup
wasted! - With CPU performance skyrocketing, if we dont
improve I/O, tasks will become I/O bound
5Our Road Map
Processor
Memory Hierarchy
I/O Subsystem
Parallel Systems
Networking
6Five Classic Components of a Computer Systemall
computers since 1946
7... and the software abstractions atop them!
operating systems, networking
Computation Processes Threads
Communication I/O devices, the internet
Storage Virtual Memory, Files
8I/0 Plan
- I/O devices in general
- magnetic disks in particular
- networks in particular
- Hardware interface issues
- tradeoff of performance and convenience
- dealing with external events
- Software abstractions
- example filesystems
- disk head scheduling
- POSIX models all I/O as files device drivers
9I/O Types and Rates
- Device Behavior Partner DataRate
kb/s - Keyboard I Human 0.01
- Mouse I Human 0.02
- Voice Input I Human 0.02
- Scanner I Human 400
- Voice Output O Human 0.6
- Line Printer O Human 1
- Laser Printer O Human 200
- Graphics Display O Human 60,000
- Modem IO Machine 8
- Network IO Machine 6,000
- Floppy Disk S Machine 100
- Optical Disk S Machine 1,000
- Magnetic Tape S Machine 2,000
- Magnetic Disk S Machine 10,000
10Mouse
I got the idea for the mouse while attending a
talk at a computer conference. The speaker was so
boring that I started daydreaming and hit upon
the idea. Doug Englebart
- Uses mechanical counters or optical devices to
generate pulses which increment or decrement
counters - Counter values determined by polling.
11Magnetic Disks
- Drums
- Disks
- Removable disk packs
- Floppy disk
- Invented for IBM Field Engineers
- Contact
- Slow speed
12Magnetic Disks
- Most common form of long term, rewriteable
storage devices - Usually considered the lowest level of memory
hierarchy - How does a magnetic disk work?
- Collection of platters rotates on a spindle at
some RPM - Platters are metal disks covered with magnetic
recording material on both sides - Disk diameters can vary
- Usually the wider faster, narrower cheaper
- Disk surface divided into tracks which are
divided into sectors - Sectors are the smallest unit that can be written
13A disk, pictorially
- When accessing data we read or write to a sector
- All sectors the same size, outer tracks just less
dense - To read or write, moveable arm with read/write
head moves over each surface - Cylinder all tracks under the arms at a given
point on all surfaces - To read or write
- Disk controller moves arm over proper track a
seek - The time to move is called the seek time
- When sector found, data is transferred
14Disk Terminology
Cylinder Track 'x' on all platters/surfaces
15The speed of light? No.
- Time required for a requested track sector to
rotate under the read/write head is called the
rotation latency or rotational delay - Involves mechanical components on the order of
milliseconds - i.e. were no longer moving at the speed of light
like in our CPU! - Time required to actually write or read data is
called the transfer time - (a function of block size, rotation speed,
recording density on a track, and speed of the
electronics connecting the disk to the computer)
16Disk odds n ends
- Often transfer time is a very small portion of a
full access - Its possible to use techniques (discussed in
caches) to help reduce disk overhead. Any
thoughts? - To help reduce complexity theres usually
additional HW called a disk controller - Disk controller helps manage disk accesses
- but also adds more overhead controller time
- (Can also have a queuing delay)
- (Time spent waiting for a disk to become free if
its already in use for another access)
17Example average disk access time
- What is the average time to read or write a
512-byte sector for a typical disk? - The average seek time is given to be 9 ms
- The transfer rate is 4 MB per second
- The disk rotates at 7200 RPM
- The controller overhead is 1 ms
- The disk is currently idle before any requests
are made (so there is no queuing delay) - Average disk access time average seek time
average rotational delay transfer time
controller overhead
18Capacity trends and disks
- Capacity of disks usually referred to as areal
density
Cost for 1GB of magnetic disk space has
decreased/ will decrease almost exponentially
over time!
19Magnetic Disks short overview
- Hard disk
- Higher speed (3600 - 7200)
- Larger
- Higher Density
- Multiple platters
- Performance
- Seek time (8-20 ms or faster)
- Rotational latency (4-8 ms)
- Transfer rate 2-40 MB/sec
20Disk Latency
Disk Latency Queuing Time Controller time
Seek Time Rotation Time Transfer Time
Order of magnitude times for 4K byte transfers
Seek 8 ms or less Rotate 4.2 ms _at_ 7200
rpm Transfer 1 ms _at_ 7200 rpm
21Technology Trends
Disk Capacity now doubles every 18
months before 1990 every 36 months
Today Processing Power Doubles Every 18
months Today Memory Size Doubles Every 18
months(4X/3yr) Today Disk Capacity Doubles
Every 18 months Disk Positioning Rate (Seek
Rotate) Doubles Every Ten Years!
The I/O GAP
22Historical Perspective
- 1956 IBM Ramac early 1970s Winchester
- Developed for mainframes
- Had proprietary interfaces
- Steady shrink in form factor 27 in. to 14 in.
- 1970s developments
- 5.25 inch floppy disk formfactor (microcode into
mainframe) - early emergence of industry standard disk
interfaces - ST506, SASI, SMD, ESDI
23Historical Perspective
- Early 1980s
- PCs and first generation workstations
- Mid 1980s
- Client/server computing
- Centralized storage on file server
- accelerates disk downsizing 8 inch to 5.25 inch
- Mass market disk drives become a reality
- industry standards SCSI, IPI, IDE
- 5.25 inch drives for standalone PCs, End of
proprietary interfaces
24Disk History
Data density Mbit/sq. in.
Capacity of Unit Shown Megabytes
1973 1. 7 Mbit/sq. in 140 MBytes
1979 7. 7 Mbit/sq. in 2,300 MBytes
source New York Times, 2/23/98, page C3,
Makers of disk drives crowd even more data into
even smaller spaces
25Historical Perspective
- Late 1980s/Early 1990s
- Laptops, notebooks, (palmtops)
- 3.5 inch, 2.5 inch, (1.8 inch formfactors)
- Formfactor plus capacity drives market, not so
much performance - Recently Bandwidth improving at 40/ year
- Challenged by DRAM, flash RAM in PCMCIA cards
- still expensive
- unattractive MBytes per cubic inch
- Optical disk fails on performace (e.g., NEXT) but
finds niche (CD ROM)
26Disk History
1989 63 Mbit/sq. in 60,000 MBytes
1997 1450 Mbit/sq. in 2300 MBytes
1997 3090 Mbit/sq. in 8100 MBytes
source New York Times, 2/23/98, page C3,
Makers of disk drives crowd even more data into
even smaller spaces
27Something cool
- This iPod mini
- A 4 GB disk in a 2 x 3.6 x 0.5 space
28Magnetic Disks
illustration source unknown
29Second Major Example Networks
- Examples
- System Area Networks (SP2) 100s nodes 25
meters per link - Local Area Networks (Ethernet) 100s nodes
1000 meters - Wide Area Network (ATM) 1000s nodes 5,000,000
meters
a.k.a. end systems, hosts
a.k.a. network, communication subnet
Interconnection Network
30ABCs of Networks
- Starting Point Send bits between 2 computers
- Queue (FIFO) on each end
- Information sent called a message
- Can send both ways (Full Duplex)
- Rules for communication? protocol
- Inside a computer
- Loads/Stores Request (Address) Response (Data)
- Need Request Response signaling
31Trivial Example
- What is the format of mesage?
- Fixed? Number bytes?
Request/ Response
Address/Data
1 bit
32 bits
0 Please send data from Address 1 Packet
contains data corresponding to request
- Header/Trailer information to deliver a message
- Payload data in message (1 word above)
32Extensions
- What if more than 2 computers want to
communicate? - Need computer address field (destination) in
packet - What if packet is garbled in transit?
- Add error detection field in packet (e.g., CRC)
- What if packet is lost?
- More elaborate protocols to detect loss
- What if multiple processes/machine?
- Queue per process to provide protection
- Simple questions such as these lead to elaborate
protocols and packet formats complexity - note complexity often slow
33A Simple Example Revisted
- What is the format of packet?
- Fixed? Number bytes?
Address/Data
CRC
Code
2 bits
32 bits
4 bits
00 RequestPlease send data from Address 01
ReplyPacket contains data corresponding to
request 10 Acknowledge request 11 Acknowledge
reply
34Network Media
- There are different ways to connect computers
together - Can kind of think of it like a memory hierarchy
- Different kinds of media vary in cost,
performance, and reliability - There are several different kinds well consider
- Twisted Pair
- Coaxial Cable
- Fiber Optics
- Air
- (first, see board for summary discussion)
35Twisted pair media
- Just a twisted pair of copper wires
- Insulated, about 1mm thick
- Twisted together to reduce electrical
interference - Makes sure we dont turn it into an antenna!
- Data transfer speeds of
- A few Mbs over a few kilometers 10s of Mbs over
shorter distances - Uses
- Used lots in the telephone industry
- OK for LANs because of reasonable data transfer
rates
36Coaxial (coax) cable
- A picture of it is included below
- Consists of copper center surrounded by
insulator, a mesh, and a plastic coating - Originally developed for cable companies to
transmit at a higher rate over a few kms - Good bandwidth 50 ohm coax cable can deliver 10
Mbs over a kilometer - Good for LAN
37Coax cable junctions
- Its harder to connect things to this media
however - One method is the T-junction
- The typical way this is handled
- Cable cut in 2 and a connector is inserted that
reconnects the cable and adds a 3rd wire to the
computer - But, if you add a new connector, you have to
split the network and therefore bring it down for
a short period of time - Additional maintenance is a headache b/c any user
can disconnect the network - Better the vampire tap
- Drill a hole to terminate in the copper core
- Screw in connector no cable cut, no network
down time
38Fiber optics
- Replaces copper with plastic and electrons with
photons - Information is now transmitted via pulses of
light - Usually, 3 basic components
- Transmission medium fiber optic cable
- Light source LED or laser diode
- Light detector photodiode
- A simplex media data can only go in 1 direction
- How it works
39Fiber optics how it really works
- Because light is bent/refracted at interfaces, it
can slowly spread out as it travels down the
diameter of a cable - Unless that is we transfer a single wavelength of
light - Then itll travel in a straight line
- With this in mind, let consider the 2 kinds of
fiber optic cable - Multimode Fiber
- Allows light to be dispersed
- Uses inexpensive LEDs
- Useful for transmissions of about 2 kms 600 Mbs
in 1995 - Single-mode Fiber
- A single-wavelength fiber
- Uses more expensive laser diodes as light sources
- Transmits Gbs over 100s of kms great for phone
companies!
40Fiber optics practical issues
- Single mode fiber is a better transmitter but
more difficult to attach connectors - Also, less reliable, more expensive, cant bend
as much - Usually in LAN, multimode is the weapon of
choice - So, how do you connect fiber optics to a
computer? - Passive Mode
- Taps are fused into the fiber and a photodiode
looks at passing light - Electrical output passes to the computer
interface - A failure cuts off just 1 computer
- Active Mode
- Really a break in the cable
- Light converted to electrical signals, sent to
computer, converted back to light, sent back down
cable - Problem tap failure causes net failure
- Advantage light source refreshed, can go longer
distances
41Some comparisons
42The bottom line
- Bandwidth problems can be fixed with more money
for more wires - Improving your latency is somewhat more difficult
to do - After all, 299792.5 km/s is kinda fixed
43I/O Device Summary
- Disks/Networks very different but consider these
similarities - Data handled in batches (sectors, messages)
- Lots of waiting around for external events
- Compatibility is important (more than
performance) - Reliability is important (and requires work to
achieve) - Slow devices are simple (and boring)
- Fast devices may be substantially autonomous
- graphics
44I/O Hardware Interface Issues
45I/O Hardware
- Basic memory-map w/polling and/or interrupts
- Project 2!
- Advanced bus issues
- Performance vs. compatibility - multiple busses
- Namespaces
- Smart device controllers
- Direct Memory Access (DMA)
- Arbitration
- Caching issues
- I/O processors
- the wheel of reincarnation
- (see board for preliminary examples)
46Basic I/O devices as memorya la project 2
47Performance vs. Compatibility
- Problem
- Processor - memory is a performance-crucial path
... improve as often as possible! - I/O controllers made by many vendors ... change
is expensive!
48(No Transcript)
49Multiple Busses
Cache Bus e.g. 256b, 533MHz
Memory Bus e.g. 64b, 533MHz
Processor
interrupts
Cache
I/O Bus e.g. 64b, 66MHz
Memory Bus
bridge
Main Memory
I/O Bus (e.g. PCI)
I/O Controller
I/O Controller
I/O Controller
Disk Drive Bus e.g. SCSI 16b, 20MHz
Graphics
Disk
Disk
Network
50Smart Device Controllers
51Polling
- Computer
- Busy bit set? Yes. Busy bit set? Yes.
- Busy bit set? Yes.
- Busy bit set? Yes.
- Busy bit set? Yes.
- Busy bit set? Yes.
- Busy bit set? Yes.
- Busy bit set? Yes.
- Busy bit set? Yes.
- Busy bit set? Yes.
52Polling
- Computer
- Busy bit set? No.
- Set write bit in command register
- Write a byte (or word) of data to Data-out
- Set command ready bit in control register
- Busy bit set? Yes.
- Busy bit set? Yes.
- Controller
- Controller clears busy bit
- Sees command ready
- Set busy bit
53Polling
- Computer
- Busy bit set? Yes.
- Busy bit set? Yes.
- Busy bit set? Yes.
- Busy bit set? Yes.
- Busy bit set? Yes.
- Busy bit set? Yes.
- Busy bit set? Yes.
- Busy bit set? Yes.
- Busy bit set? Yes.
- Busy bit set? No.
- Controller
- Checks write bit
- Reads data-out
- Does I/O with device
- Clears command ready bit
- Clears error bit
- Clears busy bit
54Polling
- Appropriate when controller and device very fast
- Very inefficient when most of the time the
controller is busy - Better solution...Interrupts
55(Recall) Interrupt Mechanism Hardware
Address Bus
Processor
Data Bus
Int
Inta
Device 1
Device 2
If the processor decides to handle the interrupt
it asserts the inta (interrupt acknowledege) line
56(Recall) Interrupts/Exceptions/Traps protection
I/O (kernel) space
a loop
user space
PC (mem. addr.)
a system call
kernel space
an interrupt
time
57(Recall) Process States
New
Terminated
Ready
Running
A longer example using these states later on in
lecture
Waiting
58 (recall) Convoy Effect
CPU
Ready Queue
IOB
IOB
IOB
CPUB
I/O
I/O Request
I/O Queue
CPU is running Compute bound jobs while I/O Bound
jobs wait.
59Interrupts
- Used by I/O controllers to communicate to
Processor - Also used by applications to communicate with
operating system - Software Interrupt or Trap
- OS can now use same device registers as before
60Interrupts
- Processor
- Initiate I/O
- Context switch to something else
- Receive interrupt transfer to handler
- Interrupt handler processes data, returns from
interrupt - Resume processing of interrupted task
- I/O Controller
- Initiate I/O with physical device
- Completion (good or bad)
- Generate interrupt
61DMA
- Preceding scheme effective but wasteful for large
blocks of data. - Using sophisticated general-purpose processor for
very specialized function - Solution Add enough processing power to device
controller (and possibly bus controller) to allow
direct transfer between device and memory.
62DMA
N
Processor tells controller to make DMA
transfer. Assume disk to memory. (Includes N
number of bytes)
63DMA
N
Controller gets sector of data from disk.
64DMA
N-1
Controller transfers one word to memory and
updates count. Checks for termination. If not...
65DMA
N-2
Controller transfers one word to memory and
updates count. Checks for termination. If not...
66DMA
N-3
Controller transfers one word to memory and
updates count. Checks for termination. If not...
67DMA
N-4
Controller transfers one word to memory and
updates count. Checks for termination. If not...
68DMA
N-5
Controller transfers one word to memory and
updates count. Checks for termination. If not...
69DMA
0
Controller transfers one word to memory and
updates count. Checks for termination. If done...
70DMA
Controller interrupts processor
71DMA
Processor acknowledges interrupt
72DMA
Controller sends interrupt vector
73DMA
Processor can now have scheduler take
appropriate action (i.e. move process waiting
for I/O into ready queue, etc.)
74Arbitration
- DMA implies multiple owners of the bus
- must decide who owns the bus from cycle to cycle
- Arbitration
- Daisy chain
- Centralized parallel arbitration
- Distributed arbitration by self selection
- Distributed arbitration by collision detection
- (see board for detailed examples and pictures)
75Daisy Chain
Simple but not fair and slow.
76Centralized Parallel Arbitration
- Requires central arbiter
- Each device has separate line
- Central arbiter may become bottleneck
- Used in PCI bus
77Distributed Arbitration by Self Selection
- Each device sees all requestors
- Priority scheme allows each to know if they get
bus - Requires lots of request lines
- Used by Apple NuBus (backplane)
78Distributed Arbitration by Collision Detection
- Devices independently request bus
- Devices have ability to detect simultaneous
requests or Collisions. - Upon collision a variety of schemes are used to
select among requestors - Used by Ethernet
79Caching Issues
- What happens if the processor has a cached copy
of data when a device does DMA? - short answer is that theres a cache
coherance problem the DMA may change memory and
the processor doesnt see the change. Two
solutions - Device driver (software) flushes cache before
using DMA - Elaborate bus hardware maintains consistency by
checking the cache on every external bus
transaction
80wheel of reincarnation
- Start with simple devices
- Add cute functionality
- Add lots of functionality
- Declare it to be a processor in its own right
- Repeat...
- Graphics community has been around this wheel a
couple of times now.
81Summary
- Example Devices
- often work in blocks
- spend lots of time waiting
- Bus Issues
- memory map w/polling and/or interrupts (project
2) - Performance vs. compatibility - multiple busses
- Namespaces
- Smart device controllers
- Direct Memory Access (DMA)
- Arbitration
- Caching issues
- I/O processors
- the wheel of reincarnation