Title: CHAPTER 6 INPUTOUTPUT PROGRAMMING
1CHAPTER 6INPUT/OUTPUT PROGRAMMING
2Relative Speed
- I/O devices are sometimes mechanical devices
(e.g., solenoids, relays, etc.) that take a long
time to perform an action. - The computer performs operations orders of
magnitude faster than the I/O devices. - Synchronization The CPU must wait for the I/O
device to finish each command before issuing the
next.
3Asynchronous Events
- The events that determine when an input device
has data available or when an output device needs
data are independent of the CPU. - Most I/O programming, therefore, requires
"hand-shaking" between the CPU and the I/O device
to coordinate the transfer so that data is
transferred reliably.
4Time Behavior
5Three Strategies
- Polled Waiting Loops
- Interrupt-driven I/O
- Direct Memory Access (DMA)
6Selecting a Strategy
- Maximum Data Transfer Rate
- Worst-case Response Time (Latency)
- Cost (Hardware)
- Software Complexity
7Direct Memory Access
Interrupt-Driven
Polled Waiting Loop
Maximum Transfer Rate
Fastest
Slowest
Worst-Case Latency
Best
Unpredictable
Moderate
Low
Least
Hardware Cost
Moderate
Moderate
Low
Software Complexity
8Polled Waiting Loops
BYTE8 Input(void) while ((inportb(STATUS_PORT)
READY) 0) / wait for new data to
arrive / return inportb(DATA_PORT)
void Output(BYTE8 ch) while
((inportb(STATUS_PORT) READY) 0) /
wait for device to finish last command
/ outportb(DATA_PORT, ch)
9Polled Serial Input
_Serial_Input MOV DX,02FDh DX ? Status Port
Address SI1 IN AL,DX Read Input Status
Port TEST AL,00000001B Check the Ready
Bit JZ SI1 Continue to wait if not
ready MOV DX,02F8h Else load DX with Data
Port Address XOR EAX,EAX Pre-clear most
significant bits of EAX IN AL,DX Read Data
Port RET return to caller with data in EAX
10Polled Waiting Loops
- Test device status in a waiting loop before
transferring each data byte. - Maximum data rate Time required to execute one
iteration of the waiting loop plus the transfer. - Latency Time from device ready until the moment
that the CPU transfers data. Unpredictable - no
guarantee when the program will arrive at the
waiting loop.
11Estimating Performance
- Performance limited by memory bandwidth
(bytes/second) - Typical memory cycle time ? 60 ns 60 ? 10-9
sec. - Determines how fast instructions can be fetched.
- We ignore speedup due to cache, so estimate is
pessimistic.
12Memory and I/O Cycles
Opcode Immediate Stack I/O_Serial_Input
Bytes Bytes Bytes Transfers MOV DX,02FDh
1 2SI1 IN AL,DX 1 1 TEST AL,00000001B
1 1 JZ SI1 1 1 MOV DX,02F8h
1 2 XOR EAX,EAX 1 IN AL,DX 1 1 RET
1 4
14 instruction bytes, 4 stack bytes, 2 I/O
transfers
13Memory Cycles
- 14 instruction bytes ? 4 bytes/memory read 4
memory cycles (minimum) 240 ns. - 4 stack bytes ? 4 bytes/memory read 1 memory
cycle (minimum) 60 ns. - Code and stack in different parts of memory
- Address alignment may increase cycle counts.
14I/O Cycles
- Assume 33 Mhz PCI bus0.03 ?s per I/O read or
write 30 ns(Actual I/O transfers usually
requires multiple I/O bus cycles.)
15Maximum Data Rate(Polled Waiting Loop)
- Time for one iteration of waiting loop plus
subsequent I/O transfer - memory cycles 300 ns
- I/O cycles 60 ns
- total time 360 ns
- maximum data rate 1/360 ns per byte
- 2.78 MB/Sec
Fastest serial I/O 115,000 bps ? 10 KB/Sec.
16Interrupt-Driven I/O
Hardware interrupt request occurs CPU finishes
the current instruction and then initiates an
interrupt response sequence.
Interrupt Response Sequence CPU pushes flags and
return address, disables interrupts, reads an
interrupt type code from the requesting device,
and transfers control to the corresponding
Interrupt Service Routine.
Interrupt Service Routine 1. Re-enable higher
priority interrupts. 2. Preserve CPU
registers. 3. Transfer data (also clears the
interrupt request). 4. Re-enable lower
priority interrupts. 5. Restore CPU registers. 6.
Pop flags and return address and return
to interrupted code.
Interrupt Complete Interrupted code continues
where it left off as if nothing happened.
17Getting Address of ISR
Interrupt Descriptor Table
IDTR Register
Resides in Main Memory
Address ( Length) of IDT
32 bits
Physical Address of ISR
32 bits
x
8
8 bits
Interrupt Type Code
Index into IDT
18Hardware Response to Interrupt
19Interrupt-Driven Latency
- Time to finish longest instruction
- Time of hardware response
- Time in ISR until data is transferred.
20Time of Longest Instruction
- PUSHAstores contents of 8 registers by pushing
their contents onto the stack. - Requires 1 memory cycle to fetch the 1-byte
representation of the instruction and 8 memory
cycles to write 32 bytes to memory. - Total time 0.54 µs
21Hardware Response to Interrupt
- 29 bytes of data to be transferred
- 12 stack bytes (3 mem cycles 180 ns)
- 1 I/O byte (1 I/O cycle 30 ns)
- 8 IDT to CS EIP (2 mem cycles 120 ns)
- 8 GDT bytes to hidden part of CS (2 mem cycles
120 ns) - Total time 450 ns 0.45 µs
22Time from ISR Entry to Transfer
Instr. Data Stack I/O_Serial_Input_ISR Byt
es Bytes Bytes Transfers STI Enable high
prior. Ints. 1 PUSH EAX Preserve contents 1 4
PUSH EDX of EAX and EDX. 1 4 MOV DX,02FDh
Retrieve the data and 3 IN AL,DX clear the
request. 1 1 MOV _serial_data,AL Save the
data away. 5 1 MOV AL,00100000b Send EOI
command to 2 OUT 20h,AL Prog. Interrupt
Ctlr. 2
1 POP EDX Restore orig. contents 1 4 POP EAX
of the registers. 1 4 IRET Restore EIP
and EFlags. 1 12
23Time from ISR Entry to Transfer
Bytes CyclesCode 7 2Data 0 0Stack 8 2 Mem
ory 4 x 60ns .24 µs I/O 1 x 30ns .03
µs Total 0.27 µs
24Interrupt-Driven Latency
25Interrupt-Driven Data Rate(1 / Time Per Transfer)
- Time to finish longest instruction
- Time of hardware response
- Time to execute entire ISR
26Total Time in ISR
Instr. Data Stack I/O_Serial_Input_ISR Byt
es Bytes Bytes Transfers STI Enable high
prior. Ints. 1 PUSH EAX Preserve contents 1 4
PUSH EDX of EAX and EDX. 1 4 MOV DX,02FDh
Retrieve the data and 3 IN AL,DX clear the
request. 1 1 MOV _serial_data,AL Save the
data away. 5 1 MOV AL,00100000b Send EOI
command to 2 OUT 20h,AL Prog. Interrupt
Ctlr. 2 1 POP EDX Restore orig.
contents 1 4 POP EAX of the
registers. 1 4 IRET Restore EIP and
EFlags. 1 12
27Total Time in ISR
Bytes CyclesCode 19 5Data 1 1Stack 28 7 M
emory 13 x 60ns .78 µs I/O 2 x 30ns .06
µs Total 0.84 µs
28Total Time Per Byte
Time to Execute Longest Instruction
Hardware Response
Time to Execute the entireInterrupt Service
Routine
0.84 µs
PUSHA 0.54 µs
0.45 µs
Total time per transfer 1.95 µs
Maximum Data Rate 1/1.95 µs 0.513 MB/Sec
29Estimate Summary
Maximum Data Rate
Worst-Case Latency
2.78 MB/Sec
Unpredictable
Polled Waiting Loop
0.513 MB/Sec
1.26 µs
Interrupt-Driven
?
?
DMA
30Programmable Interrupt Controller
31Programmable Interrupt Controller
- External device requests service by setting its
IRQ line to 1. - Request is forwarded to CPU if corresponding mask
bit is zero and no higher priority interrupt is
in progress. - If IF1, CPU sends interrupt acknowledge to PIC,
reads interrupt type code, causing PIC to set in
service bit, disabling lower priority
interrupts. - Non-Specific end of interrupt (EOI) at end of
ISR clears in service bit.
32Buffering
dequeue position (front)
q.bfr
N
I
G
O
L
E
S
A
E
L
P
19
18
17
16
15
14
13
12
11
10
9
8
7
6
5
4
3
2
1
0
enqueue position (rear)
q.count
13
33 EXTERN _q QUEUE q _COM1InISR STI
re-enable higher priority interrupts PUSHA
save general-purpose registers PUSH DS save
segment registers PUSH ES PUSH FS PUSH GS XOR
EAX,EAX IN AL,02FDh read COM1 data
port PUSH EAX pass data to Q PUSH _q pass
pointer to Q CALL _Enqueue Enqueue the
data ADD ESP,8 remove parameters MOV AL,001000
00B re-enable lower OUT 20H,AL priority
interrupts POP GS restore segment
registers POP FS POP ES POP DS POPA IRET
34Direct Memory Access
- Requires additional hardware to control data
transfers independent of CPU. - Competes with CPU for control of the bus.
- Does not have to wait for current instruction to
complete only the current bus operation. - Latency is thus 1 memory cycle.
- Data Rate determined by memory speed.
35Estimate Summary
Maximum Data Rate
Worst-Case Latency
2.78 MB/Sec
Unpredictable
Polled Waiting Loop
0.513 MB/Sec
1.26 µs
Interrupt-Driven
66.7 MB/Sec
0.06 µs
DMA
36Single Buffering
Double Buffering