CS 3210 Fall 2005 - PowerPoint PPT Presentation

1 / 17
About This Presentation
Title:

CS 3210 Fall 2005

Description:

master info shared by all disk inodes for same device. CS 3210 Operating System Design ... Block Device Driver Architecture Overview. kernel clusters, re-orders ... – PowerPoint PPT presentation

Number of Views:41
Avg rating:3.0/5.0
Slides: 18
Provided by: Phillip4
Category:
Tags: dev | fall

less

Transcript and Presenter's Notes

Title: CS 3210 Fall 2005


1
CS 3210Fall 2005
  • Managing I/O Devices

2
Device Drivers
  • device file ops interrupt handler data
  • various levels of kernel support
  • no support e.g. X server
  • minimal serial, parallel ports
  • extended devices directly attached to io bus
  • e.g. hard disk, USB devices, etc.
  • buffering
  • buffering to smooth access (e.g. audio data)
  • buffering for reuse (caching)
  • registering device drivers
  • register_blkdev()
  • register_chrdev()

3
Device Drivers (2)
  • initializing
  • registering and initializing are different
  • IRQs, DMA channels dynamically allocated
  • devices initialized on first (device) file open
  • monitoring io operations
  • polling periodic checking
  • to be avoided, some devices call schedule()
  • interrupts
  • e.g. foo_read(), foo_interrupt()

4
Block Device Drivers
  • hard disks
  • high (average) access time
  • rotational delay seek time read time
  • high sustained transfer rates
  • "adjacent" blocks one seek required
  • block device subsystem
  • functionality common to most block device drivers
  • lots of capabilities can be bypassed if desired
  • buffer/page cache
  • read-ahead logic
  • parallel device access
  • i/o schedulers (device "strategy" routines)

5
Tracking Block Device Drivers
  • devices initialized on open (if necessary)
  • no need to initialize if already opened
  • same device may have different file names
  • bdev_hashtable hash of block devices in use
  • hash(major) -gt block device descriptor
  • struct block_device
  • bd_dev, bd_count, bd_openers, bd_op, bd_sem
  • bd_hash, bd_inode (master), bd_inodes
  • master bd_inode
  • inode in special bdev filesystem (no disk
    representation)
  • master info shared by all disk inodes for same
    device

6
Initializing Block Device Driver
  • open file object f_op contains device ops
  • open, release, llseek, read, write, map, fsync,
    ioctl
  • bd_acquire(inode)
  • already open (same name)? increment count
  • else lookup in hash or add to hash
  • do_open(inode-gti_bdev, filp)
  • initialize bd_op field from blkdevs table if
    needed
  • call bd_op-gtopen()
  • increment bd_openers
  • bd_op-gtopen() can do additional customization

7
Sectors, Blocks, Buffers
  • hard disk platters, tracks, sectors, cylinders
  • sector device unit of transfer (512 bytes)
  • hardsect_sizemajorminor
  • block os unit of transfer
  • multiple sectors lt page size
  • driver may handle different block sizes (minor)
  • blksize_sizemajorminor
  • aside blksizemajorminor device size in
    blks
  • buffer memory region storing disk block

8
Buffer Heads
  • Buffer descriptor
  • Important fields
  • b_data (pointer to buffer)
  • b_blocknr, b_size, b_count, b_dev, b_rdev
  • b_state, b_flushtime, b_rsector, b_wait, b_inode
  • b_page (for page reads)
  • b_end_io (i/o completion callback)
  • b_dev, b_rdev different for RAID devices

9
Block Device Driver Architecture Overview
  • kernel clusters, re-orders i/o requests to
    increase device throughput
  • i/o request queues (per physical device)
  • request are delayed briefly to allow clustering
  • adjacent blocks can be read in a single access
  • requests can be serviced in seek order
  • high-level driver
  • checks cache, creates and enqueues request
  • low-level driver
  • dequeues request, talks to device, issues
    completion interrupt on completion

10
Request Descriptors
  • struct request
  • cmd, buffer, q, waiting, rq_status, rq_dev
  • sector, nr_sectors, current_nr_sectors
  • hard_sector, hard_nr_sectors
  • nr_segments, nr_hw_segments, bh, bh_tail
  • request multiple adjacent blocks
  • sector, nr_sectors, current_nr_sectors updated
    dynamically during servicing
  • segment sequence of adjacent buffers (for DMA)
  • fixed number of request descriptors per queue
  • under heavy load, processes wait to enqueue
    requests

11
Request Queue Descriptors
  • struct request_queue_t
  • rq request free lists, queue_head
  • various function pointers for
  • "elevator", merging (clustering), "plugging"
  • elevator algorithm to optimize seek time
  • plugging mechanism to delay requests for
    clustering
  • "plug" set timer for activation when first
    element is added to empty queue
  • "unplug" service requests, empty queue
  • ll_rw_block(cmd, block, bhs) creates io reqs

12
Extending Request Queue
  • clustering a new request
  • same device, adjacent blocks (before or after)
  • same op (READ or WRITE)
  • doesn't exceed max sectors in a request
  • elevator algorithms
  • order requests by tracks to reduce head movement
    (seek)
  • idea elevator doesn't service requests FCFS
  • start at 1, request on 10, then request on 5
  • start heading up, pick up 5, then 10
  • lot's of research on various policies
  • enhances throughput but at the cost of fairness
  • pathological cases cause starvation

13
Linux Elevators
  • Two built-in algorithms
  • ELEVATOR_NOOP
  • straight enqueue (FCFS)
  • ELEVATOR_LINUS
  • basic elevator with ageing
  • very old requests always serviced first

14
Low-level Request Handling
  • strategy routine lowest-level interaction with
    device
  • naïve approach for each request, issue, wait
  • heavily penalizes (random) process
  • better
  • issue request, terminate
  • issue next in bottom half of completion interrupt

15
Block I/O Operations
  • two fundamental units for io
  • block-based 1 buffer, 1 block
  • e.g. superblock, inode access
  • bread()
  • getblk() checks block cache
  • ll_rw_block()
  • wait_on_buffer()
  • bwrite()?
  • doesn't exist! just mark buffer dirty and let
    bdflush() process

16
Page-based I/O Operations
  • page-based
  • multiple blocks per page, possibly discontiguous
  • e.g. swapping, file-mapped data, all file
    reads/writes!
  • brw_page()
  • submit_bh() for each block in page
  • page operation completes when all blocks are read
    or written

17
Character Device Drivers
  • very simple compared to block devices
  • very little common kernel functionality
  • one exception "line discipline"
  • backspace processing, etc.
Write a Comment
User Comments (0)
About PowerShow.com