Title: School of Computing Science
1- School of Computing Science
- Simon Fraser University
- CMPT 300 Operating Systems I
- Chapters 10, 11, 12 File System and Disk
Scheduling - Dr. Mohamed Hefeeda
-
2Objectives
- Understand how to store and manage information on
secondary storage systems - Understand file system
- Interface
- Structure
- Implementation
- Note file system is the most visible part of the
OS to users
3Secondary Storage Systems
- Various storage media
- Magnetic disks
- Magnetic tapes
- Optical disks
- .
- Each medium has different physical
characteristics - Storing bits on disks is different from storing
them on CDs - Yet, OS provides a uniform logical view of
storage to users - To efficiently store, locate, and retrieve data
from a storage system, OS creates one or more
file systems on it
4File System Challenges
- File systems involve two design problems
- File system interface how file system looks to
users - Define a file, file attributes, operations on
files, and how files are organized into
directories - File system implementation algorithms and data
structures to map logical file system onto
physical devices - Block allocation, free-space management,
searching a directory, data caching,
5File System Layered Structure
Application Programs
- Interface file and directory structure
- Maintains pointers to logical block addresses
Logical File System
- Implementation block allocation,
- Maps logical into physical addresses
File-organization Module
Device Drivers
- Implementation device-specific instructions
- Writes specific bit patterns to device controller
Storage Devices
6File System Interface File Concept
- From users perspective, a file is the smallest
storage unit - A file is a named collection of related
information recorded on a secondary storage - Information stored in a file could be of various
types - Text, numeric data
- Binary data
- Source code
- Executable programs
- ..
7File Attributes
- Name only information kept in human-readable
form - Identifier unique tag (number) identifies file
within file system - Type needed for systems that support different
types - Location pointer to file location on device
- Size current file size
- Protection controls who can do reading,
writing, executing - Time, date, and user identification data for
protection, security, and usage monitoring - Information about files are kept in a directory,
which is maintained on the disk as well - Each file has an entry in the directory
8File Operations
- Create
- Write
- Read
- Reposition within file
- Delete
- Truncate
- More operations (e.g., copy) can be composed of
these primitives - To perform these operations, we open the file
(details later)
9File System Interface Directory Concept
- Directory is a logical grouping of files
- A directory contains an entry for each file under
it - Some systems (UNIX) treat directories just as
files - In fact, UNIX treats everything as a file
- Operations on a directory
- Search for a file
- Create a file
- Delete a file
- List a directory
- Rename a file
- Traverse the file system
10Directory Structure
- Design the directory structure to achieve
- Efficiency locating a file quickly
- Naming convenient to users
- Two users can have same name for different files
- The same file can have several different names
(aliases, links) - Grouping logical grouping of files by
properties, (e.g., all Java programs, all games,
) - Tree-structured directories are the most common
11Tree-Structured Directories
12Tree-Structured Directories (contd)
- Efficient searching
- Grouping capability
- Things get complicated when we start adding links
- Directory is no longer a tree ? acyclic-graph
structure
13Acyclic-Graph Directories
- When file is deleted while some links still point
to it ? - Dangling pointers!
14Acyclic-Graph Directories (contd)
- Solution for dangling links in Unix
- Symbolic link
- Just leave the dangling pointer for the user to
delete - Try
- ln s file.txt file_symLink.txt
- ls l
- rm file.txt
- ls l
- Hard link
- Keep a reference count on the file
- Only delete the physical file when all links to
it are deleted - Try
- ln file.txt file_link.txt
- ls l
- rm file.txt
- ls l
- Links may even create a cycle ? creating a
general graph
15General Graph Directory
16General Graph Directory (contd)
- Suppose we are backing up the entire file system
or searching for a file through the directory - With links, we may visit the same subdirectory
several times - Very costly (remember directory is stored on
disk) - We may even loop for ever if we have cycles!
- Solution? Simply
- Bypass links during directory traversal!
17File System Mounting
- A file system must be mounted before it can be
accessed - OS is given name of the device and a mount point
- OS checks device to make sure it has a valid file
system - Then, OS makes the new file system available
18Virtual File Systems
- Multiple file systems can be mounted at same time
(typical) - disk UFS (Unix), NTF (Windows), ext2 (Linux),
ext3, - CD iso 9660
- File systems on other machines, e.g., Network
File System (NFS) - Each file system has its own file and directory
structure, allocation methods, algorithms and
data structure, - To shield users from all these differences, OS
implements a virtual file system (VFS) layer - VFS provides a common interface (API) to all file
systems - E.g., applications use open(), read(), write(),
without worrying about which file system(s) is
(are) being used
19Virtual File System
20File System Implementation
- To implement a file system, we need
- On-disk structures, e.g.,
- directory structure, number of blocks, location
of free blocks, boot information, - (In addition to data blocks, of course)
- In-memory structures to
- improve performance (caching)
- manage file system
21On-disk Structures
- Boot block information to boot the OS
- Volume control block information about the
volume (partition) - number of blocks, block size, free block count,
- UFS calls it superblock
- NTFS calls it master file table (relational
database) - File control block (FCB) per file, details about
the file, e.g., - size, location of data blocks, file permissions,
ownership - UFS calls it inode
- NTFS stores this info in the master file table
- Directory structure how files are organized into
directories - UFS uses inodes
- NTFS stores this info in the master file table
22On-disk Structures
Free block
Boot block
Superblock
Directory structure
File
File control block
Data block
23In-memory Structures
- Mount table info on each mounted volume
(partition) - Directory-structure cache info on recently
accessed directories - System-wide open-file table contains a copy of
the FCB of each open file in the system - And info on which process currently using which
file - Per-process open-file table contains an entry
for each file opened by this process, which has - a pointer to the corresponding entry in the
system-wide open file table - and info regarding the usage of the file by this
process, e.g., current file pointer, open mode
(read, write), ..
24Opening a File
- Search the directory to find the file control
block - May need to bring (from disk) multiple directory
blocks into memory, if they are not already
cached - Consider the case open(/dir1/dir2/dir3/file.txt
) - Create an entry in the per-process open-file
table (PFT) - Check whether the system-wide open-file table has
an entry for this file - if it does
- increment its reference count
- Make the entry in PFT point to this entry
- If it does not
- Create a new entry, set its reference count to 1
- Make the entry in PFT point to the new entry
- Return a pointer (file descriptor) to the entry
in PFT - Successive file operations (read, write, ) use
the file descriptor
25Opening and Reading from a File
26Creating a File
- Allocate a new file control block (FCB)
- For faster file creation, FCBs are usually
pre-allocated ? - Find a free FCB
- Read relevant directory blocks in memory
- Update them to reflect the new file and write
them back to disk - Allocate free blocks for the data of to the file
- How do we allocate free blocks to files? And
- How do we know where the free blocks are?
27Allocation Methods
- Problem Allocate free blocks to files
- Given Disks allow random access of blocks
- Objectives Efficient disk space utilization, and
fast file access - Three common allocation methods
- Contiguous
- Linked
- Indexed
28Contiguous Allocation
- Each file occupies a set of contiguous blocks
- Needs only start address (block ) and length
(number of blocks) - Mapping of logical address (LA)
- Physical block Q start
- Offset within block R
- Block size 512
29Contiguous Allocation (contd)
- Pros
- Simple
- Supports random access efficiently
- Minimal disk head seeks ? fast
- Cons?
- External fragmentation
- Files may not be able to grow
30Linked Allocation
- Each file is a linked list of blocks
- Blocks could be anywhere
- Each block has a pointer to the next block
- Need start block and end block (to append to
file)
31Linked Allocation (contd)
- Mapping of logical addresses
- Physical block is at Qth location in the chain
- but, how do we get to it? Traverse the chain!
- Offset within block R 1
- Assume pointer takes 1 byte, and block size is
512 bytes
32Linked Allocation (contd)
- Pros
- No waste of space (except for pointers)
- Simple need only start and end addresses
- Supports dynamic growing of files
- Cons
- No random access (or very costly to support)
- Reliability one block is corrupted, the chain is
broken
33Indexed Allocation
- Bring all pointers together into an index block
34Indexed Allocation (cont'd)
- Mapping of logical addresses
- Q displacement into index block
- R offset within the block
- Pros
- Supports random access
- Supports dynamic growing of files
- No external fragmentation
- Cons
- Overhead of index blocks
- A file of one or a few data blocks needs an index
block - How do we choose the size of index blocks?
35Indexed Allocation (cont'd)
- First, consider a file with one index block
- Assume each pointer takes 4 bytes, and block size
is 512 bytes - What is the maximum file size supported?
- Index block may have up to 512/4 128 entries ?
- max file size 128 512 64 KB
- Now how do we support larger files?
- Increase size of index blocks ? waste space for
small files - Better solutions?
36Indexed Allocation (contd)
- Linked index blocks
- Last word in index block points to another index
block - May need to traverse the index linked list (long
access time) - Multilevel index
- First-level index block points to a set of
second-level index blocks which refer to data
blocks - Shorter access time but more space overhead
- Combined (used in Unix File System)
- Multilevel and linked
- Each file has an index block (inode), which
contains - Pointers that point to data blocks directly (for
small files) - Pointers that point to index blocks, which in
turn may point to either data blocks or another
level of index blocks - UNIX supports up to three level of index blocks
37Combined Scheme UNIX inode
Assume block size of 4KB, 4-byte pointers, 12
direct entries, 1 single, 1 double and1 triple
indirect, what is the max file size supported?
(12 1024 10241024 102410241024) 4KB
gtgt what the 32-bit file pointer can address
(4GB)!
38How Do We Know Where Free Blocks Are?
- Bit map
- Every block has a bit 0 occupied, 1 free
- 00011110 01100000 00001111 1..1
- Simple to implement
- Easy to find contiguous blocks
- Supported by hardware
- Single instruction to find offset of first bit
with value 1 in a word (of 32 bits) ? fast
searching - Disadvantages
- Bit map is stored on disk ? slow to access
- Solution cache it in memory
- Bit maps are not small for large disks ? waste of
space - 40-GB disk with 1-KB blocks ? 40 M blocks ? 5-MB
bitmap - This makes it difficult to cache the entire
bitmap
39How Do We Know Where Free Blocks Are?
- Linked List
- No waste of disk space
- But, not easy to get contiguous space
40Disk Scheduling
- Processes issue disk read/write requests
- Kernel maps these requests to physical block
addresses - These requests are sent to disk controller
- Problem If there are multiple outstanding
requests (in a disk queue), which one should be
serviced first? - Objectives
- Fast disk access time
- High disk bandwidth (bytes/sec transferred
between disk and memory) - Fairness (may be!)
- Before presenting scheduling algorithms, let us
understand the structure and operation of
magnetic disks
41Disk Physical Structure
- Several platters, each is divided into circular
tracks, which are subdivided into sectors - Head moves horizontally from one track to
another - Disk rotates at high speed (60--200 times/sec)
- Tracks accessed at same head position make a
cylinder - Drive can be directly attached to computer via
I/O bus (EIDE, ATA, SCSI), or it could be
attached through the network (ISCSI)
42Disk Logical Structure
- Disk is viewed as a one-dimensional array of
logical blocks - The logical block is the smallest unit of
transfer - Block sector
- The array of blocks is mapped into sectors of the
disk sequentially - Block 0 is at the first sector of the first track
on the outermost cylinder - Mapping proceeds in order through that track,
- Then the rest of the tracks in that cylinder,
- Then through the rest of the cylinders from
outermost to innermost - Block Address ltcylinder, track, sectorgt
43Disk Operation
- Accessing (reading/writing) a block
- Move the head to desired track (seek time)
- Wait for desired sector to rotate under the head
(rotational latency time) - Transfer the block to a local buffer, then to
main memory (transfer time) - We try to minimize the seek time, which is
proportional to the seek distance (distance moved
by the head)
44Disk Scheduling Algorithms
- Several algorithms exist to schedule the
servicing of disk I/O requests - FCFS
- SSTF
- SCAN, C-SCAN
- LOOK, C-LOOK
- We illustrate them with a request queue (0 -199
cylinders) - 98, 183, 37, 122, 14, 124, 65, 67
- Assume initial head position at cylinder 53
45First Come First Served
- Find the total head movements to service the
request queue 98, 183, 37, 122, 14, 124,
65, 67 - Let us work it out
Total head movements 640 cylinders
46Shortest Seek Time First
- Select request with minimum seek time from
current head position
- Total head movements 236 cylinders
- May cause starvation of some requests
47SCAN
- Disk arm starts at one end and moves toward the
other end, servicing requests - When it gets to the other end, movement is
reversed
- Total head movements 208 cylinders
48Circular SCAN (C-SCAN)
- Provides a more uniform wait time than SCAN
- The head moves from one end to the other,
servicing requests as it goes - When it reaches the other end, however, it
immediately returns to the beginning of the disk,
without servicing any requests on the return trip - Treats cylinders as a circular list that wraps
around from the last cylinder to the first one
49C-SCAN (contd)
50C-LOOK
- Version of C-SCAN
- Arm only goes as far as the last request in each
direction, - Then reverses direction immediately
51Selecting a Disk-Scheduling Algorithm
- SSTF is common and has a natural appeal
- SCAN and C-SCAN perform better for systems that
place a heavy load on the disk - Performance depends on the number and types of
requests - Requests for disk service can be influenced by
the file-allocation method - The disk-scheduling algorithm should be written
as a separate module, allowing it to be replaced
with a different algorithm if necessary - Either SSTF or LOOK is a reasonable choice for
the default algorithm
52Summary
- File system interface File and Directory
concepts - Directory Structure tree and general graph
- Multiple file systems common interface using
Virtual FS - File system implementation
- On-disk structures directory structure, FCB,
superblock, - In-memory structures caches, open-file tables
- Details of opening, closing, accessing, and
creating files - Block allocation contiguous, linked, indexed
- Free-space management bitmap, linked list
- Disk structure cylinders, tracks, sectors,
logical blocks - Transfer time and positioning time (latency
seek) - Disk scheduling To minimize seek time (head
movements) - FCFC, SSTF, SCAN, LOOK