Title: File Structure
1File Structure
- No Structure files that are a sequence of bytes
- Record structure Lines of text, Fixed length,
Variable length - Complex Record Structures
- Formatted documents
- Relocatable load file
- Other examples
- Created by inserting appropriate control
characters - Who decides the structure
- Operating system
- Program
2File Control Block (FCB)
Storage structure consisting of information about
a file
- Name human-readable
- Identifier unique number identifies each file
- Type most systems support different types
- Location pointer to file location on device
- Size current file size
- Protection access rights and owner
- Time, date, and user identification data for
protection, security, and usage - Where? The disk resident directory structure
maintains information about files
3File System Operations
A file is an abstract data type with well-defined
operations
- Create
- Write and Read
- Reposition within file (Seek)
- Delete or Truncate
- Open(Fi) Load the file control block from the
directory structure into memory - Close (Fi) update the file control block and
release resources
4Open File Information
- File pointer pointer to last read/write
location, per process that has the file open - File-open count open count to allow removal from
the open-file list on the last close - Disk location of the file and a cache of data
- Access rights per-process access mode
information - Locking Information mediates access to a file
- Mandatory access can be denied
- Advisory processes can inquire lock status
5Java File Exclusive Lock
- FileLock sharednull, exclusivenull
- try
- RandomAccessFile raf
- new RandomAccessFile("file.txt","rw")
- FileChannel ch raf.getChannel()
- // exclusively lock the first half
- exclusive
- ch.lock(0, raf.length()/2, false)
- / Now modify the data . . . /
- exclusiveLock.release() // release lock.
-
- catch (IOException ioe)
- System.out.println("I didn't like that")
6Java File Shared Lock
- FileLock sharednull, exclusivenull
- try
- RandomAccessFile raf
- new RandomAccessFile("file.txt","rw")
- FileChannel ch raf.getChannel()
- long len raf.length()
- sharedLock ch.lock(len/21,len, true)
- / Now read the data . . . /
- sharedLock.release() // release lock.
-
- catch (java.io.IOException ioe)
- System.err.println("I didn't like that")
7Direct and Sequential Access
- Sequential Access read next, write next, append,
reset, rewrite (no read after) - Direct Access read n, write n, see n, read next,
write next, rewrite n.
8Indexed Access
9File System
Abstraction of a raw partition as collections
of files and directories
- Partition logical or physical contiguous block
of secondary memory - File control blocks (FCB) Object defining a
files attributes - Directory/Folder A collection of FCBs
- Boot Control Block Information to load operating
system - Partition Control Block Information about the
partition
10File System Software Structure
- Virtual File System (VFS) wrapper between
applications and different file systems - Uniform application view
File Types
Files/Folders
Read/Write
Layered Approach
11Directory Structure
- Directory A collection of nodes containing file
information
Directory
Files
Typical File System Organization
12Directory Design
Note A directory is another abstract data type
- Operations Search, Create, Delete, List, Rename,
Traverse - Design Criteria
- Efficiency locating a file quickly
- Naming convenient to users and unique fully
qualified path names - Grouping logical grouping of files by extension
or properties - Access control
- Design decision Should sub directories be
removed on a delete operation?
13Single and Two Level Directories
- Single level
- Disadvantages Name conflicts, no sub-folders
- Can have the same names for different user
- Efficient searching but no sub-folders
14Tree-Structured Directories
- Efficient searching, can group by sub-folders,
Working directory, relative path names - Problem to resolve How should links work?
15Acyclic-Graph Directories
- Problems sharing directories and files
- aliased names (link)
- Multiple link levels
- Dangling pointers
- Solutions
- Back pointers
- Lazy detection
- Follow link chains
- Remove data when entry count 0
16General Graph Directory
Issues Cycle detection algorithms Garbage
collection algorithms
17Mount Points
- A file system can be mounted to enable remote
access. - Top figureTwo un-mounted file systems
- Bottom figure The top file system mounted over
the users directory of the bottom file system.
Original contents are hidden.
18File Sharing
Files are shared by users locally, and over
networks, and grids
- Application ImplementationsFile Transfer
Protocol (FTP), Network File System (NFS), Remote
log in (Telnet), Remote Method Invocation (RMI),
Client Server Model (Web-services), Distributed
Naming (LDAP, DNS) - Sharing protection user and group
identifications and access codes - Network or Server Failure ModesState or
stateless (NSF) systems - Consistency for simultaneous accessReader/Writer
, Cache coherence, and Transaction commit
algorithms
19Access Control
- File owner/creator controls what can be done by
whom - Types of access (Read, Write, Execute, Append,
Delete, List) - Mode of access read, write, execute
- Three classes of users and examples of access
rights - a) owner access 7 ? 1 1 1 (RWX) b) group
access 6 ? 1 1 0 (RW) - c) public access 1 ? 0 0 1 (X)
- System administrator creates group names and adds
lists of users to it. - Owner defines access to a particular file (say
game) or subdirectory
Command to set access rights to a file
owner
group
public
game
chmod
761
Command to attach a group to a file chgrp G
game
20File System Transient Data
(a) Opening a file (b) Reading a file
21Directory Structure Alternatives
- List of file names and data block pointers
- simple to program
- time-consuming to execute
- O(n) search time
- Hashed with linear list
- O(1) directory search time
- collisions need to be resolved
- fixed hash table size
- Other alternatives
- Chained overflow lists to resolve collisions
- Sort the list of file names O(lg n) find O(n)
deletion
File list structure
22Allocating Space for Files
- Contiguous allocation
- Each file occupies a set of contiguous blocks on
the disk. - simple Only starting block and number of
blocks are required. - Both random and sequential access is possible
- Wasteful of space (holes).
- Files cannot grow adjacent space might be
allocated - Some systems allocate in groups of blocks
(extents or clusters). Files are linked lists of
these contiguous allocations.
Location of record R Block start Rrecord
size/block size Offset Rrecord size block
size
23Linked Allocation of File Space
- files are linked lists of blocks blocks may be
anywhere - Simple directory only has starting address
- No external fragments
- No random access
- File-allocation table (FAT) is a separate index
at the start of a partition linking file system
block numbers. Caching reduces disk seeks (MS-DOS
and OS/2) - Allocation search for entries
FAT
Location of record R Block located by linked
list traversal Offset Rrecord size block size
Free block count
24Indexed Allocation of File Space
- index block contains block pointers.
- Index table must be maintained and is linked
- Random access possible
- Allows dynamic access without external
fragmentation - Index table can be cached
Location of record R Block located by index
table lookup Offset Rrecord size block size
25Multi-level Indexed Allocation of File Space
Inode
?
outer-index
UNIX (4K bytes per block)
file
index table
26Directory Structure
- Goals
- Convenient name space
- Quick to access and locate
- Ability to group related files
- Definitions Path (absolute, relative)
- working directory
Single Level
Two level Fails Goal c
Tree Structured
27Management of Free Space
- Bit vector (bit per block 0free)
- Extra space needed
- Example bit/block size 4096disk size 1
gigabytespace 230/(21223) 32 KB - Easy to find groups of contiguous blocks
- Linked list (free list)
- Cannot get contiguous space easily
- No waste of space
- Grouping separate lists ordered by the number of
contiguous blocks contained. - Counting Linked list contains block s count
of adjacent free blocks
28Efficiency
- Efficiency dependent on
- allocate and access algorithms
- FCBs and directory content
- Caching
- Caching
- By Buffer cache disk blocks in separate section
of memory - By Page cache pages using virtual memory
techniques. (Memory-mapped I/O) - Algorithm Optimizations
- Use free-behind and read-ahead replacement for
sequential access - dedicate section of memory as virtual disk (RAM
disk).
Various Disk-Caching Locations
29Unified and Non-unified Buffer Cache
Unified Buffer
- Buffered Cache holds recently used disk blocks
- Unified Buffer Cache Applications writes
directly into cache - Non Unified Buffer Applications write into page
cache, Operating - system transfers to buffer cache. Extra copying
needed
No Unified Buffer
30Reliability
- Consistent back up procedures
- System programs perform full or incremental back
ups - Data recovery recovers lost data from back up
device - Consistency checking on reboot
- Inconsistent directories and data blocks repaired
if possible - Log structured (or journaling)
- Write file system operations to a transaction on
a log. - Transaction commit after log write operations
complete - A background task processes log transactions
- Asynchronously updates the file system
- Delete appropriate log records after the update
cocmpletes - After a crash, the system finishes any partial
operations
31The Sun Network File System (NFS)
Software specification for accessing remote files
across LAN or WAN
- Implementation
- Solaris and SunOS operating systems on Sun
workstations - Uses either UDP/IP or TCP/IP protocol
- Specification
- Networked systems are independent and
heterogeneous - Sharing of file systems is transparent to users
- Mount operations between clients and servers are
non-transparent - The remote host IP address must be specified
- Remote directories are mounted over a local file
system directory - Mounts hide local directories and sub-directories
they are mounted over - Remote file systems can mount over any local
directory. - NFS uses RPC and an External Data Representation
(XDR) protocol - Servers are stateless but maintain client lists
for server shutdowns - Cascaded mounts
- Users can mount file systems can mount over other
file systems - Users do not get access to remote cascaded mounts
32NSF Mounting
Purpose Establish connections
- Mount operation
- usr/shared over usr/local
- User gains access shared
- loses access to local
- Cascaded mount operation
- usr/dir2 over usr/local/dir1
- Now dir2 hides dir1
Three independent file systems
- Mount operations require names of remote
directories and servers - Clients
- issue an RPC calls the server to request a mount
- Perform automatic boot time mounts specified in
/etc/vfstab - Servers authorize requests using /etc/dfs/dfstab
containing - Local exportable file systems, Names of mountable
machines - Servers return file handles for further access
- The mount operation
- changes the users view but does not affect the
server side.
33NFS Protocol
- NFS servers
- Uses buffering and caching
- All synchronous operations
- Utilizes RPC calls
- 1-1 API with UNIX system calls (except open,
close) - NO concurrency-control
- Layers
- Normal file system calls
- Virtual File System (VFS)
- Normal local file access
- RPCs to remote files
- File-system-specific logic
- Bottom level implements NFS protocol
34NFS Path-Name Translation
- Performed by breaking the path into component
names and performing a separate NFS lookup call
for every pair of component name and directory
virtual node (vnode) - To make lookup faster, a directory name lookup
cache on the clients side holds the vnodes for
remote directory names