Title: File System Framework 1
1Chapter 8. File System Interface and Framework
- User Interface to Files
- File System
- Special Files
- Vnode/Vfs Architecture
- Operations on Files
2User Interface to Files
- Files
- Directories
- File Descriptors
- File System
3Files and Directories
- Files
- logically a container for data
- Operations name, organize, control access
- Byte-stream access
- Directories
- tree-structured
- entry for a file is called hard link
- The name is not a attribute of the file
- Symbolic link
4Operations on Directories
- First introduced in 4BSD, now support by SVR4
dirp opendir (const char filename) direntp
readdir (dirp) rewinddir(dirp) status
closedir (dirp) struct dirent ino_t
d_ino / inode number / char
d_nameNAME_MAX1 / null-terminated filename
/
5File Attributes
- Attributes are store not in the directory entry,
but in an on-disk structure called inode - File type
- directories, FIFOs, Symbolic links, special files
- Number of hard links to the file
- File size in bytes
- Device ID
6File Attributes (cont)
- inode number
- a single inode associated with each file
- unique identification of a file
- User and Group IDs of the owner of the file
- Timestamps
- least access
- last modified
- attributes were last changed
7File Attributes (cont)
- Permissions and mode flags
- Permissions read, write, execute
- People owner, groups, others
- mode flags suid, sgid, sticky
- sticky bit requests the kernel to retain the
program image in the swap area after execution
terminates
8File Descriptors
- fd open (path, oflag, mode)
- path absolute or relative pathname
- oflag read, write, read-write, append
- file descriptor is a per-process object
- process passes the file descriptor to I/O related
system calls - The kernel uses the descriptor to quickly locate
the open file object and other data structures
associated with the open file - duplicate a descriptor
- dup( ), dup2( )
9File Descriptors (cont)
offset
E.g. A file is opened twice
fd1
offset
fd2
File descriptors
Open file objects
File
E.g. Duplicate a descriptor
fd1
offset
fd2
Open file objects
File descriptors
File
10File I/O
- File I/O
- e.g. nread read (fd, buf, count)
- Scatter-Gather I/O
- readv moving data from the file into multiple
buffers in user space - writev composes a single request that collect
the data from all the packets - e.g. nbytes writev (fd, iov, iovcnt)
11Scatter-Gather I/O
struct uio
uio_iovec uio_iovcnt 3 uio_offset ...
File on disk
process address space
struct iovec
12File Systems
- root file system
- system root directory
- mounting
- s5fs, FFS by mount table
- modern UNIX virtual file system list
- logical disks
- each file system is fully contained in a single
logical disk - one logical disk may contain only one file system
13Mounting File System
fs0
/
usr
sys
dev
etc
bin
fs1
/
local
users
adm
bin
14Special Files
- UNIX file system
- generalization of the file abstraction to include
all kinds of I/O related objects - FIFOs mknod( ) system call
- Pipes pipe( ) system call
- BSD uses sockets to implement a pipe
- SVR4 uses STREAMS
- Symbolic links (v.s. Hard links)
15Hard Links
- Limitations
- SVR3, 4.1BSD support hard links only
- may not span file system
- creating hard links to directories is barred
except to the superuser - cycles in the directory tree may affect du( ) or
find( )
link1
file1
16Symbolic Links
- Is a special file that points to another file
(the linked-to file) - Data portion of the special file contain the
pathname of the linked-to file - Pathname traversal routine recognizes symbolic
links and translated them to obtain the name of
the linked-to file
17Vnode/Vfs Architecture
- Objectives
- support several file systems
- The users is presented with a consist view of the
entire file tree and need not be aware of the
difference in the on-disk representations of the
subtrees - support for sharing file over a network
- allow vendors to create their own file system
types and add them to the kernel in a modular
manner
18Overview of Vnode/Vfs
- Vnode (virtual node)
- represents a file in the kernel
- Vfs (virtual file system)
- represents a file system
- v_data, v_op fields in vnode struct
- filled in when the vnode is initiated (open( ) or
create( ) system calls) - e.g.
- define VOP_CLOSE(vp, )(((vp)-gtv_op-gtvop_close)
)(vp, )
19Vnode Abstraction
data fields (struct vnode)
v_count v_data v_type
v_op v_vfsmountedhere . . .
file-system- dependent private data
virtual functions (struct vnodeops)
vop_open vop_lookup vop_read
vop_mkdir vop_getattr . . .
file-system-dependent implementation of vnodeops
functions
utility routines and macros
vn_open VN_HOLD vn_link VN_RELE . .
.
20Vfs Abstraction
data fields (struct vfs)
vfs_next vfs_data vfs_fstype
vfs_op vfs_vnodecovered ...
file-system- dependent private data
virtual functions (struct vfsops)
vfs_mount vfs_root vfs_unmount
vfs_sync vfs_statvfs . . .
file-system-dependent implementation of vfsops
functions
21Vnode and Open Files
file descriptor
struct vnode
struct file
open mode flags vnode pointer . . . offset pointer
v_data v_op
file-system- dependent objects
22Vnode Structure
struct vnode u_short v_flag / V_ROOT, etc.
/ u_short v_count / reference count
/ struct vfs vfsmountedhere / for mount
points / struct vnodeops v_op / vnode
operations vector / struct vfs vfsp / file
system to which it belongs / struct stdata
v_stream / pointer to associated stream
/ struct page v_page / resident page list
/ enum vtype v_type / file type / dev_t
v_rdev / device ID for device files
/ caddr_t v_data / pointers to private data
structure / . . .
23Vfs Structure
struct vfs struct vfs vfs_next / next Vfs
in list / struct vfsops vfs_op / operation
vector / struct vnode vfs_vnodecovered /
vnode mounted on / int vfs_fstype / file
system type index / caddr_t vfs_data /
private data / dev_t vfs_dev / device ID
/ . . .
24Vnode and Vfs objects
root file system
mounted file system
rootvfs
vfs_next vfs_vnodecovered . . .
vfs_next vfs_vnodecovered . . .
struct vfs
struct vnode
VROOT v_vfsp v_vfsmountedhere . . .
VROOT v_vfsp v_vfsmountedhere . . .
VROOT v_vfsp v_vfsmountedhere . . .
vnode of /
vnode of /usr
/ vnode of mounted filesys
25File System-Dependent Objects
- vnodeops vector
- struct vnodeops int (vop_open) ( ) int
(vop_close) ( ) int (vop_read) ( ) int
(vop_write) ( ) int (vop_ioctl) ( ) int
(vop_getattr) ( ) int (vop_lookup) (
) int (vop_rename) ( ) . . .
26File System-Dependent Vnode
struct inode
struct inode
struct rnode
i_vnode
i_vnode
r_vnode
v_data v_op . . .
v_data v_op . . .
v_data v_op . . .
struct vnodeops
ufs_open ufs_close . . .
nfs_open nfs_close . . .
27File System-Dependent Vfs
- Vfs
- struct vfsops int (vfs_mount) ( ) int
(vfs_unmount) ( ) int (vfs_root) ( ) int
(vfs_statvfs) ( ) int (vfs_sync) ( ) . .
.
28Vfs layer data structures
file-system-dependent data structure
struct ufs_vfsdata
struct ufs_vfsdata
struct mntinfo
vfs_data vfs_next vfs_op . . .
vfs_data vfs_next vfs_op . . .
vfs_data vfs_next vfs_op . . .
struct vfs
ufs_mount ufs_unmount . . .
nfs_mount nfs_unmount . . .
rootvfs
struct vfsops
29Mounting a File System
- Mount in SVR4
- mount (spec, dir, flags, type, dataptr, datalen)
- Virtual file system switch
- struct vfssw char vsw_name / file system
type name/ int (vsw_init) ( ) / address of
initialization routine / struct vfsops
vsw_vfsops / vfs operations vector for this
fs / . . . vfssw
30Mount Implementation
- 1. Uses lookuppn( ) to obtain the vnode of
the mount point directory - 2. Searches vfssw table to find the entry
matching the type name - 3. Invokes vsw_init operation
- calls a file-system-specific initialization
routine that allocates data structures and
resources needed to operate the file system - 4. Allocates a new vfs structure, and
31Mount Implementation (cont)
- Adds the structure to the linked list headed by
rootvfs - Sets the vfs_op field ? the vfsops vector in the
switch entry - Sets vfs_vnodecovered ? the vnode of the mount
point directory - Sets v_vfsmountedhere of the covered directorys
node ? the vfs structure - 5. Invokes VFS_MOUNT operation of the vfs
- It performs the file-system-dependent processing
of the mount call
32VFS_MOUNT Processing
- 1. Verify permissions for the operation
- 2. Allocate and initialize the private data
object of the file system - 3. Store a pointer to it in the vfs_data field
of the vfs object - 4. Access the root directory of the file
system and initialize its vnode in memory - kernel accesses the root of a mounted file system
is through the VFS_ROOT operation
33VFS_MOUNT Processing (cont)
- For local file systems
- implements VFS_MOUNT by reading in the file
system metadata (such as superblock for s5fs)
from disk - For distributed file systems
- sends a remote mount request to the file server
34Operations on Files
- Pathname traversal
- Directory lookup cache
- VOP_LOOPUP operation
- Opening a file
- File I/O
- File Attributes
- User Credentials
35Pathname Traversal
- lookuppn( )
- a file-system-independent function
- translates a pathname and returns a pointer to
the vnode of the desire file - Tasks
- from the starting vnode, executes a loop, parsing
one component of the pathname at a time - For each loop, performs the following
36Pathname Traversal (cont)
- 1. Make sure the vnode is that of a directory
- 2. Operations when the component is ..
- 3. Invokes VOP_LOOKUP operation on this vnode.
- This results in a call to the lookup function of
this specific file system (s5lookup( ),
ufs_lookup( ), etc) - It searches the directory for the component, and
if found, returns a pointer the the vnode of that
file system - 4. If the new component is a mount point
(v_vfsmountedhere ! NULL) - follow the pointer to the vfs object of the
mounted file system and invoke its vfs_root
operation to return the root vnode of that file
system
37Pathname Traversal (cont)
- 5. If the new component is a symbolic link
- invoke its VOP_SYMLINK operation to translate the
symbolic link. Append the rest of the pathname to
the contents of the link and start the iteration - 6. Release the directory it just finished
searching - 7. Finally, go back to the top of the loop, and
search for the next component in the directory
represented by the new vnode - 8. When no components are left, terminate search.
- If the search was successful, do not release the
hold on the final vnode and return a pointer to
this vnode to the caller
38VOP_LOOKUP Operation
- Interface to the file-system-specific function
that looks up a filename component in a directory
- Error VOP_LOOKUP(vp, compname, tvp, )
- This results in a call to file-system-specific
lookup( ) - lookup( ) for Local file systems
- perform the search by iterating through the
directory entries block by block - if the directory contains a valid match for the
component, the lookup function checks if the
vnode of the file is already in memory. - if the vnode is found in memory, increments its
reference count and returns it to the caller - if the vnode not in memory, allocates and
initializes a vnode
39Opening a File
- open( ) system call
- Algorithm
- 1. Allocate a file descriptor
- 2. Allocate an open file object (struct file)
- 3. Call lookuppn( ) to traverse the pathname
and return the vnode of the file - 4. Check for the permissions
40Opening a File (cont)
- 5. If the file does not exist, check if
O_CREAT is specified - 6. Invoke the VOP_OPEN operation of that
vnode for system-dependent processing - 7. Initialize the open file object
- store the vnode pointer and the open mode flags
in it - 8. Return the index of the file descriptor
41File System Types in SVR4
- s5fs Original system V file system
- ufs Berkeley Fast File System
- vxfs Veritas journaling file system
- specfs File system for device special files
- NFS Network File System
- RFS Remote File Sharing file system
- fifofs File system for first-in, first-out files
- /proc File system represents each process as a
file - bfs Boot file system
42Drawbacks of SVR4 Implementation
- lookuppn( )
- translates the pathname one component at a time
- statelessness of the pathname lookup operation
- in 4.4BSD
- use a stateful model and an enhanced lookup
operation