Introduction to UNIX System Programming - PowerPoint PPT Presentation

1 / 56
About This Presentation
Title:

Introduction to UNIX System Programming

Description:

use the manual pages to get information about a specific command or ... can easily lead to a chaotic conglomeration of pipes throughout our system of processes ... – PowerPoint PPT presentation

Number of Views:371
Avg rating:3.0/5.0
Slides: 57
Provided by: Sus7155
Category:

less

Transcript and Presenter's Notes

Title: Introduction to UNIX System Programming


1
Introduction to UNIX System Programming
  • By
  • Armin R. Mikler

2
Overview
  • Basic UNIX Commands
  • Files
  • Buffered vs. non-buffered I/O
  • Basic System Calls
  • Processes
  • Whats a process anyway?
  • The fork() System Call
  • Coordinating Processes (wait, exit, etc)
  • Inter-Process Communication
  • Pipes

3
Basic UNIX Commands
  • Login
  • username
  • password
  • gt User Shell.
  • The User Shell is
  • The Command Interpreter
  • A running program
  • UNIX Commands are (often small) programs.
  • What else does the Shell do?
  • Basic Commands
  • who am I
  • pwd
  • who
  • what
  • ps ()
  • finger
  • ls
  • mkdir
  • rm (-i -f -r)
  • touch
  • cat (note there is no dog)
  • grep

4
more basic UNIX
  • Editors
  • emacs
  • vi
  • joe
  • sed
  • and others
  • Compilers/Interpreters
  • gcc
  • g
  • perl
  • java
  • etc.
  • The UNIX Manual
  • use the manual pages to get information about a
    specific command or system call.
  • The UNIX manual is divided into sections.
  • Careful!! The same system call can (and does)
    appear in different sections with different
    context.
  • Use man -si subject to refer to section i.

5
man pages
  • man -k keyword(s)
  • prints the header line of manual pages that
    contain the keyword(s)
  • apropos keyword(s)
  • same as man -k
  • Questions
  • Which manual section contains UNIX user commands?
  • Which manual section contains UNIX system calls?
  • What is the difference between commands and
    system calls?
  • TRY xman, the manual pages for X11.

6
Files
  • UNIX Input/Output operations are based on the
    concept of files.
  • Files are an abstraction of specific I/O devices.
  • A very small set of system calls provide the
    primitives that give direct access to I/O
    facilities of the UNIX kernel.
  • Most I/O operations rely on the use of these
    primitives.
  • We must remember that the basic I/O primitives
    are system calls, executed by the kernel. What
    does that mean to us as programmers???

7
UNIX I/O Primitives
  • open Opens a file for reading or writing, or
    creates an empty file.
  • creat Creates an empty file
  • close Closes a previously opened file
  • read Extracts information from a file
  • write Places information into a file
  • lseek Moves to a specific byte in the file
  • unlink Removes a file
  • remove Removes a file

8
A rudimentary example
  • include ltfcntl.hgt / controls file attributes /
  • includeltunistd.hgt / defines symbolic constants
    /
  • main()
  • int fd / a file descriptor /
  • ssize_t nread / number of bytes read /
  • char buf1024 / data buffer /
  • / open the file data for reading /
  • fd open(data, O_RDONLY)
  • / read in the data /
  • nread read(fd, buf, 1024)
  • / close the file /
  • close(fd)

9
Buffered vs unbuffered I/O
  • The system can execute in user mode or kernel
    mode!
  • Memory is divided into user space and kernel
    space!
  • What happens when we write to a file?
  • the write call forces a context switch to the
    system. What??
  • the system copies the specified number of bytes
    from user space into kernel space. (into mbufs)
  • the system wakes up the device driver to write
    these mbufs to the physical device (if the
    file-system is in synchronous mode).
  • the system selects a new process to run.
  • finally, control is returned to the process that
    executed the write call.
  • Discuss the effects on the performance of your
    program!

10
Un-buffered I/O
  • Every read and write is executed by the kernel.
  • Hence, every read and write will cause a context
    switch in order for the system routines to
    execute.
  • Why do we suffer performance loss?
  • How can we reduce the loss of performance?
  • gt We could try to move as much data as possible
    with each system call.
  • How can we measure the performance?

11
Buffered I/O
  • explicit versus implicit buffering
  • explicit - collect as many bytes as you can
    before writing to file and read more than a
    single byte at a time.
  • However, use the basic UNIX I/O primitives
  • Careful !! Your program my behave differently on
    different systems.
  • Here, the programmer is explicitly controlling
    the buffer-size
  • implicit - use the Stream facility provided by
    ltstdio.hgt
  • FILE fd, fopen, fprintf, fflush, fclose, ...
    etc.
  • a FILE structure contains a buffer (in user
    space) that is usually the size of the disk
    blocking factor (512 or 1024)

12
File Locking
  • Consider the following problem
  • Processes can obtain a unique integer by reading
    from a file. The file contains a single integer
    (at all times), which must be incremented by the
    process that executes a read. Since multiple
    processes can compete for the file (a unique
    integer), we must make sure that the file access
    is synchronized.
  • HOW??
  • What happens if we use buffered I/O ?

13
lockf()
  • lockf() is a C-Library function for locking
    records of a file. Its prototype is
  • int lockf( int fd, int func, long size)
  • func-parameters are
  • F_ULOCK 0 (unlock a locked section)
  • F_LOCK 1 (locks a section)
  • F_TLOCK 2 (Test and Lock a section)
  • F_TEST 3 (Test section for Locks)
  • see the UNIX manual pages!!

14
  • If we rewind the file before locking AND use a
    size of 0L as the corresponding size parameter,
    the entire file is being locked.
  • lseek(fd, 0L, 0) can be used to rewind the file
    (fd) to the beginning.

15
flock()
  • flock() is a UNIX system call to apply or remove
    an advisory lock to an open file
  • The locking is only on an advisory basis (not
    absolute)
  • Prototype int flock(fd, operation)
  • see manual pages

16
UNIX Processes
  • A program that has started is manifested in the
    context of a process.
  • A process in the system is represented
  • Process Identification Elements
  • Process State Information
  • Process Control Information
  • User Stack
  • Private User Address Space, Programs and Data
  • Shared Address Space

17
Process Control Block
  • Process Information, Process State Information,
    and Process Control Information constitute the
    PCB.
  • All Process State Information is stored in the
    Process Status Word (PSW).
  • All information needed by the OS to manage the
    process is contained in the PCB.
  • A UNIX process can be in a variety of states

18
States of a UNIX Process
  • User running Process executes in user mode
  • Kernel running Process executes in kernel mode
  • Ready to run in memory process is waiting to be
    scheduled
  • Asleep in memory waiting for an event
  • Ready to run swapped ready to run but requires
    swapping in
  • Preempted Process is returning from kernel to
    user-mode but the system has scheduled another
    process instead
  • Created Process is newly created and not ready
    to run
  • Zombie Process no longer exists, but it leaves a
    record for its parent process to collect.
  • See Process State Diagram!!

19
Creating a new process
  • In UNIX, a new process is created by means of the
    fork() - system call. The OS performs the
    following functions
  • It allocates a slot in the process table for the
    new process
  • It assigns a unique ID to the new process
  • It makes a copy of process image of the parent
    (except shared memory)
  • It assigns the child process to the Ready to Run
    State
  • It returns the ID of the child to the parent
    process, and 0 to the child.
  • Note, the fork() call actually is called once but
    returns twice - namely in the parent and the
    child process.

20
Fork()
  • Pid_t fork(void) is the prototype of the fork()
    call.
  • Remember that fork() returns twice
  • in the newly created (child) process with return
    value 0
  • in the calling process (parent) with return value
    pid of the new process.
  • A negative return value (-1) indicates that the
    call has failed
  • Different return values are the key for
    distinguishing parent process from child process!
  • The child process is an exact copy of the parent,
    yet, it is a copy i.e. an identical but separate
    process image.

21
A fork() Example
  • include ltunistd.hgt
  • main()
  • pid_t pid / process id /
  • printf(just one process before the fork()\n)
  • pid fork()
  • if(pid 0)
  • printf(I am the child process\n)
  • else if(pid gt 0)
  • printf(I am the parent process\n)
  • else
  • printf(DANGER Mr. Robinson - the fork() has
    failed\n)

22
Basic Process Coordination
  • The exit() call is used to terminate a process.
  • Its prototype is void exit(int status), where
    status is used as the return value of the
    process.
  • exit(i) can be used to announce success and
    failure to the calling process.
  • The wait() call is used to temporarily suspend
    the parent process until one of the child
    processes terminates.
  • The prototype is pid_t wait(int status), where
    status is a pointer to an integer to which the
    childs status information is being assigned.
  • wait() will return with a pid when any one of the
    children terminates or with -1 when no children
    exist.

23
more coordination
  • To wait for a particular child process to
    terminate, we can use the waitpid() call.
  • Prototype pid_t waitpid(pid_t pid, int status,
    int opt)
  • Sometimes we want to get information about the
    process or its parent.
  • getpid() returns the process id
  • getppid() returns the parents process id
  • getuid() returns the users id
  • use the manual pages for more id information.

24
Orphans and Zombies or MIAs
  • A child process whose parent has terminated is
    referred to as orphan.
  • When a child exits when its parent is not
    currently executing a wait(), a zombie emerges.
  • A zombie is not really a process as it has
    terminated but the system retains an entry in the
    process table for the non-existing child process.
  • A zombie is put to rest when the parent finally
    executes a wait().
  • When a parent terminates, orphans and zombies are
    adopted by the init process (prosess-id -1) of
    the system.

25
Inter-Process Communication
  • In addition to synchronizing different processes,
    we may want to be able to communicate data
    between them.
  • Note, that we are dealing with processes in the
    same machine. Hence, we can use shared memory
    segments to send messages between processes.
  • One of the way to establish a communication
    channel between processes with a parent-child
    relationship is through the concept of pipes.
  • We can use the pipe() system call to create a
    pipe.

26
UNIX Pipes
  • At the UNIX command level, we can use pipes to
    channel the output of one command into another
  • ls wc
  • At the process level we use the pipe() system
    call.
  • prototype int pipe(int filedes2)
  • filedes0 will be a file descriptor open for
    reading
  • filedes1 will be a file descriptor open for
    writing
  • the return value of pipe() is -1 if it could not
    successfully open the file descriptors.
  • But how does this help to communicate between
    processes?

27
example
  • include ....
  • main()
  • int p2, pid
  • char buf64
  • if(pipe(p) -1)
  • perror(pipe call)
  • exit(1)
  • / at this point we have a pipe p with p0
    opened for reading and p1 opened for writing -
    just like a file /
  • write(p1, hi there, 9)
  • read(p0, buf, 9)
  • printf(s\n, buf)

28
A pipe to itself ?

Process

write()
read()
29
Basic Inter-ProcessCommunication
  • by
  • Armin R. Mikler

30
Overview
  • What is IPC ?
  • How can we achieve IPC?
  • The pipe at the shell level!
  • The pipe between processes!
  • The pipe() system call!
  • closing the pipe!
  • Programming with pipes.
  • size of a pipe
  • Non-blocking read() and write()
  • The select() system call
  • FIFOs - named Pipes
  • FIFOs vs. regular pipes
  • Steps for using a FIFO
  • mkfifo to make a FIFO
  • open the FIFO
  • Other IPC concepts
  • signals
  • shared memory
  • semaphores
  • sockets

31
What is IPC
  • Inter-Process Communication allows different
    processes to exchange information and synchronize
    their actions.
  • Why do processes have to synchronize their
    actions?
  • We need to distinguish how processes may be
    related
  • Parent / Child relationship i.e., the child
    process was created by the parent
  • Processes that are not related yet execute on the
    same host
  • Processes that are not related and execute on
    different hosts
  • Why do we have to make this distinctions?

32
some similarities
  • consider a program that consists of multiple
    functions.
  • how can we exchange information between the
    main() function and any of the other functions
    func()?
  • how do we produce side effects in func() that are
    visible in main()?
  • what do we need to do to guarantee that func()
    accesses the same variables as main()?
  • The trick is to either allow different functions
    to work with identical memory locations or to
    create a communication channel in the form of
    parameter lists or return values.

33
IPC between user processes on the same system
User Process
User Process
OS - Kernel
shared resources
34
IPC between processes on different systems
User Process
OS-Kernel
Network
35
How do we achieve IPC
  • Processes need to use some facility that they
    have in common.
  • Both processes must speak the same IPC-
    language.
  • What facilities can two or more processes share
    when they reside on the same host?
  • Memory
  • File System Space
  • Communication Facilities
  • Common communication protocol provided by the OS
    (signals)

36
Inter-Process Communication using PIPES
  • In addition to synchronizing different processes,
    we may want to be able to communicate data
    between them.
  • For the time, we are dealing with processes in
    the same machine. Hence, we can use shared memory
    segments to send messages between processes.
  • A pipe is a one-way communication channel which
    can be used to connect two related processes

37
Pipes contd
  • Unix provides a construct called pipe, a
    communication channel through which two processes
    can exchange information.
  • One of the way to establish a communication
    channel between processes with a parent-child
    relationship is through the concept of pipes.
  • Why do the processes need to be related?

38
UNIX Pipes contd
  • At the UNIX command level, we can use pipes to
    channel the output of one command into another
  • ls wc
  • the shell actually creates a child process, uses
    exec() to execute the corresponding program
    (i.e., ls and wc)
  • How does the shell implement the pipe-command
    i.e., lswc ??
  • How would you implement the ability to pipe??
    Discuss....

39
the pipe() system call
  • At the process level we use the pipe() system
    call.
  • prototype int pipe(int filedes2)
  • filedes0 will be a file descriptor open for
    reading
  • filedes1 will be a file descriptor open for
    writing
  • the return value of pipe() is -1 if it could not
    successfully open the file descriptors.
  • But how does this help to communicate between
    processes??

40
example
  • include ....
  • main()
  • int p2, pid
  • char buf64
  • if(pipe(p) -1)
  • perror(pipe call)
  • exit(1)
  • / at this point we have a pipe p with p0
    opened for reading and p1 opened for writing -
    just like a file /
  • write(p1, hi there, 9)
  • read(p0, buf, 9)
  • printf(s\n, buf)

41
A pipe to itself ?

Process

write()
read()
42
A channel between two processes
  • Remember parent/child relationship!
  • What does that mean?
  • the child was created by a fork() call that was
    executed by the parent.
  • the child process is an image of the parent
    process ---gt all the file descriptors that are
    opened by the parent are now available in the
    child.
  • The file descriptors refer to the same I/O
    entity, in this case a pipe.
  • The pipe is inherited by the child and may be
    passed on to the grand-children by the child
    process or other children by the parent.
  • This can easily lead to a chaotic conglomeration
    of pipes throughout our system of processes

43
The open pipe problem
Parent Process
Child Process


write()
write()
read()
read()
44
The fix
Child Process


write()
write()
read()
read()
45
closing the pipe
  • The file descriptors associated with a pipe can
    be closed with the close(fd) system call
  • Some Rules
  • A read() on a pipe will generally block until
    either data appears or all processes have closed
    the write file descriptor of the pipe!
  • Closing the write fd while other processes are
    writing to the pipe does not have any effect!
  • Closing the read fd while others are still
    reading will not have any effect!
  • Closing the read while others are still writing
    will cause an error to be returned by the write
    and a signal is sent by the kernel (Broken Pipe!!)

46
The size of a pipe
  • In most cases, we only transfer small amounts of
    data through a pipe - but we for some
    applications we may want to send and receive
    large data blocks.
  • A valid question is How much data will fit into
    a pipe ??
  • Why do we care? Remember - a write() will block
    until the requested number of bytes have been
    written.
  • The POSIX standard specifies a minimum size of
    512 bytes!

47
read() and write()
  • Both, the read() and the write() can block when
    used on a pipe (and other I/O streams)!
  • Not only may this be undesirable, it may also
    lead to deadlock!!
  • There are ways of avoiding the blocking on a
    particular fd.
  • use the fstat() system call
  • use the fcntl() system call
  • use the select() system call
  • These calls are rather complex as they combine a
    great deal of functionality and control a number
    of file parameters.

48
The fstat() system call
  • The prototype for the fstat() system call is
  • int fstat(int filedes, struct stat buf)
  • to use fstat() you must include ltsys/stat.hgt,
    which defines the stat-structure.
  • fstat() can only be used with an open file!
    WHY??
  • When executed on a file descriptor, fstat() fills
    in the stat structure pointed to by the buf
    argument.
  • Among other things, fstat() fills in st_size
    information, which indicated the size of the file
    that filedes represents.
  • We can use st_size to determine if data is
    available in the pipe for reading ---- hence,
    implement non-blocking I/O

49
The fcntl() system call
  • The fcntl() system call provides some control
    over already open files. fcntl() can be used to
    execute a function on a file descriptor.
  • The prototype is int fcntl(int fd, int cmd,
    .......) where
  • fd is the corresponding file descriptor
  • cmd is a pre-defined command (integer const)
  • .... are additional parameters that depend on
    what cmd is.
  • Two important commands are F_GETFL and F_SETFL
  • F_GETFL is used to instruct fcntl() to return the
    current status flags
  • F_SETFL instructs fcntl() to reset the file
    status flag.

50
Using fcntl() to change tonon-blocking I/O
  • We can use the fcntl() system call to change the
    blocking behavior of the read() and write()
  • Example
  • include ltfcntl.hgt
  • ..
  • if ( fcntl(filedes, F_SETFL, O_NONBLOCK) -1)
  • perror(fcntl)

51
The select() call
  • Suppose we are dealing with a server process that
    is supporting multiple clients concurrently. Each
    of the client processes communicates with the
    server via a pipe.
  • Further let us assume that the clients work
    completely asynchronously, that is, they issue
    requests to the server in any order.
  • How would you write a server that can handle this
    type of scenario? DISCUSS!!

52
select() contd
  • What exactly is the problem?
  • If we are using the standard read() call, it will
    block until data is available in the pipe.
  • if we start polling each of the pipes in
    sequence, the server may get stuck on the first
    pipe (first client), waiting for data.
  • other clients may, however, issued a request that
    could be processed instead.
  • The server should be able to examine each file
    descriptor associated with each pipe to determine
    if data is available.

53
Using the select() call
  • The prototype of select() is
  • int select(int nfds, fd_set readset, fd_set
    writeset, fd_set errorset, timeval timeout )
  • ndfs tells select how many file descriptors are
    of interest
  • readset, writeset, and errorset are bit maps
    (binary words) in which each bit represents a
    particular file descriptor.
  • timeout tells select() whether to block and wait
    and if waiting is required timeout explicitly
    specifies how long

54
fd_set and associated functions
  • Dealing with bit masks in C, C, and UNIX makes
    programs less portable.
  • In addition, it is difficult to deal with
    individual bits.
  • Hence, the abstraction fd_set is available along
    with macros (functions on bit masks).
  • Available macros are
  • void FD_ZERO(fd_set fdset) resets the bits in
    fdset to 0
  • void FD_SET(int fd, fd_set fdset) set the bit
    representing fd to 1
  • int FD_ISSET(int fd, fd_set fdset) returns 1 if
    the fd bit is set
  • void FD_CLR(int fd, fd_set fdset) turn of the
    bit fd in fdset

55
a short example
  • include .....
  • int fd1, fd2
  • fd_set readset
  • fd1 open (file1, O_READONLY)
  • fd2 open (file2, O_READONLY)
  • FD_ZERO(readset)
  • FD_SET(fd1, readset)
  • FD_SET(fd2, readset)
  • select (5, readset, NULL, NULL, NULL)
  • ..........

56
more select
  • The select() system call can be used on any file
    descriptor and is particularly important for
    network programming with sockets.
  • One important note when select returns it
    modifies the bit mask according to the state of
    the file descriptors.
  • You should save a copy of your original bit mask
    if you execute the select() multiple times.
Write a Comment
User Comments (0)
About PowerShow.com