Introduction to UNIX System Programming

About This Presentation

Title:

Introduction to UNIX System Programming

Description:

use the manual pages to get information about a specific command or ... can easily lead to a chaotic conglomeration of pipes throughout our system of processes ... – PowerPoint PPT presentation

Number of Views:371

Avg rating:3.0/5.0

Slides: 57

Provided by: Sus7155

Category:

more less

Transcript and Presenter's Notes

Title: Introduction to UNIX System Programming

1
Introduction to UNIX System Programming

By
Armin R. Mikler

2
Overview

Basic UNIX Commands
Files
Buffered vs. non-buffered I/O
Basic System Calls
Processes
Whats a process anyway?
The fork() System Call
Coordinating Processes (wait, exit, etc)
Inter-Process Communication
Pipes

3
Basic UNIX Commands

Login
username
password
gt User Shell.
The User Shell is
The Command Interpreter
A running program
UNIX Commands are (often small) programs.
What else does the Shell do?

Basic Commands
who am I
pwd
who
what
ps ()
finger
ls
mkdir
rm (-i -f -r)
touch
cat (note there is no dog)
grep

4
more basic UNIX

Editors
emacs
vi
joe
sed
and others
Compilers/Interpreters
gcc
g
perl
java
etc.

The UNIX Manual
use the manual pages to get information about a
specific command or system call.
The UNIX manual is divided into sections.
Careful!! The same system call can (and does)
appear in different sections with different
context.
Use man -si subject to refer to section i.

5
man pages

man -k keyword(s)
prints the header line of manual pages that
contain the keyword(s)
apropos keyword(s)
same as man -k
Questions
Which manual section contains UNIX user commands?
Which manual section contains UNIX system calls?
What is the difference between commands and
system calls?
TRY xman, the manual pages for X11.

6
Files

UNIX Input/Output operations are based on the
concept of files.
Files are an abstraction of specific I/O devices.
A very small set of system calls provide the
primitives that give direct access to I/O
facilities of the UNIX kernel.
Most I/O operations rely on the use of these
primitives.
We must remember that the basic I/O primitives
are system calls, executed by the kernel. What
does that mean to us as programmers???

7
UNIX I/O Primitives

open Opens a file for reading or writing, or
creates an empty file.
creat Creates an empty file
close Closes a previously opened file
read Extracts information from a file
write Places information into a file
lseek Moves to a specific byte in the file
unlink Removes a file
remove Removes a file

8
A rudimentary example

include ltfcntl.hgt / controls file attributes /
includeltunistd.hgt / defines symbolic constants
/
main()
int fd / a file descriptor /
ssize_t nread / number of bytes read /
char buf1024 / data buffer /
/ open the file data for reading /
fd open(data, O_RDONLY)
/ read in the data /
nread read(fd, buf, 1024)
/ close the file /
close(fd)

9
Buffered vs unbuffered I/O

The system can execute in user mode or kernel
mode!
Memory is divided into user space and kernel
space!
What happens when we write to a file?
the write call forces a context switch to the
system. What??
the system copies the specified number of bytes
from user space into kernel space. (into mbufs)
the system wakes up the device driver to write
these mbufs to the physical device (if the
file-system is in synchronous mode).
the system selects a new process to run.
finally, control is returned to the process that
executed the write call.
Discuss the effects on the performance of your
program!

10
Un-buffered I/O

Every read and write is executed by the kernel.
Hence, every read and write will cause a context
switch in order for the system routines to
execute.
Why do we suffer performance loss?
How can we reduce the loss of performance?
gt We could try to move as much data as possible
with each system call.
How can we measure the performance?

11
Buffered I/O

explicit versus implicit buffering
explicit - collect as many bytes as you can
before writing to file and read more than a
single byte at a time.
However, use the basic UNIX I/O primitives
Careful !! Your program my behave differently on
different systems.
Here, the programmer is explicitly controlling
the buffer-size
implicit - use the Stream facility provided by
ltstdio.hgt
FILE fd, fopen, fprintf, fflush, fclose, ...
etc.
a FILE structure contains a buffer (in user
space) that is usually the size of the disk
blocking factor (512 or 1024)

12
File Locking

Consider the following problem
Processes can obtain a unique integer by reading
from a file. The file contains a single integer
(at all times), which must be incremented by the
process that executes a read. Since multiple
processes can compete for the file (a unique
integer), we must make sure that the file access
is synchronized.
HOW??
What happens if we use buffered I/O ?

13
lockf()

lockf() is a C-Library function for locking
records of a file. Its prototype is
int lockf( int fd, int func, long size)
func-parameters are
F_ULOCK 0 (unlock a locked section)
F_LOCK 1 (locks a section)
F_TLOCK 2 (Test and Lock a section)
F_TEST 3 (Test section for Locks)
see the UNIX manual pages!!

If we rewind the file before locking AND use a
size of 0L as the corresponding size parameter,
the entire file is being locked.
lseek(fd, 0L, 0) can be used to rewind the file
(fd) to the beginning.

15
flock()

flock() is a UNIX system call to apply or remove
an advisory lock to an open file
The locking is only on an advisory basis (not
absolute)
Prototype int flock(fd, operation)
see manual pages

16
UNIX Processes

A program that has started is manifested in the
context of a process.
A process in the system is represented
Process Identification Elements
Process State Information
Process Control Information
User Stack
Private User Address Space, Programs and Data
Shared Address Space

17
Process Control Block

Process Information, Process State Information,
and Process Control Information constitute the
PCB.
All Process State Information is stored in the
Process Status Word (PSW).
All information needed by the OS to manage the
process is contained in the PCB.
A UNIX process can be in a variety of states

18
States of a UNIX Process

User running Process executes in user mode
Kernel running Process executes in kernel mode
Ready to run in memory process is waiting to be
scheduled
Asleep in memory waiting for an event
Ready to run swapped ready to run but requires
swapping in
Preempted Process is returning from kernel to
user-mode but the system has scheduled another
process instead
Created Process is newly created and not ready
to run
Zombie Process no longer exists, but it leaves a
record for its parent process to collect.
See Process State Diagram!!

19
Creating a new process

In UNIX, a new process is created by means of the
fork() - system call. The OS performs the
following functions
It allocates a slot in the process table for the
new process
It assigns a unique ID to the new process
It makes a copy of process image of the parent
(except shared memory)
It assigns the child process to the Ready to Run
State
It returns the ID of the child to the parent
process, and 0 to the child.
Note, the fork() call actually is called once but
returns twice - namely in the parent and the
child process.

20
Fork()

Pid_t fork(void) is the prototype of the fork()
call.
Remember that fork() returns twice
in the newly created (child) process with return
value 0
in the calling process (parent) with return value
pid of the new process.
A negative return value (-1) indicates that the
call has failed
Different return values are the key for
distinguishing parent process from child process!
The child process is an exact copy of the parent,
yet, it is a copy i.e. an identical but separate
process image.

21
A fork() Example

include ltunistd.hgt
main()
pid_t pid / process id /
printf(just one process before the fork()\n)
pid fork()
if(pid 0)
printf(I am the child process\n)
else if(pid gt 0)
printf(I am the parent process\n)
else
printf(DANGER Mr. Robinson - the fork() has
failed\n)

22
Basic Process Coordination

The exit() call is used to terminate a process.
Its prototype is void exit(int status), where
status is used as the return value of the
process.
exit(i) can be used to announce success and
failure to the calling process.
The wait() call is used to temporarily suspend
the parent process until one of the child
processes terminates.
The prototype is pid_t wait(int status), where
status is a pointer to an integer to which the
childs status information is being assigned.
wait() will return with a pid when any one of the
children terminates or with -1 when no children
exist.

23
more coordination

To wait for a particular child process to
terminate, we can use the waitpid() call.
Prototype pid_t waitpid(pid_t pid, int status,
int opt)
Sometimes we want to get information about the
process or its parent.
getpid() returns the process id
getppid() returns the parents process id
getuid() returns the users id
use the manual pages for more id information.

24
Orphans and Zombies or MIAs

A child process whose parent has terminated is
referred to as orphan.
When a child exits when its parent is not
currently executing a wait(), a zombie emerges.
A zombie is not really a process as it has
terminated but the system retains an entry in the
process table for the non-existing child process.
A zombie is put to rest when the parent finally
executes a wait().
When a parent terminates, orphans and zombies are
adopted by the init process (prosess-id -1) of
the system.

25
Inter-Process Communication

In addition to synchronizing different processes,
we may want to be able to communicate data
between them.
Note, that we are dealing with processes in the
same machine. Hence, we can use shared memory
segments to send messages between processes.
One of the way to establish a communication
channel between processes with a parent-child
relationship is through the concept of pipes.
We can use the pipe() system call to create a
pipe.

26
UNIX Pipes

At the UNIX command level, we can use pipes to
channel the output of one command into another
ls wc
At the process level we use the pipe() system
call.
prototype int pipe(int filedes2)
filedes0 will be a file descriptor open for
reading
filedes1 will be a file descriptor open for
writing
the return value of pipe() is -1 if it could not
successfully open the file descriptors.
But how does this help to communicate between
processes?

27
example

include ....
main()
int p2, pid
char buf64
if(pipe(p) -1)
perror(pipe call)
exit(1)
/ at this point we have a pipe p with p0
opened for reading and p1 opened for writing -
just like a file /

write(p1, hi there, 9)
read(p0, buf, 9)
printf(s\n, buf)

28
A pipe to itself ?

Process

write()
read()
29
Basic Inter-ProcessCommunication

by
Armin R. Mikler

30
Overview

What is IPC ?
How can we achieve IPC?
The pipe at the shell level!
The pipe between processes!
The pipe() system call!
closing the pipe!
Programming with pipes.
size of a pipe
Non-blocking read() and write()
The select() system call

FIFOs - named Pipes
FIFOs vs. regular pipes
Steps for using a FIFO
mkfifo to make a FIFO
open the FIFO
Other IPC concepts
signals
shared memory
semaphores
sockets

31
What is IPC

Inter-Process Communication allows different
processes to exchange information and synchronize
their actions.
Why do processes have to synchronize their
actions?
We need to distinguish how processes may be
related
Parent / Child relationship i.e., the child
process was created by the parent
Processes that are not related yet execute on the
same host
Processes that are not related and execute on
different hosts
Why do we have to make this distinctions?

32
some similarities

consider a program that consists of multiple
functions.
how can we exchange information between the
main() function and any of the other functions
func()?
how do we produce side effects in func() that are
visible in main()?
what do we need to do to guarantee that func()
accesses the same variables as main()?
The trick is to either allow different functions
to work with identical memory locations or to
create a communication channel in the form of
parameter lists or return values.

33
IPC between user processes on the same system
User Process
User Process
OS - Kernel
shared resources
34
IPC between processes on different systems
User Process
OS-Kernel
Network
35
How do we achieve IPC

Processes need to use some facility that they
have in common.
Both processes must speak the same IPC-
language.
What facilities can two or more processes share
when they reside on the same host?
Memory
File System Space
Communication Facilities
Common communication protocol provided by the OS
(signals)

36
Inter-Process Communication using PIPES

In addition to synchronizing different processes,
we may want to be able to communicate data
between them.
For the time, we are dealing with processes in
the same machine. Hence, we can use shared memory
segments to send messages between processes.
A pipe is a one-way communication channel which
can be used to connect two related processes

37
Pipes contd

Unix provides a construct called pipe, a
communication channel through which two processes
can exchange information.
One of the way to establish a communication
channel between processes with a parent-child
relationship is through the concept of pipes.
Why do the processes need to be related?

38
UNIX Pipes contd

At the UNIX command level, we can use pipes to
channel the output of one command into another
ls wc
the shell actually creates a child process, uses
exec() to execute the corresponding program
(i.e., ls and wc)
How does the shell implement the pipe-command
i.e., lswc ??
How would you implement the ability to pipe??
Discuss....

39
the pipe() system call

At the process level we use the pipe() system
call.
prototype int pipe(int filedes2)
filedes0 will be a file descriptor open for
reading
filedes1 will be a file descriptor open for
writing
the return value of pipe() is -1 if it could not
successfully open the file descriptors.
But how does this help to communicate between
processes??

40
example

include ....
main()
int p2, pid
char buf64
if(pipe(p) -1)
perror(pipe call)
exit(1)
/ at this point we have a pipe p with p0
opened for reading and p1 opened for writing -
just like a file /

write(p1, hi there, 9)
read(p0, buf, 9)
printf(s\n, buf)

41
A pipe to itself ?

Process

write()
read()
42
A channel between two processes

Remember parent/child relationship!
What does that mean?
the child was created by a fork() call that was
executed by the parent.
the child process is an image of the parent
process ---gt all the file descriptors that are
opened by the parent are now available in the
child.
The file descriptors refer to the same I/O
entity, in this case a pipe.
The pipe is inherited by the child and may be
passed on to the grand-children by the child
process or other children by the parent.
This can easily lead to a chaotic conglomeration
of pipes throughout our system of processes

43
The open pipe problem
Parent Process
Child Process

write()
write()
read()
read()
44
The fix
Child Process

write()
write()
read()
read()
45
closing the pipe

The file descriptors associated with a pipe can
be closed with the close(fd) system call
Some Rules
A read() on a pipe will generally block until
either data appears or all processes have closed
the write file descriptor of the pipe!
Closing the write fd while other processes are
writing to the pipe does not have any effect!
Closing the read fd while others are still
reading will not have any effect!
Closing the read while others are still writing
will cause an error to be returned by the write
and a signal is sent by the kernel (Broken Pipe!!)

46
The size of a pipe

In most cases, we only transfer small amounts of
data through a pipe - but we for some
applications we may want to send and receive
large data blocks.
A valid question is How much data will fit into
a pipe ??
Why do we care? Remember - a write() will block
until the requested number of bytes have been
written.
The POSIX standard specifies a minimum size of
512 bytes!

47
read() and write()

Both, the read() and the write() can block when
used on a pipe (and other I/O streams)!
Not only may this be undesirable, it may also
lead to deadlock!!
There are ways of avoiding the blocking on a
particular fd.
use the fstat() system call
use the fcntl() system call
use the select() system call
These calls are rather complex as they combine a
great deal of functionality and control a number
of file parameters.

48
The fstat() system call

The prototype for the fstat() system call is
int fstat(int filedes, struct stat buf)
to use fstat() you must include ltsys/stat.hgt,
which defines the stat-structure.
fstat() can only be used with an open file!
WHY??
When executed on a file descriptor, fstat() fills
in the stat structure pointed to by the buf
argument.
Among other things, fstat() fills in st_size
information, which indicated the size of the file
that filedes represents.
We can use st_size to determine if data is
available in the pipe for reading ---- hence,
implement non-blocking I/O

49
The fcntl() system call

The fcntl() system call provides some control
over already open files. fcntl() can be used to
execute a function on a file descriptor.
The prototype is int fcntl(int fd, int cmd,
.......) where
fd is the corresponding file descriptor
cmd is a pre-defined command (integer const)
.... are additional parameters that depend on
what cmd is.
Two important commands are F_GETFL and F_SETFL
F_GETFL is used to instruct fcntl() to return the
current status flags
F_SETFL instructs fcntl() to reset the file
status flag.

50
Using fcntl() to change tonon-blocking I/O

We can use the fcntl() system call to change the
blocking behavior of the read() and write()
Example
include ltfcntl.hgt
..
if ( fcntl(filedes, F_SETFL, O_NONBLOCK) -1)
perror(fcntl)

51
The select() call

Suppose we are dealing with a server process that
is supporting multiple clients concurrently. Each
of the client processes communicates with the
server via a pipe.
Further let us assume that the clients work
completely asynchronously, that is, they issue
requests to the server in any order.
How would you write a server that can handle this
type of scenario? DISCUSS!!

52
select() contd

What exactly is the problem?
If we are using the standard read() call, it will
block until data is available in the pipe.
if we start polling each of the pipes in
sequence, the server may get stuck on the first
pipe (first client), waiting for data.
other clients may, however, issued a request that
could be processed instead.
The server should be able to examine each file
descriptor associated with each pipe to determine
if data is available.

53
Using the select() call

The prototype of select() is
int select(int nfds, fd_set readset, fd_set
writeset, fd_set errorset, timeval timeout )
ndfs tells select how many file descriptors are
of interest
readset, writeset, and errorset are bit maps
(binary words) in which each bit represents a
particular file descriptor.
timeout tells select() whether to block and wait
and if waiting is required timeout explicitly
specifies how long

54
fd_set and associated functions

Dealing with bit masks in C, C, and UNIX makes
programs less portable.
In addition, it is difficult to deal with
individual bits.
Hence, the abstraction fd_set is available along
with macros (functions on bit masks).
Available macros are
void FD_ZERO(fd_set fdset) resets the bits in
fdset to 0
void FD_SET(int fd, fd_set fdset) set the bit
representing fd to 1
int FD_ISSET(int fd, fd_set fdset) returns 1 if
the fd bit is set
void FD_CLR(int fd, fd_set fdset) turn of the
bit fd in fdset

55
a short example

include .....
int fd1, fd2
fd_set readset
fd1 open (file1, O_READONLY)
fd2 open (file2, O_READONLY)
FD_ZERO(readset)
FD_SET(fd1, readset)
FD_SET(fd2, readset)
select (5, readset, NULL, NULL, NULL)
..........

56
more select

The select() system call can be used on any file
descriptor and is particularly important for
network programming with sockets.
One important note when select returns it
modifies the bit mask according to the state of
the file descriptors.
You should save a copy of your original bit mask
if you execute the select() multiple times.

Write a Comment

User Comments (0)