Title: Homework 1 review
1Lecture 2
- Homework 1 review
- Review of Caching (chapter 6) for Lab 1
- Linking (chapter 7)
- All source code is posted at http//reed.cs.depaul
.edu/lperkovic/csc374/lectures/lecture2/
2Review of caching for lab 1
- The following slides are a review of caching.
- You will need these ideas for lab 1.
3Intel Pentium Cache Hierarchy
Processor Chip
L1 Data 1 cycle latency 16 KB 4-way
assoc Write-through 32B lines
L2 Unified 128KB--4 MB 4-way assoc Write-back Writ
e allocate 32B lines
Main Memory Up to 4GB
Regs.
L1 Instruction 16 KB, 4-way 32B lines
4Cache Performance Metrics
- Miss Rate
- Fraction of memory references not found in cache
(misses/references) - Typical numbers
- 3-10 for L1
- can be quite small (e.g., lt 1) for L2, depending
on size, etc. - Hit Time
- Time to deliver a line in the cache to the
processor (includes time to determine whether the
line is in the cache) - Typical numbers
- 1 clock cycle for L1
- 3-8 clock cycles for L2
- Miss Penalty
- Additional time required because of a miss
- Typically 25-100 cycles for main memory
5Writing Cache Friendly Code
- Repeated references to variables are good
(temporal locality) - Stride-1 reference patterns are good (spatial
locality) - Examples
- cold cache, 4-byte words, 4-word cache blocks
int sumarrayrows(int aMN) int i, j, sum
0 for (i 0 i lt M i) for (j
0 j lt N j) sum aij
return sum
int sumarraycols(int aMN) int i, j, sum
0 for (j 0 j lt N j) for (i
0 i lt M i) sum aij
return sum
1/4 25
100
Miss rate
Miss rate
6Matrix Multiplication Example
- Major Cache Effects to Consider
- Total cache size
- Exploit temporal locality and keep the working
set small (e.g., by using blocking) - Block size
- Exploit spatial locality
- Description
- Multiply N x N matrices
- O(N3) total operations
- Accesses
- N reads per source element
- N values summed per destination
- but may be able to hold in register
/ ijk / for (i0 iltn i) for (j0 jltn
j) sum 0.0 for (k0 kltn k)
sum aik bkj cij sum
Variable sum held in register
7Miss Rate Analysis for Matrix Multiply
- Assume
- Line size 32B (big enough for 4 64-bit words)
- Matrix dimension (N) is very large
- Approximate 1/N as 0.0
- Cache is not even big enough to hold multiple
rows - Analysis Method
- Look at access pattern of inner loop
C
8Layout of C Arrays in Memory (review)
- C arrays allocated in row-major order
- each row in contiguous memory locations
- Stepping through columns in one row
- for (i 0 i lt N i)
- sum a0i
- accesses successive elements
- if block size (B) gt 4 bytes, exploit spatial
locality - compulsory miss rate 4 bytes / B
- Stepping through rows in one column
- for (i 0 i lt n i)
- sum ai0
- accesses distant elements
- no spatial locality!
- compulsory miss rate 1 (i.e. 100)
9Matrix Multiplication (ijk)
/ ijk / for (i0 iltn i) for (j0 jltn
j) sum 0.0 for (k0 kltn k)
sum aik bkj cij sum
Inner loop
(,j)
(i,j)
(i,)
A
B
C
Row-wise
Misses per Inner Loop Iteration A B C 0.25 1.
0 0.0
10Matrix Multiplication (jik)
/ jik / for (j0 jltn j) for (i0 iltn
i) sum 0.0 for (k0 kltn k)
sum aik bkj cij sum
Inner loop
(,j)
(i,j)
(i,)
A
B
C
Misses per Inner Loop Iteration A B C 0.25 1.
0 0.0
11Matrix Multiplication (kij)
/ kij / for (k0 kltn k) for (i0 iltn
i) r aik for (j0 jltn j)
cij r bkj
Inner loop
(i,k)
(k,)
(i,)
A
B
C
Misses per Inner Loop Iteration A B C 0.0 0.2
5 0.25
12Matrix Multiplication (ikj)
/ ikj / for (i0 iltn i) for (k0 kltn
k) r aik for (j0 jltn j)
cij r bkj
Inner loop
(i,k)
(k,)
(i,)
A
B
C
Fixed
Misses per Inner Loop Iteration A B C 0.0 0.2
5 0.25
13Matrix Multiplication (jki)
/ jki / for (j0 jltn j) for (k0 kltn
k) r bkj for (i0 iltn i)
cij aik r
Inner loop
(,j)
(,k)
(k,j)
A
B
C
Misses per Inner Loop Iteration A B C 1.0 0.0
1.0
14Matrix Multiplication (kji)
/ kji / for (k0 kltn k) for (j0 jltn
j) r bkj for (i0 iltn i)
cij aik r
Inner loop
(,j)
(,k)
(k,j)
A
B
C
Misses per Inner Loop Iteration A B C 1.0 0.0
1.0
15Summary of Matrix Multiplication
- ijk ( jik)
- 2 loads, 0 stores
- misses/iter 1.25
- kij ( ikj)
- 2 loads, 1 store
- misses/iter 0.5
- jki ( kji)
- 2 loads, 1 store
- misses/iter 2.0
for (i0 iltn i) for (j0 jltn j)
sum 0.0 for (k0 kltn k)
sum aik bkj
cij sum
for (k0 kltn k) for (i0 iltn i)
r aik for (j0 jltn j)
cij r bkj
for (j0 jltn j) for (k0 kltn k)
r bkj for (i0 iltn i)
cij aik r
16Pentium Matrix Multiply Performance
- Miss rates are helpful but not perfect
predictors. - Code scheduling matters, too.
17Improving Temporal Locality by Blocking
- Example Blocked matrix multiplication
- block (in this context) does not mean cache
block. - Instead, it mean a sub-block within the matrix.
- Example N 8 sub-block size 4
A11 A12 A21 A22
B11 B12 B21 B22
C11 C12 C21 C22
X
Key idea Sub-blocks (i.e., Axy) can be treated
just like scalars.
C11 A11B11 A12B21 C12 A11B12
A12B22 C21 A21B11 A22B21 C22
A21B12 A22B22
18Blocked Matrix Multiply (bijk)
for (jj0 jjltn jjbsize) for (i0 iltn
i) for (jjj j lt min(jjbsize,n) j)
cij 0.0 for (kk0 kkltn kkbsize)
for (i0 iltn i) for (jjj j lt
min(jjbsize,n) j) sum 0.0
for (kkk k lt min(kkbsize,n) k)
sum aik bkj
cij sum
19Blocked Matrix Multiply Analysis
- Innermost loop pair multiplies a 1 X bsize sliver
of A by a bsize X bsize block of B and
accumulates into 1 X bsize sliver of C - Loop over i steps through n row slivers of A C,
using same B
Innermost Loop Pair
i
i
A
B
C
Update successive elements of sliver
row sliver accessed bsize times
block reused n times in succession
20Pentium Blocked Matrix Multiply Performance
- Blocking (bijk and bikj) improves performance by
a factor of two over unblocked versions (ijk and
jik) - relatively insensitive to array size.
21Concluding Observations
- Programmer can optimize for cache performance
- How data structures are organized
- How data are accessed
- Nested loop structure
- Blocking is a general technique
- All systems favor cache friendly code
- Getting absolute optimum performance is very
platform specific - Cache sizes, line sizes, associativities, etc.
- Can get most of the advantage with generic code
- Keep working set reasonably small (temporal
locality) - Use small strides (spatial locality)
22Linking
- Linking
- Static linking
- Object files
- Static libraries
- Loading
- Dynamic linking of shared libraries
23Linker Puzzles
int x p1()
p1()
int x p1()
int x p2()
int x int y p1()
double x p2()
int x7 int y5 p1()
double x p2()
int x7 p1()
int x p2()
24A Simplistic Program Translation Scheme
m.c
ASCII source file
Translator
Binary executable object file (memory image on
disk)
p
- Problems
- Efficiency small change requires complete
recompilation - Modularity hard to share common functions (e.g.
printf) - Solution
- Static linker (or linker)
25A Better Scheme Using a Linker
m.c
a.c
Translators
Translators
Separately compiled relocatable object files
m.o
a.o
Linker (ld)
Executable object file (contains code and data
for all functions defined in m.c and a.c)
p
26Translating the Example Program
- Compiler driver coordinates all steps in the
translation and linking process. - Typically included with each compilation system
(e.g., gcc) - Invokes preprocessor (cpp), compiler (cc1),
assembler (as), and linker (ld). - Passes command line arguments to appropriate
phases - Example create executable p from m.c and a.c
gcc -O2 -v -o p m.c a.c cpp args m.c
/tmp/cca07630.i cc1 /tmp/cca07630.i m.c -O2
args -o /tmp/cca07630.s as args -o
/tmp/cca076301.o /tmp/cca07630.s ltsimilar
process for a.cgt ld -o p system obj files
/tmp/cca076301.o /tmp/cca076302.o
27What Does a Linker Do?
- Merges object files
- Merges multiple relocatable (.o) object files
into a single executable object file that can be
loaded and executed by the loader. - Resolves external references
- As part of the merging process, resolves external
references. - External reference reference to a symbol
defined in another object file. - Relocates symbols
- Relocates symbols from their relative locations
in the .o files to new absolute positions in the
executable. - Updates all references to these symbols to
reflect their new positions. - References can be in either code or data
- code a() / reference to symbol a /
- data int xpx / reference to symbol x /
28Why Linkers?
- Modularity
- Program can be written as a collection of smaller
source files, rather than one monolithic mass. - Can build libraries of common functions
- e.g., Math library, standard C library
- Efficiency
- Time
- Change one source file, compile, and then relink.
- No need to recompile other source files.
- Space
- Libraries of common functions can be aggregated
into a single file... - Yet executable files and running memory images
contain only code for the functions they actually
use.
29Executable and Linkable Format (ELF)
- Standard binary format for object files
- Derives from ATT System V Unix
- Later adopted by BSD Unix variants and Linux
- One unified format for
- Relocatable object files (.o),
- Executable object files
- Shared object files (.so)
- Generic name ELF binaries
30ELF Object File Format
- Elf header
- Type (.o, exec, .so), machine, byte ordering,
etc. - Program header table
- Page size, virtual addresses memory segments
(sections), segment sizes. - .text section
- Code
- .data section
- Initialized (static) data
- .bss section
- Uninitialized (static) data
- Block Started by Symbol
- Better Save Space
- Has section header but occupies no space
0
ELF header
Program header table (required for executables)
.text section
.data section
.bss section
.symtab
.rel.txt
.rel.data
.debug
Section header table (required for relocatables)
31ELF Object File Format (cont)
- .symtab section
- Symbol table
- Procedure and static variable names
- Section names and locations
- .rel.text section
- Relocation info for .text section
- Addresses of instructions that will need to be
modified in the executable - Instructions for modifying.
- .rel.data section
- Relocation info for .data section
- Addresses of pointer data that will need to be
modified in the merged executable - .debug section
- Info for symbolic debugging (gcc -g)
0
ELF header
Program header table (required for executables)
.text section
.data section
.bss section
.symtab
.rel.text
.rel.data
.debug
Section header table (required for relocatables)
32Example C Program
m.c
a.c
extern int e int epe int x15 int y
int a() return epxy
int e7 int main() int r a()
exit(0)
33Merging Relocatable Object Files into an
Executable Object File
Relocatable Object Files
Executable Object File
0
system code
.text
headers
.data
system data
system code
main()
.text
a()
main()
.text
m.o
more system code
.data
int e 7
system data
int e 7
.data
a()
int ep e
.text
int x 15
.bss
a.o
.data
int ep e
uninitialized data
int x 15
.symtab .debug
.bss
int y
34Relocating Symbols and Resolving External
References
- Symbols are lexical entities that name functions
and variables. - Each symbol has a value (typically a memory
address). - Code consists of symbol definitions and
references. - References can be either local or external.
m.c
a.c
extern int e int epe int x15 int y
int a() return epxy
int e7 int main() int r a()
exit(0)
35m.o Relocation Info
m.c
Disassembly of section .text 00000000 ltmaingt
00000000 ltmaingt 0 55 pushl
ebp 1 89 e5 movl esp,ebp
3 e8 fc ff ff ff call 4 ltmain0x4gt
4 R_386_PC32 a 8 6a 00
pushl 0x0 a e8 fc ff ff ff
call b ltmain0xbgt b
R_386_PC32 exit f 90 nop
int e7 int main() int r a()
exit(0)
Disassembly of section .data 00000000 ltegt
0 07 00 00 00
source objdump
36a.o Relocation Info (.text)
a.c
Disassembly of section .text 00000000 ltagt
0 55 pushl ebp 1 8b 15
00 00 00 movl 0x0,edx 6 00
3 R_386_32 ep 7 a1 00
00 00 00 movl 0x0,eax
8 R_386_32 x c 89 e5 movl
esp,ebp e 03 02 addl
(edx),eax 10 89 ec movl
ebp,esp 12 03 05 00 00 00 addl
0x0,eax 17 00
14 R_386_32 y 18 5d popl
ebp 19 c3 ret
extern int e int epe int x15 int y
int a() return epxy
37a.o Relocation Info (.data)
a.c
Disassembly of section .data 00000000 ltepgt
0 00 00 00 00 0 R_386_32 e
00000004 ltxgt 4 0f 00 00 00
extern int e int epe int x15 int y
int a() return epxy
38Executable After Relocation and External
Reference Resolution (.text)
08048530 ltmaingt 8048530 55
pushl ebp 8048531 89 e5 movl
esp,ebp 8048533 e8 08 00 00 00 call
8048540 ltagt 8048538 6a 00
pushl 0x0 804853a e8 35 ff ff ff call
8048474 lt_init0x94gt 804853f 90
nop 08048540 ltagt 8048540
55 pushl ebp 8048541 8b
15 1c a0 04 movl 0x804a01c,edx 8048546
08 8048547 a1 20 a0 04 08 movl
0x804a020,eax 804854c 89 e5
movl esp,ebp 804854e 03 02
addl (edx),eax 8048550 89 ec
movl ebp,esp 8048552 03 05 d0 a3
04 addl 0x804a3d0,eax 8048557 08
8048558 5d popl ebp
8048559 c3 ret
39Executable After Relocation and External
Reference Resolution(.data)
m.c
int e7 int main() int r a()
exit(0)
Disassembly of section .data 0804a018 ltegt
804a018 07 00 00 00
0804a01c ltepgt 804a01c
18 a0 04 08
0804a020 ltxgt 804a020 0f 00 00 00
a.c
extern int e int epe int x15 int y
int a() return epxy
40Strong and Weak Symbols
- Program symbols are either strong or weak
- strong procedures and initialized globals
- weak uninitialized globals
p1.c
p2.c
int foo5 p1()
int foo p2()
weak
strong
strong
strong
41Linkers Symbol Rules
- Rule 1. A strong symbol can only appear once.
- Rule 2. A weak symbol can be overridden by a
strong symbol of the same name. - references to the weak symbol resolve to the
strong symbol. - Rule 3. If there are multiple weak symbols, the
linker can pick an arbitrary one.
42Linker Puzzles
int x p1()
p1()
Link time error two strong symbols (p1)
int x p1()
References to x will refer to the same
uninitialized int. Is this what you really want?
int x p2()
int x int y p1()
double x p2()
Writes to x in p2 might overwrite y! Evil!
int x7 int y5 p1()
double x p2()
Writes to x in p2 will overwrite y! Nasty!
int x7 p1()
int x p2()
43Packaging Commonly Used Functions
- How to package functions commonly used by
programmers? - Math, I/O, memory management, string
manipulation, etc. - Awkward, given the linker framework so far
- Option 1 Put all functions in a single source
file - Programmers link big object file into their
programs - Space and time inefficient
- Option 2 Put each function in a separate source
file - Programmers explicitly link appropriate binaries
into their programs - More efficient, but burdensome on the programmer
- Solution static libraries (.a archive files)
- Concatenate related relocatable object files into
a single file with an index (called an archive). - Enhance linker so that it tries to resolve
unresolved external references by looking for the
symbols in one or more archives. - If an archive member file resolves reference,
link into executable.
44Static Libraries (archives)
p1.c
p2.c
Translator
Translator
static library (archive) of relocatable object
files concatenated into one file.
p1.o
p2.o
libc.a
Linker (ld)
executable object file (only contains code and
data for libc functions that are called from p1.c
and p2.c)
p
Further improves modularity and efficiency by
packaging commonly used functions e.g., C
standard library (libc), math library (libm)
Linker selectively only the .o files in the
archive that are actually needed by the program.
45Creating Static Libraries
atoi.c
printf.c
random.c
...
Translator
Translator
Translator
atoi.o
printf.o
random.o
ar rs libc.a \ atoi.o printf.o random.o
Archiver (ar)
libc.a
C standard library
- Archiver allows incremental updates
- Recompile function that changes and replace .o
file in archive.
46Commonly Used Libraries
- libc.a (the C standard library)
- 8 MB archive of 900 object files.
- I/O, memory allocation, signal handling, string
handling, data and time, random numbers, integer
math - libm.a (the C math library)
- 1 MB archive of 226 object files.
- floating point math (sin, cos, tan, log, exp,
sqrt, )
ar -t /usr/lib/libc.a sort fork.o
fprintf.o fpu_control.o fputc.o freopen.o
fscanf.o fseek.o fstab.o
ar -t /usr/lib/libm.a sort e_acos.o
e_acosf.o e_acosh.o e_acoshf.o e_acoshl.o
e_acosl.o e_asin.o e_asinf.o e_asinl.o
47Using Static Libraries
- Linkers algorithm for resolving external
references - Scan .o files and .a files in the command line
order. - During the scan, keep a list of the current
unresolved references. - As each new .o or .a file obj is encountered, try
to resolve each unresolved reference in the list
against the symbols in obj. - If any entries in the unresolved list at end of
scan, then error. - Problem
- Command line order matters!
- Moral put libraries at the end of the command
line.
bassgt gcc -L. libtest.o -lmine bassgt gcc -L.
-lmine libtest.o libtest.o In function main'
libtest.o(.text0x4) undefined reference to
libfun'
48Loading Executable Binaries
Executable object file for example program p
0
ELF header
Virtual addr
Process image
Program header table (required for executables)
0x080483e0
init and shared lib segments
.text section
.data section
0x08048494
.text segment (r/o)
.bss section
.symtab
.rel.text
0x0804a010
.data segment (initialized r/w)
.rel.data
.debug
0x0804a3b0
Section header table (required for relocatables)
.bss segment (uninitialized r/w)
49Shared Libraries
- Static libraries have the following
disadvantages - Potential for duplicating lots of common code in
the executable files on a filesystem. - e.g., every C program needs the standard C
library - Potential for duplicating lots of code in the
virtual memory space of many processes. - Minor bug fixes of system libraries require each
application to explicitly relink - Solution
- Shared libraries (dynamic link libraries, DLLs)
whose members are dynamically loaded into memory
and linked into an application at run-time. - Dynamic linking can occur when executable is
first loaded and run. - Common case for Linux, handled automatically by
ld-linux.so. - Dynamic linking can also occur after program has
begun. - In Linux, this is done explicitly by user with
dlopen(). - Basis for High-Performance Web Servers.
- Shared library routines can be shared by multiple
processes.
50Dynamically Linked Shared Libraries
m.c
a.c
Translators (cc1, as)
Translators (cc1,as)
m.o
a.o
Linker (ld)
Shared library of dynamically relocatable object
files
libc.so
p
Partially linked executable p (on disk)
Loader/Dynamic Linker (ld-linux.so)
libc.so functions called by m.c and a.c are
loaded, linked, and (potentially) shared among
processes.
Fully linked executable p (in memory)
P
51The Complete Picture
m.c
a.c
Translator
Translator
m.o
a.o
libwhatever.a
Static Linker (ld)
p
libc.so
libm.so
Loader/Dynamic Linker (ld-linux.so)
p
52Exceptional Control Flow
- Exceptional Control Flow
- Exceptions
- Process context switches
- Creating and destroying processes
53Control Flow
- Computers do Only One Thing
- From startup to shutdown, a CPU simply reads and
executes (interprets) a sequence of instructions,
one at a time. - This sequence is the systems physical control
flow (or flow of control).
Physical control flow
ltstartupgt inst1 inst2 inst3 instn ltshutdowngt
Time
54Altering the Control Flow
- Up to Now two mechanisms for changing control
flow - Jumps and branches
- Call and return using the stack discipline.
- Both react to changes in program state.
- Insufficient for a useful system
- Difficult for the CPU to react to changes in
system state. - data arrives from a disk or a network adapter.
- Instruction divides by zero
- User hits ctl-c at the keyboard
- System needs mechanisms for exceptional control
flow
55Exceptional Control Flow
- Mechanisms for exceptional control flow exists at
all levels of a computer system. - Low level Mechanism
- exceptions
- change in control flow in response to a system
event (i.e., change in system state) - Combination of hardware and OS software
- Higher Level Mechanisms
- Process context switch
- Signals
- Nonlocal jumps (setjmp/longjmp)
- Implemented by either
- OS software (context switch and signals).
- C language runtime library nonlocal jumps.
56Exceptions
- An exception is a transfer of control to the OS
in response to some event (i.e., change in
processor state)
User Process
OS
exception
current
event
exception processing by exception handler
next
exception return (optional)
57Interrupt Vectors
Exception numbers
- Each type of event has a unique exception number
k - Index into jump table (a.k.a., interrupt vector)
- Jump table entry k points to a function
(exception handler). - Handler k is called each time exception k occurs.
code for exception handler 0
interrupt vector
code for exception handler 1
0
1
code for exception handler 2
2
...
...
n-1
code for exception handler n-1
58Asynchronous Exceptions (Interrupts)
- Caused by events external to the processor
- Indicated by setting the processors interrupt
pin - handler returns to next instruction.
- Examples
- I/O interrupts
- hitting ctl-c at the keyboard
- arrival of a packet from a network
- arrival of a data sector from a disk
- Hard reset interrupt
- hitting the reset button
- Soft reset interrupt
- hitting ctl-alt-delete on a PC
59Synchronous Exceptions
- Caused by events that occur as a result of
executing an instruction - Traps
- Intentional
- Examples system calls, breakpoint traps, special
instructions - Returns control to next instruction
- Faults
- Unintentional but possibly recoverable
- Examples page faults (recoverable), protection
faults (unrecoverable). - Either re-executes faulting (current)
instruction or aborts. - Aborts
- unintentional and unrecoverable
- Examples parity error, machine check.
- Aborts current program
60Trap Example
- Opening a File
- User calls open(filename, options)
- Function open executes system call instruction
int - OS must find or create file, get it ready for
reading or writing - Returns integer file descriptor
0804d070 lt__libc_opengt . . . 804d082 cd 80
int 0x80 804d084 5b
pop ebx . . .
User Process
OS
exception
int
Open file
pop
return
61Fault Example 1
int a1000 main () a500 13
- Memory Reference
- User writes to memory location
- That portion (page) of users memory is currently
on disk - Page handler must load page into physical memory
- Returns to faulting instruction
- Successful on second try
80483b7 c7 05 10 9d 04 08 0d movl
0xd,0x8049d10
62Fault Example 2
int a1000 main () a5000 13
- Memory Reference
- User writes to memory location
- Address is not valid
- Page handler detects invalid address
- Sends SIGSEG signal to user process
- User process exits with segmentation fault
80483b7 c7 05 60 e3 04 08 0d movl
0xd,0x804e360
User Process
OS
page fault
event
movl
Detect invalid address
Signal process
63Processes
- Def A process is an instance of a running
program. - One of the most profound ideas in computer
science. - Not the same as program or processor
- Process provides each program with two key
abstractions - Logical control flow
- Each program seems to have exclusive use of the
CPU. - Private address space
- Each program seems to have exclusive use of main
memory. - How are these illusions maintained?
- Process executions interleaved (multitasking)
- Address spaces managed by virtual memory system
64Logical Control Flows
Each process has its own logical control flow
Process A
Process B
Process C
Time
65Concurrent Processes
- Two processes run concurrently (are concurrent)
if their flows overlap in time. - Otherwise, they are sequential.
- Examples
- Concurrent A B, A C
- Sequential B C
66User View of Concurrent Processes
- Control flows for concurrent processes are
physically disjoint in time. - However, we can think of concurrent processes are
running in parallel with each other.
Process A
Process B
Process C
Time
67Context Switching
- Processes are managed by a shared chunk of OS
code called the kernel - Important the kernel is not a separate process,
but rather runs as part of some user process - Control flow passes from one process to another
via a context switch.
Process A code
Process B code
user code
context switch
kernel code
Time
user code
context switch
kernel code
user code
68Private Address Spaces
- Each process has its own private address space.
0xffffffff
kernel virtual memory (code, data, heap, stack)
memory invisible to user code
0xc0000000
user stack (created at runtime)
esp (stack pointer)
memory mapped region for shared libraries
0x40000000
brk
run-time heap (managed by malloc)
read/write segment (.data, .bss)
loaded from the executable file
read-only segment (.init, .text, .rodata)
0x08048000
unused
0
69System calls
- Unix systems provide many different types of
systems calls to be used by application programs
when they need a service from the kernel - Reading a file
- Creating a new process
- To get the complete list, type
- man syscalls
70fork Creating new processes
- int fork(void)
- creates a new process (child process) that is
identical to the calling process (parent process) - returns 0 to the child process
- returns childs pid to the parent process
if (fork() 0) printf("hello from
child\n") else printf("hello from
parent\n")
Fork is interesting (and often confusing) because
it is called once but returns twice
71Fork Example 1
- Key Points
- Parent and child both run same code
- Distinguish parent from child by return value
from fork - Start with same state, but each has private copy
- Including shared output file descriptor
- Relative ordering of their print statements
undefined
void fork1() int x 1 pid_t pid
fork() if (pid 0) printf("Child has x
d\n", x) else printf("Parent has x
d\n", --x) printf("Bye from process
d with x d\n", getpid(), x)
72Fork Example 2
- Key Points
- Both parent and child can continue forking
void fork2() printf("L0\n") fork()
printf("L1\n") fork()
printf("Bye\n")
73Fork Example 3
- Key Points
- Both parent and child can continue forking
void fork3() printf("L0\n") fork()
printf("L1\n") fork()
printf("L2\n") fork()
printf("Bye\n")
74Fork Example 4
- Key Points
- Both parent and child can continue forking
void fork4() printf("L0\n") if (fork()
! 0) printf("L1\n") if (fork() ! 0)
printf("L2\n") fork()
printf("Bye\n")
75Fork Example 5
- Key Points
- Both parent and child can continue forking
void fork5() printf("L0\n") if (fork()
0) printf("L1\n") if (fork() 0)
printf("L2\n") fork()
printf("Bye\n")
76exit Destroying Process
- void exit(int status)
- exits a process
- Normally return with status 0
- atexit() registers functions to be executed upon
exit
void cleanup(void) printf("cleaning
up\n") void fork6() atexit(cleanup)
fork() exit(0)
77Zombies
- Idea
- When process terminates, it still consumes system
resources - Various tables maintained by OS
- Called a zombie
- Living corpse, half alive and half dead
- Reaping
- Performed by parent on terminated child
- Parent is given exit status information
- Kernel discards process
- What if Parent Doesnt Reap?
- If any parent terminates without reaping a child,
then child will be reaped by init process - Only need explicit reaping for long-running
processes - E.g., shells and servers
78ZombieExample
void fork7() if (fork() 0) / Child
/ printf("Terminating Child, PID d\n",
getpid()) exit(0) else
printf("Running Parent, PID d\n",
getpid()) while (1) / Infinite loop /
linuxgt ./forks 7 1 6639 Running Parent, PID
6639 Terminating Child, PID 6640 linuxgt ps
PID TTY TIME CMD 6585 ttyp9 000000
tcsh 6639 ttyp9 000003 forks 6640 ttyp9
000000 forks ltdefunctgt 6641 ttyp9 000000
ps linuxgt kill 6639 1 Terminated linuxgt ps
PID TTY TIME CMD 6585 ttyp9 000000
tcsh 6642 ttyp9 000000 ps
- ps shows child process as defunct
- Killing parent allows child to be reaped
79NonterminatingChildExample
void fork8() if (fork() 0) / Child
/ printf("Running Child, PID d\n",
getpid()) while (1) / Infinite loop /
else printf("Terminating Parent, PID
d\n", getpid()) exit(0)
linuxgt ./forks 8 Terminating Parent, PID
6675 Running Child, PID 6676 linuxgt ps PID
TTY TIME CMD 6585 ttyp9 000000
tcsh 6676 ttyp9 000006 forks 6677 ttyp9
000000 ps linuxgt kill 6676 linuxgt ps PID TTY
TIME CMD 6585 ttyp9 000000 tcsh
6678 ttyp9 000000 ps
- Child process still active even though parent has
terminated - Must kill explicitly, or else will keep running
indefinitely
80wait Synchronizing with children
- int wait(int child_status)
- suspends current process until one of its
children terminates - return value is the pid of the child process that
terminated - if child_status ! NULL, then the object it
points to will be set to a status indicating why
the child process terminated
81wait Synchronizing with children
void fork9() int child_status if
(fork() 0) printf("HC hello from
child\n") else printf("HP hello
from parent\n") wait(child_status)
printf("CT child has terminated\n")
printf("Bye\n") exit()
82Wait Example
- If multiple children completed, will take in
arbitrary order - Can use macros WIFEXITED and WEXITSTATUS to get
information about exit status
void fork10() pid_t pidN int i
int child_status for (i 0 i lt N i) if
((pidi fork()) 0) exit(100i) /
Child / for (i 0 i lt N i) pid_t
wpid wait(child_status) if
(WIFEXITED(child_status)) printf("Child d
terminated with exit status d\n", wpid,
WEXITSTATUS(child_status)) else
printf("Child d terminate abnormally\n", wpid)
83Waitpid
- waitpid(pid, status, options)
- Can wait for specific process
- Various options
void fork11() pid_t pidN int i
int child_status for (i 0 i lt N i) if
((pidi fork()) 0) exit(100i) /
Child / for (i 0 i lt N i) pid_t
wpid waitpid(pidi, child_status, 0) if
(WIFEXITED(child_status)) printf("Child d
terminated with exit status d\n", wpid,
WEXITSTATUS(child_status)) else
printf("Child d terminated abnormally\n",
wpid)
84Wait/Waitpid Example Outputs
Using wait (fork10)
Child 3565 terminated with exit status 103 Child
3564 terminated with exit status 102 Child 3563
terminated with exit status 101 Child 3562
terminated with exit status 100 Child 3566
terminated with exit status 104
Using waitpid (fork11)
Child 3568 terminated with exit status 100 Child
3569 terminated with exit status 101 Child 3570
terminated with exit status 102 Child 3571
terminated with exit status 103 Child 3572
terminated with exit status 104
85Command line arguments in C/C
- Command line arguments to C/C programs are
passed through the argv arrayint main (int
argc, char argv)argc is the number of
command line arguments, including the name of the
program or command being executed ( passed as
argv0)Example 1 printArgs.cExample 2
printN.c
86exec Running new programs
- int execl(char path, char arg0, char arg1, ,
0) - loads and runs executable at path with args arg0,
arg1, - path is the complete path of an executable
- arg0 becomes the name of the process
- typically arg0 is either identical to path, or
else it contains only the executable filename
from path - real arguments to the executable start with
arg1, etc. - list of args is terminated by a (char )0
argument - returns -1 if error, otherwise doesnt return!
main() if (fork() 0)
execl("/usr/bin/cp", "cp", "foo", "bar", 0)
wait(NULL) printf("copy completed\n")
exit()
87Running printArgs from a program
- Instead of running the printArgs program from
the Unix shell, we can run it from a program,
using execlp. - Example 1 runls.c
- Example 2 prog.c
- Example 3 prog2.c
88Creating new processes in UNIX
- A process creates a new process that executes a
given program or command as followsCall fork(
) to create a new processCall exec( ) within
the new process to execute the program or command - Example progExec.c
89Writing a Unix Shell
- Pseudo Code for a shell
- print a prompt.
- while( EOF is not signaled and an input line is
read ) - create a child process (using fork)
- have the child process replace its program (this
shell program) with the program specified in the
input line. (using execlp) - wait for the child to finish executing its
program. (using wait) - print a prompt