Computer Systems - PowerPoint PPT Presentation

1 / 24
About This Presentation
Title:

Computer Systems

Description:

Disassembly of main (source gdb): 0x80483e4 main : push ëp. 0x80483e5 main 1 ... Disassembly of section .text (source objdump d m) 80483e4: 55 push ëp ... – PowerPoint PPT presentation

Number of Views:40
Avg rating:3.0/5.0
Slides: 25
Provided by: soft9
Category:

less

Transcript and Presenter's Notes

Title: Computer Systems


1
Computer Systems
  • Linking

2
GNU Compiler Collection
  • Compiler driver coordinates all steps in the
    translation and linking process.
  • Invokes preprocessor (cpp), compiler (cc1),
    assembler (as), and linker (ld).
  • Passes command line arguments to appropriate
    phases

bassgt gcc -O2 -v -o p m.c a.c cpp args m.c
/tmp/cca07630.i cc1 /tmp/cca07630.i m.c -O2
args -o /tmp/cca07630.s as args -o
/tmp/cca076301.o /tmp/cca07630.s ltsimilar
process for a.cgt ld -o p system obj files
/tmp/cca076301.o /tmp/cca076302.o bassgt
3
Linking
  • Static linking
  • Dynamic linking
  • Application loading and linking

4
What Does a Linker Do?
  • Merges object files
  • Merges multiple relocatable (.o) object files
    into a single executable object file
  • Resolves external references reference to a
    symbol (function or global variable) defined in
    another object file.
  • Relocates symbols
  • Relocates symbols from their relative locations
    in the .o files to new absolute positions in the
    executable.
  • Updates all references to these symbols to
    reflect their new positions.
  • code a() / reference to symbol a /
  • data int xpx / reference to symbol x /

5
Why Linkers?
  • Modularity
  • Program can be written as a collection of smaller
    source files, rather than one monolithic mass.
  • Efficiency
  • Time
  • Change one source file, compile, and then relink.
  • No need to recompile other source files.
  • Space
  • Libraries of common functions can be aggregated
  • Executable files and running memory images
    contain only code for the functions they actually
    use.

6
Executable and Linkable Format (ELF)
  • Standard binary format for object files
  • Derives from ATT System V Unix
  • Later adopted by BSD Unix variants and Linux
  • One unified format for
  • Relocatable object files (.o),
  • Executable object files
  • Shared object files (.so)
  • Generic name ELF binaries
  • Tools file, readelf -h

7
ELF Object File Format
0
  • Elf header
  • Magic number, type (.o, exec, .so), machine, byte
    ordering, etc.
  • .text section
  • Code
  • .data section
  • Initialized data (!static global var)
  • .bss section
  • Uninitialized data (!static global var)
  • .rodata section
  • Read-only data (const var)

8
Example C Program
m.c
a.c
extern int e int epe int x15 int y
int a() return epxy
int e7 int main() int r a()
exit(0)
9
Merging Object Files into an Executable
Relocatable Object Files
Executable Object File
0
system code
.text
headers
.data
system data
system code
main()
.text
a()
main()
.text
m.o
more system code
.data
int e 7
system data
int e 7
.data
a()
int ep e
.text
int x 15
.bss
a.o
.data
int ep e
uninitialized data
int x 15
.symtab .debug
.bss
int y
10
Definitions and External References
  • Symbols are lexical entities that name functions
    and variables.
  • Each symbol has a value (typically a memory
    address).
  • Code consists of symbol definitions and
    references.
  • References can be either local or external.

m.c
a.c
extern int e int epe int x15 int y
int a() return epxy
int e7 int main() int r a()
exit(0)
Def of local symbol e
Ref to external symbol exit (defined in libc.so)
11
m.o Relocation Info
m.c
int e7 int main() int r a()
exit(0)
Disassembly of main (source gdb) 0x80483e4
ltmaingt push ebp 0x80483e5 ltmain1gt
mov esp,ebp 0x80483e7 ltmain3gt sub
0x8,esp 0x80483ea ltmain6gt call
0x80483fc ltagt 0x80483ef ltmain11gt add
0xfffffff4,esp 0x80483f2 ltmain14gt push
0x0 0x80483f4 ltmain16gt call 0x80482f0
ltexitgt
Disassembly of section .rel.text (source
objdump r) RELOCATION RECORDS FOR
.text OFFSET TYPE
VALUE 00000007 R_386_PC32 a 00000011
R_386_PC32 exit
Disassembly of section .text (source objdump d
m.o) 0 55 push
ebp 1 89 e5 mov
esp,ebp 3 83 ec 08 sub
0x8,esp 6 e8 fc ff ff ff call
7 ltmain0x7gt 7
R_386_PC32 a
12
Executable m info
Disassembly of section .text (source objdump d
m.o) 0 55 push
ebp 1 89 e5 mov
esp,ebp 3 83 ec 08 sub
0x8,esp 6 e8 fc ff ff ff call
7 ltmain0x7gt 7
R_386_PC32 a
  • ff ff ff fc (-4) translated into 0d (13)
  • 7 translated into 80483fc

Disassembly of section .text (source objdump d
m) 80483e4 55 push
ebp 80483e5 89 e5
mov esp,ebp 80483e7 83 ec 08
sub 0x8,esp 80483ea e8 0d 00
00 00 call 80483fc ltagt
13
Relocation algorithm
typedef struct int offset / offset of
the reference to relocate / int symbol24,
/ symbol the reference should point to /
type8 / relocation type / Elf32_Rel
foreach section s foreach relocation entry r
if (r.type R_386_PC32) refaddr
ADDR(s) r.offset refptr
ADDR(r.symbol) refaddr
0xfffffffc ? 0xd 0x7 ? 0x80483fc
ADDR(S) main 0x80483e4 r.type
R_386_PC32 r.symbol a 0x80483fc r.offset
0x7 refptr 0xfcfffff refaddr
0x80483e4 0x7 0x80483eb // line after
call refptr 0xfffffffc 0x80483fc - 0x80483eb
0xd PC lt- PC 0xd 0x80483eb 0xd
0x80483fc // 13 lines more
14
Loading Info
0xc0000000
LOAD off 0x00000000 vaddr 0x08048000 paddr
0x08048000 align 212 filesz 0x0000047c
memsz 0x0000047c flags r-x LOAD off 0x0000047c
vaddr 0x0804947c paddr 0x0804947c align 212
filesz 0x00000114 memsz 0x00000130 flags rw-
Loaded from the executable file
Executable object file
0x40000000
ELF header
Program header table (required for executables)
Run-time heap (created by malloc)
.init
0x080495ac
0x08048300
.text section
Read/write segment (.data, .bss)
.fini
0x0804947c
Read-only segment (.init, .text, .rodata)
.data section
0x08048300
.bss section
Unused
0
15
How to Package Commonly Used Functions?
  • Awkward, given the linker framework so far
  • Option 1 Put all functions in a single source
    file
  • Programmers link big object file into their
    programs
  • Space and time inefficient
  • Option 2 Put each function in a separate source
    file
  • Programmers explicitly link appropriate binaries
    into their programs
  • More efficient, but burdensome on the programmer
  • Solution static libraries (.a archive files)
  • Concatenate related relocatable object files into
    a single file with an index (called an archive).
  • Enhance linker so that it tries to resolve
    unresolved external references by looking for the
    symbols in one or more archives.

16
Static Libraries
  • Static libraries have the following
    disadvantages
  • Potential for duplicating lots of common code in
    the executable files on a filesystem.
  • e.g., every C program needs the standard C
    library
  • Potential for duplicating lots of code in the
    virtual memory space of many processes.
  • Minor bug fixes of system libraries require each
    application to explicitly relink

17
Shared Libraries
  • Solution
  • Shared libraries (dynamic link libraries, DLLs)
    whose members are dynamically loaded into memory
    and linked into an application at run-time.
  • Dynamic linking can occur when executable is
    first loaded and run.
  • Common case for Linux, handled automatically by
    ld-linux.so.
  • Dynamic linking can also occur after program has
    begun.
  • In Linux, this is done explicitly by user with
    dlopen().
  • Basis for High-Performance Web Servers.
  • Shared library routines can be shared by multiple
    processes.

18
Separate region for shared libraries
memory invisible to user code
kernel virtual memory
stack
esp
Memory mapped region for shared libraries
Linux/x86 process memory image
the brk ptr
runtime heap (via malloc)
uninitialized data (.bss)
initialized data (.data)
program text (.text)
forbidden
0
19
The Complete Picture
m.c
a.c
Translator
Translator
m.o
a.o
libwhatever.a
Static Linker (ld)
p
libc.so
libm.so
Loader/Dynamic Linker (ld-linux.so)
p
20
Power Programmer
  • The order of linking is very important

21
Strong and Weak Symbols
  • Program symbols are either strong or weak
  • strong procedures and initialized globals
  • weak uninitialized globals

p1.c
p2.c
weak
int foo5 p1()
int foo p2()
strong
strong
strong
22
Linkers Symbol Rules
  • Rule 1. A strong symbol can only appear once.
  • Rule 2. A weak symbol can be overridden by a
    strong symbol of the same name.
  • references to the weak symbol resolve to the
    strong symbol.
  • Rule 3. If there are multiple weak symbols, the
    linker can pick an arbitrary one.

23
Linker Puzzles
int x p1()
p1()
Link time error two strong symbols (p1)
int x p1()
References to x will refer to the same
uninitialized int. Is this what you really want?
int x p2()
int x int y p1()
double x p2()
Writes to x in p2 might overwrite y! Evil!
int x7 int y5 p1()
double x p2()
Writes to x in p2 will overwrite y! Nasty!
int x7 p1()
int x p2()
24
Assignment
  • Problem 7.7 Modify bar5.c to print the correct
    values
Write a Comment
User Comments (0)
About PowerShow.com