Linking - PowerPoint PPT Presentation

1 / 73
About This Presentation
Title:

Linking

Description:

ASCII source file. binary executable object file (memory image on disk) 4. Linker (ld) ... Each relocatable object module has a symbol table ... – PowerPoint PPT presentation

Number of Views:32
Avg rating:3.0/5.0
Slides: 74
Provided by: binyu5
Category:
Tags: ascii | linking | table

less

Transcript and Presenter's Notes

Title: Linking


1
Linking
2
Outline
  • What is linking and why linking
  • Complier driver
  • Static linking
  • Symbols Symbol Table
  • Symbol Resolution
  • Relocation
  • Executable Object Files
  • Loading
  • Dynamic Linking
  • Suggested reading 7.17.11

3
Monolithic source file
  • Problems
  • efficiency small change requires complete
    recompilation
  • modularity hard to share common functions (e.g.
    printf)

4
Separate Compilation
5
What is linker
  • Linking is the process of
  • collecting and combining various pieces of code
    and data into a single executable file
  • Executable file
  • Can be loaded (copied) into memory and executed.

6
What is linker
  • Linking can be performed
  • at compile time, when the source code is
    translated into machine code by the linker
  • at load time, when the program is loaded into
    memory and executed by the loader
  • at run time, by application programs.

7
Why learning on linking
  • Build large program
  • Avoid missing modules, especially libraries
  • Avoid dangerous programming errors
  • Define multiple global variables
  • Under stand scoping rules
  • Extern and static vs. auto and register

8
(No Transcript)
9
Example
  • Two functions
  • main() and swap()
  • Three global variables
  • buf, bufp0 which are initialized explicitly
  • bufp1 implicitly initialized to 0

10
Compiler Drivers
  • Coordinates all steps in the translation and
    linking process
  • Typically included with each compilation system
    (e.g., gcc)
  • Invokes
  • preprocessor (cpp)
  • compiler (cc1)
  • assembler (as),
  • linker (ld).
  • Passes command line args to appropriate phases

11
Example
  • unixgt gcc -O2 -g -o p main.c swap.c
  • cpp args main.c /tmp/main.i
  • cc1 /tmp/main.i main.c -O2 args -o /tmp/main.s
  • as args -o /tmp/main.o /tmp/main.s
  • ltsimilar process for swap.cgt
  • ld -o p system obj files /tmp/main.o
    /tmp/swap.o
  • unixgt

12
Static linking
  • Input
  • A relocatable object files and command line
    arguments
  • Output
  • Fully linked executable object file that can be
    loaded and run

13
Static linking
  • Object file
  • Various code and data sections
  • Instructions are in one section
  • Initialized global variables are in one section
  • Uninitialized global variables are in one section

14
Static linking
  • Symbol resolution
  • resolves external references.
  • external reference reference to a symbol
    defined in another object file

15
Static linking
  • Relocation
  • relocates symbols from their relative locations
    in the .o files to new absolute positions in the
    executable.
  • updates all references to these symbols to
    reflect their new positions.
  • references can be in either code or data
  • code a() / ref to symbol a /
  • data int xpx / ref to symbol x /

16
Object files
  • Relocatable object file
  • Contain binary code and data in a form that can
    be combined with other relocatable object files
    to create an executable file
  • Executable object file
  • Contains binary code and data in a form that can
    be copied directly into memory and executed

17
Object files
  • Shared object file
  • A special type of relocatable object file that
    can be loaded into memory and linked dynamically,
    at either load time or run time

18
Executable and Linkable Format (ELF)
  • Standard binary format for object files
  • Derives from ATT System V Unix
  • later adopted by BSD Unix variants and Linux
  • One unified format for relocatable object files
    (.o), executable object files, and shared object
    files (.so)
  • generic name ELF binaries
  • Better support for shared libraries than old
    a.out formats.

19
EFI object file format
20
EFI object file format
  • Elf header
  • magic number, type (.o, exec, .so), machine, byte
    ordering, etc.
  • Section header table

21
EFI object file format
  • .text section
  • code
  • .data section
  • initialized (static) data
  • .bss section
  • uninitialized (static) data
  • Block Started by Symbol
  • Better Save Space
  • has section header but occupies no space

22
EFI object file format
  • .symtab section
  • symbol table
  • procedure and static variable names
  • section names and locations
  • .rel.text section
  • relocation info for .text section
  • addresses of instructions that will need to be
    modified in the executable

23
EFI object file format
  • .rel.data section
  • relocation info for .data section
  • addresses of pointer data that will need to be
    modified in the merged executable
  • .debug section
  • debugging symbol table, local variables and
    typedefs, global variables, original C source
    file (gcc -g)

24
EFI object file format
  • .line
  • Mapping between line numbers in the original C
    source program and machine code instructions in
    the .text section.
  • .strtab
  • A string table for the symbol tables and for the
    section names.

25
Symbols
  • Three kinds of symbols
  • Defined global symbols
  • Referenced global symbols
  • Local symbols

26
Symbols
  • Defined global symbols
  • Defined by module m and can be referenced by
    other modules
  • Nonstatic C functions
  • Global variables that are defined without the C
    static attribute

27
Symbols
  • Referenced global symbols
  • Referenced by module m but defined by some other
    module
  • C functions and variables that are defined in
    other modules
  • Local symbols
  • Defined and referenced exclusively by module m.
  • C functions and global variables with static
    attribute

28
Symbol Tables
  • Each relocatable object module has a symbol table
  • A symbol table contains information about the
    symbols that are defined and referenced by the
    module

29
Symbol Tables
  • Local nonstatic program variables
  • does not contain in the symbol table in .symbol
  • Local static procedure variables
  • Are not managed on the stack
  • Be allocated in .data or .bss

30
EFI object file format
31
Examples
  • int f()
  • static int x1
  • return x
  • int g()
  • static int x 1
  • return x
  • x.1 and x.2 are allocated in .data

32
Symbol Tables
  • Compiler exports symbols in .s file
  • Assembler builds symbol tables using exported
    symbols
  • An ELF symbol table is contained in .symtab
    section
  • Symbol table contains an array of entries

33
(No Transcript)
34
ELF Symbol Tables
  • typedef struct
  • int name / string table offset /
  • int value / section offset, or VM address /
  • int size / object size in bytes /
  • char type4 , / data, func, section, or src
    file name /
  • binding4 / local or global /
  • char reserved / unused /
  • char section / section header index, ABS,
    UNDEF, /
  • / or COMMON /
  • ABS, UNDEF, COMMON

35
ELF Symbol Tables
  • Num Value Size Type Bind Ot Ndx Name
  • 8 0 8 OBJECT
    GLOBAL 0 3 buf
  • 9 0 17 FUNC
    GLOBAL 0 1 main
  • 10 0 0 NOTYPE
    GLOBAL 0 UND swap
  • Num Value Size Type Bind Ot Ndx Name
  • 8 0 4 OBJECT
    GLOBAL 0 3 bufp0
  • 9 0 0 NOTYPE
    GLOBAL 0 UND buf
  • 10 0 39 FUNC
    GLOBAL 0 1 swap
  • 11 4 4 OBJECT
    GLOBAL 0 COM bufp1

alignment
36
Symbol Resolution
  • void foo(void)
  • int main()
  • foo()
  • return 0
  • Unixgt gcc Wall O2 o linkerror linkerror.c
  • /tmp/ccSz5uti.o In function main
  • /tmp/ccSz5uti.o (.text0x7) undefined reference
    to foo
  • collect2 ld return 1 exit status

37
Multiply Defined Global Symbols
  • Strong
  • Functions and initialized global variables
  • Weak
  • Uninitialized global variables
  • Rules
  • Multiple strong symbols are not allowed
  • Given a strong symbol and multiple weak symbols,
    choose the strong symbol
  • Given multiple weak symbols, choose any of the
    weak symbol

38
Multiply Defined Global Symbols
39
Multiply Defined Global Symbols
  • /foo3.c/
  • include ltstdio.hgt
  • void f()
  • int x15213
  • int main()
  • f()
  • printf(xd\n,x)
  • return 0
  • /bar3.c/
  • int x
  • void f()
  • x 15212

40
Multiply Defined Global Symbols
  • /foo4.c/
  • include ltstdio.hgt
  • void f()
  • int x15213
  • int main()
  • x15213
  • f()
  • printf(xd\n,x)
  • return 0
  • /bar4.c/
  • int x
  • void f()
  • x 15212

41
Multiply Defined Global Symbols
  • /foo5.c/
  • include ltstdio.hgt
  • void f()
  • int x15213
  • int y15212
  • int main()
  • f()
  • printf(x0xx y0xx \n,
  • x, y)
  • return 0
  • /bar5.c/
  • double x
  • void f()
  • x -0.0

42
Relocation
  • Relocation
  • Merge the input modules
  • Assign runtime address to each symbol
  • Two steps
  • Relocating sections and symbol definitions
  • Relocating symbol references within sections

43
Relocation
  • For each reference to an object with unknown
    location
  • Assembler generates a relocation entry
  • Relocation entries for code are placed in
    .rel.text
  • Relocation entries for data are placed in
    .rel.data

44
Relocation
  • Relocation Entry
  • typedef struct
  • int offset
  • int symbol24,
  • type8
  • Elf32_Rel

45
Relocation
  • e8 fc ff ff ff call 7ltmain0x7gt swap()
  • There is a relocation entry in rel.txt
  • offset symbol type
  • 7 swap R_386_PC32

46
Relocation
  • int bufp0 buf0
  • 00000000 ltbufp0gt
  • 0 00 00 00 00
  • There is a relocation entry in rel.data
  • offset symbol type
  • 0 buf R_386_32

47
Relocation
  • e8 fc ff ff ff call 7ltmain0x7gt swap()
  • 7 R_386_PC32 swap relocation entry
  • r.offest 0x7
  • r.symbol swap
  • r.type R_386_PC32
  • ADDR(main)ADDR(.text) 0x80483b4
  • ADDR(swap)0x80483c8
  • refaddr ADDR(main)r.offset 0x80483bb
  • ADDR(r.symbol)ADDR(swap)0x80483c8
  • refptr (unsigned) (ADDR(r.symbol) refptr
    refaddr
  • (unsigned) (0x80483c8 (-4) 0x80483bb)
  • (unsigned) 0x9

48
Relocation
  • int bufp0 buf0
  • 00000000 ltbufp0gt
  • 0 00 00 00 00 int bufp0 buf0
  • 0 R_386_32 buf relocation entry
  • ADDR(r.symbol) ADDR(buf) 0x8049454
  • refptr (unsigned) (ADDR(r.symbol) refptr)
  • (unsigned) (0x8049454)
  • 0804945c ltbufp0gt
  • 0804945c 54 94 04 08

49
Relocation
  • foreach section s
  • foreach relocation entry r
  • refptr s r.offset / ptr to
    reference to be relocated /
  • / relocate a PC-relative reference /
  • if (r.type R_386_PC32)
  • refaddr ADDR(s) r.offset /
    refs runtime address /
  • refptr (unsigned) (ADDR(r.symbol)
    refptr refaddr)
  • / relocate an absolute reference /
  • if ( r.type R_386_32 )
  • refptr (unsigned) (ADDR(r.symbol)
    refptr)

50
Relocation
  • 080483b4ltmaingt
  • 080483b4 55 push ebp
  • 080483b5 89 e5 mov esp, ebp
  • 080483b7 83 ec 08 sub 0x8, esp
  • 080483ba e8 09 00 00 00 call 80483c8 ltswapgt
  • 080483bf 31 c0 xor eax, eax
  • 080483c1 89 ec mov ebp, esp
  • 080483c3 5d pop ebp
  • 080483c4 c3 ret
  • 080483c5 90 nop
  • 080483c6 90 nop
  • 080483c7 90 nop

51
Relocation
  • 080483c8ltswapgt
  • 80483c8 55 push ebp
  • 80483c9 8b 15 5c 94 04 08 mov 0x804945c,
    edx get bufp0
  • 80483cf a1 58 94 04 08 mov 0x8049458,
    edx get buf1
  • 80483d4 89 e5 mov esp, ebp
  • 80483d6 c7 05 48 85 04 08 58 movl 0x8049458,
    0x8049548
  • 80483dd 94 04 08 bufp1 buf1
  • 80483e0 89 ec mov ebp, esp
  • 80483e2 8b 0a mov (edx), ecx
  • 80483e4 80 02 mov eax, (edx)
  • 80483e6 a1 48 95 04 08 mov 0x8049548, eax
  • 80483eb 89 08 mov ecx, (eax)
  • 80483ed 5d pop ebp
  • 80483ee c3 ret

52
Relocation
  • 08049454 ltbufgt
  • 8049454 01 00 00 00 02 00 00 00
  • 0804945cltbufp0gt
  • 804945c 54 94 04 08

53
EFI object file format
54
Executable Object Files
  • ELF header
  • Overall information
  • Entry point
  • .init section
  • A small function _init
  • Initialization
  • Program header table
  • page size, virtual addresses for memory segments
    (sections), segment sizes.

55
(No Transcript)
56
Loading
  • Unixgt ./p
  • Loader
  • Memory-resident operating system code
  • Invoked by call the execve function
  • Copy the code and data in the executable object
    file from disk into memory
  • Jump to the entry point
  • Run the program

57
Loading
  • Startup code
  • At the _start address defined in the crt1.o
  • Same for all C program
  • 0x080480c0lt_startgt
  • call _libc_init_first
  • call _init
  • call atexit
  • call main
  • call _exit

58
Packaging commonly used functions
  • How to package functions commonly used by
    programmers?
  • math, I/O, memory management, string
    manipulation, etc.

59
Packaging commonly used functions
  • Awkward, given the linker framework so far
  • Option 1 Put all functions in a single source
    file
  • programmers link big object file into their
    programs
  • space and time inefficient
  • Option 2 Put each function in a separate source
    file
  • programmers explicitly link appropriate binaries
    into their programs
  • more efficient, but burdensome on the programmer

60
Packaging commonly used functions
  • Solution static libraries (.a archive files)
  • concatenate related relocatable object files into
    a single file with an index (called an archive)
  • enhance linker so that it tries to resolve
    unresolved external references by looking for the
    symbols in one or more archives
  • If an archive member file resolves reference,
    link into executable

61
Static libraries (archives)
62
Creating static libraries
63
Using static libraries
  • E
  • relocatable object files that will be merged to
    form the executable
  • U
  • Unresolved symbols
  • D
  • Symbols that have been defined in previous input
    files
  • Initially all are empty

64
Using static libraries
  • Scan .o files and .a files in the command line
    order.
  • When scan an object file f,
  • Add f to E
  • Updates U, D
  • When scan an archive file f,
  • Resolve U
  • If m is used to resolve symbol, m is added to E
  • Update U, D using m

65
Using static libraries
  • If any entries in the unresolved list at end of
    scan, then error
  • Problem
  • command line order matters!
  • Moral put libraries at the end of the command
    line.

66
Shared libraries
  • Static libraries have the following
    disadvantages
  • potential for duplicating lots of common code in
    the executable files on a file system.
  • e.g., every C program needs the standard C
    library
  • potential for duplicating lots of code in the
    virtual memory space of many processes.
  • minor bug fixes of system libraries require each
    application to explicitly relink

67
Shared libraries ? Solution
  • shared libraries (dynamic link libraries, DLLs)
    whose members are
  • dynamically loaded into memory and
  • linked into an application at run-time

68
Shared libraries ? Solution
  • dynamic linking can occur when executable is
    first loaded and run.
  • common case for Linux, handled automatically by
    ld-linux.so
  • dynamic linking can also occur after program has
    begun.
  • in Linux, this is done explicitly by user with
    dlopen()
  • shared library routines can be shared by multiple
    processes.

69
Dynamic Linking
  • Unixgt gcc shared fPIC o libvector.so addvec.c
    multvec.c
  • Unixgtgcc o p2 main2.c ./libvector.so

70
Dynamically linked shared libraries
71
Dynamic Linking
  • include ltdlfcn.hgt
  • void dlopen(const char filename, int flag)
  • returns ptr to handle if OK, NULL on error
  • void dlsym(void handle, char symbol)
  • returns ptr to symbol if OK, NULL on error
  • int dlclose(void handle)
  • returns 0 if OK, -1 on error
  • const char dlerror(void)
  • returns errormsg if previous call to
  • dlopen, dlysym, or dlclose failed,
  • NULL if previous call was OK

72
  • include ltstdio.hgt
  • include ltdlfcn.hgt
  • int x2 1, 2
  • int y2 3, 4
  • int z2
  • int main()
  • void handle
  • void (addvec)(int , int , int , int )
  • char error
  • /dynamically load the shared library that
    contains addvec() /
  • handle dlopen(./libvector.so, RTLD_LAZY)
  • if (!handle)
  • fprintf(stderr, s\n, dlerror())
  • exit()

73
  • /get a pointer to the addvec() function we
    just loaded /
  • addvec dlsym(handle, addvec)
  • if ( (error dlerror()) ! NULL )
  • fprintf(stderr, s\n, error)
  • exit(1)
  • / Now we can call addvec() just like any other
    function /
  • addvec(x, y, z, 2)
  • printf(zd, d\n, z0, z1)
  • / unload the shared library /
  • if (dlclose(handle) lt0)
  • fprintf(stderr, s\n, dlerror())
  • exit(1)
  • return 0
Write a Comment
User Comments (0)
About PowerShow.com