Title: Interfacing with ELF files
1Interfacing with ELF files
- An introduction to the Executable and Linkable
Format (ELF) binary file specification standard
2Overview of source translation
User-created files
Assembly Source Files
C/C Source and Header Files
Makefile
C/C Source and Header Files
Assembly Source Files
Linker Script File
preprocessor
Make Utility
compiler
assembler
Object Files
Object Files
Archive Utility
Library Files
Library Files
Linker and Locator
Shared Object File
Linkable Image File
Executable Image File
Link Map File
3Executable versus Linkable
ELF Header
ELF Header
Program-Header Table (optional)
Program-Header Table
Segment 1 Data
Section 1 Data
Section 2 Data
Segment 2 Data
Section 3 Data
Segment 3 Data
Section n Data
Segment n Data
Section-Header Table (optional)
Section-Header Table
Linkable File
Executable File
4Role of the Linker
ELF Header
ELF Header
Section 1 Data
Program-Header Table
Section 2 Data
Section n Data
Segment 1 Data
Section-Header Table
Segment 2 Data
Linkable File
Segment n Data
ELF Header
Section 1 Data
Section 2 Data
Executable File
Section n Data
Section-Header Table
Linkable File
5ELF Header
e_ident EI_NIDENT
e_type
e_machine
e_version
e_entry
e_phoff
e_shoff
e_flags
e_ehsize
e_phentsize
e_phnum
e_shentsize
e_shnum
e_shstrndx
Section-Header Table e_shoff, e_shentsize,
e_shnum, e_shstrndx Program-Header Table
e_phoff, e_phentsize, e_phnum, e_entry
NOTE The sizes of these fields, and their
arrangement, is slightly different for the ELF64
files that are produced by default on our x86_64
Linux workstations.
6Section-Headers
sh_name
sh_type
sh_flags
sh_addr
sh_offset
sh_size
sh_link
sh_info
sh_addralign
sh_entsize
NOTE These are for the ELF32 file-format.
7Program-Headers
p_type
p_offset
p_vaddr
p_paddr
p_filesz
p_memsz
p_flags
p_align
NOTE These are for the ELF32 file-format.
8Official ELF documentation
- The official document that describes ELF
file-formats for both the linkable and the
executable files is available online on our
CS630 course website (see Resources) - (Be aware that this document has been revised to
accommodate programs that will be run on
platforms which implement 64-bit addresses and
processor registers)
9Memory Physical vs. Virtual
Portions of physical memory are mapped by the
CPU into regions of each tasks virtual
address-space
Virtual Address Spaces (4 GB)
Physical address space (4 GB)
10Linux Executable ELF files
- An Executable ELF32 file produced by the Linux
linker is configured to execute in a private
virtual address space, whereby every program
gets loaded at the identical virtual
memory-address (i.e., 0x08048000) - We will soon study the x86 CPUs paging
mechanism which makes this possible (i.e., after
we have finished Project 1)
11Linux Linkable ELF files
- It is possible that some linkable ELF files are
self-contained (i.e., they may not need to be
linked with any other object-files, or with any
shared libraries) - Our manydots.o is one such example
- So we can write our own system-code that can
execute the instructions contained in a
stand-alone linkable object-module, using the
CPUs segmented physical memory
12Our loadmap.cpp utility
- We created a tool that parses a linkable ELF
file, to identify each sections length, type,
and location within the object-module - For those sections containing the text and
data for the program, we build
segment-descriptors, based on where the linkable
image-file will reside in physical memory - Then we jump to the _start entry-point
1332-bit versus 16-bit code
- Linuxs compilers, and the as assembler, can
produce object-files that are intended to reside
in 32-bit memory-segments (i.e., the D-bit in a
code-segment descriptor is set to 1) - This affects the CPUs interpretation of all the
machine-instructions it subsequently fetches - Our as assembler can produce both16-bit and
32-bit code (although its default is 64-bit code) - We employ .code32 or .code16 directives
14Example as Listing
- .code32
- 0x0000 01 D8 add eax, ebx
- 0x0002 66 01 D8 add ax, bx
- 0x0005 90 nop
-
- .code16
- 0x0006 66 01 D8 add eax, ebx
- 0x0009 01 D8 add ax, bx
- 0x000B 90 nop
- .end
15Demo-program
- We created a Linux program (linuxapp.s) that
invokes two system-calls (write and exit) - We assembled it with the as assembler as
--32 linuxapp.s o linuxapp.o - This linkable ELF object-file linuxapp.o should
then be written to our hard-disk partition
(/dev/sda4) at sector 65, using the dd
utility dd iflinuxapp.o of/dev/sda4
seek65 - So it will get loaded into memory by cs630ipl
16Memory-Map
Both tryelf32.b and linuxapp.o will get
loaded into ram from sectors 1..127 of
the disk-partition by our cs630ipl.b
program-loader
linuxapp.o image
0x00018000
tryelf32.b image
0x00010000
BOOT-LOADER
0x00007C00
cs630ipl.b is read from CS630 disk-partition
via ROM-BIOS bootstrap
ROM-BIOS DATA
0x00000400
hard disk
IVT
17Segment Descriptors
- We created 32-bit segment-descriptors for the
text and data sections of linuxapp.o (in a
Local Descriptor Table) with DPL3 - For the .text section offset in ELF file
0x34 size 0x24 - So its segment-descriptor is
- .word 0x0023, 0x8034, 0xFA01, 0x0040
- (base-address load-address file-offset)
18Descriptors (continued)
- For the .data section
- offset in ELF file 0x58 size 0x16
- So its segment-descriptor is
- .word 0x0015, 0x8058, 0xF201, 0x0040
- (base-address load-address file-offset)
- For our ring3 stack (not part of ELF file)
- .word 0x0000, 0x0000, 0xF602, 0x00C0
- Note Its an expand-down data-segment!
19Expand-Down segments
segment limit
segment limit
base-address
base-address
Normal Expand-Up Data-Segment
Special Expand-Down Data-Segment
20Task-State Segment
- Because any system-calls (via int 0x80) will
cause privilege-level transitions, we will need
to setup a Task-State Segment (to store a ring0
stack-pointer SS0ESP0) - theTSS .long 0, 0, 0 3 longwords
- Its segment-descriptor goes into our GDT
- .word 0x000B, theTSS, 0x8901, 0x0000
21Transition to Ring 3
- Recall that we use lret to enter ring-3
- pushw userSS
- pushw 0
- pushw userCS
- pushw 0
- lret
- (NOTE This assumes we are coming from a 16-bit
code-segment in protected-mode)
22System-Call Dispatcher
- All system-calls get vectored through our IDTs
interrupt-gate number 0x80 - For linuxapp.o we only need to implement two
system-calls exit and write - But to simplify future enhancements, we use a
jump-table anyway (although for now it has a
few dummy entries, which can easily be
modified later on)
23System-Call ID-numbers
- System-call ID 0 (it will never be needed)
- System-call ID 1 is for exit (required)
- System-call ID 2 is for fork (deferred)
- System-call ID 3 is for read (deferred)
- System-call ID 4 is for write (required)
- System-call ID 5 is for open (deferred)
- System-call ID 6 is for close (deferred)
- (NOTE over 300 system-calls exist in Linux)
24Defining our jump-table
- sys_call_table
- .long do_nothing for service 0
- .long do_exit for service 1
- .long do_nothing for service 2
- .long do_nothing for service 3
- .long do_write for service 4
- .equ NR_SYS_CALLS, ( . - sys_call_table)/4
25Setting up IDT Gate 0x80
- The Descriptor Privilege Level must be 3
- The Gate-Type should be 386 Trap-Gate
- The entry-point will be our isrSVC label
-
Interrupt Descriptor Tables entry for
SuperVisor Call (int 0x80) mov 0x80, ebx
table-entry array-index lea theIDT(, ebx, 8),
di descriptor offset-address movw isrSVC,
ss0(di) entry-point offsets
loword movw privCS, ss2(di) selector for
code-segment movw 0xEF00, ss4(di)
Gate-Type 386 Trap-Gate movw 0x0000,
ss6(di) entry-point offsets hiword
26Using our jump-table
- isrSVC service-number is found in EAX
- cmp NR_SYS_CALLS, eax
- jb idok
- xor eax, eax
- idok jmp sys_call_table(, eax, 4)
27Our exit service
- When the application invokes the exit
system-call, our mini operating system should
leave protected-mode and return back to our
boot-loader program - The exit-code parameter (in ebx) may just as
well be discarded (since this isnt yet a
multitasking operating-system)
28Our write service
- We only implement writing to the STDOUT device
(i.e., the video display console) - For most characters in the users buffer, we just
write the ascii-code (and standard
display-attribute) directly to video memory at
the current cursor-location and advance the
cursor (scrolling the screen if needed) - Special ascii control-codes (\n, \r, \b)
are treated differently, as on a TTY device
29In-Class Exercise
- The manydots.s demo (to be used with Project
1) uses the read system-call (in addition to
the write and exit services) - However, you could still execute it, using our
tryelf32.s mini operating-system, by letting
the read service simply do nothing (or return
with some kind of hard-coded buffer-contents) - You just need to modify the LDT descriptors so
theyll conform to ELF sections in manydots.o