Translating Code - PowerPoint PPT Presentation

1 / 54
About This Presentation
Title:

Translating Code

Description:

Source = Assembly Language = symbolic representation of machine language. ... Operating system examines addresses and provides address translation. ... – PowerPoint PPT presentation

Number of Views:34
Avg rating:3.0/5.0
Slides: 55
Provided by: ise2
Category:
Tags: code | translating

less

Transcript and Presenter's Notes

Title: Translating Code


1
Assembly Language
  • Chapter 7

2
Translating Code
  • Translation
  • Given source, produce target Source -gt Target.
  • Done when no processor to execute source.
  • Results in generating an executable.
  • Can optimize, and check for semantics.
  • C, C, Pascal.
  • Interpretation
  • Read each line and execute it.
  • Done when no processor to execute source.
  • No executable generated.
  • Not much optimization possible, little checking.
  • Lisp

3
Assemblers and Compilers
  • Compiler
  • Source High level language (C, C)
  • Target Machine language, or its symbolic
    representation.
  • Assembler
  • Source Assembly Language symbolic
    representation of machine language.
  • Target machine language.

4
Assembly Language
  • Symbolic Representation of a numeric machine
    language
  • Why?
  • Easy to read and understand.
  • Easy to remember ADD, MV rather than their
    hexadecimal numbers.
  • Performance A good assembly language program can
    beat any optimizing compiler.
  • Access to Machine A high level language does not
    have access to underlying machine.
  • One line of assembly corresponds to one
    instruction

5
Sometimes only Assembly
  • Examples with limited resources
  • Smart cards
  • Embedded systems, cell phones, pagers etc.
  • Characteristics
  • limited power, memory CPU cycles etc.
  • No paging, code should fit in RAM.
  • No operating system either.

6
Performance
  • 10 of code executed 90 time Called critical,
    because it is the bottleneck in improving
    performance.
  • Rewrite critical parts in assembly and hand
    optimize them called tuning.
  • There are examples of tuning improving
    performance by 20 to 50 faster with about 1/3
    to 1/5 of original code length.
  • Downside Tedious and time consuming.

7
Some Statistics
8
Format of Assembler Statements
  • Statement above blank computes NIJ
  • Statements below blank reserves memory
  • Parts label, Opcode, Operands, Comments

9
Labels
  • Symbolic names for memory addresses
  • Needed for branching out.
  • Needed to access data storage.
  • Usually starts with first column, and has finite
    width (say 8 columns)
  • Colons
  • Motorola does not
  • Spark requires,
  • Intel requires for code labels, but not for data
    labels.

10
Opcode Field
  • Two Kinds
  • OpCode
  • Command to assembler (assembler directive)
  • Opcode
  • Symbolic representation of Opcode. Eg. MOV, LD
  • Motorola has MOVE for Memory lt-gt Register
    movement of data
  • Spark uses LD (load) and ST (store)
  • Some instructions need more than one line.

11
Instructions Needing Two Lines
  • Example Spark has 32 bit or 44 bit addresses
  • Instructions hold at most 22 bits on immediate
    data. How is full address provided?
  • SETHI HI(I),R1
  • Zero upper 32 bits and lower 10 bits of 64 bit
    register R1.
  • LD R1LO(I),R1
  • Adds R1 and low-order 10 bits of address I,
    (forms address of I) fetch that word from memory
    in to R1

12
Addressing Granularity
  • Can address at byte, word and long operands. How
    to indicate granularity?
  • Option 1 use different instructions.
  • EAX to move 32-bit items.
  • AX to move 16-bit items.
  • Option 2 Have suffixes indicating length.
  • MOVE.L for long words.
  • MOVE.W for words.

13
Pseudoinstrctions
  • Assembler directives Pseudoinstrctions
  • Directives for the assembler, not instructions.
  • Example
  • SEGMENT starts new segment.
  • EQU for symbolic expressions e.g.. BASE EQU 10,
    now BASE can be used instead of 10.
  • Storage allocation. E.g..TABLE DB 11, 23, 49
  • Allocates space for 3 bytes, initializes them to
    11,23,49 and sets TABLE to address of 11.

14
Controlling Visibility of Symbols
  • Programs reside in many files. Need to refer to
    symbols in other files.
  • PUBLIC Allows other files to refer to this
    definition.
  • EXTERN Asks the assembler to look in other files
    for this symbols.
  • INCLUDE Includes contents of other file bodily
    in this file.

15
Some Assembly Directives
16
Conditional Assembly
  • WORDSIZE EQU 16
  • IF WORDSIZE GT 16
  • WSIZ DW 32
  • ELSE
  • WSIZE DW 16
  • ENDIF
  • Can maintain one source for many machines

17
Macros
  • Need repeated sequences of instructions.
  • Three ways
  • Write them all over again
  • Laborious, and error prone.
  • Write a procedure and call it when needed
  • Good for long sets, but call overhead can
    significantly slow down code if not too many
    lines are there.
  • Macro Definition
  • Give a name for a piece of text, possibly with
    some parameters.
  • Use it by stating the name, possibly with
    instantiations to parameters.

18
Example Macro Definition
19
More on Macros
  • Macro Call using a macro name as opcode.
  • Macro Expansion Replacing the name with the
    body.
  • Macro expansion happens during assembly process,
    NOT during execution. Hence no stack used.
  • The same code is executed by processor with or
    without macro. NOT a procedure call.

20
Comparison of Macros and Procedures
21
Macros with Parameters
22
Advanced Features of Macros
  • Macros within Macros
  • M1 Macro
  • IF WORDSIZE GT 16
  • M2 MACRO
  • .
  • ENDM
  • ELSE
  • M3 MARO
  • .
  • ENDM
  • ENDIF
  • ENDM
  • One of the problems
  • Address Duplication.
  • Solution Ability to pass label as a parameter.
  • Recursive Calling
  • Need to have a method to pass a parameter from
    caller to calle.
  • Calee decreases parameter to stop recursion.

23
Implementing Macros
  • Assembler maintains table of macro definitions
    with
  • Macro name.
  • Stored text of definition.
  • Parameters.
  • Parameters written in an easily recognizable
    format
  • During expansion replace
  • Name with body text.
  • Formal parameters with instantiations.

24
Assembly Process
  • Two pass process
  • Pass One Collects definitions of symbols,
    statement labels etc.
  • Pass Two Re-reads the statements, replaces
    symbolic names with values, and translates to
    target language.
  • Why Two Passes?
  • Need to solve the forward referencing problem.
  • Need to find values of names that have not yet
    been defined.

25
Pass One
  • Pass one builds
  • Symbol table Containing (Symbol, Value) pairs.
  • Pseudoinstruction table
  • Opcode Table Details of opcodes used.
  • In assigning value, the assembler must know the
    address of the symbol during execution.
  • To do so it maintains
  • Instruction Location Counter (ILC) during
    assembly.
  • Start with zero.
  • Increase by instruction length each time new
    instruction processed.

26
Instruction Location Counter
27
Contents of Symbol table
  • Symbol.
  • Length of data field.
  • Relocation bits I.e. does the symbol change if
    program loaded at different address?
  • Recall immediate addressing and indirect
    addressing.
  • Visibility/security bits I.e. should the
    procedure be accessible to other procedures?

28
Example Symbol Table
29
Contents of the Opcode Table
  • Symbolic opcode.
  • Operands.
  • Hexadecimal value of opcode.
  • Instruction length.
  • Type number indicating group.
  • Depends on address types (immediate vs direct),
    parameter types etc.
  • 32 bit add is different from 16 bit add.

30
Example Opcode Table
31
Pass One
32
Pass two
  • Generate Object program from data collected in
    pass one.
  • Outputs information to be used by a Linker.
  • Linker Forms a single executable from object
    code produced at different times.
  • Errors generated during pass two are printed out
    and the translation procedure stopped.

33
Pass Two
34
Implementing the Symbol Table
  • Associative Memory
  • A set of pairs, given a key (symbol) must produce
    the value.
  • Implemented as an array of pairs with proper
    operations.
  • Insert(), delete(), , getValue().
  • Binary Search Tree.
  • Keep (symbol, value) pairs on a sorted binary
    tree.
  • In searching, compare with key, go left or right
    depending on (key on node lt, , gt key)
  • HASH Tables
  • Values attached to has hashing buckets.

35
Hash Table
36
Linking and Loading
  • Translating sources to executable involves
  • Compiling or assembling all source procedures in
    to separate object modules.
  • On Unix these are .o, in NT .obj.
  • Linking all object modules into one executable.
  • On Unix these have no specific extension
    (sometimes a.out) and in NT .exe.
  • Two step process saves time.

37
Assembling and Linking Process
38
What the Linker has to do
  • Take separately compiled objects, and put them in
    one liner address space, and adjust the addresses
    to that they are what individual modules had in
    mind.
  • It should do
  • Find externally defined addresses.
  • Translate addresses so that they can be loaded at
    any physical location.

39
Object Modules
40
(No Transcript)
41
Liker Activity
  • The Linker solves the relocation problem and the
    external reference problem by
  • Constructing a table of all object modules and
    their lengths.
  • Based on table assigns starting addresses to each
    table.
  • Adds relocation constant (starting address its
    module) to each memory reference.
  • Finds all instructions referencing other
    procedures and inserts addresses in place.

42
Structure of an Object Module I
  • Identification names and lengths of different
    parts.
  • Entry Point Symbols in this module and their
    values.
  • External.. List of externally defined symbols
    and which instructions use them.
  • Linker inserts these later on.
  • Machine Inst.Only this loaded into memory for
    execution.

43
Structure of an Object Module II
  • Relocation Directory Relocatable addresses are
    listed here. The linker has no way to guess this.
    The assembler produces this part.
  • End of Module May have parity information such
    as a checksum.

44
Binding Time and Dynamic Relocation
  • A Process migrates throughout queues and occupies
    memory many times.
  • Need addresses to change accordingly. Hence
    cannot compute absolute addresses.
  • Binding Time Actual time of computing addresses.

45
Possible Binding Time
  • When module is written.
  • When module assembled.
  • When program linked.
  • When program loaded.
  • When base register used for addressing loaded.
  • When instruction containing address is executed.

46
Issues in Address Binding
  • If instruction containing address moved after
    binding, then address incorrect.
  • Address Binding
  • When symbolic names are bound to virtual
    addresses.
  • When virtual address are bound to physical
    addresses.
  • Linker creates binding of (symbolic names to
    virtual addresses).
  • No effect on paged or not.

47
Three Methods to Relocate
  • Virtual Memory and paging.
  • Need only to know the page table.
  • Relocation register Points to starting physical
    address. All references automatically add
    relocation register to address.
  • Relative addressing All addresses are either
    constant (for devices) or relative to PC.

48
Dynamic Linking
  • Previous linker links all possible procedures
    statically at link time, if they are used or not.
  • Some procedures (such as exception handlers) are
    rarely used. Hence can reduce executable size if
    linking is postponed to calling time.
  • Called Dynamic Linking.

49
MULTICS Style Dynamic Linking
  • Each object has linkage segment with addresses
    with procedures that may be called.
  • Procedure calls are translated to addresses in
    this block.
  • Compiler fills in an invalid address, hence
    causes a trap.
  • In turn the dynamic linker finds the proper
    address, fills it in and restarts the instruction.

50
Before the Call
51
After the Call
52
Dynamic Linking in Windows
  • DLLs Special file format with procedure and/or
    data.
  • Library sharing is done through DLLs.
  • DLLs have no main, hence cannot run by
    themselves.
  • Implicit Linking Use program statically linked
    through import library glue. Operating system
    examines addresses and provides address
    translation.
  • Explicit Linking User program makes an explicit
    call to library routine at runtime. Makes
    additional calls to OS to get address to load
    library procedure.

53
Using DLLs
54
Dynamic Linking in Unix
  • Shared Library Supports only implicit linking.
  • An archive file containing multiple data segments
    and procedures.
  • Has two parts
  • Host Library Statically links to executable.
  • Target Library Called at runtime.
Write a Comment
User Comments (0)
About PowerShow.com