SYMBOL TABLES - PowerPoint PPT Presentation

1 / 21
About This Presentation
Title:

SYMBOL TABLES

Description:

SYMBOL TABLES &CODE GENERATION FOR EXECUTABLES SYMBOL TABLES Compilers that produce an executable (or the representation of an executable in object module format) as ... – PowerPoint PPT presentation

Number of Views:107
Avg rating:3.0/5.0
Slides: 22
Provided by: ICS91
Category:

less

Transcript and Presenter's Notes

Title: SYMBOL TABLES


1
SYMBOL TABLES CODE GENERATION FOR EXECUTABLES
2
SYMBOL TABLES
  • Compilers that produce an executable (or the
    representation of an executable in object module
    format)
  • as opposed to a program in an intermediate
    language (and, in fact, for optimization
    purposes, all compilers)
  • need to make use of a symbol table

3
  • The symbol table records information about the
    identifiers in the source program
  • such as their name, type, no. of dimensions,
    space assignment, etc.

4
  • To illustrate the use of symbol tables, lets
    consider a simple compiler, where symbol_stack
    consists of integers, and the integer associated
    with an identifier on the stack is the index of
    the entry for that identifier in the symbol
    table.

5
  • Our symbol stack entries will provide pointers to
    the entries in the symbol table where the name of
    the identifier and the offset assigned to it in
    the data segment is stored.
  • Negative numbers will be employed on symbol stack
    as codes to denote the registers, AX, BX, etc.

6
  • As identifiers are encountered in the source
    code, their names are packed onto an array, we
    will call id_stack, defined as char
    id_stack1000
  • Since strings in C all end in a 00h byte, it is
    only necessary to specify where on id_stack a
    name begins, in order to retrieve it.

7
  • The symbol table entry for a name does not
    contain the name itself, but instead a pointer to
    the beginning of the name on id_stack.
  • The reason for this is that, since the symbol
    table is an array of symbol table entries, we
    would have otherwise have to provide space in
    each entry for the largest legal name size.

8
  • When an identifier is encountered in the source
    code, the compiler has to search the symbol table
    to find the entry, if any, for it.
  • Various methods have been investigated for making
    this process more efficient, such as the use of
    binary trees,

9
  • But the method of choice has been to derive a
    number called a hash code from an identifier, and
    then link all identifiers with the same hash code
    in a list, which we will refer to as a hashlist

10
  • One method for evaluating a hash code, is to add
    up the ascii codes of the individual characters
    of the identifier
  • and then take, as the hash code the remainder of
    this sum after division by a prime number, such
    as 127.

11
  • The following is sample code for this purpose
  • int hash(char name)
  • int hash_value 0
  • int i 0
  • while(namei ! '\0')
  • hash_value namei
  • i
  • return(hash_value 127)
  • In this scheme there are 127 hash-lists

12
  • A simple symbol table could be defined as
    follows
  • typedef struct
  • int name_index
  • int offset
  • int hash_link symbol_table_entry
  • symbol_table_entry symbol_table1000

13
  • Here name_index is the pointer into ID_S where
    the name is stored,
  • offset is the offset in the data segment assigned
    to the identifier, and
  • hash_link is a pointer to the symbol table entry
    for the next identifier encountered, if any, with
    the same hash code

14
  • The entries at symbol_table0 thru
    symbol_table126 are reserved for the heads of
    the 127 hash-lists.

15
  • For example if X1 is the first identifier
    encountered in the source with hash-code (say)
    30, then an entry for it will be made at
    symbol_table30.
  • If later on, an identifier ZZ is encountered
    which also has hash-code 30, then an entry will
    be made for ZZ at the next free index gt 127 in
    symbol_table, and the hash-link in the entry for
    X1 will be changed from null to point instead to
    the entry for ZZ.

16
  • Within the rules section of the Lex definition
    file, the regular expression and associated code
    for an identifier may take a form such as the
    following
  • letter(letterdigit'_')
  • yylval find(yytext) return
    identifier
  • where the find function returns the index into
    the symbol_table of the entry for the identifier,
    creating an entry if one doesnt already exist

17
  • The find function begins as follows
  • int find(char name)
  • int j
  • j hash(name)
  • and proceeds according to the flow-diagram on
    the next slide

18
  • Code Generation Using the Symbol Table
  • Lets consider the code required in our simple
    compiler within our Yacc definition file for
    addition.
  • To avoid complications, lets assume that the
    code for our arithmetic expressions requires the
    use of register AX only

19
  • So on symbol stack, positive numbers are
    indexes of entries for identifiers in
    symbol_table, and (say) -1 is used as a code for
    AX
  • expression expression term
  • c code as
    described below
  • The c code should check whether 1 and 3 are
    positive or negative, and generate appropriate
    object code for each of the 4 cases.

20
  • Case where 1 and 3 are both positive
  • Generate machine code corresponding to
  • mov AX, symbol_table1.offset
  • add AX, symbol_table3.offset
  • and set -1

21
  • Case where 1 is neg. and 3 is positive
  • Generate machine code corresponding to
  • add AX, symbol_table3.offset
  • and set -1
Write a Comment
User Comments (0)
About PowerShow.com