SYMBOL TABLES PowerPoint PPT Presentation

presentation player overlay

1 / 21

About This Presentation

Transcript and Presenter's Notes

Title: SYMBOL TABLES

1
SYMBOL TABLES CODE GENERATION FOR EXECUTABLES
2
SYMBOL TABLES

Compilers that produce an executable (or the
representation of an executable in object module
format)
as opposed to a program in an intermediate
language (and, in fact, for optimization
purposes, all compilers)
need to make use of a symbol table

3

The symbol table records information about the
identifiers in the source program
such as their name, type, no. of dimensions,
space assignment, etc.

4

To illustrate the use of symbol tables, lets
consider a simple compiler, where symbol_stack
consists of integers, and the integer associated
with an identifier on the stack is the index of
the entry for that identifier in the symbol
table.

5

Our symbol stack entries will provide pointers to
the entries in the symbol table where the name of
the identifier and the offset assigned to it in
the data segment is stored.
Negative numbers will be employed on symbol stack
as codes to denote the registers, AX, BX, etc.

6

As identifiers are encountered in the source
code, their names are packed onto an array, we
will call id_stack, defined as char
id_stack1000
Since strings in C all end in a 00h byte, it is
only necessary to specify where on id_stack a
name begins, in order to retrieve it.

7

The symbol table entry for a name does not
contain the name itself, but instead a pointer to
the beginning of the name on id_stack.
The reason for this is that, since the symbol
table is an array of symbol table entries, we
would have otherwise have to provide space in
each entry for the largest legal name size.

8

When an identifier is encountered in the source
code, the compiler has to search the symbol table
to find the entry, if any, for it.
Various methods have been investigated for making
this process more efficient, such as the use of
binary trees,

9

But the method of choice has been to derive a
number called a hash code from an identifier, and
then link all identifiers with the same hash code
in a list, which we will refer to as a hashlist

10

One method for evaluating a hash code, is to add
up the ascii codes of the individual characters
of the identifier
and then take, as the hash code the remainder of
this sum after division by a prime number, such
as 127.

11

The following is sample code for this purpose
int hash(char name)
int hash_value 0
int i 0
while(namei ! '\0')
hash_value namei
i
return(hash_value 127)
In this scheme there are 127 hash-lists

12

A simple symbol table could be defined as
follows
typedef struct
int name_index
int offset
int hash_link symbol_table_entry
symbol_table_entry symbol_table1000

13

Here name_index is the pointer into ID_S where
the name is stored,
offset is the offset in the data segment assigned
to the identifier, and
hash_link is a pointer to the symbol table entry
for the next identifier encountered, if any, with
the same hash code

14

The entries at symbol_table0 thru
symbol_table126 are reserved for the heads of
the 127 hash-lists.

15

For example if X1 is the first identifier
encountered in the source with hash-code (say)
30, then an entry for it will be made at
symbol_table30.
If later on, an identifier ZZ is encountered
which also has hash-code 30, then an entry will
be made for ZZ at the next free index gt 127 in
symbol_table, and the hash-link in the entry for
X1 will be changed from null to point instead to
the entry for ZZ.

16

Within the rules section of the Lex definition
file, the regular expression and associated code
for an identifier may take a form such as the
following
letter(letterdigit'_')
yylval find(yytext) return
identifier
where the find function returns the index into
the symbol_table of the entry for the identifier,
creating an entry if one doesnt already exist

17

The find function begins as follows
int find(char name)
int j
j hash(name)
and proceeds according to the flow-diagram on
the next slide

18

Code Generation Using the Symbol Table
Lets consider the code required in our simple
compiler within our Yacc definition file for
addition.
To avoid complications, lets assume that the
code for our arithmetic expressions requires the
use of register AX only

19

So on symbol stack, positive numbers are
indexes of entries for identifiers in
symbol_table, and (say) -1 is used as a code for
AX
expression expression term
c code as
described below
The c code should check whether 1 and 3 are
positive or negative, and generate appropriate
object code for each of the 4 cases.

20

Case where 1 and 3 are both positive
Generate machine code corresponding to
mov AX, symbol_table1.offset
add AX, symbol_table3.offset
and set -1

21

Case where 1 is neg. and 3 is positive
Generate machine code corresponding to
add AX, symbol_table3.offset
and set -1

Write a Comment

User Comments (0)

About PowerShow.com

Recommended Relevance Latest Highest Rated Most Viewed

Sort by:

Related More from user