Title: Symbol Tables
1Symbol Tables
- Contents and Use
- Construction
- Implementing Scoping Rules
2Symbol Table Design
3Structure of a Compiler
Scanner
Parser
Semantic Analyzer
SYMBOL TABLE
Code Generator
Optimizer
4Role of Symbol Tables
- The symbol table is one of the most important
data structures in a compiler. - Created during front end to hold information on
programs objects - IR refers to objects via pointers to the
corresponding entries in symbol table - This is done until near the end of compilation
- So symbol table needed for entire processing
- Contents may be modified as part of optimizations
5Contents of Symbol Tables
- Symbol Tables save information on
- identifiers
- labels
- numerical values (constants)
- character strings
- Temporaries (compiler-generated variables)
- More efficient to store numerical values and
character strings separately. - There might be a separate label table
Symbol tables may be created on a per-procedure
basis. There may be separate tables for global
data.
6Contents of Symbol Tables
- There are a variety of typical kinds of
identifiers - Scalar variables, arrays and structures
(records), procedures and functions, defined
constants, temporaries - Some kinds of information stored with identifiers
- The name or value
- Data type
- Size and dimensionality information (for arrays)
- Other scoping and storage information
- Result types, parameters (formal arguments),
prototypes
The information saved depends on the kind of
object, so need flexible format for entries in
symbol table
7Type of Data in Symbol Table
- The entry for this array would require a complete
description of its type, including the type of
each of its elements. - Array A 1 .. 100 of mytype
- Mytpe is a record
- username char string
- emailaddress char string
- acctdetails array ( 1 5) of integer
- usage pointer to array ( 112) of reals
8Implementation of Symbol Tables
- Simple implementations, e.g. as an array, are too
inefficient in practice - Most common approach is to use open hashing
- Hash function based on identifier
- A number of good functions are known
- Symbol table entry usually points to another
structure to save character strings (name of
variable, char const) - Enables same size for each entry
9Role of Hash Function
- The symbol table is heavily accessed.
- Add or delete entries, modify or augment
information associated with entry, find entry and
enter reference into IR or other structure - A good choice of hash function is critical for
practice. - Goal reasonably uniform distribution of entries.
Not too many unused indices. - A large table is more efficient, but takes more
space, than a small table.
10Scope
- A (syntactically defined) region within a
program within which variables may be declared. - E.g. procedure, statement block.
- Scoping units may be nested in some languages.
11Scope of Variable Declaration
- Static Scope
- scope is determined from nesting relationships in
source program, e.g. Pascal. - Dynamic scope
- scope is determined by run time calling
relationships. E.g. Lisp.
12Lexical vs. Dynamic Scoping
This program calls (twice) show small What is
the output?
- program whichone (input , output )
- var r real
- procedure show
- begin write ( r 53 ) end
- procedure small
- var r real
- begin r 0.125 show end
- begin
- r 0.25
- show small writeln
- end
13Lexical vs. Dynamic Scoping
This program calls (twice) show small What is
the output?
- program whichone (input , output )
- var r real
- procedure show
- begin write ( r 53 ) end
- procedure small
- var r real
- begin r 0.125 show end
- begin
- r 0.25
- show small writeln
- end
With lexical scoping, it is 0.250 0.250.
14Lexical vs. Dynamic Scoping
This program calls (twice) show small What is
the output?
- program whichone (input , output )
- var r real
- procedure show
- begin write ( r 53 ) end
- procedure small
- var r real
- begin r 0.125 show end
- begin
- r 0.25
- show small writeln
- end
With lexical scoping, it is 0.250 0.250.
With dynamic scoping, it is 0.250 0.125.
15The Procedure as a Name Space
- Details of scoping rules are language dependent
- C has global, static, local, and block scopes
(Fortran-like) - Blocks can be nested, procedures cannot
- Scheme has global, procedure-wide, and nested
scopes - Procedure scope (typically) contains formal
parameters
Lexical scoping can be determined at compile
time.
16Lifetime of Variable
- Lifetime is the time during execution of a
program when a given variable first becomes
visible to when it is last visible. - Also called extent.
- The lifetime of a global variable covers the
entire execution (unless it is temporarily
superseded in some places). - The lifetime of a local or automatic variable is
usually an activation of the program unit within
which it is declared. - Some languages permit variables to be local to
blocks also.
17Lifetime and Scope Examples
- In C, scope of automatic variable is procedure or
block where it is declared to end of program unit - In PL/1 scope encompasses entire relevant program
unit - In Pascal, a variable in outermost scope is
visible everywhere in program except where
another variable with same name is declared (and
in the routines it contains) - Fortran has common blocks, static memory that is
visible in all routines where it is declared.
Scope encompasses all those routines. - Local static variables in C and SAVEd variables
in Fortran have global lifetime but are only
visible in certain regions, so scope may be a
file or a procedure - Dynamic variables have a lifetime that extends
from their point of allocation to the point(s) of
their destruction.
18Block Scoping of Variables
- Nested blocks
- If variable x is used in block B and declared in
B, then that declaration is used. - Else if x is declared in block C surrounding
block B, but not in any other block within C and
surrounding B, use declaration of x in block C.
Which x? y?
In general At point p, which declaration of x is
current? How can compiler keep track of this?
19Scoping
- Compiler has to be sure that it is accessing the
right variable - Care must be taken that the object code uses data
only within its scope - Two variables with the same name and different
scopes must both be entered into symbol table - We will have to organize the symbol table setup
to take account of the scope of data objects - And make sure that different variables with the
same name are assigned distinct storage
locations.
20Lexically-scoped Symbol Tables
- Many implementation schemes have been proposed
- Maintain separate symbol table for each scope
- A tree of local symbol tables, active ones on a
stack - If new scope is entered, create new symbol table.
- Or
- Number procedures (scoping units) during parsing.
- Compute these numbers when units are encountered.
- Save scope as part of entry
- Idea can also be used for handling blocks if they
are scoping units - A standard interface
- insert(name, level ) creates record for name at
level - lookup(name, level ) returns pointer or index
21Use of Symbol Table
- Symbol table can be used to support layout of
data in memory - Save memory requirements for object
- Record memory location of object for code
generation - As register containing base address and offset
from base address - The symbolic register that variable is assigned
to - While doing so, assign memory to variables in
order encountered - Or sort on variable size (to ensure that as many
variables as possible get small offsets) - Or sort to maximize alignment of individual
objects on appropriate boundaries
22Some Typical Fields in Symbol Table