Title: CS 3240: Languages and Computation
1CS 3240 Languages and Computation
2Personnel
- Instructor Xiangmin (Jim) Jiao
- Email jiao_at_cc.gatech.edu
- Office CCB 253
- Office Hours
- Tue. 2pm, Wed. 4pm
- Or by appointment
- TA Lawrence Ibarria
- Email redark_at_cc.gatech.edu
- Office Hours Mon. 4pm, CCB Commons Area
3Required Textbooks
- Compiler Construction Principles and Practice
by Kenneth C. Louden, PWS Publishing, 1997, ISBN
0534939724 - On reserve in library (available soon)
- Introduction to Theory of Computation, Second
Edition by Michael Sipser, PWS Publishing, 2005,
ISBN 0534950973 - First edition may also be used
4Course Objectives
- Formal languages and compiler concepts
- Understand definitions of regular and
context-free languages and their corresponding
machines - Understand their computational powers and
limitations - Understand their applications in compilers
- Front-end of compiler
- Lexical analysis and parsing
- Theory of computation
- Understand Turing machines
- Understand decidability
5Grading
- Homework 20
- Mini-project 25
- Tests 30
- Final 25
- Homework will be due in class
- No late homework or assignments without prior
approval of instructor - Homework should be concise, complete, and precise
- Tests will be in class. Closed book, closed
notes, but one-page cheat-sheet allowed.
6Collaboration Policy
- Students must write solutions to assignments
completely independently - General discussions are allowed on assignments
among students, but names of collaborators must
be reported
7Resources
- Class webpage
- http//www.cc.gatech.edu/classes/AY2006/cs3240_sum
mer/ - Newsgroup
- git.cc.class.cs3240
- on news.gatech.edu.
8Introduction toCompiler Concepts
9Compilers
- What is a compiler?
- A program that translates an executable program
from source language into target language - Usually source language is high-level language,
and target language is object (or machine) code - Related to interpreters
- Why compilers?
- Programming in machine (or assembly) language is
tedious, error prone, and machine dependent - Historical note In 1954, IBM started developing
FORTRAN language and its compiler
10Why study theory of compiler?
- Besides it is required
- Prerequisite for developing advanced compilers,
which continues to be active as new computer
architectures emerge - Useful to develop software tools that parse
computer codes or strings - E.g., editors, debuggers, interpreters,
preprocessors, - Important to understand how compliers work to
program more effectively
11How Does Compiler Work?
Scanner
Request Token
Get Token
Parser
Start
- Front End Analysis of program syntax and
semantics
Semantic Action
Semantic Error
Checking
Intermediate Representation
12Parts of Compilers
Focus of this class.
Analysis
1. Lexical Analysis 2. Syntax Analysis 3.
Semantic Analysis
Front End
Synthesis
4. Code Generation 5. Optimization
Back End
13The Big Picture
- Parsing Translating code to rules of grammar.
Building representation of code. - Scanning Converting input text into stream of
known objects called tokens. Simplifies parsing
process. - Grammar dictates syntactic rules of language
i.e., how legal sentence could be formed - Lexical rules of language dictate how legal word
is formed by concatenating alphabet.
14Overall Operation
- Parser is in control of the overall operation
- Demands scanner to produce a token
- Scanner reads input file into token buffer
forms a token (How?) - Token is returned to parser
- Parser attempts to match the token (How?)
- Failure Syntax Error!
- Success
- Does nothing and returns to get next token, or
- Takes semantic action
15Overall Operation
- Semantic action look up variable name
- If found okay
- If not put in symbol table
- If semantic checks succeed, do code-generation
(How?) - Continue to get next token
- No more tokens? Done!
16Scanning/Tokenization
Input File
Token Buffer
- What does the Token Buffer contain?
- Token being identified
- Why a two-way ( ) street?
- Characters can be read
- and unread
- Termination of a token
17Example
main()
m
18Example
main()
am
19Example
main()
iam
20Example
main()
niam
21Example
main()
(niam
22Example
main()
niam
Keyword main
23Parser
- Translating code to rules of a grammar
- Control the overall operation
- Demands scanner to produce a token
- Failure Syntax Error!
- Success
- Does nothing and returns to get next token, or
- Takes semantic action
24Grammar Rules
- ltC-PROGgt ? MAIN OPENPAR ltPARAMSgt CLOSEPAR
ltMAIN-BODYgt - ltPARAMSgt ? NULL
- ltPARAMSgt ? VAR ltVAR-LISTgt
- ltVARLISTgt ? , VAR ltVARLISTgt
- ltVARLISTgt ? NULL
- ltMAIN-BODYgt ? CURLYOPEN ltDECL-STMTgt ltASSIGN-STMTgt
CURLYCLOSE - ltDECL-STMTgt ? ltTYPEgt VAR ltVAR-LISTgt
- ltASSIGN-STMTgt ? VAR ltEXPRgt
- ltEXPRgt ? VAR
- ltEXPRgt ? VARltOPgtltEXPRgt
- ltOPgt ?
- ltOPgt ? -
- ltTYPEgt ? INT
- ltTYPEgt ? FLOAT
25Demo
main() int a,b a b
Scanner
Token Buffer
Parser
26Demo
main() int a,b a b
Scanner
Token Buffer
"Please, get me the next token"
Parser
27Demo
main() int a,b a b
Scanner
m
Parser
28Demo
main() int a,b a b
Scanner
am
Parser
29Demo
main() int a,b a b
Scanner
iam
Parser
30Demo
main() int a,b a b
Scanner
niam
Parser
31Demo
main() int a,b a b
Scanner
(niam
Parser
32Demo
main() int a,b a b
Scanner
niam
Parser
33Demo
main() int a,b a b
Scanner
Token Buffer
Token main
Parser
34Demo
main() int a,b a b
Scanner
Token Buffer
Parser
"I recognize this"
35Parsing (Matching)
- Start matching using a rule
- When match takes place at certain position, move
further (get next token repeat) - If expansion needs to be done, choose appropriate
rule (How to decide which rule to choose?) - If no rule found, declare error
- If several rules found, the grammar (set of
rules) is ambiguous
36Scanning Parsing Combined
main() int a,b a b
Scanner
"Please, get me the next token"
Parser
37Scanning Parsing Combined
main() int a,b a b
Scanner
Token MAIN
Parser
ltC-PROGgt ? MAIN OPENPAR ltPARAMETERSgt CLOSEPAR
ltMAIN-BODYgt
38Scanning Parsing Combined
main() int a,b a b
Scanner
"Please, get me the next token"
Parser
ltC-PROGgt ? MAIN OPENPAR ltPARAMETERSgt CLOSEPAR
ltMAIN-BODYgt
39Scanning Parsing Combined
main() int a,b a b
Scanner
Token OPENPAR
Parser
ltC-PROGgt ? MAIN OPENPAR ltPARAMETERSgt CLOSEPAR
ltMAIN-BODYgt
40Scanning Parsing Combined
main() int a,b a b
Scanner
Token CLOSEPAR
Parser
ltC-PROGgt ? MAIN OPENPAR ltPARAMETERSgt CLOSEPAR
ltMAIN-BODYgt ltPARAMETERSgt ? NULL
41Scanning Parsing Combined
main() int a,b a b
Scanner
Token CLOSEPAR
Parser
ltC-PROGgt ? MAIN OPENPAR ltPARAMETERSgt CLOSEPAR
ltMAIN-BODYgt ltPARAMETERSgt ? NULL
42Scanning Parsing Combined
main() int a,b a b
Scanner
Token CLOSEPAR
Parser
ltC-PROGgt ? MAIN OPENPAR ltPARAMETERSgt CLOSEPAR
ltMAIN-BODYgt
43Scanning Parsing Combined
main() int a,b a b
Scanner
Token CURLYOPEN
Parser
ltC-PROGgt ? MAIN OPENPAR ltPARAMETERSgt CLOSEPAR
ltMAIN-BODYgt ltMAIN-BODYgt ? CURLYOPEN ltDECL-STMTgt
ltASSIGN-STMTgt CURLYCLOSE
44Scanning Parsing Combined
main() int a,b a b
Scanner
Token INT
Parser
ltC-PROGgt ? MAIN OPENPAR ltPARAMETERSgt CLOSEPAR
ltMAIN-BODYgt ltMAIN-BODYgt ? CURLYOPEN ltDECL-STMTgt
ltASSIGN-STMTgt CURLYCLOSE ltDECL-STMTgt ?
ltTYPEgtVARltVAR-LISTgt ltTYPEgt ? INT
45Scanning Parsing Combined
main() int a,b a b
Scanner
Token INT
Parser
ltC-PROGgt ? MAIN OPENPAR ltPARAMETERSgt CLOSEPAR
ltMAIN-BODYgt ltMAIN-BODYgt ? CURLYOPEN ltDECL-STMTgt
ltASSIGN-STMTgt CURLYCLOSE ltDECL-STMTgt ?
ltTYPEgtVARltVAR-LISTgt ltTYPEgt ? INT
46Scanning Parsing Combined
main() int a,b a b
Scanner
Token INT
Parser
ltC-PROGgt ? MAIN OPENPAR ltPARAMETERSgt CLOSEPAR
ltMAIN-BODYgt ltMAIN-BODYgt ? CURLYOPEN ltDECL-STMTgt
ltASSIGN-STMTgt CURLYCLOSE ltDECL-STMTgt ?
ltTYPEgtVARltVAR-LISTgt ltTYPEgt ? INT
47Scanning Parsing Combined
main() int a,b a b
Scanner
Token VAR
Parser
ltC-PROGgt ? MAIN OPENPAR ltPARAMETERSgt CLOSEPAR
ltMAIN-BODYgt ltMAIN-BODYgt ? CURLYOPEN ltDECL-STMTgt
ltASSIGN-STMTgt CURLYCLOSE ltDECL-STMTgt ?
ltTYPEgtVARltVAR-LISTgt ltVARLISTgt ? , VAR
ltVARLISTgt ltVARLISTgt ? NULL
48Scanning Parsing Combined
main() int a,b a b
Scanner
Token ',' COMMA
Parser
ltC-PROGgt ? MAIN OPENPAR ltPARAMETERSgt CLOSEPAR
ltMAIN-BODYgt ltMAIN-BODYgt ? CURLYOPEN ltDECL-STMTgt
ltASSIGN-STMTgt CURLYCLOSE ltDECL-STMTgt ?
ltTYPEgtVARltVAR-LISTgt ltVARLISTgt ? , VAR
ltVARLISTgt ltVARLISTgt ? NULL
49Scanning Parsing Combined
main() int a,b a b
Scanner
Token VAR
Parser
ltC-PROGgt ? MAIN OPENPAR ltPARAMETERSgt CLOSEPAR
ltMAIN-BODYgt ltMAIN-BODYgt ? CURLYOPEN ltDECL-STMTgt
ltASSIGN-STMTgt CURLYCLOSE ltDECL-STMTgt ?
ltTYPEgtVARltVAR-LISTgt ltVARLISTgt ? , VAR
ltVARLISTgt ltVARLISTgt ? NULL
50Scanning Parsing Combined
main() int a,b a b
Scanner
Token ''
Parser
ltC-PROGgt ? MAIN OPENPAR ltPARAMETERSgt CLOSEPAR
ltMAIN-BODYgt ltMAIN-BODYgt ? CURLYOPEN ltDECL-STMTgt
ltASSIGN-STMTgt CURLYCLOSE ltDECL-STMTgt ?
ltTYPEgtVARltVAR-LISTgt ltVARLISTgt ? , VAR
ltVARLISTgt ltVARLISTgt ? NULL
51Scanning Parsing Combined
main() int a,b a b
Scanner
Token ''
Parser
ltC-PROGgt ? MAIN OPENPAR ltPARAMETERSgt CLOSEPAR
ltMAIN-BODYgt ltMAIN-BODYgt ? CURLYOPEN ltDECL-STMTgt
ltASSIGN-STMTgt CURLYCLOSE ltDECL-STMTgt ?
ltTYPEgtVARltVAR-LISTgt ltVARLISTgt ? , VAR
ltVARLISTgt ltVARLISTgt ? NULL
52Scanning Parsing Combined
main() int a,b a b
Scanner
Token ''
Parser
ltC-PROGgt ? MAIN OPENPAR ltPARAMETERSgt CLOSEPAR
ltMAIN-BODYgt ltMAIN-BODYgt ? CURLYOPEN ltDECL-STMTgt
ltASSIGN-STMTgt CURLYCLOSE ltDECL-STMTgt ?
ltTYPEgtVARltVAR-LISTgt ltVARLISTgt ? , VAR
ltVARLISTgt ltVARLISTgt ? NULL
53Scanning Parsing Combined
main() int a,b a b
Scanner
Token ''
Parser
ltC-PROGgt ? MAIN OPENPAR ltPARAMETERSgt CLOSEPAR
ltMAIN-BODYgt ltMAIN-BODYgt ? CURLYOPEN ltDECL-STMTgt
ltASSIGN-STMTgt CURLYCLOSE ltDECL-STMTgt ?
ltTYPEgtVARltVAR-LISTgt ltVARLISTgt ? , VAR
ltVARLISTgt ltVARLISTgt ? NULL
54Scanning Parsing Combined
main() int a,b a b
Scanner
Token ''
Parser
ltC-PROGgt ? MAIN OPENPAR ltPARAMETERSgt CLOSEPAR
ltMAIN-BODYgt ltMAIN-BODYgt ? CURLYOPEN ltDECL-STMTgt
ltASSIGN-STMTgt CURLYCLOSE ltDECL-STMTgt ?
ltTYPEgtVARltVAR-LISTgt
55Scanning Parsing Combined
main() int a,b a b
Scanner
Token ''
Parser
ltC-PROGgt ? MAIN OPENPAR ltPARAMETERSgt CLOSEPAR
ltMAIN-BODYgt ltMAIN-BODYgt ? CURLYOPEN ltDECL-STMTgt
ltASSIGN-STMTgt CURLYCLOSE ltDECL-STMTgt ?
ltTYPEgtVARltVAR-LISTgt
56Scanning Parsing Combined
main() int a,b a b
Scanner
Token VAR
Parser
ltC-PROGgt ? MAIN OPENPAR ltPARAMETERSgt CLOSEPAR
ltMAIN-BODYgt ltMAIN-BODYgt ? CURLYOPEN ltDECL-STMTgt
ltASSIGN-STMTgt CURLYCLOSE ltASSIGN-STMTgt ? VAR
ltEXPRgt ltEXPRgt ? VAR
57Scanning Parsing Combined
main() int a,b a b
Scanner
Token ''
Parser
ltC-PROGgt ? MAIN OPENPAR ltPARAMETERSgt CLOSEPAR
ltMAIN-BODYgt ltMAIN-BODYgt ? CURLYOPEN ltDECL-STMTgt
ltASSIGN-STMTgt CURLYCLOSE ltASSIGN-STMTgt ? VAR
ltEXPRgt ltEXPRgt ? VAR
58Scanning Parsing Combined
main() int a,b a b
Scanner
Token VAR
Parser
ltC-PROGgt ? MAIN OPENPAR ltPARAMETERSgt CLOSEPAR
ltMAIN-BODYgt ltMAIN-BODYgt ? CURLYOPEN ltDECL-STMTgt
ltASSIGN-STMTgt CURLYCLOSE ltASSIGN-STMTgt ? VAR
ltEXPRgt ltEXPRgt ? VAR
59Scanning Parsing Combined
main() int a,b a b
Scanner
Token VAR
Parser
ltC-PROGgt ? MAIN OPENPAR ltPARAMETERSgt CLOSEPAR
ltMAIN-BODYgt ltMAIN-BODYgt ? CURLYOPEN ltDECL-STMTgt
ltASSIGN-STMTgt CURLYCLOSE ltASSIGN-STMTgt ? VAR
ltEXPRgt ltEXPRgt ? VAR
60Scanning Parsing Combined
main() int a,b a b
Scanner
Token VAR
Parser
ltC-PROGgt ? MAIN OPENPAR ltPARAMETERSgt CLOSEPAR
ltMAIN-BODYgt ltMAIN-BODYgt ? CURLYOPEN ltDECL-STMTgt
ltASSIGN-STMTgt CURLYCLOSE ltASSIGN-STMTgt ? VAR
ltEXPRgt ltEXPRgt ? VAR
61Scanning Parsing Combined
main() int a,b a b
Scanner
Token ''
Parser
ltC-PROGgt ? MAIN OPENPAR ltPARAMETERSgt CLOSEPAR
ltMAIN-BODYgt ltMAIN-BODYgt ? CURLYOPEN ltDECL-STMTgt
ltASSIGN-STMTgt CURLYCLOSE ltASSIGN-STMTgt ? VAR
ltEXPRgt
62Scanning Parsing Combined
main() int a,b a b
Scanner
Token ''
Parser
ltC-PROGgt ? MAIN OPENPAR ltPARAMETERSgt CLOSEPAR
ltMAIN-BODYgt ltMAIN-BODYgt ? CURLYOPEN ltDECL-STMTgt
ltASSIGN-STMTgt CURLYCLOSE ltASSIGN-STMTgt ? VAR
ltEXPRgt
63Scanning Parsing Combined
main() int a,b a b
Scanner
Token CURLYCLOSE
Parser
ltC-PROGgt ? MAIN OPENPAR ltPARAMETERSgt CLOSEPAR
ltMAIN-BODYgt ltMAIN-BODYgt ? CURLYOPEN ltDECL-STMTgt
ltASSIGN-STMTgt CURLYCLOSE
64What Is Happening?
- During/after parsing?
- Tokens get gobbled
- Symbol tables
- Variables have attributes
- Declaration attached attributes to variables
- Semantic actions
- What are semantic actions?
- Semantic checks
65Symbol Table
- int a,b
- Declares a and b
- Within current scope
- Type integer
- Use of a and b now legal
66Typical Semantic Actions
- Enter variable declaration into symbol table
- Look up variables in symbol table
- Do binding of looked-up variables (scoping rules,
etc.) - Do type checking for compatibility
- Keep the semantic context of processing
- a b c ? t1 a b
- t2 t1 c
Semantic Context
67How Are Semantic Actions Called?
- Action symbols embedded in the grammar
- Each action symbol represents a semantic
procedure - These procedures do things and/or return values
- Semantic procedures are called by parser at
appropriate places during parsing - Semantic stack implements stores semantic
records
68Semantic Actions
- ltdecl-stmtgt ? lttypegtput-typeltvar-listgtdo-decl
- lttypegt ? int float
- ltvar-listgt ? ltvargtadd-decllt, ltvar-listgt
- var-listgt ? ltvargtadd-decllt
- ltvargt ? IDproc-decl
- put-type puts given type on semantic stack
- proc-decl builds decl record for var on stack
- add-decl builds decl-chain
- do-decl traverses chain on semantic stack
using - backwards pointers entering each var
into - symbol table
decl record
id3
Name
Type
Scope
id2
id1
1
3
do-decl ?
id1
id2
1
3
type
id3
1
3
69Semantic Actions
- What else can semantic actions do in addition to
storing and looking up names in a symbol table? - Two type of semantic actions
- Checking (binding, type compatibility, scoping,
etc.) - Translation (generate temporary values, propagate
them to keep semantic context).
70Full Compiler Structure
Scanner
- Most compilers have two pass
Parser
Start
Semantic Action
Semantic Error
Code Generation
CODE
71Summary
- Front-end of compiler scanner and parser
- Translation takes place in back end
- Scanner, parser and code generator are automated
- How? We will answer this question in this class