Title: PL/0 and the 655 Project
1 2Specification of Syntax
- PL/0
- How the nesting of expression, term and factor in
PL/0 work together and generate code - How the nesting of recognition routines has the
effect of static scoping - Project questions and answers
- UML Use Case Modeling
- General Problem of Describing Syntax
- Recursive Descent Parsing
- Attribute Grammars
- Describing the Meanings of Programs
Dynamic Semantics
3Simple Syntax Processing Prerequisites
- 321, 560, 625 cover basics of processing simple
programs - 655 to advance and unify that understanding
- Wirth approach to describing syntax graphs as
flow charts for programming a parserrecursive
descent - Textbook Chapters
- Following slides should be a review
4Syntax, semantics, language
- Syntax - the form or structure of the
expressions, statements, and program units - Semantics - the meaning of the expressions,
statements, and program units - Sentence - string of characters over some
alphabet (maybe what are usually words) - Language - set of sentences
- Lexeme - lowest level syntactic unit of a
language (e.g., , sum, begin) - Token - category of lexemes (e.g., identifier)
5Language (following Wirth)
- L L ( T, N, P, S )
- Vocabulary T of terminal symbols
- Set N of non-terminal symbols(grammatical
categories) - Set P of productions (syntactical rules)
- Symbol S (from N) called the start symbol
- Language is set of sequences of terminal symbols
that can be generated (directly or indirectly
(thats his points 3 and 4)
6Backus Normal Form (1959)
- Invented by John Backus to describe Algol 58
- BNF is equivalent to context-free grammars
- A metalanguage is a language used to describe
another language. - In BNF, abstractions are used to represent
classes of syntactic structures--they act like
syntactic variables (also called nonterminal
symbols) - e.g. ltwhile_stmtgt -gt while ltlogic_exprgt do
ltstmtgt - This is a rule it describes the structure of a
while statement
7Syntax rules
- A rule has a left-hand side (LHS) and a
right-hand side (RHS), and consists of terminal
and non-terminal symbols - A grammar is a finite nonempty set of rules
- An abstraction (or non-terminal symbol) can have
more than one RHS - ltstmtgt -gt ltsingle_stmtgt begin ltstmt_listgt
end - Syntactic lists are described in BNF using
recursion - ltident_listgt -gt ident ident, ltident_listgt
- A derivation is a repeated application of rules,
starting with the start symbol and ending with a
sentence (all terminal symbols)
8An example grammar
- ltprogramgt -gt ltstmtsgt
- ltstmtsgt -gt ltstmtgt ltstmtgt ltstmtsgt
- ltstmtgt -gt ltvargt ltexprgt
- ltvargt -gt a b c d
- ltexprgt -gt lttermgt lttermgt lttermgt - lttermgt
- lttermgt -gt ltvargt const
9An example derivation
- ltprogramgt gt ltstmtsgt
- gt ltstmtgt
- gt ltvargt ltexprgt
- gt a ltexprgt
- gt a lttermgt lttermgt
- gt a ltvargt lttermgt
- gt a b lttermgt
- gt a b const
10Derivation explanation
- Every string of symbols in the derivation is a
sentential form - A sentence is a sentential form that has only
terminal symbols - A leftmost derivation is one in which the
leftmost non-terminal in each sentential form is
the one that is expanded - A derivation may be neither leftmost nor
rightmost - Parse tree is a hierarchical representation of a
derivation
11Parsing another view
12Static Semantics
- Other information about the language not
specified with the BNF - Identifier length
- Maximum integer value
- Other restrictions on your compiler
- Symbol table size
- Code array size
- Specify these in your description of your
language processor - Recognize the restrictions youve implied
13Unstated Assumptions
- Input program read top-to-bottom, left-to-right,
with no backtracking - Things declared before they are used
- No redefining at same level
- Inner declarations hidden by nesting
- Inner can locally hide outer declarations
14Ambiguity Right Recursive
- A grammar is ambiguous iff if and only if it
generates a sentential form that has two or more
distinct parse trees - If we use the parse tree to indicate precedence
levels of the operators, we cannot have ambiguity - Operator associativity can also be indicated by a
grammar - ltexprgt -gt ltexprgt ltexprgt const (ambiguous)
- ltexprgt -gt ltexprgt const const (unambiguous)
- Left recursive (left associative)(recursive
descent will require right recursive)
15Extended BNF (abbreviations)
- Optional parts are placed in brackets ()
- ltproc_callgt -gt ident ( ltexpr_listgt)
- Put alternative parts of RHSs in parentheses and
separate them with vertical bars - lttermgt -gt lttermgt ( -) const
- Put repetitions (0 or more) in braces ()
- ltidentgt -gt letter letter digit
16BNF / EBNF
- BNF
- ltexprgt -gt ltexprgt lttermgt
- ltexprgt - lttermgt
- lttermgt
- lttermgt -gt lttermgt ltfactorgt
- lttermgt / ltfactorgt
- ltfactorgt
- EBNF
- ltexprgt -gt lttermgt ( -) lttermgt
- lttermgt -gt ltfactorgt ( /) ltfactorgt
17Syntax Graphs
- Put the terminals in circles or ellipses and put
the non-terminals in rectangles - Connect with lines with arrowheads
- e.g., Pascal type declarations
18Wirths Rules
- B1 Reduce system of syntax graphs to a few of
reasonable size - B2 Translate each graph to a procedure according
to subsequent rules - B3 Sequence of elements translates to
- begin T(S1) T(S2) T(Sn) endor T(S1)
T(S2) T(Sn) - procedure TSx()begin TS1()
getsym() TS2() getsym() end
19lttermgt -gt ltfactorgt ( /) ltfactorgt
-
- Pascal commentbegin factor while sym in
times, slash do begin mulop
sym getsym factor gen_proper_op end
end
20lttermgt -gt ltfactorgt ( /) ltfactorgt
-
- void term()
- factor() / parse the first factor/
- while (next_token aster_code
- next_token slash_code)
- lexical() / get next token /
- factor() / parse the next factor /
-
-
21Recursive Descent Parsing
- Parsing - constructing a parse / derivation tree
for a given input string - Lexical analyzer is called by the parser
- A recursive descent parser traces out a parse
tree in top-down order it is a top-down parser - Each non-terminal in the grammar has a subprogram
associated with it the subprogram parses all
sentential forms that the nonterminal can
generate - The recursive descent parsing subprograms are
built directly from the grammar rules - Recursive descent parsers cannot be built from
left-recursive grammars
22PL/0 Program Structure
- Initialize keyword arrays, operator symbols,
mnemonics, and so forth - Initialize variables controlling scanning
(getting the individual characters), lexical
analysis (forming tokens), and parsing - Call the ltblockgt recognizing routine
- Note that block ends with a call to listcode
- Call the virtual machine interpreter
- Machine code kept in an array between phases
- Need to add to the output capabilities of PL/0
23Blocks and Static Scoping
- Blocks are different than sequences of statements
or compound statements - Blocks can include declarations
- Sort of like a single use subprogram used and
defined here - Where can blocks appear?
- Ada almost anywhere a statement could be
- Pascal only as bodies of procedures
- Java inner classes
24Data Specific to a Procedure
- To be able to return from call
- Program address of its call (return address)
- Address of data segment of caller
- Keep in data segment of procedure as
- RA (return address) DL (dynamic link)
- Location of variables
- Relative address only (since memory dynamic)
- Displacement off base address of appropriate data
segment (locally B register or by descending
chain of static links) - What does static scoping mean here?
25Example of Static Scoping
- void a local variable one void b
local variable two void c
local variable three // beginning
of code for c reference one,
two call b // end of c //
beginning of code for b reference one,
two call c // end of b // beginning of
code for a call b // end of a - a ? b ? c ? b
26Example of Static Scoping
- In a, one is local
- In b, two is local
- In b, one is a single static level out
- In c, three is local
- In c, two is a single static level out
- In c, one is double static levels out
- Then c calls b
- In b, one is still a single static level out
27Block Recognition Processing
- Block(level, symbolTableStartingIndex)
- Page 13, left
- ltblockgt ltconst_declgt ltvar_declgt
ltproc_declgt ltstatement_bodygt - ltproc_declgt procedure ltnamegt ltblockgt
- Recognize inner block
- Block(currentLevel1, currentSymbolTableIndex)
- Jump around decalrations
- tx0 tx tabletx0.adrcx gen(jmp,0,0)
... codetabletx0.adr.acx
tabletx0.adrcx statement() gen(opr,0,0)
return
28Symbol Table and Static Scope
- Variable declaration storage allocated by
incrementing DX (data index) by 1 - Initially DX is 3 to allocate space for the
block mark (RA, DL, and SL) - Symbol table (table)
- enter enter object into table
- Nested in block which determines static scoping
- Recursive calls make table act like a stack
- position - find identifier id in table
- Linear search backward
29Blocks and Scoping
- Nesting blocks does scope
- Restoring symbol table pointers makes symbol
table work like stack - Inner definitions lost to outer contexts
- Idea make symbol table work like a tree(one
branch along a tree looks like a stack)
30PL/0 Virtual Machine
- Section 5.10 (page 6 of handout)
- Stack machine primary data store is stack
- push, pop, insert or retrieve from within
- Operations on top of stack (add, test, etc.)
- Program store array named code
- Unchanged during interpretation
- I instruction register
- P program address register
- Data store array named S stack
31Example of Static Scoping (Repeat)
- void a local variable one void b
local variable two void c
local variable three // beginning
of code for c reference one,
two call b // end of c //
beginning of code for b reference one,
two call c // end of b // beginning of
code for a call b // end of a - a ? b ? c ? b
32Example of Static Scoping (Repeat)
- In a, one is local
- In b, two is local
- In b, one is a single static level out
- In c, three is local
- In c, two is a single static level out
- In c, one is double static levels out
- Then c calls b
- In b, one is still a single static level out
33Stack of PL/0 Machine (Fig. 5.7)
DL RA SL
DynamicLink
A local vars
B local vars
C local vars
B
StaticLink
B local vars
T
34Data Specific to a Procedure (again)
- To be able to return from call
- Program address of its call (return address)
- Address of data segment of caller
- Keep in data segment of procedure as
- RA (return address) DL (dynamic link)
- Location of variables
- Relative address only (since memory dynamic)
- Displacement off base address of appropriate data
segment (locally B register or by descending
chain of static links) - What does static scoping mean here?
35PL/0 Code Generation
- (page 7)
- Addresses are generated as pairs of numbers
indicating the static level difference and the
relative displacement within a data segment. - But how does the compiler figure this out?
- PL/0 code
- Other questionhow does PL/0 handle forward
references?
36PL/0 Machine Commands
- LIT load numbers (literals) onto the stack
- LOD fetch variable values to top of stack
- STO store values at variable locations
- CAL call a subprogram
- INT allocate storage by incrementing stack
pointer (T) - JMP - transfer of control
- (new program address - P)
- JPC - conditional transfer of control
- OPR - arithmetic and relational operators
37More on PL/0 Code Generation
- fct (lit, opr, lod, sto, cal, int, jmp, jpc)
- instruction packed record f fct function
code l 0 .. levmax level a 0 ..
amax displacement address end - procedure gen (x fct y, z integer)
- begincodecx.f x codecx.l y
codecx.a zcx cx 1 - end
- procedure listcodevar i integer
- begin list code generated for this bockfor i
cx0 to cx-1 do writeln(i, mnemoniccodei.f
5, codei.l 3, codei.a 5) - end
38PL/0 Interpreter
- t0 b1 p0 initialize
registersS10 s20 s30 (initialize
memoryrepeat instruction fetch
loop icodep pp1 With i do case f of
decode instruction lit begin tt1
sta end opr case a of 1 st
-st 2 begin tt-1 st st
st1 end end jmp pa sto
begin sbase(l)ast writeln(st) tt-
1 end cal begin generate new block
mark st1base(l) st2b st3
p bt1 pa end enduntil p0
not a good way to end
39Project Virtual Machine
- Can use the design of the PL/0 one
- Operations in PL/0 are integer orientedyou
probably want to add to this - Can also use other machine designs
- Hybrid approach compiles to intermediate form,
then interprets that - Direct interpretation possible if clearly
proposed - Idea add output whenever computation done
- Idea build some messages in that could be output
with new opr instructions
40Programs Writing Programs
- Compilers basically take programs in one language
and write programs in another - Compiler-compilers take grammars and write
compilers (second level program writers) - Translation of ENBF straightforward
- terminal ? specific recognizer
- non-terminal ? call to recognizing routine
- alternatives ? explicit initial characters
- loops ? graph following algorithms
- LEXX, YACC, other compiler tools
- Add in other stuff to the language processor
41Adding Predefined (Variable) Names
- procedure block has 2 parameters
- lev (the nesting level for the block)
- tx (starting index for the symbol table)
- The nested procedure enter is what puts symbols
(variable names) into the symbol table - Right-side page 14 of handout
- Initialize symbol table
- Make initial call of block non-zero table index
- Can initialize or do other things not normal in
the user visible input language
42Adding New Operator
- Add to getsym to recognize new symbol
- Look in condition, expression, term, factor
- Is new operator parallel to one of those
operators? - Basically another option in code generation
- If not like existing operators,add new syntactic
construct. - New action add to PL/0 machine
- Generate new instruction gen(opr,0,14)
(square) - Implement new instruction functionality(page 14,
left-side)14 begin st stst end - Add it into list of mnemonics
43Adding Built-in Function
- Design new indicator for symbol table
- Put function name in symbol table
- Parser will recognize as defined name(there will
be no way for user to put in) - In termif symident then
iposition(id) case kind of
constant variable procedure
built-in begin getsym left paren
expression
getsym right paren
gen (opr, 0, new-thing) end
44Adding Pre-defined Function
- Another approach
- Put entry into symbol table
- Make it a regular procedure
- Initialize the code array to represent the code
that might have been generated - Adding - New statement type
- Add new syntax into body of statement(page 12)
- Look at call as an example
- Syntactic sugar
45How To Start on the Project
- Get your tokenizer working
- This is the getsym procedure of the Pascal
version of PL/0 distributed in class - Can also be done with classes in C and Java
- Read in sample programs in the language youre
trying to compile and output the tokens (with
some other information) - Benefits
- Written some programs in your language
- Can leave the output statements for debugging
46UML Use Case Modeling
- program actions from the user viewpointe.g.,
directions for the grader of how to execute your
program - begin developing different aspects of the program
and planning its eventual actions as soon as
possible
user
command lineinterpreter
compilation
execution
47(No Transcript)
48(No Transcript)
49655 Project
- Unified Software Development Process (Rational)
- Unified Modeling Language (UML)
- Only to begin understanding, not required to use
- Unified Process
- Inception Phasebegin understanding the problem
and what you might do - Spiral Approachtry to have some partial version
at each stage - Project Report - proposal - introductory part
- Risk analysis small steps rather than being
overwhelmed, some small test programs
50Project Activities
- Language and program processing
- Input (lexical) scanning
- Grammars and recursive descent parsing
- Compiling to a virtual machine
- Virtual machine interpreter
- Some other approaches to compiling
51Project Options
- Proposals required as a first step you may want
an alternative language or alternative
techniques. - Individual proposal or work as a group of two.
- Many resources on the Internet if you want to
use them, propose to do something that goes
beyond them in a significant way.
52Project Stages
- Proposal (preliminary write-up) for project
- By e-mail to grader
- Simple parser for simple imperative language
- To exercise submit process
- Simple interpreter
- (step that doesn't have to be turned in)
- Final complete project
- significant write-up electronic submission
- No Other program in Lisp/Scheme
53Test Input
- Your language, your implementation, you know the
features and restrictions, therefore - You supply the test input
- Tell the grader what she should expect when
running the tests and why you chose what you did
(show off this or that feature, exercise an error
message, clever program in your language)
54655 Project Options
- Encourage you to make this into something youll
enjoy and be proud of - Flexibility probably unusual
- Available resources (books, Internet, etc.)
- Acknowledge their use
- Do significant work of your own
- Many different backgrounds and interests
- My experience has been at detail levels after
others have started and thought they were close
to finished (the very large Last 10)
55PL/0 Virtual Machine
- Section 5.10 (page 6 of handout)
- Stack machine primary data store is stack
- push, pop, insert or retrieve from within
- Operations on top of stack (add, test, etc.)
- Program store array named code
- Unchanged during interpretation
- I instruction register
- P program address register
- Data store array named S stack
56Data specific to a procedure
- To be able to return from call
- Program address of its call (return address)
- Address of data segment of caller
- Keep in data segment of procedure as
- RA (return address) DL (dynamic link)
- Location of variables
- Relative address only (since memory dynamic)
- Displacement off base address of appropriate data
segment (locally B register or by descending
chain of static links) - What does static scoping mean here?
57Example of Static Scoping
- void a local variable one void b
local variable two void c
local variable three // beginning
of code for c reference one,
two call b // end of c //
beginning of code for b reference one,
two call c // end of b // beginning of
code for a call b // end of a - a ? b ? c ? b
58Example of static scoping
- In a, one is local
- In b, two is local
- In b, one is a single static level out
- In c, three is local
- In c, two is a single static level out
- In c, one is double static levels out
- Then c calls b
- In b, one is still a single static level out
59Stack of PL/0 machine (Fig. 5.7)
DL RA SL
DynamicLink
A local vars
B local vars
C local vars
B
StaticLink
B local vars
T
60Data specific to a procedure (again)
- To be able to return from call
- Program address of its call (return address)
- Address of data segment of caller
- Keep in data segment of procedure as
- RA (return address) DL (dynamic link)
- Location of variables
- Relative address only (since memory dynamic)
- Displacement off base address of appropriate data
segment (locally B register or by descending
chain of static links) - What does static scoping mean here?
61PL/0 code generation
- (page 7)
- Addresses are generated as pairs of numbers
indicating the static level difference and the
relative displacement within a data segment. - But how does the compiler figure this out?
- PL/0 code
62PL/0 machine commands
- LIT load numbers (literals) onto the stack
- LOD fetch variable values to top of stack
- STO store values at variable locations
- CAL call a subprogram
- INT allocate storage by incrementing stack
pointer (T) - JMP - transfer of control
- (new program address - P)
- JPC - conditional transfer of control
- OPR - arithmetic and relational operators
63Symbol table and static scope
- Variable declaration storage allocated by
incrementing DX (data index) by 1 - Initially DX is 3 to allocate space for the
block mark (RA, DL, and SL) - Symbol table (table)
- enter enter object into table
- Nested in block which determines static scoping
- Recursive calls make table act like a stack
- position - find identifier id in table
- Linear search backward
64More on PL/0 code generation
- fct (lit, opr, lod, sto, cal, int, jmp, jpc)
- instruction packed record f fct function
code l 0 .. levmax level a 0 ..
amax displacement address end - procedure gen (x fct y, z integer)
- begincodecx.f x codecx.l y
codecx.a zcx cx 1 - end
- procedure listcodevar i integer
- begin list code generated for this bockfor i
cx0 to cx-1 do writeln(i, mnemoniccodei.f
5, codei.l 3, codei.a 5) - end
65Programs writing programs
- Compilers basically take programs in one language
and write programs in another - Compiler-compilers take grammars and write
compilers (second level program writers) - Translation of ENBF straightforward
- terminal ? specific recognizer
- non-terminal ? call to recognizing routine
- alternatives ? explicit initial characters
- loops ? graph following algorithms
- LEXX, YACC, other compiler tools
66655 Project Options
- Encourage you to make this into something youll
enjoy and be proud of - Flexibility probably unusual
- Available resources (books, Internet, etc.)
- Acknowledge their use
- Do significant work of your own
- Many different backgrounds and interests
- My experience has been at detail levels after
others have started and thought they were close
to finished (the very large Last 10)
67CIS 655 Project PL/0
- Niklaus Wirth, Algorithms Data Structures
Programs, 1976, Prentice-Hall (ISBN
0-13-022418-9) - PL/0 subset of Pascal
- Illustrated the way the Pascal P-code compiler
built - http//www.cs.rochester.edu/u/www/courses/
254/PLzero/guide/guide.html - 655 Project (100 pts, 40 of total grade)
- Project proposal (10 pts for turning in,
revisable) - Parser
- Intermediate step (non-graded) (but 10 pts for
turning in) - Input in syntax of programming language youre
building a compiler/interpreter for - Some kind of output, maybe with XML markup
- Develop your own test cases
- Freedom to make the project into something youll
enjoy and be proud of
68655 Project
- Unified Software Development Process (Rational)
- Unified Modeling Language (UML)
- Only to begin understanding, not required to use
- Unified Process
- Inception Phasebegin understanding the problem
and what you might do - Spiral Approachtry to have some partial version
at each stage - Project Report - proposal - introductory part
- Risk analysis small steps rather than being
overwhelmed, some small test programs
69Project Activities
- Language and program processing
- Input (lexical) scanning
- Grammars and recursive descent parsing
- Compiling to a virtual machine
- Virtual machine interpreter
- Some other approaches to compiling
70Project Options
- Proposals required as a first step you may want
an alternative language or alternative
techniques. - Individual proposal or work as a group of two.
- Many resources on the Internet if you want to
use them, propose to do something that goes
beyond them in a significant way.
71Project Stages
- Proposal (preliminary write-up) for project
- By e-mail to grader
- Simple parser for simple imperative language
- To exercise submit process
- Simple interpreter
- (step that doesn't have to be turned in)
- Final complete project
- significant write-up electronic submission
- No Other program in Lisp/Scheme
72655 Project Options
- Encourage you to make this into something youll
enjoy and be proud of - Flexibility probably unusual
- Available resources (books, Internet, etc.)
- Acknowledge their use
- Do significant work of your own
- Many different backgrounds and interests
- My experience has been at detail levels after
others have started and thought they were close
to finished (the very large Last 10)
73Other Project Questions
- An answer is probably in the PL/0 handout if you
only knew where to look and what you were looking
for - OK to use LEX and YACC if you build a significant
project on top of their use - JLex
- http//www.cs.princeton.edu/appel
/modern/java/JLex/ - More Class Questions?
74How to start on the project
- Get your tokenizer working
- This is the getsym procedure of the Pascal
version of PL/0 distributed in class - Can also be done with classes in C and Java
- Read in sample programs in the language youre
trying to compile and output the tokens (with
some other information) - Benefits
- Written some programs in your language
- Can leave the output statements for debugging
75655 Su03 Project Grading
- Pick between possible project options
- Write proposals
- Revisable
- First part of final report
- Work singly or in pairs
- Communicate with grader
- Written reports required
- Demonstration (after grader has time to read
report) - Submit code to be run directly by grader
- Project approximately 40 of overall grade
76PL/0 program structure
- Initialize keyword arrays, operator symbols,
mnemonics, and so forth - Initialize variables controlling scanning
(getting the individual characters), lexical
analysis (forming tokens), and parsing - Call the ltblockgt recognizing routine
- Note that block ends with a call to listcode
- Call the virtual machine interpreter
- Machine code kept in an array between phases
- Need to add to the output capabilities of PL/0
77Specification of Syntax
- PL/0
- How the nesting of expression, term and factor in
PL/0 work together and generate code - How the nesting of recognition routines has the
effect of static scoping - Project questions and answers
- UML Use Case Modeling
- General Problem of Describing Syntax
- Recursive Descent Parsing
- Attribute Grammars
- Describing the Meanings of Programs
Dynamic Semantics
78Syntax, semantics, language
- Syntax - the form or structure of the
expressions, statements, and program units - Semantics - the meaning of the expressions,
statements, and program units - Sentence - string of characters over some
alphabet (maybe what are usually words) - Language - set of sentences
- Lexeme - lowest level syntactic unit of a
language (e.g., , sum, begin) - Token - category of lexemes (e.g., identifier)
79Language (following Wirth)
- L L ( T, N, P, S )
- Vocabulary T of terminal symbols
- Set N of non-terminal symbols(grammatical
categories) - Set P of productions (syntactical rules)
- Symbol S (from N) called the start symbol
- Language is set of sequences of terminal symbols
that can be generated (directly or indirectly
(thats his points 3 and 4)
80Language definitions
- Who must use language definitions?
- Other language designers
- Implementors
- Programmers (the users of the language)
- Formal approaches to describing syntax
- Recognizers - used in compilers
- Generators - approach we'll study
81Backus Normal Form (1959)
- Invented by John Backus to describe Algol 58
- BNF is equivalent to context-free grammars
- A metalanguage is a language used to describe
another language. - In BNF, abstractions are used to represent
classes of syntactic structures--they act like
syntactic variables (also called nonterminal
symbols) - e.g. ltwhile_stmtgt -gt while ltlogic_exprgt do
ltstmtgt - This is a rule it describes the structure of a
while statement
82Syntax rules
- A rule has a left-hand side (LHS) and a
right-hand side (RHS), and consists of terminal
and non-terminal symbols - A grammar is a finite nonempty set of rules
- An abstraction (or non-terminal symbol) can have
more than one RHS - ltstmtgt -gt ltsingle_stmtgt begin ltstmt_listgt
end - Syntactic lists are described in BNF using
recursion - ltident_listgt -gt ident ident, ltident_listgt
- A derivation is a repeated application of rules,
starting with the start symbol and ending with a
sentence (all terminal symbols)
83An example grammar
- ltprogramgt -gt ltstmtsgt
- ltstmtsgt -gt ltstmtgt ltstmtgt ltstmtsgt
- ltstmtgt -gt ltvargt ltexprgt
- ltvargt -gt a b c d
- ltexprgt -gt lttermgt lttermgt lttermgt - lttermgt
- lttermgt -gt ltvargt const
84An example derivation
- ltprogramgt gt ltstmtsgt
- gt ltstmtgt
- gt ltvargt ltexprgt
- gt a ltexprgt
- gt a lttermgt lttermgt
- gt a ltvargt lttermgt
- gt a b lttermgt
- gt a b const
85Derivation explanation
- Every string of symbols in the derivation is a
sentential form - A sentence is a sentential form that has only
terminal symbols - A leftmost derivation is one in which the
leftmost non-terminal in each sentential form is
the one that is expanded - A derivation may be neither leftmost nor
rightmost - Parse tree is a hierarchical representation of a
derivation
86Ambiguity Right Recursive
- A grammar is ambiguous iff if and only if it
generates a sentential form that has two or more
distinct parse trees - If we use the parse tree to indicate precedence
levels of the operators, we cannot have ambiguity - Operator associativity can also be indicated by a
grammar - ltexprgt -gt ltexprgt ltexprgt const (ambiguous)
- ltexprgt -gt ltexprgt const const (unambiguous)
- Left recursive (left associative)(recursive
descent will require right recursive)
87Extended BNF (abbreviations)
- Optional parts are placed in brackets ()
- ltproc_callgt -gt ident ( ltexpr_listgt)
- Put alternative parts of RHSs in parentheses and
separate them with vertical bars - lttermgt -gt lttermgt ( -) const
- Put repetitions (0 or more) in braces ()
- ltidentgt -gt letter letter digit
88BNF / EBNF
- BNF
- ltexprgt -gt ltexprgt lttermgt
- ltexprgt - lttermgt
- lttermgt
- lttermgt -gt lttermgt ltfactorgt
- lttermgt / ltfactorgt
- ltfactorgt
- EBNF
- ltexprgt -gt lttermgt ( -) lttermgt
- lttermgt -gt ltfactorgt ( /) ltfactorgt
89Recursive Descent Parsing
- Parsing - constructing a parse / derivation tree
for a given input string - Lexical analyzer is called by the parser
- A recursive descent parser traces out a parse
tree in top-down order it is a top-down parser - Each non-terminal in the grammar has a subprogram
associated with it the subprogram parses all
sentential forms that the nonterminal can
generate - The recursive descent parsing subprograms are
built directly from the grammar rules - Recursive descent parsers cannot be built from
left-recursive grammars
90lttermgt -gt ltfactorgt ( /) ltfactorgt
- void term()
- factor() / parse the first factor/
- while (next_token ast_code
- next_token slash_code)
- lexical() / get next token /
- factor() / parse the next factor /
-
-
91Wirths Rules
- B1 Reduce system of syntax graphs to a few of
reasonable size - B2 Translate each graph to a procedure according
to subsequent rules - B3 Sequence of elements translates to
- begin T(S1) T(S2) T(Sn) endor T(S1)
T(S2) T(Sn) - procedure TSx()begin TS1()
getsym() TS2() getsym() end
92lttermgt -gt ltfactorgt ( /) ltfactorgt
- Pascal commentbegin factor while sym in
times, slash do begin mulop
sym getsym factor gen_proper_op end
end
93Blocks and static scoping
- Blocks are different than sequences of statements
or compound statements - Blocks can include declarations
- Sort of like a single use subprogram used and
defined here - Where can blocks appear?
- Ada almost anywhere a statement could be
- Pascal only as bodies of procedures
- Java inner classes
94Data specific to a procedure
- To be able to return from call
- Program address of its call (return address)
- Address of data segment of caller
- Keep in data segment of procedure as
- RA (return address) DL (dynamic link)
- Location of variables
- Relative address only (since memory dynamic)
- Displacement off base address of appropriate data
segment (locally B register or by descending
chain of static links) - What does static scoping mean here?
95Example of Static Scoping
- void a local variable one void b
local variable two void c
local variable three // beginning
of code for c reference one,
two call b // end of c //
beginning of code for b reference one,
two call c // end of b // beginning of
code for a call b // end of a - a ? b ? c ? b
96Example of static scoping
- In a, one is local
- In b, two is local
- In b, one is a single static level out
- In c, three is local
- In c, two is a single static level out
- In c, one is double static levels out
- Then c calls b
- In b, one is still a single static level out
97Block recognition processing
- Block(level, symbolTableStartingIndex)
- Page 13, left
- ltblockgt ltconst_declgt ltvar_declgt
ltproc_declgt ltstatement_bodygt - ltproc_declgt procedure ltnamegt ltblockgt
- Recognize inner block
- Block(currentLevel1, currentSymbolTableIndex)
- Jump around decalrations
- tx0 tx tabletx0.adrcx gen(jmp,0,0)
... codetabletx0.adr.acx
tabletx0.adrcx statement() gen(opr,0,0)
return
98Symbol table and static scope
- Variable declaration storage allocated by
incrementing DX (data index) by 1 - Initially DX is 3 to allocate space for the
block mark (RA, DL, and SL) - Symbol table (table)
- enter enter object into table
- Nested in block which determines static scoping
- Recursive calls make table act like a stack
- position - find identifier id in table
- Linear search backward
99Blocks and Scoping
- Nesting blocks does scope
- Restoring symbol table pointers makes symbol
table work like stack - Inner definitions lost to outer contexts
- Idea make symbol table work like a tree(one
branch along a tree looks like a stack)
100PL/0 Virtual Machine
- Section 5.10 (page 6 of handout)
- Stack machine primary data store is stack
- push, pop, insert or retrieve from within
- Operations on top of stack (add, test, etc.)
- Program store array named code
- Unchanged during interpretation
- I instruction register
- P program address register
- Data store array named S stack
101Stack of PL/0 machine (Fig. 5.7)
DL RA SL
DynamicLink
A local vars
B local vars
C local vars
B
StaticLink
B local vars
T
102Data specific to a procedure (again)
- To be able to return from call
- Program address of its call (return address)
- Address of data segment of caller
- Keep in data segment of procedure as
- RA (return address) DL (dynamic link)
- Location of variables
- Relative address only (since memory dynamic)
- Displacement off base address of appropriate data
segment (locally B register or by descending
chain of static links) - What does static scoping mean here?
103PL/0 code generation
- (page 7)
- Addresses are generated as pairs of numbers
indicating the static level difference and the
relative displacement within a data segment. - But how does the compiler figure this out?
- PL/0 code
104PL/0 machine commands
- LIT load numbers (literals) onto the stack
- LOD fetch variable values to top of stack
- STO store values at variable locations
- CAL call a subprogram
- INT allocate storage by incrementing stack
pointer (T) - JMP - transfer of control
- (new program address - P)
- JPC - conditional transfer of control
- OPR - arithmetic and relational operators
105More on PL/0 code generation
- fct (lit, opr, lod, sto, cal, int, jmp, jpc)
- instruction packed record f fct function
code l 0 .. levmax level a 0 ..
amax displacement address end - procedure gen (x fct y, z integer)
- begincodecx.f x codecx.l y
codecx.a zcx cx 1 - end
- procedure listcodevar i integer
- begin list code generated for this bockfor i
cx0 to cx-1 do writeln(i, mnemoniccodei.f
5, codei.l 3, codei.a 5) - end
106PL/0 Interpreter
- t0 b1 p0 initialize
registersS10 s20 s30 (initialize
memoryrepeat instruction fetch
loop icodep pp1 With i do case f of
decode instruction lit begin tt1
sta end opr case a of 1 st
-st 2 begin tt-1 st st
st1 end end jmp pa sto
begin sbase(l)ast writeln(st) tt-
1 end cal begin generate new block
mark st1base(l) st2b st3
p bt1 pa end enduntil p0
not a good way to end
107Project Virtual Machine
- Can use the design of the PL/0 one
- Operations in PL/0 are integer orientedyou
probably want to add to this - Can also use other machine designs
- Hybrid approach compiles to intermediate form,
then interprets that - Direct interpretation possible if clearly
proposed - Idea add output whenever computation done
- Idea build some messages in that could be output
with new opr instructions
108Programs writing programs
- Compilers basically take programs in one language
and write programs in another - Compiler-compilers take grammars and write
compilers (second level program writers) - Translation of ENBF straightforward
- terminal ? specific recognizer
- non-terminal ? call to recognizing routine
- alternatives ? explicit initial characters
- loops ? graph following algorithms
- LEXX, YACC, other compiler tools
109Static Semantics
- Other information about the language not
specified with the BNF - Identifier length
- Maximum integer value
- Other restrictions on your compiler
- Symbol table size
- Code array size
- Specify these in your description of your
language processor - Recognize the restrictions youve implied
110Test input
- Your language, your implementation, you know the
features and restrictions, therefore - You supply the test input
- Tell the grader what she should expect when
running the tests and why you chose what you did
(show off this or that feature, exercise an error
message, clever program in your language)
111Submit assignment goals
- keeping you on track
- checking out the submit process
- This is not a separately graded assignment, but
do do it within the next week or so. - The second part of the assignment is about
checking out the submit process. I have set
things up like I've done before, so I expect the
following will work for most of you very easily
but there will be a few problems, so get them out
of the way as soon as possible.
112Adding to PL/0
- Predefined variable names
- New operator
- Built-in function
- Pre-defined function
- New statement type
113Unstated Assumptions
- Input program read top-to-bottom, left-to-right,
with no backtracking - Things declared before they are used
- No redefining at same level
- Inner declarations hidden by nesting
- Inner can locally hide outer declarations
114Predefined (Variable) Names
- procedure block has 2 parameters
- lev (the nesting level for the block)
- tx (starting index for the symbol table)
- The nested procedure enter is what puts symbols
(variable names) into the symbol table - Right-side page 14 of handout
- Initialize symbol table
- Make initial call of block non-zero table index
- Can initialize or do other things not normal in
the user visible input language
115New Operator
- Add to getsym to recognize new symbol
- Look in condition, expression, term, factor
- Is new operator parallel to one of those
operators? - Basically another option in code generation
- If not like existing operators,add new syntactic
construct. - New action add to PL/0 machine
- Generate new instruction gen(opr,0,14)
(square) - Implement new instruction functionality(page 14,
left-side)14 begin st stst end - Add it into list of mnemonics
116Built-in Function
- Design new indicator for symbol table
- Put function name in symbol table
- Parser will recognize as defined name(there will
be no way for user to put in) - In termif symident then
iposition(id) case kind of
constant variable procedure
built-in begin getsym left paren
expression
getsym right paren
gen (opr, 0, new-thing) end
117Pre-defined function
- Another approach
- Put entry into symbol table
- Make it a regular procedure
- Initialize the code array to represent the code
that might have been generated
118New statement type
- Add new syntax into body of statement(page 12)
- Look at call as an example
119(No Transcript)
120(No Transcript)
121What is Computer Science?
- Lots of parts and specialties
- Core of computer science
- How programs developed
- How programs execute
- Programming software engineering
- 655 (programming languages) is central
- Programming
- How would you illustrate basic programming?
- Really, how would you illustrate basic
programming? - HTML for formatting text
- JavaScript for beginning programming
122What is 655 Prog. Languages About?
- Compare programming languages, but how?
- Vertical Fortran, Cobol, Lisp, C, etc.
- Horizontal loops, selection, subprograms
- Topics object-oriented, event handling,
concurrency - Processing syntax, semantics, compiler basics
- Note textbook and course have some of all of
these - Fundamental programming language concepts (my
idea) - Divide a program into pieces (e.g., subprograms,
types, threads, tasks, classes, packages,
modules, components) - Controlled modification of the pieces (e.g.,
compilers, templates) - Have pieces communicate (during development,
building, and execution e.g., names, linkages,
parameters) - Combine pieces into program (e.g., loaders,
building tools) - Know the pieces will work together (e.g., design,
reviews, proofs) - Decisions, decisions, decisions, . . .
123Goal of CIS 655
- New insight into the programming process and the
languages and concepts we use. - Knowledge of where we are and why.
- Anticipate and prepare for future programming.
124CIS 655 Warning
- Adult Content WarningF-word of programming
languages FORTRANA-word ALGOL - LISP Lots of Irritating Silly Parentheses
- Sesame Streets Cookie MonsterC is for
programming, thats good enough for me. - COBOL Committee Originated Boloney
- America Demanded Ada
- Java Just Another Version of Ada
- C (C)
125Traditional Content of 655
- Languages you didnt see elsewhere (1 hr courses)
- Compiler basics (used to be separate course)
- History of computing (still some of that)
- Classic languages
- Lisp (Scheme), Algol (Pascal, Ada), Snobol, etc.
- Classic computing approaches
- Recursion
- Concurrency
- New topics
- Programming with interfaces
- Distributed computing, web applications, Web
Services
126Now 2005
- Programming languages have changed mainly because
of capabilities in execution environment - Graphical User Interfaces
- Networking inside the execution environment
- Operating system concurrency in prog. language
- Classes (objects) as an extension of types
- Exception handling
- Security (probably coming soon)
- Some changes for program structuring
- Packages
- Generics and templates
- Proof conditions
- Components ?
127Programming Languages
- Teaching at Old Dominion University
- Usual course
- Lisp
- Snobol
- APL
- Algol / Pascal
- Summer 1979
- LISP
- MUMPS
- JOVIAL
- CMS-2
- Ada
- How I got started with Ada (then for 16 years)
128655 Summary Su05
- PL/0 compiler for traditional languages
- Lisp and its interpreter similarities
- XML/XSLT string matching (declarative)
- Patterns in programming
- High-level constructor, factory, façade,
singleton - Low-level loop, selection, exit, exception
- Parameterized types generics, templates
- Event-oriented programming, concurrency,
distributed - Environments Eclipse and its extensions
- Next generation programs (Oracle HTML DB)
129Plan and Expectations for Su05
- Classical languages and topics
- History, comparisons, Algol (and its influence)
- Basic control structures, data structures, types
- Binding, parameter passing
- Lisp and functional programming
- Compilation basics
- Instead of a separate course
- Object-oriented programming, UML
- Java, C, J2EE, .NET
- Concurrency, event driven programming
- Distributed Computing
- RMI, CORBA, RPC, SOAP (XML Protocol)
- XML, XSLT, XML Schemas, etc.
130Summer Quarter 2005 Plan
- Trying new things in a different order
- Implementation of programming languages
- Syntax of programming languages prerequisites
- Possible Project - PL/0 handout (understanding
large program) - New style project assignment (presentations)
start early - Traditional topics middle part of quarter
- New stuff, OO, concurrency, event driven,
distributed computing platforms, Java, C - Course Expectations
- Broad perspective on computing
- How programs are developed executed
- Knowledge of alternatives in programming
- Information about traditional language processing
concepts - Some specifics about programming languages
considered in the common knowledge domain of
computer scientists - Prepare for 50 more years of programming
131Traditional 655 Programming Project(s)
- Hybrid compiler / interpreter for small language
with C/Java-style syntax - Transform a high-level language to low-level form
- Reasonable use of tools and software encouraged
- Primarily individual, some 2-person groups
- Build on what you already know (e.g., 560)
- Project done in stages
- Proposals in class, demonstrations, documentation
- Develop your own appropriate tests
- Software to CIS computers using submit command
- Possible alternatives may be proposed
- No Perl implementations (use C or Java)
132How to Use the Textbook
- The textbook and web resources are very
informative and cost effective - Read, question, summarize
- Learn from reading
- Work on it in this class
- Bring book to class on days suggested
- Relate the authors comments to class
- Formulate your own conclusions
- Test against discussions, experience, and
experiments - Learning from reading is essential to being a
successful computer scientist and leader
133How 655 with Mathis Class Works
- Mathis usually talks a lot
- This term I want more student interaction
- Student should learn a lot
- Ive tried to structure topics of interest
- Students investigate and share with class
- Goalenable students to learn without the
teacher - I want to help you in this direction,not force
it on you by default - There will be a lot of change in the next 40
years - Information on web site is frequently
updated(primary method of class communication)
134Do I have to come to class?
- I dont usually take attendance, but . . .I
notice who is here. - Class attendance more important than students
seem to realize. - I think the topics I talk about are
important.Lots of new material not covered in
text. - Class attendance is the most efficient way to
learn the information in the text course your
opportunity to ask questions. - The projects and tests are discussed in class.
- Office hours and e-mail important, but not a
substitute for class attendance. - Ill miss you.
135Relationship of 655 to Other Courses
- 560 (prerequisite)
- Classical system software and group projects
- 625 (prerequisite)
- 660 Operating Systems (when I have tau