PL/0 and the 655 Project - PowerPoint PPT Presentation

1 / 64
About This Presentation
Title:

PL/0 and the 655 Project

Description:

Input program read top-to-bottom, left-to-right, with no backtracking ... A rule has a left-hand side (LHS) and a right-hand side (RHS), and consists of ... – PowerPoint PPT presentation

Number of Views:51
Avg rating:3.0/5.0
Slides: 65
Provided by: bobma5
Category:

less

Transcript and Presenter's Notes

Title: PL/0 and the 655 Project


1
  • PL/0 and the 655 Project

2
CIS 655 Project PL/0
  • Niklaus Wirth, Algorithms Data Structures
    Programs, 1976, Prentice-Hall (ISBN
    0-13-022418-9)
  • PL/0 subset of Pascal
  • Illustrated the way the Pascal P-code compiler
    built
  • http//www.cs.rochester.edu/u/www/courses/

    254/PLzero/guide/guide.html
  • 655 Project (100 pts, 40 of total grade)
  • Project proposal (10 pts for turning in,
    revisable)
  • Parser
  • Intermediate step (non-graded) (but 10 pts for
    turning in)
  • Input in syntax of programming language youre
    building a compiler/interpreter for
  • Some kind of output, maybe with XML markup
  • Develop your own test cases
  • Freedom to make the project into something youll
    enjoy and be proud of

3
Traditional 655 Programming Project(s)
  • Hybrid compiler / interpreter for small language
    with C/Java-style syntax
  • Transform a high-level language to low-level form
  • Reasonable use of tools and software encouraged
  • Primarily individual, some 2-person groups
  • Build on what you already know (e.g., 560)
  • Project done in stages
  • Proposals in class, demonstrations, documentation
  • Develop your own appropriate tests
  • Software to CIS computers using submit command
  • Possible alternatives may be proposed
  • No Perl implementations (use C or Java)

4
655 Project Ideas
  • Small language like C or Pascal or Basic
  • Mathis has used since 1971 in various forms
  • Lisp / Scheme interpreter
  • Very similar to other sections of 655
  • XML parser, XSLT processor
  • Experimental but of current interest
  • Project Tests
  • Responsible for writing test cases
  • Grader will review, not do
  • Illustrate all the features youre claiming
  • Illustrate all your error checking
  • Project you can describe with pride

5
655 Project Options
  • Encourage you to make this into something youll
    enjoy and be proud of
  • Flexibility probably unusual
  • Available resources (books, Internet, etc.)
  • Acknowledge their use
  • Do significant work of your own
  • Many different backgrounds and interests
  • Proposals required as a first step you may want
    an alternative language or alternative
    techniques.
  • Target machine PL/0 machine (easy to
    expand)Java Virtual Machine (JVM) has too many
    checks
  • Evolving write-up and software together

6
Project Basics
  • Language processor
  • Define the syntax rules for simple language you
    want to process (PL/0, C subset, Lisp)
  • Convert from high-level language to low-level
    directly machine executable version
  • Using PL/0 machine and interpreter is easiest
  • Documentation about use of your processor(use
    cases and user documentation)
  • Test cases to illustrate your projects
    capabilities (use cases to test cases andtest
    driven development)

7
Project Stages
  • Proposal (preliminary write-up) for project
  • By e-mail to grader
  • Simple parser for simple imperative language
  • To exercise submit process
  • Simple interpreter
  • (step that doesn't have to be turned in)
  • Final complete project
  • significant write-up electronic submission
  • No additional program in Lisp/Scheme

8
655 Project Unified Process
  • Unified Software Development Process (Rational)
  • Unified Modeling Language (UML)
  • Only to begin understanding, not required to use
  • Unified Process
  • Inception Phasebegin understanding the problem
    and what you might do
  • Spiral Approachtry to have some partial version
    at each stage
  • Project Report - proposal - introductory part
  • Risk analysis small steps rather than being
    overwhelmed, some small test programs

9
Possible Approaches to Project
  • Do a good job with what you know
  • Use Resolve/C to implement compiler
  • Extension of what you did in CSE 321 560
  • Add skills described in this course
  • Use Pascal PL/0 example as a guide
  • Implement in Java as talked about in class
  • Investigate additional skills
  • Use Visual C or Lisp for different language
  • Define your own approach to the project
  • Project Step 1 Detailed Proposal
  • Test Driven Development (based on use cases) for
    refinement
  • Refactoring improving design of existing code

10
Test Input
  • Its your language, your implementation, and you
    know the features and restrictions, therefore
  • You supply the test input (lots of tests)
  • Sample input programs/syntax in your language
  • Intermediate results of tokenizing
  • Intermediate results of code generation
  • Top-level execution
  • Tell the grader what he should expect when
    running the tests and why you chose what you did
    (show off this or that feature, exercise an error
    message, clever program in your language)
  • Illustrate the capabilities of your language and
    processor
  • Syntax error processing not required
  • Grader not testing your project, but evaluating
    if you adequately tested
  • Even if code is generated by JavaCC or done with
    pre-built classes (StringTokenizer for example)
  • Build test cases and automatically run them at
    some point in build

11
(No Transcript)
12
Why Study PL/0
  • Need to look at large programs
  • PL/0 is a classic
  • Understand how Pascal (and Algol and Ada) works
  • Local variables
  • Recursion
  • How recursive descent parsing works
  • How typical language features added
  • Code generation
  • Working of a computer interpreter/emulator
  • See how everything is brought together

13
PL/0 program structure
  • Code for body of procedure after declarations of
    subprograms (main code is at end of listing)
  • Initialize keyword arrays, operator symbols,
    mnemonics, and so forth
  • Initialize variables controlling scanning
    (getting the individual characters), lexical
    analysis (forming tokens), and parsing
  • Call the ltblockgt recognizing routine
  • Note that block ends with a call to listcode
  • Call the virtual machine interpreter
  • Machine code kept in an array between phases
  • Need to add to the output capabilities of PL/0

14
Simple Syntax Processing Prerequisites
  • 321, 560, 625 basics of processing simple
    langs.
  • 655 to advance and unify that understanding
  • (multi-char) symbols are syntax components
  • Low-level, read a character at a time and build
    symbols / tokens (PL/0 getsym, getch, low-level)
  • Higher-level implementation language might have
    string tokenizer in language (Java,
    StringTokenizer)
  • Compiler generating tools lex/yacc, JavaCC
  • Wirth approach to describing syntax graphs as
    flow charts for programming a parser recursive
    descent
  • Textbook Chapters 3 4

15
Specification of Syntax
  • PL/0
  • How the nesting of expression, term and factor in
    PL/0 work together and generate code
  • How the nesting of recognition routines has the
    effect of static scoping
  • Project questions and answers
  • UML Use Case Modeling
  • General Problem of Describing Syntax
  • Recursive Descent Parsing
  • Attribute Grammars
  • Describing the Meanings of Programs
    Dynamic Semantics

16
Unstated Assumptions
  • Input program read top-to-bottom, left-to-right,
    with no backtracking
  • Things declared before they are used
  • No redefining at same level
  • Inner declarations hidden by nesting
  • Inner can locally hide outer declarations
  • Other information about the language not
    specified with the BNF
  • Identifier length
  • Maximum integer value
  • Other restrictions on your compiler
  • Symbol table size
  • Code array size
  • Specify these in your description of your
    language processor
  • Recognize the restrictions youve implied

17
Syntax, semantics, language
  • Syntax - the form or structure of the
    expressions, statements, and program units
  • Semantics - the meaning of the expressions,
    statements, and program units
  • Sentence - string of characters over some
    alphabet (maybe what are usually words)
  • Language - set of sentences
  • Lexeme - lowest level syntactic unit of a
    language (e.g., , sum, begin)
  • Token - category of lexemes (e.g., identifier)

18
Language (following Wirth)
  • L L ( T, N, P, S )
  • Vocabulary T of terminal symbols
  • Set N of non-terminal symbols(grammatical
    categories)
  • Set P of productions (syntactical rules)
  • Symbol S (from N) called the start symbol
  • Language is set of sequences of terminal symbols
    that can be generated (directly or indirectly
    (thats his points 3 and 4)

19
Backus Normal Form (1959)
  • Invented by John Backus to describe Algol 58
  • BNF is equivalent to context-free grammars
  • A metalanguage is a language used to describe
    another language.
  • In BNF, abstractions are used to represent
    classes of syntactic structures--they act like
    syntactic variables (also called nonterminal
    symbols)
  • e.g. ltwhile_stmtgt -gt while ltlogic_exprgt do
    ltstmtgt
  • This is a rule it describes the structure of a
    while statement

20
Syntax rules
  • A rule has a left-hand side (LHS) and a
    right-hand side (RHS), and consists of terminal
    and non-terminal symbols
  • A grammar is a finite nonempty set of rules
  • An abstraction (or non-terminal symbol) can have
    more than one RHS
  • ltstmtgt -gt ltsingle_stmtgt begin ltstmt_listgt
    end
  • Syntactic lists are described in BNF using
    recursion
  • ltident_listgt -gt ident ident, ltident_listgt
  • A derivation is a repeated application of rules,
    starting with the start symbol and ending with a
    sentence (all terminal symbols)

21
An example grammar
  • ltprogramgt -gt ltstmtsgt
  • ltstmtsgt -gt ltstmtgt ltstmtgt ltstmtsgt
  • ltstmtgt -gt ltvargt ltexprgt
  • ltvargt -gt a b c d
  • ltexprgt -gt lttermgt lttermgt lttermgt - lttermgt
  • lttermgt -gt ltvargt const

22
An example derivation
  • ltprogramgt gt ltstmtsgt
  • gt ltstmtgt
  • gt ltvargt ltexprgt
  • gt a ltexprgt
  • gt a lttermgt lttermgt
  • gt a ltvargt lttermgt
  • gt a b lttermgt
  • gt a b const

23
Derivation explanation
  • Every string of symbols in the derivation is a
    sentential form
  • A sentence is a sentential form that has only
    terminal symbols
  • A leftmost derivation is one in which the
    leftmost non-terminal in each sentential form is
    the one that is expanded
  • A derivation may be neither leftmost nor
    rightmost
  • Parse tree is a hierarchical representation of a
    derivation

24
Parsing another view

25
Static Semantics
  • Other information about the language not
    specified with the BNF
  • Identifier length
  • Maximum integer value
  • Other restrictions on your compiler
  • Symbol table size
  • Code array size
  • Specify these in your description of your
    language processor
  • Recognize the restrictions youve implied

26
Unstated Assumptions
  • Input program read top-to-bottom, left-to-right,
    with no backtracking
  • Things declared before they are used
  • No redefining at same level
  • Inner declarations hidden by nesting
  • Inner can locally hide outer declarations

27
Ambiguity Right Recursive
  • A grammar is ambiguous iff if and only if it
    generates a sentential form that has two or more
    distinct parse trees
  • If we use the parse tree to indicate precedence
    levels of the operators, we cannot have ambiguity
  • Operator associativity can also be indicated by a
    grammar
  • ltexprgt -gt ltexprgt ltexprgt const (ambiguous)
  • ltexprgt -gt ltexprgt const const (unambiguous)
  • Left recursive (left associative)(recursive
    descent will require right recursive)

28
Extended BNF (abbreviations)
  • Optional parts are placed in brackets ()
  • ltproc_callgt -gt ident ( ltexpr_listgt)
  • Put alternative parts of RHSs in parentheses and
    separate them with vertical bars
  • lttermgt -gt lttermgt ( -) const
  • Put repetitions (0 or more) in braces ()
  • ltidentgt -gt letter letter digit

29
BNF / EBNF
  • BNF
  • ltexprgt -gt ltexprgt lttermgt
  • ltexprgt - lttermgt
  • lttermgt
  • lttermgt -gt lttermgt ltfactorgt
  • lttermgt / ltfactorgt
  • ltfactorgt
  • EBNF
  • ltexprgt -gt lttermgt ( -) lttermgt
  • lttermgt -gt ltfactorgt ( /) ltfactorgt

30
Syntax Graphs
  • Put the terminals in circles or ellipses and put
    the non-terminals in rectangles
  • Connect with lines with arrowheads
  • e.g., Pascal type declarations

31
Wirths Rules
  • B1 Reduce system of syntax graphs to a few of
    reasonable size (not consistent with modern Java)
  • B2 Translate each graph to a procedure according
    to subsequent rules
  • B3 Sequence of elements translates to
  • begin T(S1) T(S2) T(Sn) endor T(S1)
    T(S2) T(Sn)
  • procedure TSx()begin TS1() getsym()
    TS2() getsym()
  • end

32
lttermgt -gt ltfactorgt ( /) ltfactorgt
  • Pascal commentbegin factor while sym in
    times, slash do begin mulop
    sym getsym factor gen_proper_op end
    end

33
lttermgt -gt ltfactorgt ( /) ltfactorgt
  • void term()
  • factor() / parse the first factor/
  • while (next_token aster_code
  • next_token slash_code)
  • lexical() / get next token /
  • factor() / parse the next factor /

34
Recursive Descent Parsing
  • Parsing - constructing a parse / derivation tree
    for a given input string
  • Lexical analyzer is called by the parser
  • A recursive descent parser traces out a parse
    tree in top-down order it is a top-down parser
  • Each non-terminal in the grammar has a subprogram
    associated with it the subprogram parses all
    sentential forms that the nonterminal can
    generate
  • The recursive descent parsing subprograms are
    built directly from the grammar rules
  • Recursive descent parsers cannot be built from
    left-recursive grammars

35
PL/0 Program Structure
  • Initialize keyword arrays, operator symbols,
    mnemonics, and so forth
  • Initialize variables controlling scanning
    (getting the individual characters), lexical
    analysis (forming tokens), and parsing
  • Call the ltblockgt recognizing routine
  • Note that block ends with a call to listcode
  • Call the virtual machine interpreter
  • Machine code kept in an array between phases
  • Need to add to the output capabilities of PL/0

36
Blocks and Static Scoping
  • Blocks are different than sequences of statements
    or compound statements
  • Blocks can include declarations
  • Sort of like a single use subprogram used and
    defined here
  • Where can blocks appear?
  • Ada almost anywhere a statement could be
  • Pascal only as bodies of procedures
  • Java inner classes

37
Data Specific to a Procedure
  • To be able to return from call
  • Program address of its call (return address)
  • Address of data segment of caller
  • Keep in data segment of procedure as
  • RA (return address) DL (dynamic link)
  • Location of variables
  • Relative address only (since memory dynamic)
  • Displacement off base address of appropriate data
    segment (locally B register or by descending
    chain of static links)
  • What does static scoping mean here?

38
Example of Static Scoping
  • void a local variable one void b
    local variable two void c
    local variable three // beginning
    of code for c reference one,
    two call b // end of c //
    beginning of code for b reference one,
    two call c // end of b // beginning of
    code for a call b // end of a
  • a ? b ? c ? b

39
Example of Static Scoping
  • In a, one is local
  • In b, two is local
  • In b, one is a single static level out
  • In c, three is local
  • In c, two is a single static level out
  • In c, one is double static levels out
  • Then c calls b
  • In b, one is still a single static level out

40
Block Recognition Processing
  • Block(level, symbolTableStartingIndex)
  • Page 13, left
  • ltblockgt ltconst_declgt ltvar_declgt
    ltproc_declgt ltstatement_bodygt
  • ltproc_declgt procedure ltnamegt ltblockgt
  • Recognize inner block
  • Block(currentLevel1, currentSymbolTableIndex)
  • Jump around decalrations
  • tx0 tx tabletx0.adrcx gen(jmp,0,0)
    ... codetabletx0.adr.acx
    tabletx0.adrcx statement() gen(opr,0,0)
    return

41
Symbol Table and Static Scope
  • Variable declaration storage allocated by
    incrementing DX (data index) by 1
  • Initially DX is 3 to allocate space for the
    block mark (RA, DL, and SL)
  • Symbol table (table)
  • enter enter object into table
  • Nested in block which determines static scoping
  • Recursive calls make table act like a stack
  • position - find identifier id in table
  • Linear search backward

42
Blocks and Scoping
  • Nesting blocks does scope
  • Restoring symbol table pointers makes symbol
    table work like stack
  • Inner definitions lost to outer contexts
  • Idea make symbol table work like a tree(one
    branch along a tree looks like a stack)

43
PL/0 Virtual Machine
  • Section 5.10 (page 6 of handout)
  • Stack machine primary data store is stack
  • push, pop, insert or retrieve from within
  • Operations on top of stack (add, test, etc.)
  • Program store array named code
  • Unchanged during interpretation
  • I instruction register
  • P program address register
  • Data store array named S stack

44
Example of Static Scoping (Repeat)
  • void a local variable one void b
    local variable two void c
    local variable three // beginning
    of code for c reference one,
    two call b // end of c //
    beginning of code for b reference one,
    two call c // end of b // beginning of
    code for a call b // end of a
  • a ? b ? c ? b

45
Example of Static Scoping (Repeat)
  • In a, one is local
  • In b, two is local
  • In b, one is a single static level out
  • In c, three is local
  • In c, two is a single static level out
  • In c, one is double static levels out
  • Then c calls b
  • In b, one is still a single static level out

46
Stack of PL/0 Machine (Fig. 5.7)
DL RA SL
DynamicLink
A local vars
B local vars
C local vars
B
StaticLink
B local vars
T
47
Data Specific to a Procedure (again)
  • To be able to return from call
  • Program address of its call (return address)
  • Address of data segment of caller
  • Keep in data segment of procedure as
  • RA (return address) DL (dynamic link)
  • Location of variables
  • Relative address only (since memory dynamic)
  • Displacement off base address of appropriate data
    segment (locally B register or by descending
    chain of static links)
  • What does static scoping mean here?

48
Machine Definition
  • Jave5 enum example of machine operations
  • PL/0 virtual machine emulator
  • But how do high-level (programming language)
    structures relate to low-level (machine level)
    structures?
  • Control structures
  • Data structures
  • Program component re-combination

49
Compilation Mapping
  • Input c a b
  • Output
  • Symbol table c and corresponding location
  • save to later generate store
  • Symbol table a and corresponding location
  • save to generate addition
  • Symbol table b and corresponding location
  • end of statement, generate saved operations
  • Stack oriented machine code load a, load b,
    add, store c

50
PL/0 Code Generation
  • (page 7 of handout)
  • Addresses are generated as pairs of numbers
    indicating the static level difference and the
    relative displacement within a data segment.
  • But how does the compiler figure this out?
  • PL/0 code
  • Other questionhow does PL/0 handle forward
    references?

51
PL/0 Machine Commands
  • LIT load numbers (literals) onto the stack
  • LOD fetch variable values to top of stack
  • STO store values at variable locations
  • CAL call a subprogram
  • INT allocate storage by incrementing stack
    pointer (T)
  • JMP - transfer of control
  • (new program address - P)
  • JPC - conditional transfer of control
  • OPR - arithmetic and relational operators

52
More on PL/0 Code Generation
  • fct (lit, opr, lod, sto, cal, int, jmp, jpc)
  • instruction packed record f fct function
    code l 0 .. levmax level a 0 ..
    amax displacement address end
  • procedure gen (x fct y, z integer)
  • begincodecx.f x codecx.l y
    codecx.a zcx cx 1
  • end
  • procedure listcodevar i integer
  • begin list code generated for this bockfor i
    cx0 to cx-1 do writeln(i, mnemoniccodei.f
    5, codei.l 3, codei.a 5)
  • end

53
PL/0 Interpreter
  • t0 b1 p0 initialize
    registersS10 s20 s30 (initialize
    memoryrepeat instruction fetch
    loop icodep pp1 With i do case f of
    decode instruction lit begin tt1
    sta end opr case a of 1 st
    -st 2 begin tt-1 st st
    st1 end end jmp pa sto
    begin sbase(l)ast writeln(st) tt-
    1 end cal begin generate new block
    mark st1base(l) st2b st3
    p bt1 pa end enduntil p0
    not a good way to end

54
Project Virtual Machine
  • Can use the design of the PL/0 one
  • Operations in PL/0 are integer orientedyou
    probably want to add to this
  • Can also use other machine designs
  • Hybrid approach compiles to intermediate form,
    then interprets that
  • Direct interpretation possible if clearly
    proposed
  • Idea add output whenever computation done
  • Idea build some messages in that could be output
    with new opr instructions

55
Adding to PL/0
  • Predefined variable names
  • New operator
  • Built-in function
  • Pre-defined function
  • New statement type

56
Adding Predefined (Variable) Names
  • procedure block has 2 parameters
  • lev (the nesting level for the block)
  • tx (starting index for the symbol table)
  • The nested procedure enter is what puts symbols
    (variable names) into the symbol table
  • Right-side page 14 of handout
  • Initialize symbol table
  • Make initial call of block non-zero table index
  • Can initialize or do other things not normal in
    the user visible input language

57
Adding New Operator
  • Add to getsym to recognize new symbol
  • Look in condition, expression, term, factor
  • Is new operator parallel to one of those
    operators?
  • Basically another option in code generation
  • If not like existing operators,add new syntactic
    construct.
  • New action add to PL/0 machine
  • Generate new instruction gen(opr,0,14)
    (square)
  • Implement new instruction functionality(page 14,
    left-side)14 begin st stst end
  • Add it into list of mnemonics

58
Adding Built-in Function
  • Design new indicator for symbol table
  • Put function name in symbol table
  • Parser will recognize as defined name(there will
    be no way for user to put in)
  • In termif symident then
    iposition(id) case kind of
    constant variable procedure
    built-in begin getsym left paren
    expression
    getsym right paren
    gen (opr, 0, new-thing) end

59
Adding Pre-defined Function
  • Another approach
  • Put entry into symbol table
  • Make it a regular procedure
  • Initialize the code array to represent the code
    that might have been generated
  • Adding - New statement type
  • Add new syntax into body of statement(page 12)
  • Look at call as an example
  • Syntactic sugar

60
How To Start on the Project
  • Get your tokenizer working
  • This is the getsym procedure of the Pascal
    version of PL/0 distributed in class
  • Can also be done with classes in C and Java
  • Read in sample programs in the language youre
    trying to compile and output the tokens (with
    some other information)
  • Benefits
  • Written some programs in your language
  • Can leave the output statements for debugging

61
Requirements Analysis
  • Part of Object-Oriented Analysis Design (O-O
    AD)
  • Collect potential requirements
  • Ask users (or think about) how users will use the
    system
  • For incremental development, rank normal and
    exceptional flows
  • Use case diagrams UML (Unified Modeling
    Language) for talking
  • Use case documents for details
  • Non-functional (and other) requirements(security,
    background tracing, existing s/w)
  • Use Case to Use Your Processor
  • Command line approach
  • Graphical user interface
  • Use Cases ? Boundary Classes
  • Boundary class façade pattern
  • Implementing those is successive refinement

62
UML Use Case Modeling
  • program actions from the user viewpointe.g.,
    directions for the grader of how to execute your
    program
  • begin developing different aspects of the program
    and planning its eventual actions as soon as
    possible
  • Boundary class is a façade for interaction

user
command lineinterpreter
compilation
execution
63
Software Development Steps
  • Narrative requirements from the users
  • Requirements analysis to be sure needs are well
    understood
  • Use cases user perspective
  • Use case analysis from the development
    perspective
  • Use case analysis from a testing perspective
  • Identification of boundary classes in the design
  • Determination of business rules and logic
  • Design of supporting classes (behind façade)
  • rest of the design and development process

64
  • PL/0 and the 655 Project
  • Lisp and then XML/XSLT ?
Write a Comment
User Comments (0)
About PowerShow.com