Title: Syntax and Semantics
1Chapter 3
2Issues of a programming language
- The language description must be concise yet
understandable - The language description must be understood by a
diverse group of people - The language itself must define both structure
and meaning of various constructs such as
expressions, statements and program units - Language users must be able to write programs by
referring to the languages reference manual
3Properties of programming languages
- Syntax the form of the languages expressions,
statements and program units. - Semantics the meaning of the languages
expressions, statements and program units.
4- Given a language L that uses the alphabet, ?, of
character, what are the ways to define a
programming language?
5Ways to define programming languages (1)
- Language recognizers
- Use a recognition device, R, to
- Read strings of characters from ? and determine
whether the string is in the language, L, or not - Separate correct sentences from incorrect
sentences - - Language recognizers performs the syntax
analysis of a compiler
6Ways to define programming languages (2)
- Language generators
- Use to generate sentences of a language
- Use generators to determne the correct syntax of
a statement by comparing it with the structure
provided by a generator.
7- Terms associated with programming languages
- Sentences statement or strings of a language
- Lexemes lowest level of static units which
includes identifiers, literals, operators and
keywords - Tokens category of lexemes
8index 2 count 17
- Tokens
- identifier
- identifier
- int_literal
- int_literal
- equal_sign
- mult_op
- plus_op
- semicolon
9- What is grammar?
- grammar is a formal method to describe syntax
- Two classes of grammar used in programming
languages (Chomsky) - Regular grammar used to describe tokens
- Context-Free used to describe the programming
language itself as a whole
10What is BNF?
- Acronym for Backus-Naur form
- Used initially to describe ALGOL 58, it is a
formal notation for specifying programming
language syntax - Developed by John Backus in 1959 and later
amended by Peter Naur in 1960 - Nearly identical to Chomskys context-free
grammar. - A metalanguage for programming languages
11- A metalanguage is a language used to describe
another language. - An example of a metalanguage is BNF
12BNF Basics
- It is a metalanguage used to describe another
language. - It is a collection of rules or productions of the
form - LHS -gt RHS
- where LHS (left-hand side) is the rule being
defined - RHS (right-hand side) is the definition
of the rule and consists of tokens, lexemes or
references to other rules. - A rule is made up of terminal (T) and nonterminal
(NT) symbols - Example of a rule ltassigngt ? ltvargt
ltexpressiongt
13BNF Basics (cont.)
- Nonterminal symbols can be further defined by two
or more distinct rules of the language whereas
terminal symbols can only be defined by lexemes
and tokens of the language. - From the previous example, assign, var and
expression are nonterminal symbols and is a
terminal symbol
14BNF Basics (cont.)
- BNF is used to describe lists of similar
constructs, ordering which different constructs
must appear, nested structure, operator
precedence, and operator associativity. - BNF is a generative device for defining
languages. This means that sentences of the
programming language defined by BNF are generated
through the application of the grammars rules.
15- Ex. Grammar 1
- ltprogramgt ? begin ltstmt_listgt end
- ltstmt_listgt ? ltstmtgt ltstmtgtltstmt_listgt
- ltstmtgt ? ltvargt ltexprgt
- ltexprgt ? ltvargt ltvargt ltvargt - ltvargt ltvargt
- ltvargt ? a b c
-
16Anatomy of Grammar 1
- ltprogramgt ? begin ltstmt_listgt end
- START T NT T
- ltstmt_listgt ? ltstmtgt ltstmtgtltstmt_listgt
- NT T NT NT T
NT - ltstmtgt ? ltvargt ltexprgt
- NT NT T NT
- ltexprgt ? ltvargt ltvargt ltvargt - ltvargt ltvargt
- NT NT T NT NT T
NT NT - ltvargt ? a b c
- NT T T T
17- A program that uses ex. grammar 1
- begin
- a bc
- b c
- end
- How do you derive the program above using
grammar one from the previous chart?
18Grammar and Parse Trees
- Grammars naturally describes the hierarchical
syntactic structures, i.e., parse trees, of
sentences - Ex. grammar 2 for an assignment statement
- ltassigngt ? ltidgt ltexprgt
- ltidgt ? abc
- ltexprgt ? ltidgt ltexprgt
- ltidgt ltexprgt
- ( ltexprgt )
- ltidgt
19Grammar and Parse Trees
- ltassigngt ? ltidgt ltexprgt
- ltidgt ? abc
- ltexprgt ? ltidgt ltexprgt
- ltidgt ltexprgt
- ( ltexprgt )
- ltidgt
- Use the grammar 2 to parse the statements
- A B A C
- A B A C
- What can you say about operator precedence?
20Ambiguous vs. Unambiguous Grammar
- Ambiguous grammar a grammar that generates a
sentence for which there are two or more distinct
parse trees. - Unambiguous grammar a grammar for which a
sentence generated can be represented by only one
parse tree.
21Ambiguous Grammar
- Ex. grammar 3 for an assignment statement
- ltassigngt ? ltidgt ltexprgt
- ltidgt ? abc
- ltexprgt ? ltexprgt ltexprgt
- ltexprgt ltexprgt
- ( ltexprgt )
- ltidgt
- Example statement A B C A
- Based on the above example why is the grammar
ambiguous?
22Grammar 4 correct version of Grammars 2 and 3
- ltassigngt ? ltidgt ltexprgt
- ltidgt ? A B C
- ltexprgt ? ltexprgt lttermgt lttermgt
- lttermgt ? lttermgt ltfactorgt ltfactorgt
- ltfactorgt ? (ltexprgt) ltidgt
- Example statement A B C A
- Is the operator precedence still enforced with
grammar 4 with this statement?
23Recursivity
- Left recursive
- grammar rule is left recursive if it has its LHS
appearing at the RHS, - Left recursive rule specifies left associativity
- Right Recursive
- grammar rule is right recursive if it has its LHS
appearing at the right end of RHS - right recursive rules specify right associativity
24Right associative operator exponent operator
-
- ltfactorgt ? ltexprgt lt factorgt ltexprgt
- ltexprgt ? (ltexprgt) id
- Example statement abc
- What does the parse tree look like?
- Is operator precedence enforced?
25Grammar Requirements
- It must be unambiguous
- It must enforce operator precedence
- It must enforce operator associativity
26Extended BNF (EBNF)
- Extensions to BNF to improve readability and
writability - Extensions include
- to mean optional
- to mean none or infinite repetitions
- n with n as the maximum number of
repetitions - to mean one or more repetitions
- ( ) to mean multiple choice
27Notational Symbols in BNF, EBNF
- lt gt used for nonterminals (BNF, EBNF)
- ? rule assignment (BNF, EBNF)
- selection (BNF, EBNF)
- optional (EBNF)
- repetition (EBNF)
- ( ) used with for selection (EBNF)
28Attribute Grammar informal definition
- An attribute grammar is an extension to a
context-free grammar to allow for the description
of more structure in a programming language than
is possible with BNF. It allows for the
description of static semantics such as
variable-type compatibility
29Static Semantics
- language syntax rules that cannot be defined by
BNF. Typically, these are related to variable
type constraints. Static semantics rules are
resolved at compile time, not execution time.
30Examples of static semantics
- A floating point values cannot be assigned to an
integer-type variable. - For some languages such as Ada, if the end
statement of a subprogram is followed by a name,
the name must match the name of the subprogram.
31Attribute Grammar formal definition
- An attribute grammar is context-free grammar with
the following additional features - Associated with each grammar symbol are sets of
inherited and synthesized attributes. - Synthesized attributes are used to pass semantic
information up the parse tree from the nodes
children - inherited attributes are semantic information
passed down the parse tree from the nodes parent
and siblings. - Associated with each grammar rule is a set of
attribute computation function and predicate
function (optional) over the attributes of the
symbols in the rule.
32Other definitions
- Fully attributed parse tree
- if all the attribute values of a parse tree have
been computed. - Intrinsic attributes
- synthesized attributes of leaf nodes whose values
are determined outside the parse tree. An
example would be the attributes of an identifier
taken from the symbol table.
33Example 5 Attribute Grammar
- ltassigngt ? ltvargt ltexprgt
- ltexprgt.expected_type ? ltvargt.actual_type
- ltexprgt ? ltvargt2 ltvargt3
- ltexprgt.actual_type ? if (ltvargt2.actual_type
int) and (ltvargt3.actual_type int) - then int
- else real
- end if
- ltexprgt.actual_type ltexprgt.expected_type
- ltexprgt ? ltvargt
- ltexprgt.actual_type ? ltvargt.actual_type
- ltexprgt.actual_type ltexprgt.expected_type
- ltvargt ? A B C
- ltvargt.actual_type ? look-up(ltvargt.string)
34Steps to building a fully attributed parse tree
- Step 1 From the given grammar, derive the parse
tree using the grammars production rules as
discussed previously. - Step 2 Get the intrinsic attributes of the leaf
nodes.
35Steps to building a fully attributed parse tree
(cont.)
- Step 3 Calculate the attributes of the
intermediate nodes, starting with the leaf node
and using the attribute computation functions,
possibly the predicate function and the intrinsic
attribute values of the leaf nodes.
36Sample Parse Tree of Grammar 4
- Use the attribute grammar 4 and the steps
outlined above to create a fully attributed parse
tree of the expression - A A B
-
37Dynamic Semantics
- A description of what expressions, statements,
and program units actually do when executed. - Three methods to define a languages dynamic
semantics - Operational Semantics
- Axiomatic Semantics
- Denotational Semantics
38Operational Semantics
- defines the meaning of a program written in some
language by the changes that occur in the state
of the computer that runs the program. - As each statement is executed, examine the
contents of the registers and memory cells - Needs a low-level virtual computer
39Operational Semantics (cont.)
- Advantages
- Provides an effective means of describing
semantics as long as the description is simple
and informal. - Intuitive and easy to implement.
- Disadvantages
- Depends on algorithms, not mathematics therefore
the semantic descriptions are inexact. - Can lead to circular definitions.
- Limited to simple language constructs.
40Axiomatic Semantics
- defines the meaning of a program by proving that
it executed correctly through the use of
assertions before and after each statement. - An assertion is a logical expression that defines
the constraints in the statements variables. - Precondition - assertion before the statement
- Postcondition - assertion after the statement
executes. - The weakest precondition is the least restrictive
precondition that will guarantee the validity of
the postcondition.
41Axiomatic Semantics (2)
- Advantages
- Powerful tool for research into program
correctness proofs. - Provides excellent framework with which to reason
about programs both before and after development.
- Disadvantages
- Limited in describing the meaning of programming
languages. - Requires that an axiom or inference rule be
defined for each statement type in the language.
Note that this has been proven to be a difficult
task.
42Denotational Semantics
- defines the meaning of a program by mapping each
language entity with which the program is written
to a mathematical object. - The idea is that mathematical objects can be
rigorously defined using classical mathematics.
If a language entity can be mapped using some
function to a mathematical object, it follows
that the language entity mapped to this object is
also defined.
43Denotational Semantics (2)
- Advantages
- Provides a framework for thinking about
programming in a highly rigorous way. - Can be used as an aide in language design.
- Limited use in automatic generation of compilers.
- Provides an excellent method of concisely
describing a language
- Disadvantages
- Too complex to be useful to language users.
- Difficult to create mathematical objects and
mapping functions.