Title: Describing Syntax and Semantics
1Describing Syntax and Semantics
- Jianhua Yang
- Department of Math and Computer Science
- Bennett College
2Outline
- The general problem of describing syntax
- Semantics
- Formal methods of describing syntax
- Attribute grammars
- Describing the meanings of programs Dynamic
Semantics
33.1 Introduction
- Describing a language
- Syntax
- Semantics
- Example
The syntax of a programming language is the
form of its expressions, statements, and program
units.
is the meaning of those expressions, statements,
and program units
while (ltboolean_exprgt) ltstatementsgt
43.2 The general problem of describing syntax
A basic unit to form a programming language
A category of its lexemes
Index 2 count 17
Lexemes Tokens index
identifier
equal_sign count
identifier 17
int_literal
51. Language recognizer
- Lexical analyzer (lexizer)
- Syntax analyzer (Parser)
62. Language generator
- It is a device that can be used to generate the
sentences of a language
73.3 Formal methods of describing syntax
- 1. Backus-Naur Form and Context-Free Grammars
- 2. Extended Backus-Naur Form
81. BNF and Context-Free Grammars
- Regular grammars
- Context-free grammars
9Origins of BNF
- Algol 58 (John Backus)
- Algol 60 (Peter Naur)
10Fundamentals of BNF
- BNF is a metalanguage which is used to describe
another language. - Example
- Rule (production)
- Nonterminal
- Terminal
ltassigngt? ltvargtltexpressiongt
A definition of one abstraction
A abstraction in a BNF description
The lexemes and tokens of the rules are called
terminals
11Example of BNF
Example1 about multiple definition ltif_stmtgt ?
if ltlogic_exprgt then ltstmtgt
if ltlogic_exprgt then ltstmtgt else ltstmtgt
ifltlog_expgt ltstmtgt
Example2 about recursive definition
(ellipsis) ltident_listgt ? identifier
identifier, ltident_listgt
12Grammars and Derivations
Start symbol
- A grammar for a small language
- ltprogramgt ? begin ltstmt_listgt end
- ltstmt_listgt ? ltstmtgt ltstmtgt ltstmt_listgt
- ltstmtgt ? ltvargt ltexpressiongt
- ltvargt ? A B C
- ltexpressiongt ? ltvargt ltvargt ltvargt - ltvargt
- ltvargt
begin ABC BC end
13Grammars and Derivations
- Derivation it is a sentence generation process
- Leftmost derivation the replaced nonterminal is
always the leftmost nonterminal
14Grammars and Derivations
Each of the strings in the derivation is called a
sentential form
- Leftmost derivation example
- ltprogramgt gtbegin ltstmt_listgt end
- gtbegin ltstmtgt ltstmt_listgt end
- gtbegin ltvargtltexpressiongt ltstmt_listgt end
- gtbegin Altexpressiongt ltstmt_listgt end
- gtbegin Altvargt ltvargt ltstmt_listgt end
- gtbegin AB ltvargt ltstmt_listgt end
- gtbegin AB C ltstmt_listgt end
- gtbegin AB C ltstmtgtend
- gtbegin AB C ltvargtltexpressiongtend
- gtbegin AB C B ltexpressiongt end
- gtbegin AB C B ltvargtend
- gtbegin AB C BC end
15Another derivation example
ltassinggtgtltidgtltexprgt gtAltexprgt gtAltidgtltexprgt
gtAB ltexprgt gtAB (ltexprgt) gtAB
(ltidgtltexprgt) gtAB (Altexprgt) gtAB
(Altidgt) gtAB (AC)
- Grammar
- ltassigngt? ltidgtltexprgt
- ltidgt?ABC
- ltexprgt?ltidgtltexprgt
- ltidgtltexprgt
- ltidgt(ltexprgt)
A B (A C)
16Parse tree
ltassigngt
ltexprgt
ltidgt
ltexprgt
ltidgt
A
(
ltexprgt
)
B
ltidgt
ltexprgt
ltidgt
A
C
17Ambiguity
- It is said to be ambiguous when a grammar that
generate a sentential form for which there are
two or more distinct parse trees
Example ABCA
18Ambiguity
ltassigngt
ltexprgt
ltidgt
ltexprgt
A
ltexprgt
id
ltidgt
ltexprgt
B
ltidgt
c
A
19Ambiguity
ltassigngt
ltexprgt
ltidgt
ltexprgt
ltexprgt
A
ltexprgt
ltexprgt
ltidgt
ltidgt
ltidgt
A
C
B
20Operator Precedence
- An unambiguous grammar for expressions
- ltassigngt?ltidgtltexprgt
- ltidgt?A B C
- ltexprgt?ltexprgtlttermgt
- lttermgt
- lttermgt?lttermgtltfactorgt
- ltfactorgt
- ltfactorgt?(ltexprgt)
- ltidgt
21Associativity of Operators
- Does the grammar describe the associativity?
- Left-associative
- Right-associative
- Does the associativity matter for computer
addition?
Associativity Math, Computer
22Define associativity of a operator
A rule has its LHS also appearing at the
beginning of its RHS
A rule has its LHS also appearing at the end of
its RHS
ltfactorgt?ltexpgtltfactorgt ltexpgt ltexpgt?(ltexprgt)
id
232. EBNF
- Not a new form
- For overcoming a few minor inconveniences in BNF
- Not enhance the descriptive power of BNF
- Three extensions
24First extension
- With bracket
- ltselectiongt?if (ltexpressiongt)
- thenltstatementgt
- else ltstatementgt
25Second extension
- With braces
- ltident_listgt?ltidentifiergt , ltidentifiergt
26Third extension
- With parentheses
- lttermgt?lttermgt ( / ) ltfactorgt
27Example
BNF ltexprgt?ltexprgtlttermgt ltexprgt-lttermgt
lttermgt lttermgt?lttermgtltfactorgt
lttermgt/ltfactorgt ltfactorgt ltfactorgt?ltexpgtltfacto
rgt ltexpgt ltexpgt?(ltexprgt) id
EBNF ltexprgt?lttermgt(-)lttermgt lttermgt?ltfactorgt(
/)ltfactorgt ltfactorgt?ltexpgtltexpgt ltexpgt?(ltexpr
gt)id
283.4 Attribute grammar
- Is an extension to a context-free grammar
- Is a device used to describe more of the
structure of a programming language than can be
described with a context-free grammar - Allows certain language rules to be conveniently
described, such as type compatibility.
291. Static semantics
- Used to describe type constraints.
- To check the specifications can be done at
compile time. - There are some problems to describe static
semantics with BNF. - Attribute grammars are a formal approach to both
describing and checking the correctness of the
static semantics rules of a program.
30Example
- Problem In Java, a floating-point value cannot
be assigned to an integer type variable, but the
opposite is legal.
This restriction can be described in BNF, it
requires additional nonterminals and rules, the
grammar would become too large, and the size of
parser is also too large to be useful.
31Example
- Problem all variables must be declared before
they are referenced.
It can be proved that this rule cannot be
described by using BNF.
322. Basic concepts
- Attribute grammars Context-free grammars
- Attributes
- Attributes functions
- Predicate functions.
33Basic concepts
Are associated with grammar symbols. They can be
assigned values.
- Attributes
- Attribute computation functions
- Predicate functions
Also called semantic functions, are associated
with grammar rules.
which state the static semantic rules of the
language, are associated with grammar rules.
343. Attribute grammars defined
- 1. Associated with each grammar symbol X is a set
of attributes A(X).
A(X) consists of two disjoint sets S(X) and
I(X). S(X) synthesized attributes are used to
pass semantic information up a parse tree. I(X)
inherited attributes pass semantic information
down and across a tree.
35Attribute grammars defined
- 2. Associated with each grammar rule is a set of
semantic functions and a possibly empty set of
predicate functions over the attributes of the
symbols in the grammar rule
For a rule X0?X1X2Xn S(X0)f(A(X1), ,
A(Xn)) I(Xj)f(A(X0), , A(Xn))
36Attribute grammars defined
- 3. A predicate function has the form of a Boolean
expression on the union of the attribute set
A(X0), , A(Xn) and a set of literal attribute
values.
374. Intrinsic attributes
- Intrinsic attributes are synthesized attributes
of leaf nodes whose values are determined outside
the parse tree.
385. Examples of Attribute Grammars
- We will show how attribute grammars can be used
to describe static semantics. - Example in Ada language, there is a rule that
the end of an Ada procedure must match the
procedures name.
39Ada example with Attribute grammars
- Syntax rule ltproc_defgt?procedure ltproc_namegt1
- ltproc_bodygtend
ltproc_namegt2 - Predicate ltproc_namegt1.stringltproc_namegt2.
string
40Another example
- The only variable names are A, B, C
- The right side of the assignments can either be a
variable or an expression in the form of a
variable added to another variable - The variables can be one of two types int, or
real - The type of the expression when the operand types
are not the same is always real. When they are
the same, the expression type is that of the
operands - The type of the left side of the assignment must
match the type of the right side - Assignment is valid only the LHS and the RHS have
the same type.
41The syntax
- ltassigngt?ltvargtltexprgt
- ltexprgt?ltvargtltvargtltvargt
- ltvargt?A B C
42Semantic rules
- actual_type A synthesized attribute associated
with the nonterminals ltvargt and ltexprgt. - expected_type An inherited attribute associated
with the nonterminal ltexprgt.
43An attribute grammar for simple assignment
statements
- 1. Syntax rule ltassigngt?ltvargt ltexprgt
- Semantic rule ltexprgt.expected_type ?
ltvargt.actual_type - 2. Syntax rule ltexprgt?ltvargt1ltvargt2
- semantic rule ltexprgt.actual_type?
if(ltvargt1.actual_typeint) and - (ltvargt2.actual_type int)
- then int
- else real
- predicate ltexprgt.actual_typeltexprgt.expecte
d_type - 3. Syntax rule ltexprgt?ltvargt
- Semantic rule ltexprgt.actual_type?ltvargt.actual
_type - Predicate rule ltexprgt.actual_typeltexprgt.expect
ed_type - 4. Syntax rule ltvargt?A B C
- Semantic rule ltvargt.actual_type?look-up(ltvargt.str
ing)
44A parse tree for this example
ltassigngt
ltexprgt
ltvargt
ltvargt2
ltvargt3
A
A
B
456. Computing attribute values
ltassigngt
e_t
a_t
ltexprgt
a_t
ltvargt
ltvargt2
ltvargt3
a_t
a_t
A
A
B
463.5 summary
- BNF and context-free grammars.
- They are equivalent metalanguages in describing
the syntax of programming languages. - Attributed grammar
- Static semantics
- Dynamic semantics operational, axiomatic, and
denotational.