Title: EBNF: A Notation for Describing Syntax
1EBNFA Notation for Describing Syntax
- n Languages and Syntax
- n EBNF Descriptions and Rules
- n More Examples of EBNF
- n Syntax and Semantics
- n EBNF Description of Sets
- n Advanced EBNF (recursion)
2Quote of the Day
- When teaching a rapidly changing technology,
perspective is more important than content. -
3Why Study EBNF
- EBNF is a notation for formally describing
syntax how to write symbols in a language. We
will use EBNF to describe the syntax of Java. But
there is a more compelling reason to begin our
study of programming with EBNF it is a microcosm
of programming. There is a strong similarity
between the control forms of EBNF and the control
structures of Java sequence, decision,
repetition, recursion, and the ability to name
descriptions. There is also a strong similarity
between the process of writing EBNF descriptions
and writing Java programs. Finally studying EBNF
introduces a level of formality that will
continue throughout the semester.
4Languages and Syntax
- EBNF Extended Backus-Naur Form
- John Backus (IBM) invented a notation called BNF
- He used it to describe FORTRANs syntax (1956)
- Peter Naur popularized BNF
- He used it to describe ALGOL's syntax (1958)
- Niklaus Wirth used and Extended form of BNF
(called EBNF) to describe the syntax of his
Pascal programming language (1976) - Noam Chomsky (MIT linguist and philospher)
- Invented a Hierarchy of Notations for Natural
Languages - 4 levels 0-3 with 0 being the most powerful
- BNF is at level 2 programming languages are at
level 0 - Formal Languages and Computability
- is the study of different families of notations
and their power
5EBNF Descriptions and Rules
- Each Description is a list of Rules
- Rule Form LHS Ü RHS (read Ü as is defined as)
- Rule Names (LHS) are italicized, hyphenated words
- Control Forms in RHS
- Sequence Items appear left to right order is
important - Choice Alternatives separated by (stroke)
exactly one item is chosen from the alternatives - Option Optional item enclosed between and it
can be included or discarded - Repetition Repeatable item enclosed between
and it can be repeated 0 or more times
6An EBNF Description of Integers
- A symbol (sequence of characters) is classified
legal by an EBNF rule if we can process all the
characters in the symbol when we reach the end of
the right hand side of the EBNF rule. - digit Ü 0123456789
- integer Ü -digitdigit
- digit is defined as any of the alternatives 0
through 9 - integer is defined as a sequence of three items
(1) an optional sign (if it is included, it must
be the alternative or -), followed by (2) any
digit, followed by (3) a repetition of zero or
more digits. - The integer RHS combines and illustrates all EBNF
- control forms sequence, option, alternative,
repetition.
7Proofs In English
- Is the symbol 7 an integer? Yes, the proof
- In the integer EBNF rule, start with the optional
sign discard the option. Next in the sequence is
a digit choose the 7 alternative. Next in the
sequence is a repetition choose 0 repetitions.
End of symbol integer reached. - Is the symbol 127 an integer? Yes, the proof.
- In the integer EBNF rule, start with the optional
sign include the option choose the
alternative. Next in the sequence is a digit
choose the 1 alternative. Next in the sequence is
a repetition choose 2 repetitions choose the 2
alternative for the first choose the 7
alternative for the second. End of symbol
integer reached. - Are the symbols 1,024 A5 15- 12 an
integer?
8Tabular Proof
- Tabular Proof Replacement Rules
- (1) Replace a name (LHS) by its definition
(RHS) - (2) Choose an alternative
- (3) Include or Discard an Option
- (4) Choose the number of repetitions
- Status Reason
- integer Given
- -digitdigit Replace LHS by RHS (1)
- digitdigit Chose alternative (2)
- digitdigit Include option (3)
- 1digit Replace digit by 1 alternative (12)
- 1digit digit Choose two repetitions (4)
- 12digit Replace digit by 2 alternative (12)
- 127 Replace digit by 7 alternative (12)
9Graphical Proof
integer
-
digit
digit
1
digit
digit
2
7
A graphical proof replaces multiple (equivalent)
tabular proofs, since the order of rule
application (which is unimportant) is often
absent in graphical proofs.
10Identical vs Equivalent Descriptions
- sign Ü -
- digit Ü 0123456789
- integer Ü signdigitdigit
- x Ü -
- y Ü 0123456789
- z Ü xyy
- These two descriptions are not identical but they
are equivalent Although they use different EBNF
rule names (consistently), asking whether a
symbol is an integer is the same as asking
whether the symbol is a z.
11Two Problematical Descriptions
- A simplified but equivalent definition of
integer? - sign Ü -
- digit Ü 0123456789
- integer Ü signdigit
- A good definition of integers with commas
(1,024)? - sign Ü -
- comma-digit Ü 0123456789,
- comma-integer Ü signcomma-digitcomma-digit
- Both definitions classify non-obvious symbols
as legal integer or comma-integer. Find such
symbols.
12Syntax and Semantics
- Syntax Form
- Semantics Meaning
- Key Questions
- Can two different symbols have the same meaning?
- Can a symbol have many meanings (depending on
context)? - Do the following symbols have the same meaning?
- 1 and 1, 000193 and 193
- 9.000 and 9.0
- Rich and rich
- EBNF specifies syntax, not semantics
- Semantics is supplied informally English,
examples, ... - Formal semantics is a research area in CS, AI,
Linguistics, ...
13Structured Integers
- Allow non-adjacent embedded underscores to add a
special structure to a number - 2_10_54
- 1_800_555_1212
- 1_000_000 (compared to 1000000 figure each
value fast) - Define structured-integer
- digit Ü 0123456789
- structured-integer Ü signdigit_digit
- Semantically, the underscore is ignored
- 1_2 has the same meaning as 12
- How can we fix the date problem 12_5_1987 and
1_25_1987
14Syntax Charts
Sequence
Choice
A B C D
ABCD A B C D
ABCD
Option
Repetition
A A
A A
15Syntax Charts for integer and digit
0 1 2 3 4 5 6 7 8 9
digit
digit
-
digit
integer
16A Syntax Chart with no other names
0 1 2 3 4 5 6 7 8 9
0 1 2 3 4 5 6 7 8 9
-
integer
Which Syntax chart for integer is simpler? The
previous one (because it is smaller) or this one
(because it it doesnt need another name for
digit)?
17Interesting Rules Their Charts
18Description of Sets
- Set syntax
- Sets start with ( and end with )
- Sets contain 0 or more integers
- A comma appears between every pair of integers
- integer-list Ü integer,integer
- integer-set Ü (integer-list)
-
- Set semantics
- Order is unimportant
- (1,3,5) is equivalent to (5,1,3) and any other
permutation - Duplicate elements are unimportant
- (1,3,5,1,3,3,5) is equivalent to (1,3,5)
19Proof (5,-2,11) is an integer-set
- Status Reason
- integer-set Given
- (integer-list) Replace integer-set by its RHS
- (integer-list) Include option
- (integer,integer) Replace integer-list by its
RHS - (5,integer) Lemma 5 is an integer
- (5,integer,integer) Choose two repetitions
- (5,-2,integer) Lemma -2 is an integer
- (5,-2,11) Lemma 11 is an integer
20Description of Sets with Ranges
- Ranges syntax
- A ranges is a single integer or a pair separated
by .. - integer-range Ü integer..integer
- integer-list Ü integer-range,integer-range
- integer-set Ü (integer-list)
-
- Range semantics X..Y
- XY all integers from X up to Y (inclusive)
- 1..5 is equivalent to 1,2,3,4,5 5..5 is
equivalent just to 5 - XgtY a null range it contains no values
- (1..4,10,5..4,11..13) is equivalent to
(1,2,3,4,10,11,12,13)
21Recursive Descriptions
- A directly recursive EBNF rule has its LHS in its
RHS - r1 Ü Ar1
- We read this as r1 is defined as the choice of
nothing or an A followed by an r1. The symbols
recognized as an r1 are of the form An, n³ 0.
Proof that AAA is an r1 - r1 Given
- Ar1 Replace r1 by the second alternative in
its RHS - AAr1 Replace r1 by the second alternative in
its RHS - AAAr1 Replace r1 by the second alternative in
its RHS - AAA Replace r1 by the first (empty)
alternative in its RHS - This rule is equivalent to r1 Ü A
22The Power of Recursion
- To recognize symbols of the form form An Bn , n³
0 we cannot write r1 Ü AB, because nothing
constrains us choosing different repetitions of A
and B AAB - The recursive rule r1 Ü Ar1B works, because
each choice of the second alternative uses
exactly one A and one B. Proof that AAABBB is an
r1 - r1 Given
- Ar1B Replace r1 by the second alternative in
its RHS - AAr1BB Replace r1 by the second alternative in
its RHS - AAAr1BBB Replace r1 by the second alternative
in its RHS - AAABBB Replace r1 by the first (empty)
alternative in its RHS - Symbols of the form form An Bn , n³ 0
23Problems
- Read the EBNF Handout (all but Section 2.7)
- Study and Understand the Review Questions
- 2 (page 10), 23 (page 12), 1 (page 16), 2
(page 18) - Be prepared to discuss in class solutions to the
following Exercises (starting on page 23) - 1, 2, 4, and expecially 8
- See next slide for more problems
24Problems (continued)
- Translate the following RHS of an EBNF rule into
its equivalent syntax chart. Then, classify each
of the examples below as legal or illegal
according to this rule (or its equivalent chart). - ABAZ
- AZ BZ ABZ ABAZ
- ABABZ ABA AAAZ ABABBZ
- ABCZ
- BZ ABC ABBBZ ACCZ
- ABCZ ABCBCZ ABBCBBZ ABCZBCZ
- ABCZ
- AB ABC ABBBZ BBZ
- ABBCCZ ACCBBZ ACBBCZ ABCZBCZ