Title: Mariano Ceccato
1The TXL Programming Language (2)
- Mariano Ceccato
- ITC-Irst
- Istituto per la ricerca Scientifica e Tecnologica
- ceccato_at_itc.it
2The three phases of TXL
Transformed parse tree
Output text
Input text
Parse tree
Parse
Transform
Unparse
words
words
blue fish
marlin
word
empty
word
words
word
empty
blue
marlin
fish
3Anatomy of a TXL program
The base grammar defines the lexical forms
(tokens or terminals) and the syntactic forms
(non-terminals).
- Base grammar
- Grammar overrides
- Transformation rules
The optional grammar overrides non-terminal of
the base grammar.
The ruleset defines the set of transformation
rules and functions
4Anatomy of a TXL program
Example
Expr grammar
- Base Grammar
- Grammar overrides
- Transformation rules
include Expr.Grammar redefine expr
exp(number, number))
include Expr-exp.Grammar rule main rule
one rule two
5Specifying Lexical Forms
- Lexical forms specify how the input is
partitionated into tokens. - Predefined defaults include identifiers id
(e.g. ABC, rt789), integer and float number
(e.g. 123, 123.23, 3e22), string string (e.g.
hi there). - The tokens statement gives regular expressions
for each class of token in the input language.
tokens hexnumber 0xX\dABCDEFabcdef
end tokens
Example
6Specifying lexical Forms (contd)
tokens name regular expression end tokens
Regular expression
- Any single char (not , ) not preceded by a \ or
simply represents itself. - Single char patterns ex. \d (digits), \a
(alphabetic char). - Regular expression operators PQR (any one of),
(PQR) (sequence of), P, P, P?.
7Specifying lexical Forms (contd)
keys procedure repeat program end
keys compounds gt lt end
compounds comments / / // end
comments
- The keys specifies that certain identifiers are
to be treated as unique special symbols. - The compounds specifies char seuqences to be
treated as a single terminal. - The comments specifies the commenting conventions
of the input language. By default comments are
ignored by TXL.
8Specifying Syntactic Forms
- The general form of a non-terminal is
- define name
- alternative1 alternative2
alternativeN - end define
- Where each alternative is any sequence of
terminal and non terminal (enclosed in square
brackets). - The special type program describes the
structure of the entire input.
9Specifying Syntactic Forms (contd)
- Extended BNF-like sequence notation
- repeat x sequence of zero or more (X)
- list X comma-separated list
- opt X optional (zero or one)
are equivalent
define statements statement
statement statements end define
define statements repeat statement end
define
10Specifying Syntactic Forms (contd)
key procedure begin end int bool end
key define proc procedure id
forrmalParameters begin
body end end define
- define formalParameters
- (list formalParameter)
- empty
- end define
- define formalParameter
- id type
- end define
- define type
- int bool
- end define
11Ambiguity
- TXL resolves ambiguities by choosing the first
alternative of each non-terminal that can match
the input.
T
T
Example T-language
define T number (T) T
T end define
T
T
2
T
2
2
12Transformation rules
- TXL has two kinds of transformation rules, rules
and functions, which are distinguished by whether
they should transform only one (for functions) or
many (for rules) occurrences of their pattern. - Rules search their scope for the first istance of
their target type matching their pattern,
transform it, and then reapply to the entire
scope until no more matches are found. - Functions do not search, but attempt to match
only their entire scope to their pattern,
transforming it if it matches.
13Rules and function
function 2To42 replace number 2
by 42 end function
2 ----gt 42 3 2 6 2 78 4 2
Rules search the pattern!
rule 2To42 replace number 2
by 42 end rule
2 ----gt 42 3 2 6 2 78 4 2 ----gt 42 6 42 78 4 42
14Searching functions
function 2To42 replace number
2 by 42 end function
2 ----gt 42 3 2 6 2 78 4 2 ----gt 42 6 2 78 4 2
Note change only
15Syntax of rules and functions
Simplified and given in TXL.
rule ruleid repeat formalArgument
repeat construct_deconstruct_where replace
type pattern repeat
construct_deconstruct_where by
replacement end rule
The same for functions!
N.B. If the where-condition is false
the rule can not be applied and the
result is the input-AST.
16Built-in functions
- rule resolveAdd
- replace expr
- N1 number N2 number
- by
- N1 add N2
- end rule
- function add
-
- end function
rule resolveAdd replace expr
N1 number N2 number by
N1 N2 end rule
are equivalent!
17Built-in functions (contd)
- rule sort
- replace repeat number
- N1 number N2 number Rest repeat
number - where
- N1 gt N2
- by
- N2 N1 Rest
- end rule
22 4 2 15 1 ------gt . ------gt 1 2 4 15 22
18Recursive functions
- function fact
- replace number
- n number
- construct nMinusOne number
- n - 1
- where n gt 1
- construct factMinusOne number
- nMinusOne fact
- by
- n factMinusOne
- end function
19Using rule parameters
- rule resolveConstants
- replace repeat statement
- const C id V expr RestOfscope
repeat statement - by
- RestOfScope replaceByValue C V
- end rule
- rule replaceByValue ConstName id Value expr
- replace primary
- ConstName
- by
- (Value)
- end rule
-
Example Const Pi 3.14 Area rrPi
Area rr3.14
20Exercises
- Implementing the T-language (page 11).
- Implementing the Calculator.txl.
- Adding to the expr-grammar the exponential i.e
Exp(x, n). Computing the exponential - - in syntax way ex. Exp(2, 3) ----gt 222
- - in semantic way by means a recursive
function that substitute at Exp(x, n) the correct
value.
21Homework
- Implementing a simple version of
commands-language where commands can be - - assignments i.e. id expr
- - declarations i.e. const id number
- Implementing some transformation rules (page 19)
that substitute in the assignments identifiers
with related values.
Example Const Pi 3.14 Area rrPi
Area rr3.14