Perl 6 Update PGE and Pugs - PowerPoint PPT Presentation

About This Presentation
Title:

Perl 6 Update PGE and Pugs

Description:

Perl 6 Update - PGE and Pugs. Dr. Patrick R. Michaud. April 26, 2005. Rules and Grammars ... Pugs. Perl 6 compiler written in Haskell. Started by Autrijus Tang ... – PowerPoint PPT presentation

Number of Views:76
Avg rating:3.0/5.0
Slides: 32
Provided by: patrickr3
Category:
Tags: pge | perl | pugs | update

less

Transcript and Presenter's Notes

Title: Perl 6 Update PGE and Pugs


1
Perl 6 Update - PGE and Pugs
  • Dr. Patrick R. Michaud
  • April 26, 2005

2
Rules and Grammars
  • Perl 6 completely redesigns the regular
    expression syntax
  • Regular expressions are now "rules"
  • Rules can call/embed other rules
  • Groups of rules can be combined into Grammars

3
Current events in Perl 6
  • Parrot 1.2 released
  • The Perl Foundation receives 25,000 for
    completion of Parrot milestones
  • New Parrot pumpking - Chip Salzenburg
  • New version of Parrot Grammar Engine (PGE / Perl
    6 rules) to be released this week
  • Pugs - Autrijus Tang
  • Perl 6 test suite

4
Pugs
  • Perl 6 compiler written in Haskell
  • Started by Autrijus Tang
  • Compiles directly to Haskell or to Parrot AST
  • Being used to develop Perl 6 tests and experiment
    with Perl 6 design
  • Available at http//pugscode.org
  • Discussion on perl6-compiler_at_perl.org mailing list

5
Perl 6 rules / Parrot Grammar Engine
  • The heart of the Perl 6 compiler is the
    Perl/Parrot Grammar Engine (PGE)
  • Implements the Perl 6 rules syntax, compiles to
    Parrot code
  • Perl 6 rules compiler currently written in C
  • Bootstrap to Perl 6

6
Steps to Perl 6 compiler
  • Finish PGE bootstrap in C
  • Parse p6 "rule" statements and grammars
  • Use p6 rules to define the Perl 6 grammar
  • P6 grammar can be used to generate Parrot
    abstract syntax trees from Perl 6 programs
  • Compile, (optimize), execute the abstract syntax
    tree to get working Perl 6 program
  • Use Perl 6 to rewrite the grammar engine in Perl
    6 (faster)

7
Current state of PGE
  • Handles concatenation, alternation, quantifiers,
    captures, subpatterns, subrules
  • Capture semantics redefined in Dec 2004, still
    not final
  • To be added next
  • Character classes (note Unicode)
  • Patterns containing scalars, arrays, hashes

8
P6 rule syntax
  • Changes from perl 5
  • No more trailing /e, /x, /s options
  • ... denotes non-capturing groups
  • and are beginning/end of string
  • and are beginning/end of line
  • . matches any character, including newline
  • \n and \N match newline/non-newline
  • marks a comment (to end of line)
  • Quantifiers are , , ?, and m..n

9
Character classes
  • aeiou changed to
  • 0-9 now
  • Properties defined as
  • Combine classes using /- syntax
  • -aeiou

10
Subrules
  • Patterns are now called "rules"
  • Analogous to subroutines and closures
  • Like ..., /.../ compiles into a "rule"
    subroutine
  • P6 rule statement allows named rules
  • rule ident / _ \w /
  • Named rules can be easily used in other rules
  • m / \ (.) /
  • rule expr / /

11
Interpolation
  • Variables no longer interpolate directly, thus
  • / var /
  • matches the contents of var literally, even if
    it contains rule metacharacters. (No \Q and \E)
  • To treat var as a rule, use
  • / /
  • Interpolated arrays match as an alternation
  • / _at_cmds /
  • / _at_cmds0 _at_cmds1 _at_cmds2 ... /

12
Interpolation, cont'd
  • Hashes match the keys of the hash, and the value
    of the hash is either
  • Executed if it is a closure
  • Treated as a subrule if it's a string or rule
    object
  • Succeeds if value is 1
  • Fails for any other value
  • Useful for parsed languages
  • rule expr / infixop ? /

13
  • The introduce various forms of metasyntax
  • A leading alphabetic character indicates a
    subrule or grammatical assertion
  • A leading ! negates the match

14
  • Leading ' matches a literal string
  • Leading " matches an interpolated string
  • Leading '' or '-' are character classes
  • / /

15
  • Leading '(' indicates code assertion
  • /(\d1..3) /
  • (fail if 1 is not less than 256)
  • A , _at_, or indicates a variable subrule, where
    each value (or key) is a subrule to be matched

16
A cool and somewhat scary example
  • cmd'\d' say "You entered a number"
  • cmd'hello' say "world"
  • cmd'print \s (.)' say 1
  • cmd'exit' exit()
  • while IN
  • // say "Unrecognized command"

17
Backtracking control
  • Single colons skip previous atom
  • m/ \( , \) /
  • (if we don't find closing paren, no point in
    trying to match fewer s)
  • Two colons break an alternation
  • mw/ if
  • for
  • loop ?
  • (once we've found "if", "for", or "loop", no
    point in trying the other branches of the
    alternation)

18
Backtracking control
  • Three colons () fail the current rule
  • The assertion fails the entire match
    (including any rules that called the current
    rule)
  • The assertion matches successfully, removes
    the matched portion of the string up to the
    , and if backtracked over fails the match
    entirely
  • Useful for throwing away successfully processed
    input when matching from an input stream
  • Like, say, when writing a compiler -)

19
Backslash
  • \L, \U, \Q, \E, \A, \z gone from rules
  • \n and \N match newline/not newline
  • \s matches any Unicode space
  • backreferences are gone, use 1, 2, 3
    (non-interpolated)
  • Perl 6 allows defining custom backslash sequences
    for use in rules

20
Closures
  • Anything in curlies is executed as a Perl 6
    closure
  • / (\w) say "Got 1" /

21
Capture semantics
  • Captures are different in Perl 6
  • The result of a match is a "match object"
  • If a match succeeds, the match object has
  • Boolean value true
  • Numeric value 1 (except for global matches)
  • String value the matched substring
  • Array component is matched subpatterns
  • Hash component is matched subrules

22
Subpattern captures
  • Part of a rule in parenthesis is a subpattern
  • Each subpattern produces its own match object
  • /Scooby (dooby) (doo)!/
  • 1 2
  • Quantified subpatterns produce arrays of match
    objects
  • /Scooby (\w \s) (doo)!/
  • 1 2
  • 1 is a (possibly empty) array of matches

23
Non-capturing groups
  • Brackets do not capture, thus they don't result
    in a match object
  • /Scooby (\w \s) (doo) !/
  • 1 2
  • Quantified brackets replace nested subpatterns
    with the last component matched
  • /Scooby (\w \s) (doo) !/
  • 1 2

24
Nested capturing subpatterns
  • Each capturing subpattern introduces a new
    lexical scope, with nested captures inside the
    new match object
  • /Scooby ( (\w \s) (doo) ) !/
  • 10 11
  • ---------

25
Alternations
  • Alternations introduce a new lexical scope, thus
    subpatterns restart counting at zero for each
    alternative branch (unlike p5)
  • 1 2
  • m/ Scooby (dooby) (doo)!
  • Yabba (dabba) (doo) /
  • 1 2
  • This avoids lots of empty subpatterns when an
    alternation doesn't match.

26
Subrules
  • Subrules capture into a hash keyed by the name of
    the subrule
  • rule ident / _ \w /
  • rule num / \d /
  • m/ \ /
  • places match objects into and

27
Quantified subrules
  • Like subpatterns, quantified subrules produce
    arrays of matches
  • mw / dir /
  • produces matches in 0, 1, etc.
  • Nested parens in a subrule capture to the
    subrule's match object

28
Named captures
  • Portions of a match can be captured directly into
    a match object without a subrule
  • mw/ \w , \d /
  • captures the first sequence of alphanumerics
    into , and digits following the comma into
    .

29
Grammars
  • Rules can be packaged together into separate name
    spaces to form Grammars
  • grammar Perl6
  • rule ident ...
  • rule term ...
  • rule expr ...

30
parsetree
  • The parsetree flag to a rule causes the grammar
    engine to keep all information about a match.
  • Thus, one can do something like
  • parse (source Perl6program)
  • to get the entire parsetree for a program
    (including comments)

31
Questions?
Write a Comment
User Comments (0)
About PowerShow.com