YACC no more - PowerPoint PPT Presentation

About This Presentation
Title:

YACC no more

Description:

Lexer, parser and tree recognizer phase have similar syntax. Middle. Session # 2221 ... class LispLexer extends Lexer; ID : ('a' .. 'z') ; NUM: ('0' .. '9' ... – PowerPoint PPT presentation

Number of Views:78
Avg rating:3.0/5.0
Slides: 35
Provided by: andrew1002
Category:
Tags: yacc | lexer | more

less

Transcript and Presenter's Notes

Title: YACC no more


1
YACC no more
Integrating parsers, interpreters and compilers
into your application
  • Sriram Srinivasan (Ram)

2
This is he
  • Sriram Srinivasan
  • One of the core engineers of the WebLogic app
    server
  • Wrote the first commercially available EJB
    implementation
  • Wrote the TP engine in the WLS
  • Author Advanced Perl Programming (Oreilly)

Beginning
3
Why this talk?
  • Quest for higher level programming patterns
  • More productive / faster / maintainable etc
  • Integrating compilers, parsers, interpreters into
    your application

Beginning
4
Embeddable Parsers
Case Study Configuration Data
  • JDK parsers for configuration data
  • java.util.Properties, XML, regex library
  • java.util.Properties
  • Limited to property value format
  • Takes care of comments, multi-line values, quotes

app server properties connectionPoolName
testPool numThreads 10 p new
Properties().load(inputStream)
Middle
5
XML parsers
  • Good for structured, hierarchical data
  • DOM (Document Object Model) parser
  • Converts an entire XML document into a
    corresponding tree of Nodes.
  • SAX (Simple API for XML)
  • Callback class extends DefaultHandler
  • Supplies methods for startDocument(),
    startElement(), endElement() etc.

Middle
6
Adding code to data
  • Problem We want to add add macros and
    expressions to our properties.

numThreads numProcessors Ensure that
connection pool is smaller than thread pool.
connectionPoolSize min(numThreads 2, 1)
  • This requires an expression evaluator

Middle
7
Embeddable interpreters
  • Plethora of free, high quality interpreters
    available
  • BeanShell (Java-like syntax)
  • Rhino (JavaScript)
  • Jython (Python in Java)
  • Kawa (Scheme in Java)
  • When embedded, flow of control easily goes from
    java to interpreter to back.
  • Command-line shell always included

Middle
8
BeanShell
  • Expressions identical to java
  • Types are inferred dynamically

add( a, b ) return a b sum add(1,
2) // 3 str add("Web", "Logic") //
"WebLogic"
Middle
9
Embedding BeanShell
import bsh.Interpreter Interpreter i new
Interpreter() i.set("foo", 5) i.eval("bar
foo10") System.out.println("bar "
i.get("bar"))
  • Instead of writing code to parse the properties
    file, just eval it!
  • Comments should be // , not
  • Each property definition line should end in

i.eval(new FileReader("config.properties")) Integ
er n i.get("connectionPoolSize")
Middle
10
BeanShell features
  • Strict java expression syntax
  • no class declarations
  • Loose convenience syntax

b new java.awt.Button() b.label "Yo" //
eqvt. to b.setLabel("Yo") h new
Hashtable() h"spud" "potato" // Swing
stuff b new JButton("My Button") f new
JFrame("My Frame") f.getContentPane().add(b,
"Center") f.pack() f.show()
Middle
11
Rhino
  • Free ECMAScript interpreter from Mozilla
  • Slightly more cumbersome to embed than BeanShell
  • Contains bytecode compiler that can be called
    from within java
  • Closures
  • Regex support built-in. Good for text
    manipulation

Middle
12
Case study Command pattern
  • Undo/Redo in an editor

function insertCommand(text) this.pos
buf.pos buf.insert(text) this.len
text.length this.undo function ()
buf.moveTo(this.pos) buf.erase(this.len)
undoStack.push(this) new
insertCommand("foo") undoStack.pop().undo()
Middle
13
Python
  • Python (Java implementation is "Jython")
  • powerful high-level language
  • Compiles to bytecode.
  • True scripting language
  • Can extend java classes
  • Static compilation and standalone execution

Middle
14
More case studies
  • Embedded expressions
  • Spreadsheet formulae
  • Customizable GUIs
  • Macro facility, keyboard mapping
  • Remote agents
  • Monitoring
  • Performance through partial evaluation

Middle
15
Case Study Remote Agents
  • Example Test Agents
  • Can upload script to each agent to launch
    processes, control them locally.
  • Jython is well-suited for this kind of task
  • Example Scriptable IMAP mail server
  • "All messages that contain this regex, make a
    copy in this folder"

Middle
16
Case Study Monitoring
  • SNMP model Obtain attributes from each node over
    the network, do calculation
  • Alternatively, upload script to each node, and
    let it return the result
  • Conserves network bandwidth
  • Can insert any kind of probe
  • Study application data structures
  • Application-specific profiling

Middle
17
Case Study Performance
  • Partial evaluation can yield substantial
    performance benefits
  • Object - RDBMS adaptors
  • Code generator studies class and db schema
  • Omits unnecessary conversions, null checks
  • Vector dot product

dp a0b0 a1b1 a2b2 // But
if 'a' is fixed 16,0,4 dp b0 ltlt 4 b2
ltlt 2
Middle
18
Generating java
  • Moving from embedded interpreters to generating
    java source
  • Example JSP.
  • Convert template to java, compile and dynamically
    load
  • BEA/WebLogic's weblogic.dtdc
  • Converts XML DTD to a high performance SAX parser
    tuned to that DTD

Middle
19
Generating code with Doclets
  • javadoc is a general purpose parser
  • javadoc doclet ListClass foo.java
  • ListClass.start() called with a hierarchy of Doc
    nodes

import com.sun.javadoc. public class ListClass
public static boolean start(RootDoc root)
ClassDoc classes root.classes() for
(int i 0 i lt classes.length i)
System.out.println(classesi)
return true
  • Arbitrary tags can be introduced at any level

Middle
20
Case study iContract
  • Pattern doclet expressions converted to
    annotated java code

/ Ensure that argument is always gt 0 _at_pre
f gt 0.0 Ensure that the function produces
the sqrt within a _at_post Math.abs((return
return) - f) lt 0.001 / public float sqrt(float
f) ...
Middle
21
Case Study EJBGen
/ _at_ejbgenentity ejb-name
AccountEJB-OneToMany data-source-name
demoPool table-name Accounts / abstract
public class AccountBean implements EntityBean
/ _at_ejbgencmp-field column acct_id
_at_ejbgenprimkey-field _at_ejbgenremote-metho
d transaction-attribute
Required / abstract public String
getAccountId()
Middle
22
Generating bytecode
  • Example WebLogic RMI adaptors
  • Sometimes, some facilities are available only in
    bytecode (goto's!)
  • Example fast string matching
  • Given a search string, encode the state machine
    into bytecode
  • Worth it if the same pattern is going to be used
    many times
  • Virus scanners
  • Searching genome sequences

Middle
23
Example String matching
  • Problem match "10100"
  • Convert to a state machine
  • Each state encodes a succesful prefix match

Middle
24
String matching (contd.)
  • If only goto were allowed in java
  • But, goto's are allowed in bytecode!

try //buf is the buffer to be searched int i
-1 s0 i if (bufi ! '1') goto s0 s1
i if (bufi ! '0') goto s1 s2 i if
(bufi ! '1') goto s0 s3 i if (bufi !
'0') goto s1 s4 i if (bufi ! '0') goto
s3 s5 i return i-5 catch
(ArrayIndexOutOfBoundsException e) return
-1
Middle
25
String matching (contd.)
  • Using an assembler like jasmin

iconst_m1 istore_1 S0 i if ai
! '1' goto S0 iinc 1 1 i
aload_0 load ai iload_1 caload
bipush 49 load '1' if_icmpne S0 if ..
goto S0 S1 i if ai ! '0' goto S1
iinc 1 1 aload_0 iload_1 caload
bipush 48 if_icmpne S1
Middle
26
Custom languages
  • Craft a language that fits the context you are
    working in
  • Avoid XML ugliness SRML (Simple Rule Markup)
  • Instead of "if s.purchaseAmount gt 100 "

ltsimpleCondition className"ShoppingCart"
objectVariable"s"gt ltbinaryExp
operator"gt"gt ltfield name"purchaseAmount"/gt
ltconstant type"float" value"100"/gt
lt/binaryExpgt lt/simpleConditiongt
Middle
27
Antlr Introduction
  • Antlr A recursive descent parser with
    configurable lookahead (LL(k) parser)
  • Much, much simpler than lex/yacc
  • Yacc error messages are cryptic, tough for non-CS
    types to understand
  • Even generated code easy to understand
  • Includes tree building and recognition
  • No such facility in yacc
  • Lexer, parser and tree recognizer phase have
    similar syntax

Middle
28
Antlr
  • Example hierarchical property list
  • A list consists of name value pairs
  • Names are identifiers, values are numbers or
    lists

( a 200 b (c 10 d 20) )
Middle
29
Antlr (contd.)
class LispLexer extends Lexer ID ('a' ..
'z') NUM ('0' .. '9') LP '(' RP ')'
class LispParser extends Parser list
LP (nameValuePair) RP nameValuePair ID
value value NUM list
Middle
30
Antlr (contd.)
  • Adding code, arguments, return values

nameValuePair returns NVP retnull Object v
tID vvalue ret new
NVP(t.getText(),v) value returns Object
retnull tNUM rett.getText()
retlist
Middle
31
Way out there
  • Configurable hardware
  • New circuits on the fly
  • Intentional programming
  • Code not represented as a stream of characters

Middle
32
Summary
  • Run-time evaluation gives you a lot of power
  • Other languages add features (e.g. closures) to
    java
  • Lots of simple, free, quality parsers,
    interpreters
  • Produce custom java source or byte code for
    performance
  • Roll your own domain-specific language with ANTLR
    or javacc.
  • Yacc No More.

End
33
References
  • Doclets
  • Doclet tools www.doclet.com
  • EJBGen www.beust.com, Cedric Beust
  • Icontract www.reliable-systems.com, Reto Kramer
  • Languages, interpreters
  • Beanshell www.beanshell.org
  • Rhino www.mozilla.org/rhino
  • Python www.python.org, www.jython.org
  • ANTLR www.antlr.org
  • More flp.cs.tu-berlin.de/tolk/vmlanguages.html
  • SRML xml.coverpages.org/srml.html

End
34
References (contd.)
  • Bytecode manipulation
  • Jasmin mrl.nyu.edu/meyer/jasmin/
  • Jikes Bytecode toolkit www.alphaworks.ibm.com/tec
    h/jikesbt
  • BCEL bcel.sourceforge.net
  • "Rapid" - Reconfigurable hardware
  • www.cs.washington.edu/research
  • "The death of computer languages, the birth of
    intentional programming", Charles Simonyi
  • research.microsoft.com/scripts/pubs/trpub.asp
  • Microsoft tech report MSR-TR-95-52
  • Thinking in Patterns with Java, Bruce Eckel
  • www.mindview.net/Books/TIPatterns

End
Write a Comment
User Comments (0)
About PowerShow.com