TweaXML - PowerPoint PPT Presentation

1 / 14
About This Presentation
Title:

TweaXML

Description:

Moreover, it provides a way to create output files containing this data in user-defined formats. ... file operators (open, create, print, close) ... – PowerPoint PPT presentation

Number of Views:61
Avg rating:3.0/5.0
Slides: 15
Provided by: kk19
Category:
Tags: tweaxml | create

less

Transcript and Presenter's Notes

Title: TweaXML


1
TweaXML
A Language to manipulate extract data from XML
files
Kaushal Kumar (kk2457) Srinivasa Valluripalli
(sv2232)
2
Contents
  • Overview and motivation
  • Language features
  • XML handling functionalities
  • Architectural Design
  • Tutorial (with example)
  • Lessons learned
  • Summary

3
Overview and Motivation
  • TweaXML is a language to parse and extract data
    from XML files and create new csv/txt files
    in user defined data-formats.
  • XML is a universal language and is used to pass
    data around between heterogeneous systems.
  • (But) Parsing an XML file programmatically is
    not straightforward.
  • To parse an XML file
  • First you need to learn Java (for example)
  • Then learn APIs like DOM-Parser and SAX-Parser.
  • These API-usage can be too complicated.
  • TweaXML provides a much simpler language to
    parse XML files. Moreover, it provides a way to
    create output files containing this data in
    user-defined formats.

4
Language Features
  • Carefully chosen set of keywords
  • Multiple Types (int, string, node, file, array)
  • Several Operators
  • Unary Operators (, !)
  • Arithmetic Operators (, -, , /)
  • Comparison (lt, lt, gt, gt, , !)
  • Logical Operators (, )
  • node operators (getchild, getvalue)
  • file operators (open, create, print, close)
  • inbuilt functions (add, subtract, multiply,
    divide, length)

5
Language Features (cont)
  • various types of statements
  • Conditional statements (if else)
  • Iterative statements (while)
  • jump statements (return, continue, break)
  • I/O statements (open, create, print, close)
  • inbuilt function calls (add, subtract, multiply,
    divide, length)

6
XML Handling functionalities
  • Open an XML file to read (open)
  • returns the root node of the xml file
  • Get the child nodes of a node, using the xpath
    of the child-nodes (getchild)
  • returns an array of child-nodes
  • Get the length of the child nodes array (length)
  • Get the value of a node (getvalue)
  • returns the value of the node in string format
  • add the values of two nodes (add)
  • implicit checks of data types
  • subtract the values of two nodes (subtract)
  • multiply the values of two nodes (multiply)
  • divide the values of two nodes (divide)

7
File Handling functionalities
  • Create an output file to write (create)
  • returns the file type
  • Write in the file (print)
  • close the output file once you are done (close)

8
Architectural Design
Front end (TweaXMLLexer TweaXMLParser)
Tree Walker (TweaXmlWalker TweaXmlCodeGen)
Back End (CodeGen.java)
Run time Libraries (Apaches DOM Parser)
9
Tutorial - Example
(A tweaxml program to extract students
performance data and create a csv file with the
average marks of each student)
Input XML file (marks_data.xml)
ltstudentsgt ltstudentgt ltnamegtkaushallt/namegt ltho
mework1gt85lt/homework1gt lthomework2gt85lt/homework2gt
ltmidtermgt70lt/midtermgt ltfinalgt90lt/finalgt lt/st
udentgt ltstudentgt ltnamegtSrinilt/namegt lthomework
1gt80lt/homework1gt lthomework2gt85lt/homework2gt ltmi
dtermgt87lt/midtermgt ltfinalgt95lt/finalgt lt/studentgt
lt/studentsgt
10
Tweaxml program
start() file output node rootNode output
create "AvgMarks.csv" rootNode open
"marks_data.xml" node studentNodes student
Nodes getchild rootNode "student" int
len len length studentNodes if(len gt
0) int j j0 while(j lt
len) node nameNode, homework1Node,
homework2Node, midtermNode,
finalNode string name,
homework1Marks, homework2Marks, midtermMarks,
finalMarks nameNode getchild
studentNodesj "name" homework1Node
getchild studentNodesj "homework1" homework2
Node getchild studentNodesj
"homework2" midtermNode getchild
studentNodesj "midterm" finalNode
getchild studentNodesj "final"
11
name getvalue nameNode0 homework1Ma
rks getvalue homework1Node0 homework2Marks
getvalue homework2Node0 midtermMarks
getvalue midtermNode0 finalMarks getvalue
finalNode0 string totalMarks totalMa
rks add homework1Marks homework2Marks totalM
arks add totalMarks midtermMarks totalMarks
add totalMarks finalMarks string
avgMarks avgMarks divide totalMarks
"4" print output name print output
"\t" print output avgMarks print output
"\n" j j 1 close output
12
Output
Output file (AvgMarks.csv)
kaushal 82.5 Srini 86.75
13
Lessons Learned
  • Start early on the project
  • More functionalities could have been added
  • More data types could have been provided
  • User defined functions could have been added

14
Summary
  • TweaXML provides an easier way to deal with xml
    files.
  • Data can be extracted and written out in
    user-defined formats.
  • No need to learn APIs like DOMParser and
    SAXParser
  • Its not perfect, but its highly useful.
  • More functionalities could have been provided if
    given more time.
Write a Comment
User Comments (0)
About PowerShow.com