Title: Object-Oriented Parsing and Transformation
1Object-Oriented Parsing and Transformation
- Kenneth Baclawski
Northeastern University - Scott A. DeLoach Air Force Institute
of Technology - Mieczyslaw Kokar
Northeastern University - Jeffrey Smith Northeastern
University/Sanders
2Why Formalize CASE Tools?
- Formal Methods
- Provably correct software
- Code generation
- Specification refinement
- Theorem proving
- Specification and software composition
- CASE Tools
- Uniform graphical interface
- Modern SE methodologies
- Reverse engineering
- Large-scale development paradigm
3The Problem
- Refinement is the process of transforming one
specification to a more detailed specification. - CASE tools commonly support OO Analysis and
Design, but refinement is still based on grammars
and parse trees.
4Proposed Solution
- We introduce a toolkit for OO refinement and
transformation. - The toolkit also automates the generation of
grammars and parsers when it is necessary to use
linear (grammar-based) representations.
5Examples Web Documents
- Web Documents.
- An OO data model can be transformed in an
automated way to an XML DTD. - An OO repository can be viewed as an XML document
using a variety of panoramas. - The parser for the DTD can also be produced in an
automated way.
6Examples Natural Language
- Traditional NLP techniques involve a pipeline
of linear scans of the text. - Lexical scanning to produce tokens.
- Tagging determines the part of speech of terms.
- Parsing determines a tree structure.
- Knowledge extraction maps the tree structure to a
data model (usually a relational data model). - OO transformation avoids the need for generating
and parsing intermediate linear representations.
7Example UML Formalization
- Formal Methods can provide a foundation for
specification and modeling. - However, formal methods are regarded as difficult
to learn and to use. - Combining a CASE tool with a formal methods
system would make formal methods more accessible
and usable.
8Theory-Based Object Model
- UML Component
- sort
- class type
- class sort
- abstract class
- concrete class
- attribute
- object-valued
- attribute
- method
- operation
- axiom
- state attribute
- state sort
- state invariant
- event
- Meaning
- collection of values
- structure of object and response to stimuli
- all possible value representations of objects of
the class - class with no direct instances
- blueprint for instances
- function that returns data values/objects -
observable class characteristic - class attribute whose sort is a set of objects
- function that modifies attribute values
- function that does not modify attribute values
- class attribute value invariant or specification
of a functions semantics - function mapping from class to state sort
- all possible states of an object
- constraint on class attribute in a given state
- function that invokes methods, generates events
and modifies state attributes
9Component Composition
An important feature of the theory-based object
model is the ability to compose components using
the colimit operation. The following diagram
illustrates the use of the colimit for
aggregation of account information for a customer
of a bank.
Integer
Set
Set
Set
E ? CA-Link, Set ? Cust-Acct
E ? Acct, Set ? Acct-Class
E ? Customer, Set ? Cust-Class
E ? Account, Set ? Accounts
E ? Customer, Set ? Customers
Acct-Class
Cust-Acct
Cust-Class
C
C
C
Bank
10Grammars versus OO Models
- Expressing an OO model in terms of a grammar is
complex and awkward. - Many-to-many relationships require introducing
artificial identifiers. - Object sharing in general requires identifiers.
- A focal point must be chosen.
- Web documents add the additional complexity of
choosing document boundaries.
11Example
takes
Student
Course
Student as focal point List of students each
student has the list of courses being taken by
the student. Is the course information replicated
for each student or is an identifier used?
Where does the information about the course get
expressed?
Course as focal point List of courses each
course has the list of students who are taking
the course. Is the student information replicated
for each course or is an identifier used? Where
does the information about the student get
expressed?
12The Transformation Pipeline
- Refinement and transformation are usually
modularized into a series of steps. - In the grammar-based approach, each step
communicates with the next using a linear
representation which requires - grammar
- parser
- symbol table
- generator
13Pipeline Example
CASE Diagram
Export Format
Parse Tree
Intermediate Structure
Parse Tree
Object Model Language
Object Model Structure
Formal Methods Language
Parse Tree
Formal Methods System
Programming Language
Parse Tree
Intermediate Structure
Intermediate Code
Executable Code
Most of the effort in construction such a
pipeline is devoted to adapting to the needs of
the grammar-based intermediate representations.
14Simplifying the Pipeline
The nu toolkit was introduced to simplify the
transformational pipeline by specifying
transformations directly on the OO data
structures
CASE Diagram
Intermediate Structure
Object Model Structure
Formal Methods Structure
Programming Language Intermediate Structure
15Conclusion
- Grammar-based refinement requires a great deal of
unnecessary effort which is only partly mitigated
by attribute grammars and support tools. - Direct OO refinement and transformation is much
simpler and less error-prone. - Unfortunately, this particular paradigm shift has
yet to occur in the refinement community.
16Future Directions
- Complete the formalization of UML.
- Development of nu into a full-featured system
for object-oriented refinement and
transformation. - Application of formal methods (via CASE tools)
for component composition, reusable components
and self-adaptive systems.