Erlang%20refactoring%20with%20relational%20database - PowerPoint PPT Presentation

About This Presentation
Title:

Erlang%20refactoring%20with%20relational%20database

Description:

MySQL. Distel. Emacs. ODBC (?) Code , AST. Storing the code in database ... call the MySQL database directly from the Erlang node (a mysql module was ... – PowerPoint PPT presentation

Number of Views:43
Avg rating:3.0/5.0
Slides: 28
Provided by: Bub1
Category:

less

Transcript and Presenter's Notes

Title: Erlang%20refactoring%20with%20relational%20database


1
Erlang refactoring with relational database
  • Anikó Víg and Tamás Nagy
  • Supervisors
  • Zoltán Horváth and Simon Thompson
  • Project members László Lövei, Tamás Kozsik

2
Refactor
  • Refactoring is restructuring program code without
    altering its external behaviour
  • Aims restructuring of code, code quality
    improvement, coding conventions, optimization,
    migrating to new API
  • In general not available for functional
    languages, only HaRe (AST) and Clean refactoring
    (relational database)

3
Erlang
  • Functional programming language and runtime
    environment developed by Ericsson
  • Designed to build distributed, reliable, soft
    realtime concurrent systems (telecommunication)
  • Highly dynamic nature
  • Lightweight processes and message passing
  • Messages can be sent to a process ID or
    registered name,which are bound to function code
    at runtime.
  • Possibility of running dynamically created code
    (eval, hot code replacement).

4
Erlang
  • Function id-s are atoms, atoms may be constructed
    runtime (and passed as arguments to apply,spawn).
    The syntactical category of an atom may be a
    function or string, etc.
  • Side effects are restricted to message passing
    and built-in functions
  • Modules with explicit interface definitions and
    static export and import lists
  • No static type system
  • Variables are assigned a value only once in their
    life
  • Variables are not typed statically, they can have
    a value of any data type.

5
The refactor tool
Distel
Erlang node
MySQL
ODBC (?)
Emacs
Source in the database
Source code In Emacs
AST of the source
6
Code , AST
7
Storing the code in database
  • Every node in a tree has a unique identifier.
  • Every module has a unique module identifier.
  • Almost every syntax-tree node type has an own
    database table
  • The records in the tables contain the identifiers
    of the current nodes and their childrens ones.
  • We store the positions, node types and names in
    separate tables.

8
Semantic informations
  • We store the semantic informations in separate
    tables identical variables, function definitions
    and their calling expressions, scope of the
    nodes, hierarhy of scopes
  • AST semantic informations graph, not tree.
  • Searching and using a graph is more efficient
    with relational database as traversing the tree.
  • - The storing-recovering of the code and
    communication with the database is more expensive

9
Our own tables visibility of variables
10
Our own tables function calls
11
Our own tables visibility of scopes
12
Problems during the building up
  • The Erlang prepocessor substitutes the macro
    definitions using epp_dodger instead of epp
  • Every node has only the line number of position
    informations using erl_scan1 (modified by
    Huiqing) instead of erl_scan it can give back
    the column information too

13
Problems during the building up
  • In Erlang language there are many types of
    comments we have not only comments, but pre- and
    postcomments too, which can be list of comment
    nodes.
  • The erl_comment_scanfile collects the comments
    from the file, and we have to put them too with
    the correct position information into the
    database.
  • There is the same problem with the column
    infomation so we use erl_recomment1 instead of
    erl_recomment.

14
The algorithms
  • We use postorder traverse on the syntax tree to
    give identifiers and put the nodes into the
    correct table.
  • We use preorder traverse on the syntax tree to
    get the visibility information.
  • The other information come from the database
    (collected by separate processes), for example
    the information of the function callings.

15
Analysis for Refactoring Steps
  • Syntax analysis (AST)
  • Static semantics (Annotated AST or relational
    database) scope, visibility, binding structure,
    type information.
  • Side-condition analysis
  • Compensations
  • Dynamic function calls (apply, spawn, etc.)
  • Syntactical, semantical and library coverage

16
Rename variable
  • Definition Find every occurrence of the variable
    (i.e. the variables with the same name in the
    visibility range of the variable) and replace
    every occurrence with the new name.
  • Precondition The new variable name is not
    visible at any occurrence of the variable
  • Limited to one module (no global variables)

17
Rename function
  • Definition The refactoring rely on finding the
    definition and every place of call for a given
    function and substitute it with a new name.
  • Preconditions
  • No name clash in the current module (existing
    functions, import list)
  • No name clash in other modules, if the function
    is exported

18
Reorder arguments
  • Definition Change the order of arguments in the
    same way at the definition and every place of
    call for a given function.
  • Preconditions
  • No side effect of the parameters (just planned)
  • Tricky implicit function calls delete,
    create subtree (the same problem will be at the
    tuple arguments refactor step too)

19
Implicit function example
20
Tuple arguments
  • Definition Change the way of using some
    arguments at the definition and at every place
    of call for a given function by grouping some
    arguments into one tuple argument.
  • Preconditions
  • The given position must be within a formal
    argument of a function definition
  • The function must be a declared function, not a
    fun-expression
  • The given number must not be too large
  • No name clash if the arity is changing (not only
    in the current module if the function is exported)

21
Eliminate variable
  • Definition All instances of a variable are
    replaced with its bound value in that region
    where the variable is visible. The variable can
    be left out where its value is not used.
  • Preconditions
  • It has exactly one binding occurrence on the left
    hand side of a pattern matching expression, and
    not a part of a compound pattern.
  • The expression bound to the variable has no side
    effects.
  • Every variable of the expression is visible (that
    is, not shadowed) at every occurrence of the
    variable to be eliminated.

22
Eliminate variable cont.
  • Decide if an occurrence is needed (remove or
    replace). Remove if
  • Not at the end of block expression
  • Not at the end of clause body
  • Not at the end of the recieve expression action
  • Not at the end of try expression body, after
    branch and handler
  • Replicate subtree, because we need unique id-s in
    the new subtrees. The subtree can contain every
    node type, we had to implement the most of the
    syntax tool module for database representation.

23
Planned refactor steps (short term)
  • Merge subexpression duplicates All instances of
    the same subexpressions are stored in a variable
    that the user gives, then all instances of the
    original subexpression are changed to the
    variable.
  • Extract function An alternative of a function
    definition might contain a sequence of
    expressions which can be considered as a logical
    unit, hence a function definition can be created
    from it. The extracted function is lifted to the
    module level, and it is parameterised with the
    variables that the expressions depend on. The
    sequence of expressions is replaced with a
    function call expression.

24
Planned refactor steps (middle term)
  • Tuple to record
  • Specialisation of functions
  • Generalisation of functions
  • Fusion of functions
  • Modification of data structures

25
Fusion and comparison
  • We are working together with the University of
    Kent. They released the Wrangler Erlang
    refactorer in January 2007. Their approach is
    working with annotated abstract syntax trees
    without database. We plan to make a common
    version with more refactorings.
  • The two tools have only two common refactorings
    (rename variable, rename function). The two tools
    gave the same result on bigger testbases too.

26
Testing
  • We tested the tool with more than 200 little test
    cases, which are made to cover the possible
    branches of the refactorings. The tool gave the
    same result as the original result files.
  • We tested the tool on a real bigger codebase.
  • Our version was slower at the moment as the AAST
    approach at big multi module systems and huge
    source files.

27
Future work
  • We plan to make the fusion of the two tools at
    first just under a common umbrella and later
    with a common interface level
  • Compare the two approaches with more complicated
    refactorings (generalisation)
  • We plan to eliminate the ODBC connection and call
    the MySQL database directly from the Erlang node
    (a mysql module was released for Erlang in the
    middle of January)
  • Expand the number of the refactorings
Write a Comment
User Comments (0)
About PowerShow.com