Refactoring Functional Programs - PowerPoint PPT Presentation

1 / 77
About This Presentation
Title:

Refactoring Functional Programs

Description:

Slice function for a component of its result. Error handling ... Use the Haddock parser ... full Haskell given in 500 lines of data type definitions. ... – PowerPoint PPT presentation

Number of Views:34
Avg rating:3.0/5.0
Slides: 78
Provided by: thom214
Category:

less

Transcript and Presenter's Notes

Title: Refactoring Functional Programs


1
Refactoring Functional Programs
  • Simon Thompson
  • with
  • Huiqing Li
  • Claus Reinke
  • www.cs.kent.ac.uk/projects/refactor-fp

2
Session 2
3
Overview
  • Review mini-project.
  • Implementation of HaRe.
  • Larger-scale examples.
  • Case study.

4
Mini-project feedback
  • Refactorings performed.
  • Refactorings and language features?
  • Machine support feasible? Useful?
  • Not-quite refactorings? Support possible here?

5
Examples
  • Argument permutations (NB partial application).
  • (Un)group arguments.
  • Slice function for a component of its result.
  • Error handling / exception handling.

6
More examples
  • Introduce type synonym, selectively.
  • Introduce branded type.
  • Modify the return type of a function from T to
    Maybe T, Either T S, T.
  • Ditto for input types and modify variable names
    correspondingly.

7
Implementing HaRe
8
Proof of concept
  • To show proof of concept it is enough to
  • build a stand-alone tool,
  • work with a subset of the language,
  • pretty print the results of refactorings.

9
or a useful tool?
  • Integrate with existing program development
    tools stand-alone program links to editors emacs
    and vim, any other IDEs also possible.
  • Work with the complete language Haskell 98?
  • Preserve the formatting and comments in the
    refactored source code.
  • Allow users to extend and script the system.

10
The refactorings in HaRe
  • Rename
  • Delete
  • Lift / Demote
  • Introduce definition
  • Remove definition
  • Unfold
  • Generalise
  • Add / remove params

Move def between modules Delete /add to
exports Clean imports Make imports explicit Data
type to ADT
All these refactorings are module aware.
11
The Implementation of HaRe
Information gathering
Pre-condition checking
Program transformation
Program rendering
12
Information needed
  • Syntax replace the function called sq, not the
    variable sq parse tree.
  • Static semantics replace this function sq, not
    all the sq functions scope information.
  • Module information what is the traffic between
    this module and its clients call graph.
  • Type information replace this identifier when it
    is used at this type type annotations.

13
Infrastructure decisions
  • Build a tool that can interoperate with emacs,
    vim, yet act separately.
  • Leverage existing libraries for processing
    Haskell 98, for tree transformation as few
    modifications as possible.
  • Be as portable as possible, in the Haskell space.
  • Abstract interface to compiler internals?

14
Haskell landscape (end 2002)
  • Parser many
  • Type checker few
  • Tree transformations few
  • Difficulties
  • Haskell 98 vs. Haskell extensions.
  • Libraries proof of concept vs. distributable.
  • Source code regeneration.
  • Real project

15
Programatica
  • Project at OGI to build a Haskell system
  • with integral support for verification at
    various levels assertion, testing, proof etc.
  • The Programatica project has built a Haskell
    front end in Haskell, supporting syntax, static,
    type and module analysis
  • freely available under BSD licence.

16
The Implementation of HaRe
Information gathering
Pre-condition checking
Program transformation
Program rendering
17
First steps lifting and friends
  • Use the Haddock parser full Haskell given in
    500 lines of data type definitions.
  • Work by hand over the Haskell syntax 27 cases
    for expressions
  • Code for finding free variables, for instance

18
Finding free variables by hand
  • instance FreeVbls HsExp where
  • freeVbls (HsVar v) v
  • freeVbls (HsApp f e)
  • freeVbls f freeVbls e
  • freeVbls (HsLambda ps e)
  • freeVbls e \\ concatMap paramNames ps
  • freeVbls (HsCase exp cases)
  • freeVbls exp concatMap freeVbls cases
  • freeVbls (HsTuple _ es)
  • concatMap freeVbls es
  • etc.

19
This approach
  • Boilerplate code 1000 lines for 100 lines of
    significant code.
  • Error prone significant code lost in the noise.
  • Want to generate the boiler plate and the tree
    traversals
  • DriFT Winstanley, Wallace
  • Strafunski Lämmel and Visser

20
Strafunski
  • Strafunski allows a user to write general (read
    generic), type safe, tree traversing programs,
    with ad hoc behaviour at particular points.
  • Top-down / bottom up, type preserving / unifying,

full
stop
one
21
Strafunski in use
  • Traverse the tree accumulating free variables
    from components, except in the case of lambda
    abstraction, local scopes,
  • Strafunski allows us to work within Haskell
  • Other options? Generic Haskell,
    Template Haskell, AG,

22
Rename an identifier
  • rename (Term t)gtPName-gtHsName-gtt-gtMaybe t
  • rename oldName newName applyTP worker
  • where
  • worker full_tdTP (idTP adhocTP
    idSite)
  • idSite PName -gt Maybe PName
  • idSite v_at_(PN name orig)
  • v oldName
  • return (PN newName orig)
  • idSite pn return pn

23
The coding effort
  • Transformations straightforward in Strafunski
  • the chore is implementing conditions that the
    transformation preserves meaning.
  • This is where much of our code lies.

24
Move f from module A to B
  • Is f defined at the top-level of B?
  • Are the free variables in f accessible within
    module B?
  • Will the move require recursive modules?
  • Remove the definition of f from module A.
  • Add the definition to module B.
  • Modify the import/export lists in module A, B and
    the client modules of A and B if necessary.
  • Change uses of A.f to B.f or f in all affected
    modules.
  • Resolve ambiguity.

25
The Implementation of HaRe
Information gathering
Pre-condition checking
Program transformation
Program rendering
26
Program rendering example
  • -- This is an example
  • module Main where
  • sumSquares x y sq x sq y
  • where sq Int-gtInt
  • sq x x pow
  • pow 2 Int
  • main sumSquares 10 20
  • Promote the definition of sq to top level

27
Program rendering example
  • module Main where
  • sumSquares x y
  • sq pow x sq pow y where pow 2 Int
  • sq Int-gtInt-gtInt
  • sq pow x x pow
  • main sumSquares 10 20
  • Using a pretty printer comments lost and layout
    quite different.

28
Program rendering example
  • -- This is an example
  • module Main where
  • sumSquares x y sq x sq y
  • where sq Int-gtInt
  • sq x x pow
  • pow 2 Int
  • main sumSquares 10 20
  • Promote the definition of sq to top level

29
Program rendering example
  • -- This is an example
  • module Main where
  • sumSquares x y sq pow x sq pow y
  • where pow 2 Int
  • sq Int-gtInt-gtInt
  • sq pow x x pow
  • main sumSquares 10 20
  • Layout and comments preserved.

30
Token stream and AST
  • White space and comments in the token stream.
  • Modification of the AST guides the modification
    of the token stream.
  • After a refactoring, the program source is
    extracted from the token stream not the AST.
  • Heuristics associate comments with program
    entities.

31
Production tool
Programatica parser and type checker
Refactor using a Strafunski engine
Render code from the token stream and syntax tree.
32
Production tool (optimised)
Programatica parser and type checker
Refactor using a Strafunski engine
Render code from the token stream and syntax tree.
Pass lexical information to update the syntax
tree and so avoid reparsing
33
What have we learned?
  • Emerging Haskell libraries make it practical(?)
  • Efficiency and robustness
  • type checking large systems,
  • linking,
  • editor script languages (vim, emacs).
  • Limitations of editor interactions.
  • Reflections on Haskell itself.

34
Refactoring
  • Refactoring comes in many forms
  • micro refactoring as a part of program
    development,
  • major refactoring as a preliminary to revision,
  • dealing with legacy code,
  • as a part of debugging, understanding,

35
Reflections on Haskell
  • Cannot hide items in an export list (cf import).
  • Field names for prelude types?
  • Scoped class instances not supported.
  • Ambiguity vs. name clash.
  • Tab is a nightmare!
  • Correspondence principle fails

36
Correspondence
  • Operations on definitions and operations on
    expressions can be placed in one to one
    correspondence
  • (R.D.Tennent, 1980)

37
Correspondence
  • Definitions
  • where
  • f x y e
  • f x
  • g1 e1
  • g2 e2
  • Expressions
  • let
  • \x y -gt e
  • f x if g1 then e1 else if g2

38
Function clauses
  • f x
  • g1 e1
  • f x
  • g2 e2
  • Can fall through a function clause no direct
    correspondence in the expression language.
  • f x if g1 then e1 else if g2
  • No clauses for anonymous functions no reason to
    omit them.

39
Work in progress
  • Fold against definitions find duplicate code.
  • All, some or one? Effect on the interface
  • f x e e
  • Traditional program transformations
  • Short-cut fusion
  • Warm fusion

40
Where next?
  • Opening up to users API or little language?
  • Link with other IDEs (and front ends?).
  • Detecting bad smells.
  • More useful refactorings supported by us.
  • Working without source code.

41
API
Refactorings
Refactoring utilities
Strafunski
Haskell
42
DSL
Combining forms
Refactorings
Refactoring utilities
Strafunski
Haskell
43
Larger-scale examples
  • More complex examples in the functional domain
    often link with data types.
  • Dawning realisation that can some refactorings
    are pretty powerful.
  • Bidirectional no right answer.

44
Algebraic or abstract type?
data Tr a Leaf a Node a (Tr a) (Tr a)
flatten Tr a -gt a flatten (Leaf x)
x flatten (Node s t) flatten s flatten
t
Tr Leaf Node
45
Algebraic or abstract type?
Tr isLeaf isNode leaf left right mkLeaf mkNode
data Tr a Leaf a Node a (Tr a) (Tr
a) isLeaf isNode
flatten Tr a -gt a flatten t isleaf t
leaf t isNode t flatten (left t)
flatten (right t)
46
Algebraic or abstract type?
  • ?
  • Pattern matching syntax is more direct
  • but can achieve a considerable amount with
    field names.
  • Other reasons? Simplicity (due to other
    refactoring steps?).
  • ?
  • Allows changes in the implementation type without
    affecting the client e.g. might memoise
  • Problematic with a primitive type as carrier.
  • Allows an invariant to be preserved.

47
Outside or inside?
Tr isLeaf isNode leaf left right mkLeaf mkNode
data Tr a Leaf a Node a (Tr a) (Tr
a) isLeaf isNode
flatten Tr a -gt a flatten t isleaf t
leaf t isNode t flatten (left t)
flatten (right t)
48
Outside or inside?
Tr isLeaf isNode leaf left right mkLeaf mkNode fl
atten
data Tr a Leaf a Node a (Tr a) (Tr
a) isLeaf isNode flatten t
49
Outside or inside?
  • ?
  • If inside and the type is reimplemented, need to
    reimplement everything in the signature,
    including flatten.
  • The more outside the better, therefore.
  • ?
  • If inside can modify the implementation to
    memoise values of flatten, or to give a better
    implementation using the concrete type.
  • Layered types possible put the utilities in a
    privileged zone.

50
Memoise flatten Tr a-gta
data Tree a Leaf vala Node
vala, left,right(Tree a) leaf
Leaf node Node flatten (Leaf x) x flatten
(Node x l r) (x (flatten l flatten r))
data Tree a Leaf vala,
flatten a Node vala,
left,right(Tree a), flattena
leaf x Leaf x x node x l r
Node x l r (x (flatten l
flatten r))
51
Memoise flatten
  • Invisible outside the implementation module, if
    tree type is already an ADT.
  • Field names in Haskell make it particularly
    straightforward.

52
Data type or existential type?
data Shape data Shape
Circle Float forall a.
Sh a gt Shape a Rect Float Float
class Sh a where area
Shape -gt Float area a -gt
Float area (Circle f) pir2 perim
a -gt Float area (Rect h w) hw
data Circle Circle
Float perim Shape -gt Float perim (Circle f)
2pir instance Sh Circle perim (Rect h
w) 2(hw) area (Circle f)
pir2
perim (Circle f) 2pir
data Rect Rect Float
instance Sh Rect
area (Rect h w)
hw perim
(Rect h w) 2(hw)
53
Constructor or constructor?
data Expr data Expr
Epsilon .... Epsilon ....
Then Expr Expr Then Expr Expr
Star Expr Star Expr
Plus Expr
plus e Then e (Star e)
54
Monadification expressions
data Expr Lit Integer
-- Literal integer value Vbl Var
-- Assignable variables Add Expr Expr
-- Expression addition e1e2 Assign Var
Expr -- Assignment xe type Var
String type Store (Var, Integer) lookup
Store -gt Var -gt Integer lookup st x head i
(y,i) lt- st, yx update Store -gt Var -gt
Integer -gt Store update st x n (x,n)st
55
Monadification evaulation
eval Expr -gt evalST Expr
-gt Store -gt (Integer, Store)
State Store Integer eval (Lit n) st
evalST (Lit n) (n,st)
do
return n eval (Vbl x) st
evalST (Vbl x) (lookup st x,st)
do st
lt- get
return (lookup st x)
56
Monadification evaulation 2
eval Expr -gt evalST Expr
-gt Store -gt (Integer, Store)
State Store Integer eval (Add e1 e2) st
evalST (Add e1 e2) (v1v2, st2)
do where
v1 lt- evalST e1 (v1,st1) eval e1 st
v2 lt- evalST e2 (v2,st2) eval
e2 st1 return (v1v2) eval (Assign x
e) st evalST (Assign x e) (v,
update st' x v) do where
v lt- evalST e (v,st')
eval e st st lt- get
put (update st x v)
return v
57
Classes and instances
  • Type Store Int
  • empty Store
  • empty
  • get Var -gt Store -gt Int
  • get v st head i (var,i) lt- st, varv
  • set Var -gt Int -gt Store -gt Store
  • set v i ((v,i))

58
Classes and instances
  • Type Store Int
  • empty Store
  • get Var -gt Store -gt Int
  • set Var -gt Int -gt Store -gt Store
  • empty
  • get v st head i (var,i) lt- st, varv
  • set v i ((v,i))

59
Classes and instances
  • class Store a where
  • empty a
  • get Var -gt a -gt Int
  • set Var -gt Int -gt a -gt a
  • instance Store Int where
  • empty
  • get v st head i (var,i) lt- st, varv
  • set v i ((v,i))
  • Need newtype wrapper in Haskell 98

end
60
Not just programming
  • Paper or presentation
  • moving sections about amalgamate sections move
    inline code to a figure animation
  • Proof
  • introduce lemma remove, amalgamate hypotheses,
  • Program
  • the topic of the lecture

61
Evolving the evidence
  • Dependable System Evolution is the software
    engineering grand challenge.
  • Systems built with evidence of their
    dependability.
  • But how to evolve the evidence with the system?
  • Refactoring proofs, test coverage data etc.

62
Understanding a program
  • Take a working semantic tableau system written by
    an anonymous 2nd year student
  • refactor to understand its behaviour.
  • Nine stages of unequal size.
  • Reflections afterwards.

63
An example tableau
?((A?C)?((A?B)?C))
64
v1 Name types
  • Built-in types
  • Prop
  • Prop
  • used for branches and tableaux respectively.
  • Modify by adding
  • type Branch Prop
  • type Tableau Branch
  • Change required throughout the program.
  • Simple edit but be aware of the order of
    substitutions avoid
  • type Branch Branch

65
v2 Rename functions
  • Existing names
  • tableaux
  • removeBranch
  • remove
  • become
  • tableauMain
  • removeDuplicateBranches
  • removeBranchDuplicates
  • and add comments clarifying the (intended)
    behaviour.
  • Add test datum.
  • Discovered some edits undone in stage 1.
  • Use of the type checker to catch errors.
  • test will be useful later?

66
v3 Literate ? normal script
  • Change from literate form
  • Comment
  • gt tableauMain tab
  • gt ...
  • to
  • -- Comment
  • tableauMain tab
  • ...
  • Editing easier implicit assumption was that it
    was a normal script.
  • Could make the switch completely automatic?

67
v4 Modify function definitions
  • From explicit recursion
  • displayBranch
  • Prop -gt String
  • displayBranch
  • displayBranch (xxs)
  • (show x) "\n"
  • displayBranch xs
  • to
  • displayBranch
  • Branch -gt String
  • displayBranch
  • concat . map ("\n") . map show
  • Abstraction move from explicit list
    representation to operations such as map and
    concat which could be over any collection type.
  • First time round added incorrect (but type
    correct) redefinition only spotted at next
    stage.
  • Version control un/redo etc.

68
v5 Algorithms and types (1)
  • removeBranchDup Branch -gt Branch
  • removeBranchDup
  • removeBranchDup (xxs)
  • x findProp x xs
    removeBranchDup xs
  • otherwise x
    removeBranchDup xs
  • findProp Prop -gt Branch -gt Prop
  • findProp z FALSE
  • findProp z (xxs)
  • z x x
  • otherwise findProp z xs

69
v5 Algorithms and types (2)
  • removeBranchDup Branch -gt Branch
  • removeBranchDup
  • removeBranchDup (xxs)
  • findProp x xs
    removeBranchDup xs
  • otherwise x
    removeBranchDup xs
  • findProp Prop -gt Branch -gt Bool
  • findProp z False
  • findProp z (xxs)
  • z x True
  • otherwise findProp z xs

70
v5 Algorithms and types (3)
  • removeBranchDup Branch -gt Branch
  • removeBranchDup nub
  • findProp Prop -gt Branch -gt Bool
  • findProp elem

71
v5 Algorithms and types (4)
  • removeBranchDup Branch -gt Branch
  • removeBranchDup nub
  • Fails the test! Two duplicate branches output,
    with different ordering of elements.
  • The algorithm used is the 'other' nub algorithm,
    nubVar
  • nub 1,2,0,2,1 1,2,0
  • nubVar 1,2,0,2,1 0,2,1
  • Code using lists in a particular order to
    represent sets.

72
v6 Library function to module
  • Add the definition
  • nubVar
  • to the module
  • ListAux.hs
  • and replace the definition by
  • import ListAux
  • Editing easier implicit assumption was that it
    was a normal script.
  • Could make the switch completely automatic?

73
v7 Housekeeping
  • Remanings including foo and bar and contra
    (becomes notContra).
  • An instance of filter,
  • looseEmptyLists
  • is defined using filter, and subsequently
    inlined.
  • Put auxiliary function into a where clause.
  • Generally cleans up the script for the next
    onslaught.

74
v8 Algorithm (1)
  • splitNotNot Branch -gt Tableau
  • splitNotNot ps combine (removeNotNot ps)
    (solveNotNot ps)
  • removeNotNot Branch -gt Branch
  • removeNotNot
  • removeNotNot ((NOT (NOT _))ps) ps
  • removeNotNot (pps) p removeNotNot ps
  • solveNotNot Branch -gt Tableau
  • solveNotNot
  • solveNotNot ((NOT (NOT p))_) p
  • solveNotNot (_ps) solveNotNot ps

75
v8 Algorithm (2)
  • splitXXX removeXXX solveXXX for each of nine
    rules.
  • The algorithm applies rules in a prescribed
    order, using an integer value to pass information
    between functions.
  • Aim generic versions of split remove solve
  • Change order of rule application effect on
    duplicates.
  • Add map sort to top level pipeline before
    duplicate removal.

76
v9 Replace lists by sets.
  • Wholesale replacement of lists by a Set library.
  • map mapSet
  • foldr foldSet (careful!)
  • filter filterSet
  • The library exposes the representation pick,
    flatten.
  • Use with discretion further refactoring
    possible.
  • Library needed to be augmented with
  • primRecSet (a -gt Set a -gt b -gt b) -gt b -gt Set
    a -gt b

77
v9 Replace lists by sets (2)
  • Drastic simplification no explicit worries about
  • ordering (and equality), (removal of)
    duplicates.
  • Hard to test intermediate stages type change is
    all or nothing
  • work with dummy definitions and the type
    checker.
  • Further opportunities why choose one rule from a
    set when could apply to all elements at once?
    Gets away from picking on one value (and breaking
    the set interface).

78
Conclusions of the case study
  • Heterogeneous process some small, some large.
  • Are all these stages strictly refactorings some
    semantic changes always necessary too?
  • Importance of type checking for hand refactoring
    and testing when any semantic changes.
  • Undo, redo, reordering the refactorings CVS.
  • In this case, directional not always the case.

79
Teaching and learning design
  • Exciting prospect of using a refactoring tool as
    an integral part of an elementary programming
    course.
  • Learning a language learn how you could modify
    the programs that you have written
  • appreciate the design space, and
  • the features of the language.

80
Conclusions
  • Refactoring functional programming good fit.
  • Real benefit from using available libraries
    with work.
  • Want to use the tool in building itself.
  • Much more to do than we have time for.
Write a Comment
User Comments (0)
About PowerShow.com