Title: Refactoring Functional Programs
1Refactoring Functional Programs
- Huiqing Li
- Claus Reinke
- Simon Thompson
- Computing Lab, University of Kent
- www.cs.kent.ac.uk/projects/refactor-fp/
2Refactoring
- Refactoring means changing the design of program
- without changing its behaviour.
- Refactoring comes in many forms
- micro refactoring as a part of program
development - major refactoring as a preliminary to revision
- as a part of debugging,
- As programmers, we do it all the time.
3Refactoring functional programs
- What is possible for functional programs?
- What is different about functional programs?
- Building a usable tool vs.
- building a tool that will be used.
- Reflection on language design.
- Experience, demonstration, next steps.
- Haskell as a medium, but wider applicability.
4Not just programming
- Paper or presentation
- moving sections about amalgamate sections move
inline code to a figure animation - Proof
- introduce lemma remove, amalgamate hypotheses,
- Program
- the topic of the lecture
5Overview of the talk
- Example refactorings
- Refactoring functional programs
- Generalities
- Tooling demo, rationale, design.
- Catalogue of refactorings
- Larger-scale examples and a case study
- Conclusions
6Rename
- f x y
- ?
- Name may be too specific, if the function is a
candidate for reuse.
- findMaxVolume x y
- ?
- Make the specific purpose of the function clearer.
Needs scope information just change this f and
not all fs (e.g. local definitions or variables).
7Lift / demote
- f x y h
- where
- h
- ?
- Hide a function which is clearly subsidiary to f
clear up the namespace.
- f x y (h y)
-
- h y
- ?
- Makes h accessible to the other functions in the
module (and beyond?).
Needs free variable information which of the
parameters of f is used in the definition of
h? Need h not to be defined at the top level,
, DMR.
8Introduce and use a type defn
- f Int -gt Char
- g Int -gt Int
-
- ?
- Reuse supported (a synonym is transparent, but
can be misleading).
- type Length Int
- f Length -gt Char
- g Int -gt Length
- ?
- Clearer specification of the purpose of f,g.
(Morally) can only apply to lengths.
Avoid name clashes Problem with instance
declarations (Haskell specific).
9Introduce and use branded type
- f Int -gt Char
- g Int -gt Int
-
- ?
- Reuse supported, but lose the clarity of
specification.
- data Length
- Length lengthInt
- f Length -gt Char
- g Int -gt Length
- ?
- Can only apply to lengths.
- Needs function call information where are (these
definitions of) f and g called? - Change the calls of f and the call sites of g.
- Choice of data and newtype (Haskell specific).
10Lessons from the first examples
- Changes are not limited to a single point or even
a single module diffuse and bureaucratic - unlike traditional program transformation.
- Many refactorings bidirectional
- there is no single correct design.
11Refactoring functional programs
- Semantics can articulate preconditions and
- verify transformations.
- Absence of side effects makes big changes
predictable and verifiable unlike
OO. - XP is second nature to a functional programmer.
- Language support expressive type system,
abstraction mechanisms, HOFs,
12Composing refactorings
- Interesting refactorings can be built from simple
components - each of which looks trivial in its own right.
- A set of examples
- which we have implemented.
13Example program
- showAll Show a gt a -gt String
- showAll table . map show
- where
- format String -gt String
- format
- format x x
- format (xxs) (x "\n") format xs
- table String -gt String
- table concat . format
14Examples
- Lift definitions from local to global
- Demote a definition before lifting its container
- Lift a definition with dependencies
15Example 1
- showAll Show a gt a -gt String
- showAll table . map show
- where
- format String -gt String
- format
- format x x
- format (xxs) (x "\n") format xs
- table String -gt String
- table concat . format
16Example 1 lift
- showAll Show a gt a -gt String
- showAll table . map show
- where
- format String -gt String
- format
- format x x
- format (xxs) (x "\n") format xs
- table String -gt String
- table concat . format
17Example 1
- showAll Show a gt a -gt String
- showAll table . map show
- where
- table String -gt String
- table concat . format
- format String -gt String
- format
- format x x
- format (xxs) (x "\n") format xs
18Example 1 lift
- showAll Show a gt a -gt String
- showAll table . map show
- where
- table String -gt String
- table concat . format
- format String -gt String
- format
- format x x
- format (xxs) (x "\n") format xs
19Example 1
- showAll Show a gt a -gt String
- showAll table . map show
- table String -gt String
- table concat . format
- format String -gt String
- format
- format x x
- format (xxs) (x "\n") format xs
20Example 2
- showAll Show a gt a -gt String
- showAll table . map show
- where
- format String -gt String
- format
- format x x
- format (xxs) (x "\n") format xs
- table String -gt String
- table concat . format
21Example 2 demote
- showAll Show a gt a -gt String
- showAll table . map show
- where
- format String -gt String
- format
- format x x
- format (xxs) (x "\n") format xs
- table String -gt String
- table concat . format
22Example 2
- showAll Show a gt a -gt String
- showAll table . map show
- where
- table String -gt String
- table concat . format
- where
- format String -gt String
- format
- format x x
- format (xxs) (x "\n") format xs
23Example 2 lift
- showAll Show a gt a -gt String
- showAll table . map show
- where
- table String -gt String
- table concat . format
- where
- format String -gt String
- format
- format x x
- format (xxs) (x "\n") format xs
24Example 2
- showAll Show a gt a -gt String
- showAll table . map show
- table String -gt String
- table concat . format
- where
- format String -gt String
- format
- format x x
- format (xxs) (x "\n") format xs
25Example 2 lift
- showAll Show a gt a -gt String
- showAll table . map show
- table String -gt String
- table concat . format
- where
- format String -gt String
- format
- format x x
- format (xxs) (x "\n") format xs
26Example 2
- showAll Show a gt a -gt String
- showAll table . map show
- table String -gt String
- table concat . format
-
- format String -gt String
- format
- format x x
- format (xxs) (x "\n") format xs
27Example 3
- showAll Show a gt a -gt String
- showAll table . map show
- where
- format String -gt String
- format
- format x x
- format (xxs) (x "\n") format xs
- table String -gt String
- table concat . format
28Example 3 lift with dependencies
- showAll Show a gt a -gt String
- showAll table . map show
- where
- format String -gt String
- format
- format x x
- format (xxs) (x "\n") format xs
- table String -gt String
- table concat . format
29Example 3
- showAll Show a gt a -gt String
- showAll table format . map show
- where
- format String -gt String
- format
- format x x
- format (xxs) (x "\n") format xs
- table (String -gt String) -gt String -gt
String - table format concat . format
30Example 3 rename
- showAll Show a gt a -gt String
- showAll table format . map show
- where
- format String -gt String
- format
- format x x
- format (xxs) (x "\n") format xs
- table (String -gt String) -gt String -gt
String - table format concat . format
31Example 3
- showAll Show a gt a -gt String
- showAll table format . map show
- where
- format String -gt String
- format
- format x x
- format (xxs) (x "\n") format xs
- table (String -gt String) -gt String -gt
String - table fmt concat . fmt
32Example 3 lift
- showAll Show a gt a -gt String
- showAll table format . map show
- where
- format String -gt String
- format
- format x x
- format (xxs) (x "\n") format xs
- table (String -gt String) -gt String -gt
String - table fmt concat . fmt
33Example 3
- showAll Show a gt a -gt String
- showAll table format . map show
-
- format String -gt String
- format
- format x x
- format (xxs) (x "\n") format xs
- table (String -gt String) -gt String -gt
String - table fmt concat . fmt
34Example 3 unfold/inline
- showAll Show a gt a -gt String
- showAll table format . map show
-
- format String -gt String
- format
- format x x
- format (xxs) (x "\n") format xs
- table (String -gt String) -gt String -gt
String - table fmt concat . fmt
35Example 3
- showAll Show a gt a -gt String
- showAll (concat . format) . map show
-
- format String -gt String
- format
- format x x
- format (xxs) (x "\n") format xs
- table (String -gt String) -gt String -gt
String - table fmt concat . fmt
36Example 3 delete
- showAll Show a gt a -gt String
- showAll (concat . format) . map show
-
- format String -gt String
- format
- format x x
- format (xxs) (x "\n") format xs
- table (String -gt String) -gt String -gt
String - table fmt concat . fmt
37Example 3
- showAll Show a gt a -gt String
- showAll (concat . format) . map show
-
- format String -gt String
- format
- format x x
- format (xxs) (x "\n") format xs
38Example 3 new definition
- showAll Show a gt a -gt String
- showAll (concat . format) . map show
-
- format String -gt String
- format
- format x x
- format (xxs) (x "\n") format xs
table
39Example 3
- showAll Show a gt a -gt String
- showAll table . map show
-
- format String -gt String
- format
- format x x
- format (xxs) (x "\n") format xs
- table String -gt String
- table concat . format
40Beyond the text editor
- All the refactorings can in principle be
implemented using a text editor, but this is - tedious,
- error-prone,
- difficult to reverse,
- With machine support refactoring becomes
- low-cost easy to do and to undo,
- reliable,
- a full part of the programmer's repertoire.
41Information needed
- Syntax replace the function called sq, not the
variable sq parse tree. - Static semantics replace this function sq, not
all the sq functions scope information. - Module information what is the traffic between
this module and its clients call graph. - Type information replace this identifier when it
is used at this type type annotations.
42Machine support invaluable
- Current practice editor type checker (
tests). - Our project automated support for a repertoire
of refactorings - integrated into the existing development
process tools such as vim and emacs.
Demonstration of the tool, hosted in vim.
43Proof of concept
- To show proof of concept it is enough to
- build a stand-alone tool,
- work with a subset of the language,
- pretty print the refactored source code in a
standard format.
44 or a useful tool?
- To make a tool that will be used we must
- integrate with existing program development
tools the program editors emacs and vim only
add to their capabilities. - work with the complete Haskell 98 language,
- preserve the formatting and comments in the
refactored source code.
45Consequences
- To achieve this we chose to
- build a tool that can interoperate with emacs,
vim, yet act separately. - leverage existing libraries for processing
Haskell 98, for tree transformation, yet - modify them as little as possible.
- be as portable as possible, in the Haskell space.
46The Haskell background
- Libraries
- parser many
- type checker few
- tree transformations few
- Difficulties
- Haskell98 vs. Haskell extensions.
- Libraries proof of concept vs. distributable.
- Source code regeneration.
- Real project
47First steps lifting and friends
- Use the Haddock parser full Haskell given in
500 lines of data type definitions. - Work by hand over the Haskell syntax 27 cases
for expressions - Code for finding free variables, for instance
48Finding free variables 100 lines
- instance FreeVbls HsExp where
- freeVbls (HsVar v) v
- freeVbls (HsApp f e)
- freeVbls f freeVbls e
- freeVbls (HsLambda ps e)
- freeVbls e \\ concatMap paramNames ps
- freeVbls (HsCase exp cases)
- freeVbls exp concatMap freeVbls cases
- freeVbls (HsTuple _ es)
- concatMap freeVbls es
- etc.
49This approach
- Boiler plate code
- 1000 lines for 100 lines of significant code.
- Error prone significant code lost in the noise.
- Want to generate the boiler plate and the tree
traversals - DriFT Winstanley, Wallace
- Strafunski Lämmel and Visser
50Strafunski
- Strafunski allows a user to write general (read
generic) tree traversing programs - with ad hoc behaviour at particular points.
- Traverse through the tree accumulating free
variables from component parts, except in the
case of lambda abstraction, local scopes, - Strafunski allows us to work within Haskell
other options are under development.
51Production tool (version 0)
Programatica parser and type checker
Refactor using a Strafunski engine
Pretty print from the augmented
Programatica syntax tree
52Production tool (version 1)
Programatica parser and type checker
Refactor using a Strafunski engine
Pretty print from the augmented
Programatica syntax tree
Pass lexical information to update the syntax
tree and so avoid reparsing
53Experience so far
- We can do it but
- efficiency
- formalising static semantics
- change management (CVS etc.)
- user interface
- interface to other tools
- problems of getting code to work
- different systems working together
- clash of instance global problem
- Haskell in the large (e.g. 20 minute link time)
54Clarification
- Implementation yields clarification.
- The precise way to document refactorings
- and in particular their preconditions.
55Catalogue of refactorings
- name (a phrase)
- label (a word)
- description
- left-hand code
- right-hand code
- comments
- l to r
- r to l
- general
- primitive / composed
- cross-references
- internal
- external (Fowler)
- category (just one) or
classifiers (keywords) - language
- specific (Haskell, ML etc.)
- feature (lazy etc.)
- conditions
- left / right
- analysis required (e.g. names, types, semantic
info.) - which equivalence?
- version info
- date added
- revision number
56Preconditions
- It is possible precisely to articulate the
preconditions for successful application of the
refactorings - For example, in renaming, the existing binding
structure must not be affected
57Preconditions renaming f to g
- No binding for the new name may exist in the
same binding group.
58Preconditions renaming f to g
- No binding for the new name may exist in the
same binding group.
59Preconditions renaming f to g
- No binding for the new name may intervene between
the binding of the old name and any of its uses
- as the renamed identifier would be captured by
the renaming.
60Preconditions renaming f to g
- No binding for the new name may intervene between
the binding of the old name and any of its uses
- as the renamed identifier would be captured by
the renaming.
61Preconditions renaming f to g
- Conversely, the binding to be renamed must not
intervene between bindings and uses of the new
name.
62Preconditions renaming f to g
- Conversely, the binding to be renamed must not
intervene between bindings and uses of the new
name.
63Preconditions lifting
- Widening the scope of the binding must not
capture independent uses of the name in the outer
scope. - There should be no existing definition of the
name in the outer binding group (irrespective of
whether or not it is used).
- The binding to be promoted must not make use of
bindings in the inner scope. Instead lambda lift
over these extra conds apply
64Preconditions lifting
- Lambda lift over these extra conds apply
- The binding must be a simple binding of a
function or constant, not a pattern. - Any argument must not be used polymorphically.
- f
- where
- g (xxs) x
- h g test
- show (g 1,2,3)
65Crossing the refactoring Rubicon?
- Martin Fowlers Rubicon implement extract
definition compare with other systems. - This is in our ? version already
- and were only 1/3 of the way into the project.
- Productivity of functional programming.
- Challenge of implementing larger refactorings.
66Larger-scale examples
- More complex examples in the functional domain
often link with data types. - Dawning realisation that can some refactorings
are pretty powerful. - Bidirectional no right answer.
67Algebraic or abstract type?
flatten Tr a -gt a flatten (Leaf x)
x flatten (Node s t) flatten s flatten
t
Tr Leaf Node
data Tr a Leaf a Node a (Tr a) (Tr a)
68Algebraic or abstract type?
Tr isLeaf isNode leaf left right mkLeaf mkNode
flatten Tr a -gt a flatten t isleaf t
leaf t isNode t flatten (left t)
flatten (right t)
data Tr a Leaf a Node a (Tr a) (Tr
a) isLeaf isNode
69Algebraic or abstract type?
- ?
- Pattern matching syntax is more direct
- but can achieve a considerable amount with
field names. - Other reasons? Simplicity (due to other
refactoring steps?).
- ?
- Allows changes in the implementation type without
affecting the client e.g. might memoise - Problematic with a primitive type as carrier.
- Allows an invariant to be preserved.
70Outside or inside?
Tr isLeaf isNode leaf left right mkLeaf mkNode
flatten Tr a -gt a flatten t isleaf t
leaf t isNode t flatten (left t)
flatten (right t)
data Tr a Leaf a Node a (Tr a) (Tr
a) isLeaf
71Outside or inside?
Tr isLeaf isNode leaf left right mkLeaf mkNode fl
atten
data Tr a Leaf a Node a (Tr a) (Tr
a) isLeaf flatten
72Outside or inside?
- ?
- If inside and the type is reimplemented, need to
reimplement everything in the signature,
including flatten. - The more outside the better, therefore.
- ?
- If inside can modify the implementation to
memoise values of flatten, or to give a better
implementation using the concrete type. - Layered types possible put the utilities in a
privileged zone.
73Replace function by constructor
- data Expr Star Expr
- Then Expr Expr
-
- plus e Then e (Star e)
- ?
- plus is just syntactic sugar reduce the number
of cases in definitions. - Character range is a better example.
- data Expr Star Expr
- Plus Expr
- Then Expr Expr
-
- ?
- Can treat Plus differently, e.g.
- literals (Plus e)
- literals e
- but require each function over Expr to have a
Plus clause.
74Other examples ...
- Modify the return type of a function from T to
Maybe T, Either T T' or T. - Would be nice to have field names in Prelude
types. - Add an argument (un)group arguments reorder
arguments. - Move to monadic presentation important case
study. - Flat or layered datatypes (Expr add BinOp type).
- Various possibilities for error
handling/exceptions. - Tableau case study.
75Change of user interface
Refactor the existing text-based application
so that it can have textual or graphical
user interface.
76Changing functionality?
- The aim is not to change functionality
- or at least not required functionality.
- What level of behaviour is visible?
- May change incidental properties
- cf legacy systems preserve their essential
properties but not their accidental ones.
77Other uses of refactoring
- Understand someone elses code
- make it your own.
- Understanding your own code.
- Preparing for major changes.
- etc.
78Teaching and learning design
- Exciting prospect of using a refactoring tool as
an integral part of an elementary programming
course. - Learning a language learn how you could modify
the programs that you have written - appreciate the design space, and
- the features of the language.
79Conclusions
- Refactoring functional programming good fit.
- Stresses the type system generic traversal
- Practical tool not yet another type tweak.
- Leverage from available libraries with work.
- We are eager to use the tool in building itself!
80- www.cs.kent.ac.uk/projects/refactor-fp/
81Understanding semantic tableaux
- Take a working semantic tableau system written by
an anonymous 2nd year student - refactor to understand its behaviour.
- Nine stages of unequal size.
- Reflections afterwards.
- See www.cs.kent.ac.uk/projects/refactor-fp/
82An example tableau
?((A?C)?((A?B)?C))
83v1 Name types
- Built-in types
- Prop
- Prop
- used for branches and tableaux respectively.
- Modify by adding
- type Branch Prop
- type Tableau Branch
- Change required throughout the program.
- Simple edit but be aware of the order of
substitutions avoid - type Branch Branch
84v2 Rename functions
- Existing names
- tableaux
- removeBranch
- remove
- become
- tableauMain
- removeDuplicateBranches
- removeBranchDuplicates
- and add comments clarifying the (intended)
behaviour. - Add test datum.
- Discovered some edits undone in stage 1.
- Use of the type checker to catch errors.
- test will be useful later?
85v3 Literate ? normal script
- Change from literate form
- Comment
- gt tableauMain tab
- gt ...
- to
- -- Comment
- tableauMain tab
- ...
- Editing easier implicit assumption was that it
was a normal script. - Could make the switch completely automatic?
86v4 Modify function definitions
- From explicit recursion
- displayBranch
- Prop -gt String
- displayBranch
- displayBranch (xxs)
- (show x) "\n"
- displayBranch xs
- to
- displayBranch
- Branch -gt String
- displayBranch
- concat . map ("\n") . map show
- More abstract move somewhat away from the list
representation to operations such as map and
concat which could appear in the interface to any
collection type. - First time round added incorrect (but type
correct) redefinition only spotted at next
stage. - Undo, redo, merge, ?
87v5 Algorithms and types (1)
- removeBranchDup Branch -gt Branch
- removeBranchDup
- removeBranchDup (xxs)
- x findProp x xs
removeBranchDup xs - otherwise x
removeBranchDup xs - findProp Prop -gt Branch -gt Prop
- findProp z FALSE
- findProp z (xxs)
- z x x
- otherwise findProp z xs
88v5 Algorithms and types (2)
- removeBranchDup Branch -gt Branch
- removeBranchDup
- removeBranchDup (xxs)
- findProp x xs
removeBranchDup xs - otherwise x
removeBranchDup xs - findProp Prop -gt Branch -gt Bool
- findProp z False
- findProp z (xxs)
- z x True
- otherwise findProp z xs
89v5 Algorithms and types (3)
- removeBranchDup Branch -gt Branch
- removeBranchDup nub
- findProp Prop -gt Branch -gt Bool
- findProp elem
90v5 Algorithms and types (4)
- removeBranchDup Branch -gt Branch
- removeBranchDup nub
- Fails the test! Two duplicate branches output,
with different ordering of elements. - The algorithm used is the 'other' nub algorithm,
nubVar - nub 1,2,0,2,1 1,2,0
- nubVar 1,2,0,2,1 0,2,1
- The code is dependent on using lists in a
particular order to represent sets.
91v6 Library function to module
- Add the definition
- nubVar
- to the module
- ListAux.hs
- and replace the definition by
- import ListAux
- Editing easier implicit assumption was that it
was a normal script. - Could make the switch completely automatic?
92v7 Housekeeping
- Renamings including foo and bar and contra
(becomes notContra). - An instance of filter,
- looseEmptyLists
- is defined using filter, and subsequently
inlined. - Put auxiliary function into a where clause.
- Generally cleans up the script for the next
onslaught.
93v8 Algorithm (2)
- splitXXX removeXXX solveXXX
- are present for each of nine rules.
- The algorithm applies rules in a prescribed
order, using an integer value to pass information
between functions. - Aim generic versions of split remove solve
- Have to change order of rule application
- which has a further effect on duplicates.
- Add map sort to top level pipeline prior to
duplicate removal.
94v9 Replace lists by sets.
- Wholesale replacement of lists by a Set library.
- map mapSet
- foldr foldSet (careful!)
- filter filterSet
- The library exposes the representation pick,
flatten. - Use with discretion further refactoring
possible. - Library needed to be augmented with
- primRecSet (a -gt Set a -gt b -gt b) -gt b -gt Set
a -gt b
95v9 Replace lists by sets (2)
- Drastic simplification no need for explicit
worries about - ordering and its effect on equality,
- (removal of) duplicates.
- Difficult to test whilst in intermediate stages
the change in a type is all or nothing - work with dummy definitions and the type
checker. - Further opportunities
- why choose one rule from a set when could apply
to all elements at once? Gets away from picking
on one value (and breaking the set interface).
96Conclusions of the case study
- Heterogeneous process some small, some large.
- Are all these stages strictly refactorings some
semantic changes always necessary too? - Importance of type checking for hand refactoring
and testing when any semantic changes. - Undo, redo, reordering the refactorings CVS.
- In this case, directional not always the case.
97- www.cs.kent.ac.uk/projects/refactor-fp/