Title: Total Code Makeover
1Total Code Makeover
- Changing the Face of High Performance Code
- Brian Foote
- Tue., 7 June 2005
- University of Illinois at Urbana-Champaign
- http//www.laputan.org
- foote_at_cs.uiuc.edu
2A Scene from a Nightmare
3Big Ball of Mud
- The de-facto standard software architecture
- Why is the gap between what we preach and what we
practice so large?
4Crescit Eundo
5The Medusa
- An apparition so hideous that all who gaze upon
her are turned to stone - "This thing you call language, though...most
remarkable. You depend on it for so very much,
but is any one of you really its master?" - -- Spock/Kollos, Is There in Truth No Beauty?
-
6HPC Programming Languagesand Objects
- FORTRAN (77, Carter era, James VI)
- Fortran 90 (Bush the Elder, George II)
- Fortran 2003 (Bush the Younger, George III)
- ANSI C
- C
- Matlab and Friends
- Assembler?
7Refactoring
- Behavior Preserving code transformation
- Rename
- Extract method
- Verifies preconditions
- Reconciles all occurrences
- More reliable than cut n paste, or search and
replace
8Reasons to Refactor
- Make code more readable / comprehesible
- Improve structure
- Improve design
- Remove duplication
- Improve performance
9What is a Refactoring?
- A Behavior Preserving Program Transformation
- Source-to-Source
- Program Restructuring -gt Batch
- Refactoring Incremental / Interactive
10From http//www.refactoring.com
- Refactoring is a disciplined technique for
restructuring an existing body of code, altering
its internal structure without changing its
external behavior. Its heart is a series of small
behavior preserving transformations. Each
transformation (called a 'refactoring') does
little, but a sequence of transformations can
produce a significant restructuring. Since each
refactoring is small, it's less likely to go
wrong. The system is also kept fully working
after each small refactoring, reducing the
chances that a system can get seriously broken
during the restructuring. - --Martin Fowler
11Refactoring?
- So, why do we want to make changes to our
programs that dont do anything? - Havent programmers always been doing this?
12Legacy1.for
13Legacy 1 ? Legacy 2
- There are three changes
- 1000 ? 100
- Added whitespace
- Simplified the expression in the assignment to
buf() - Only one is a refactoring (literally!)
- Simplify Expression or Apply Arithmetic
Identity
14Legacy2.for
15Legacy 1 ? Legacy 3
- Changes
- Introduce / Add Local Variable
- Replace Manifest Constant with Variable
- Magic Numbers ? Evil
- Tools sometimes treat a series of composable,
fine-grained refactorings as a single refactoring
16Legacy3.for
17Legacy 3 ? Legacy 4
- Weve introduced and initialized six more
variables, and replaced a number of constants
with references to these variables - Weve also changed the case of npts
- No new refactorings were introduced
18Legacy4.for
19Legacy 4 ? Legacy 5
- Weve converted several variables to parameters
- In old-school Fortran, we can use these to
declare array dimensions now - Refactor to make code more general
- Refactor to make code more clear
20Legacy5.for
21Style vs. Substance
- I know what your are all thinking
- Who cares what the code does?
- How does it look?
- Are refactorings merely cosmetic?
- If so, is that a bad thing?
- If not, why not?
- Does style matter?
- How can I make my Fortran look fabulous?
- Coding Conventions / Style / Design / Structure /
Architecture / Clarity matter - The machine, by and large doesnt care about the
differences among these programs, but we do
22Legacy5.for ? Legacy6.f95
- Labeled the INTRINSIC
- Coalesced PARAMETERS
- Introduced modern declarations
- Added initial values
- Converted old to new-style DO Loops
- and last but not least
- Converted from Fixed to Free Format
23Legacy6.f95
24Legacy6.f95 ? Legacy7.f95
- Introduced a MODULE Goodies
- Moved function IROUND into it
- A major modernization
25Legacy7.f95
26Legacy7.f95 ? Legacy8.f95
- Extract Procedure Square
- What Martin calls the Refactoring Rubicon
the real test of a refactoring tools mettle
27Legacy8.f95
28Legacy8.f95 ? Legacy9.f95
- Introduced IMPLICIT NONE
- Declared heretofore implicit variables explicitly
- Introduced INTENTS
29Legacy9.f95
30Legacy9.f95 ? Legacy10.f95
- Extracted Print, Sum
- Changed Sum to use the INTRINSIC version
- Changed IROUND to PURE, ELEMENTAL (removed debug
statements) - Is this a behavior preserving transformation?
31Legacy10.f95
32Legacy10.f95 ? Legacy11.f95
- Added Comments
- Renamed things
- Whats in a name? Plenty
- Intention-revealing names have a dramatic impact
on readability and code comprehesion - Some advocate method extraction and renaming as a
superior alternative to comment cards (comments a
smell?)
33Legacy11.f95
nchan,npts,chan INTEGER, INTENT(IN)
buf(nchan,npts) INTEGER i DO i 1,npts
34Legacy11.f95 ? Legacy12.f95
- Refactoring to Objects
- Created a Waveform TYPE
- Changed the procedures in our MODULE to
type-bound procedures in TYPE (Waveform) - Changed the calls to these procedures
accordingly - Many existing refactoring tools have extensive
suppport for object-oriented program cultivation.
A state-of-the-art Fortran tools should be no
different (once F2003 is here)
35Legacy12.f95
nchan,npts,chan INTEGER, INTENT(IN)
buf(nchan,npts) INTEGER i DO i 1,npts
36Legacy13.f95
37Photran A Fortran Plugin For Eclipse
38Photrans Pedigree
- Opdykes C Tool
- Refactory and the Refactoring Browser
- IBM Eclipse Innovation Grant
- IBM PERCS High Performance Computing Initiative
- NASA IBEAM Project
- CRefactory
39Status November 2004
- Produced two MS Theses
- Brought aboard two PhD Candidates
- Repaired numerous parser bugs
- Greatly Enhanced Stability
- Implemented Outline Views
- Make / Run / Debug under Linux
- Polished the GUI
40Status June 2004
- Hyperion Release (14 January 2005)
- Parsed Full IBEAM Codebase
- Support for g77, g95, debugging under Windows /
Linux - Support for Lahey Fortran for Windows
- Intel Support
- Mac Support
- Eos (Eclipse 3.0.1) (4 February 2005)
- AST / DOM design in progress
41Why Refactor Fortran
- Make code easier to understand
- Make code easier to reuse
- Improve structure
- Banish duplication
- Make code your own
- Incrementally Cultivate the Code (XP)
- Refactor to Components
- Refactor to Patterns
- Refactor to Objects
42What would you pay for a tool like this?
43Refactoring IBEAM
- Separating IBEAM and PARAMESH (calling PARAMESH
as a LIBRARY) - Problem Data structures must be dynamically
allocated - Problem Sizes must be dynamic, not compile time
quantities - Problem Many fixed quanties are now runtime
parameters
44Conjoined Twins
- Things that are hard to separate?
- Marriage, Omelets, Nations, Urban Decay
- Divorce, Secession, Gentrification, Renewal are
possible
45Dynamic vs. Static Allocation
46Dynamic vs. Static Allocation
47Parameter Object
48The Hippocratic Oath
- First, do no harm
- --Hippocrates? Galen?
- Good advice anyway
49Something Wonderful
- Tension between performance and flexibility is
high - Moore and Amdahl are conspiring to help
- Computational Science ? Science
- HP Computing ? Computing
- HPC Pioneers will lead the way
- Pattern mining needs to begin now
50An Embarrassment of Riches?
- Cycles too Plentiful to burn
- Bandwidth too Cheap to meter
- Realtime Photorealism
51Overcoming Fear
- Safe Refactorings
- Reversible Refactorings
- Unit Tests
- Regression Tests
- Emergent Substructures
- Tangible Incremental Results
- Visible Results
52On Reading Legacy Code
- The final alternative is to copy the original
program. This copy is altered to meet the new
requirements. I call this approach
metastisization. - Reading and fully comprehending such code is
among one of the most difficult intellectual
challenges one might have the misfortune to
encounter. - --Designing to Facilitate Change Foote 1988
53Things You Should Never Do
- They did it by making the single worst strategic
mistake that any software company can make - They decided to rewrite the code from scratch.
- --Joel Spolsky
54From No Silver Bullet
- Some years ago, Harlan Mills proposed that any
software system should be grown by incremental
development. That is, the system first be made to
run, even though it does nothing useful except
call the proper set of dummy subprograms. Then,
bit by bit, it is fleshed out, with the
subprograms in turn being developed into actions
or calls to empty stubs in the level below. - Nothing in the past decade has so radically
changed my own practice, and its effectiveness. - One always has, at every stage, in the process, a
working system. I find that teams can grow much
more complex entities in four months than they
can build. - -- From "No Silver Bullet" Brooks 1995
55PTP and Photran
56Things People Would Rather Write than Read
- Web Log Entries
- Poetry
- Code
- Its easier in general to work with ones own
work than reuse someone elses - For example Whod like to get up here and finish
my talk?
57IBEAM and PARAMESH
58The Importance of Tools
59Refactoring Tools
- First developed at UIUC for Smalltalk
- Made popular by Fowler, Kerievsky
- Coming soon to a HPC near you
60Reconstruction / Demolition
- Atlantas Fulton County Stadium was built in 1966
and razed in 1997. - Two single purpose stadia, with sky boxes, are
replacing it
61Our Fortran Heritage
- The First Fortran Program was run on 20 September
1954
62Draining the Swamp
63The Fumigation Brigade
64Refactoring Engine
- Typically Tree to Tree Transformations at the AST
/ Parse Tree Level - This means you need about 60 of a full compiler
(All but the back end) - Source to AST must be a round trip conversion
- Pretty Printing (un-parsing) is necessary to
present refactored results
65From the Mythical Man-Month
- Lehman and Belady have studied the history of
successive releases in a large operating system. - They find that the total number of modules
increases linearly with release number, but that
the number of modules affected increases
exponentially with release number. - All repairs tend to destroy the structure to
increase the entropy and disorder of the system. - Less and less effort is spent fixing original
design flaws more and more is spent on fixing
flaws introduced in earlier fixes. - As time passes, the system becomes less and less
well ordered...
66Holes in a Failing Dike
- ...Systems program building is an entropy
decreasing process, hence inherently metastable. - Program maintenance is an entropy increasing
process, and even its most skillful execution
only delays the subsidence of the system into
unfixable obsolescence... - Maintenance, it would seem, is like fixing holes
in a failing dike. Eventually it fails, and must
be rebuilt. Only then are lessons learned during
its tenure exploited
67Fortran Refactorings
- IMPLICIT NONE
- Extract Derived Type
- Move procedure to module
-
- Many Larger Tasks can be seen as a series of
smaller refactorings - Object-Oriented Refactorings will be on the table
soon - You tell us!
68Bibliography
69The Olfactory Method
- Kent Beck May be Best Remembered as the Man Who
brought Scatology and Software Engineering
together - If it stinks, change it!
- --Grandma Beck
- Code Smells are (not so) subtle indications a
piece of code is in need of attention and is a
likely candidate for refactoring
70Common Code Smells
- Procedure is too Long
- Name conveys no useful information
- Code is duplicated
71Fowlers Catalog
- Extract Method (Procedure)
- Add / Remove Parameter
- Introduce Parameter Object
- Rename Variable / Proc etc.
- Replace Global w/ Pararmeter
- Etc. Etc. Etc.