Title: DMS Software Quality Enhancement via Automated Software Analysis, Modification and Generation
1DMSSoftware Quality Enhancement viaAutomated
Software Analysis,Modification and Generation
- Ira D. Baxter
- Semantic Designs, Inc.
- www.semdesigns.com
- September 2002
2Semantic DesignsCorporate Goal
- To enable our customers to produce and maintain
timely, robust and economical software - by providing world-classSoftware Engineering
tools - using deep problem domain knowledgeand high
degrees of automation.
3Modern Software Engineering
- Large software system in multiple languages
- 80 Maintenance/Enhancements
- Little accurate design documentation
- Largely manual effort
- How to
- Understand software structure?
- Reorganize structure to enable change?
- Make sweeping changes?
- Make reliable changes?
4DMS Software Reengineering Toolkit
- Customized, automated analysis,
modification, porting or generation - Enables wide variety of source-based SE tasks to
be automated - For sources for large scale software systems
- Scalable to millions of source lines, tens of
thousands of files - Parallel processing foundations to support scale
- Handles many and mixed languages simultaneously
- C, C, Java, Ada, Fortran, SQL, XML, assembler,
- Generalized compiler technology conveniently
integrated - Parsing, Analyzing, Transforming, Prettyprinting
- Enables practical customization for desired
automation task - Predefined support for standard computer
languages - Huge infrastructure cost amortized over many
tasks/customers - Semantic Designs Supporting Services and Tools
- Consulting Training to customer on DMS usage
- Implementation of DMS customization
- Selected SE tasks prepackaged formatting, test
coverage,
5Changing the Economics of Software
- Present software difficult to build/maintain
- Key difficulties
- Manual process slow, poor engineering scale
- Absence of expert knowledge
- General how to knowledge
- Specific details about structure of artifact
- Tackle by integrating many research ideas well
engineered - Generalized compilers
- Program Transformations (DARPA-funded technology
1970-1990!) - NIST/ATP funded practical implementation (1995)
- Capture expert knowledge as rules
- Benefits
- Faster (automation)
- Better (knowledge)
- Radically Cheaper (lowered engineering costs)
6Manual Software Methods
Analyses of Software System a in b b in c a in c
Slow! Hard to repeat!
Engineering Activity
Software System Sources (MSLOC)
example a in b if a in c then p7 else p2
Revised/Generated Software System Sources a in
b p7
Sr. Programmers/ Engineers
Expensive!
Low level of automation!
Editors, Compilers
Implicit Background Knowledge
Unknown lost between iterations!
7Automation In Software Engineering
- Manual coding/analysis expensive
- Typical 100K/year/man/4KSLOC ? 25/SLOC
- Presently not much automated help very expensive
to build - Automation possible
- For problem domains with well-defined semantics
- Computer languages, specification languages,
- For well-defined tasks
- Analysis error detection, test support,
documentation extraction, reverse
engineering, - Modification structure improvement, error
handler insertion, API
change, code porting, - Code generation from specs, diagnostics, test
cases,. - Using Program Transformation technology
- Generalized compiler componentry
- Researched by community over past 25 years
8Transformation Systems
Stepwise Semiautomatic Conversion of Specs to Code
Spec
Prog
Rqmts
Transform Engine
fS
ci
fG
Transforms
factoring
like-term combination
distributive law
unity multiplier
remove parentheses
t1
t2
tk-1
tk
tk-2
...
f1
fG
fS
fk-1
fk
(x-1)y2y (xy-1y)2y
xy-y2y xyy (x1)y
9DMS Concept
Automated! Repeatable! Scalable!
Analyses of Software System a in b b in c a in c
Software System Sources (MSLOC)
example a in b if a in c then p7 else p2
Engineering Activity
Revised/Generated Software Sources a in b
p7
Captured!
DMS
- Background Knowledge for Software Analysis or
Modification problem (KSLOC) - Language Definitions
Ada, SQL, Java - General Language Analyses
varexpression gt var modified - General Language Transforms
if true then s else t endif -gt s - DSL Language Analyses
x in y and y in z gt x in z - DSL Application Context Knowledge b
in c - DSL Refinement Language Transforms ?x in
?y gt ffalse
for (i) if
?x?yi then ftrue - from Semantic
Designs or Consultants/Sr. Programmers/Engineers
10DMS Impact on Quality and Process
- Quality
- (Re)use of tested specification techniques
- avoids ad hoc descriptions
- Reuse of abstract generative components
(transforms) - Not code reuse, but rather implementation
knowledge reuse - Specifications and implementation steps
inspectable by others - Reuse of tested synthesis/modification methods
- Mechanical, reliable construction of product
- Easier recovery from errors correct mistake,
re-execute task - Process
- Reliable components --gt avoid rework after
changes - Mechanically repeatable implementation steps
- Focus on knowledge acquisition rather than
repeated coding events
11Overview
- DMS Software Reengineering Toolkit
- Defining notations (domains) for specs and
legacy systems - Parsing and prettyprint
- Transformation mechanics
- Applications for Software Quality Improvement
- C preprocessor conditional removal
- Software Test Coverage
- Cross Reference and Dead Code
- Refactoring Java
- Automatic Code Generation (XML Parsers)
- Fast HTML generation using XSLT
- Clone Detection/Removal
- Porting application software to new languages
12DMS Domain (Notation Semantics) Parts
- External Form (what you can say string or
graphical) - Internal Form (How DMS
stores it) - Parser (how to convert external form to
internal form) - PrettyPrinter (how to display the
Internal Form) - Semantics (what the Internal
Form means) - Optimizations (how to optimize in the
domain) - Refinements (how to transform IF to
another IF) - Analyzers (how to analyze in
the domain) - Attachments (procedures to enhance DMS
efficiency)
13DMS Domain for JavaParser Pretty Printer
- nested_class_declaration nested_class_modifiers
class_header class_body - ltltPrettyPrintergtgt V(H(nested_class_modifiers
,class_header),class_body) - class_header 'class' IDENTIFIER
- ltltPrettyPrintergtgt H('class',IDENTIFIER)
- class_header 'class' IDENTIFIER 'implements'
name_list - ltltPrettyPrintergtgt H('class',IDENTIFIER,'impl
ements',name_list) - class_header 'class' IDENTIFIER 'extends' name
- ltltPrettyPrintergtgt H('class',IDENTIFIER,'exte
nds',name) - class_header 'class' IDENTIFIER 'extends' name
'implements' name_list - ltltPrettyPrintergtgt H('class',IDENTIFIER,'exte
nds',name,'implements',name_list) - class_body '' class_body_declarations ''
- ltltPrettyPrintergtgt V(H('',STRING("
"),class_body_declarations),'') - nested_class_modifiers nested_class_modifiers
nested_class_modifier - ltltPrettyPrintergtgt H(CH(nested_class_modifier
s1),nested_class_modifier)
300 more rules(COBOL is 3500!)
14Parsing to Abstract Syntax TreesA Program
Representation analyzable by Computers
- Use DMS grammar domain to define language syntax
- DMS generates lexer/parser automatically
- Parser reads source file(s)
- Captures comments
- Carries out lexical conversions (e.g, FP text -gt
IEEE binary fp) - Builds Abstract Syntax Tree
- Records Position of every node (file, line, col)
- Present capability for the following domains
- Specification Spectrum, BNF, Rose Models
- Technology XML, IDL, SQL
- Implementation C/C, COBOL, Java, Ada, VB6,
Fortran, Verilog
15A Simple Java Program
001 002 003 004 005 006 007 008 009 010
/ Fib.java / public class NumberTheory int
Fib(int x) if (x lt 1) return 1 // base
case else return Fib(x-1)Fib(x-2)
16Abstract Syntax Tree (AST) for Fib Class free
of lexical properties (text shape) of program
...
Class Header
/ Fib.java / public class NumberTheory int
Fib(int x) if (x lt 1) return 1 // base
case else return Fib(x-1)Fib(x-2)
Block
Return
Class Body
ID Number Theory
Stmt Sequence
Method Declaration
Function Call
Function Call
If Then Else
Method Modifiers
ID Fib
Parameters
Empty Throwlist
Type INT
Parameter
lt
Return
Type INT
ID x
ID x
NUMBER 1
NUMBER 1
Not shown File/line/column annotation on each
node
17PrettyPrinting AntiParsing
- Conversion of AST back to text file
- Handles indentation, comments, literal formats...
- Uses DMS Box language to compose PP fragments
V(H(if,(,condition,)), I(then_stmt))
If Then
H(expression1,lt,expression2)
H(return,expression,)
lt
Return
Prettyprinted result if (xlt1) return 1
ID x
NUMBER 1
NUMBER 1
18Winner Obfuscated C ContestThe Maintenance
Programmers Nightmare
include
ltmath.hgt include
ltsys/time.hgt include
ltX11/Xlib.hgt include
ltX11/keysym.hgt
double L ,o ,P
,_dt,T,Z,D1,d,
s999,E,h
8,I,
J,K,w999,M,m,O
,n999,j33e-3,i
1E3,r,t, u,v ,W,S
74.5,l221,X7.26,
a,B,A32.2,c,
F,H int
N,q, C, y,p,U
Window z char f52
GC k main() Displaye
XOpenDisplay( 0) zRootWindow(e,0) for
(XSetForeground(e,kXCreateGC (e,z,0,0),BlackPixel
(e,0)) scanf("lflflf",y n,wy, ys)1 y
) XSelectInput(e,z XCreateSimpleWindow(e,z,0,0
,400,400, 0,0,WhitePixel(e,0) ),KeyPressMask)
for(XMapWindow(e,z) Tsin(O)) struct timeval
G 0,dt1e6 K cos(j) N1e4 M H_ ZDK
F_P rEK Wcos( O) mKW HKT OD_F/
Kd/KE_ B sin(j) aBTD-EW
XClearWindow(e,z) tTE DBW jd_D-_FE
PWEB-TD for (o(IDWE TB,Ed/K
BvB/KFD)_ plty ) Tpsi Ec-pw
Dnp-L KDm-BT-HE if(p nw pps
0K ltfabs(WTr-IE DP) fabs(Dt DZ T-a
E)gt K)N1e4 else qW/K 4E22e2 C 2E24e2/
K D N-1E4 XDrawLine(e ,z,k,N ,U,q,C) Nq
UC p L_ (Xt PMml) TXX llM
M XDrawString(e,z,k ,20,380,f,17) Dv/l15
i(B l-Mr -XZ)_ for( XPending(e) u
CS!N)
XEvent z XNextEvent(e ,z)
((NXLookupKeysym
(z.xkey,0))-IT?
N-LT?
UP-N? E
J u h) --(
DN -N? N-DT ?N
RT?u WhJ
) m15F/l
c(IM/ l,lH
IMaX)_ H
ArvX-Fl(
E.1X4.9/l,t
Tm/32-IT/24
)/S KFM(
h 1e4/l-(T
E5TE)/3e2
)/S-Xd-BA
a2.63 /ld
X( dl-T/S
(.19E a
.64J/1e3
)-M v A
Z)_ l
K
_ Wd
sprintf(f,
"5d 3d"
"7d",p l
/1.7,(C9E3
O57.3)0550,(int)i) dT(.45-14/l
X-a130-J
.14)_/125e2F_v P(T(47
I-m 52E94 D-t.38u.21E) /1e2W
179v)/2312
select(p0,0,0,0,G) v-(
WF-T(.63m-I.086mE19-D25-.11u
)/107e2)_ Dcos(o)
Esin(o)
19Pretty Printing to un-obfuscate
include ltmath.hgt include ltsys/time.hgt include
ltX11/Xlib.hgt include ltX11/keysym.hgt double L, o,
P, _ dt, T, Z, D 1, d, s999, E, h 8, I,
J, K, w999, M, m, O, n999, j 3.3e-2,
i 1e3, r, t, u, v, W, S 7.45e1, l
221, X 7.26, a, B, A 3.22e1, c, F, H int N,
q, C, y, p, U Window z char f52 GC
k main() Display e XOpenDisplay(0) z
RootWindow(e, 0) for (XSetForeground(e, k
XCreateGC(e, z, 0, 0), BlackPixel(e, 0))
scanf("lflflf", y n, w y, y s) 1
y) XSelectInput(e, z XCreateSimpleWindow(e,
z, 0, 0, 400, 400, 0, 0,
WhitePixel(e, 0)), KeyPressMask) for
(XMapWindow(e, z) T sin(O))
struct timeval G 0, dt 1e6 K
cos(j) N 1e4 M H _ Z
D K F _ P r E K
W cos(O) m K W H K T
O D _ F / K d / K E _ B
sin(j) a B T D - E W
XClearWindow(e, z) t T E D B W
j d _ D - _ F E P W E
B - T D for (o (I D W E T
B, E d / K B v B / K F D) _ p lt
y) T ps i E
c - pw D np - L K
D m - B T - H E if (pn wp
ps 0 K lt fabs(W T r - I E D P)
fabs(D t D Z T - a E) gt K)
N 1e4 else
q W / K 4e2 2e2 C 2e2
4e2 / K D N - 1e4
XDrawLine(e, z, k, N, U, q, C) N
q U C
p
L _ (X t P M m l) T
X X l l M M XDrawString(e, z, k,
20, 380, f, 17) D v / l 15 i
(B l - M r - X Z) _ for (
XPending(e) u CS ! N)
XEvent z XNextEvent(e, z)
((N XLookupKeysym( z.xkey, 0)) - IT ? N -
LT ? UP - N ? E J u h)
-- (DN - N ? N - DT ? N RT ? u W h
J) m 15 F / l c
(I M / l, l H I M a X) _
H A r v X - F l (E 1e-1 X 4.9 /
l, t T m / 32 - I T / 24) / S K F
M (h 1e4 / l - (T E 5 T E) / 3e2) /
S - X d - B A a 2.63 / l d
X (d l - T / S (1.9e-1 E a 6.4e-1
J / 1e3) - M v A Z) _ l K _
W d sprintf(f, "5d 3d"
"7d", p l / 1.7, (C 9e3 O 5.73e1)
0550, (int) i) d T (4.5e-1 - 14 / l
X - a 130 - J 1.4e-1) _ / 1.25e4 F _
v P (T (47 I - m 52 E 94 D
- t 3.8e-1 u 2.1e-1 E) / 1e2 W 179
v) / 2312 select(p 0, 0, 0, 0, G)
v - (W F - T (6.3e-1 m - I 8.6e-2 m
E 19 - D 25 - 1.1e-1 u) / 1.07e4) _
D cos(o) E sin(o)
20PrettyPrint to ObfuscateConsistent identifier
scrambling
import javax.swing.JOptionPane public class
Program1 public static void O0(String l1)
String l10,O11 double l100 double
O101,l110,l111,O1000 String l1001 " " l1001
JOptionPane.showInputDialog("P\145se c\150o\040ne
of the \157ns" "\012" "t\145rate \145f
\164e") l100 Double.parseDouble(l1001) while
(l100 ! 6) if (l100 1) l10
JOptionPane.showInputDialog("\105n\164\145\162\040
t\150e\040v\141\154\165\145\040o\146\040r\141\144i
u\163") O101 Double.parseDouble(l10) l111
Math.PI O101 O101 JOptionPane.showMessageDial
og(null,"\124a\040of\040\164\150e Ci\145\040
\040"l111,"result",JOptionPane.INFORMATION_MESSAG
E) else if (l100 2) l10
JOptionPane.showInputDialog("Ent\145r\040\164\150e
\040v\141l\165e \157\146\040\154en\147ht") O11
JOptionPane.showInputDialog("\105\156\164\145\162
\164h\145 value of\040width") O1000
Double.parseDouble(l10) l110
Double.parseDouble(O11) l111 (l110 O1000) /
2 JOptionPane.showMessageDialog(null, Area
\157f\154e " l111, "re\163ult",
JOptionPane.INFORMATION_MESSAGE) else if (l100
3) l10 JOptionPane.showInputDialog("\105\15
6\164\145\162 \164h\145 value\040of
lengt\150") O1000 Double.parseDouble(l10) l111
O1000 O1000 JOptionPane.showMessageDialog(nu
ll, "\124he area \163\161uare " l111, "
re\163ul\164", JOptionPane.INFORMATION_MESSAGE)
else if (l100 4) l10 JOptionPane.showInput
Dialog("\105\156\164er t\150e\040v\141lu\145\040\1
57f len\147\164h") O11 JOptionPane.showInputDia
log("E\156\164er the\040v\141l\165e \157\146
width") O1000 Double.parseDouble(l10) l110
Double.parseDouble(O11) l111 l110
O1000 JOptionPane.showMessageDialog(null,
"are\141 \164h\145 rectangle " l111,
"\162\145\163\165\154t", JOptionPane.INFORMATION_M
ESSAGE) else if (l100 5) l10
JOptionPane.showInputDialog("E\156ter
t\150\145\040v\141l\165\145\040of
lengt\150") O1000 Double.parseDouble(l10) l111
6 O1000 JOptionPane.showMessageDialog(null,
"T\150e area \157f\040t\150e\040\143\165\142e "
l111, "r\145sult", JOptionPane.INFORMATION_MESSA
GE) l1001 JOptionPane.showInputDialog("P\154e
ase\040\040\160r\157\147r\141m") l100
Double.parseDouble(l1001) System.out.println("\
120r\157\147\162a\155 \164e\162mi\156a\164ed\012")
System.exit(0)
21Optimization transformfor DMS Rewrite Rule
Language
Domain Name
default base domain Java rule
merge-ifs(\condition1,
\condition2, \then-statements)
if (\condition1) if (\condition2)
\then-statements rewrites
to if (\condition1 \condition2)
\then-statements
Domain Syntax
Well see this idea again later.
22DMS transforms work on ASTs, not textNot fooled
by any lexical properties of text!
To modify programs 1) define transforms 2) Parse
program 3) Apply transforms a) match LHS
pattern b) replace with RHS substitution 4)
Prettyprint program
rewrites -to
If Then
Right hand side
Left hand side
If Then
\condition1
If Then
\then statements
\then statements
\condition2
\condition1
\condition2
23Expertise Number of Rules
- Mathematics
- Novice (9th grade algebra) x0 ? x
- Amateur (HS Senior) sin2(x)cos2(x) ? 1
- Journeyman (Frosh Calculus) integrals
- Craftsman (B.S. Math) Linear Algebra, Group
Theory - Expert (Ph.D. Mathematics) Category Theory,
Topology, - DMS
- Toy several rules
- Useful 50 rules (simplification/optimization)
- Powerful 250 rules (testing, code generation)
- Indispensable 2000 rules (massive program
translation)
24DMS Toolkit Generalized Compiler
- Underlying Hypergraph representation trees,
graphs, - Parsing/Prettyprinting
- UNICODE lexer with binary conversions, lexical
format/comment capture - GLR (context-free) parser with automatic tree
builder - "Text Box" building language reproduces
comments! - Analysis
- Multipass attribute grammars
- Generalized symbol table support inheritance,
overloading, - Next Generic Control/Data Flow
- Transformation
- Complete procedural AST interface gt procedural
transforms ( analyzers) - Conditional Source to Source transforms w/
associative/commutative laws - Next Goal-directed metaprogramming
- Predefined Domains
- Spec, Technology, and Legacy languages
Spectrum, .MDL XML, SQL, IDL
C/C, Java, COBOL, assembler!
25Fundamental Issue Scale
- Engineering hard but straightforward
- Ugly details C preprocessors, etc.
- Reasoning/Analysis costs
- "Incremental" Knowledge capture gt domains
- Computers do Symbolic computation slowly gt
Parallel foundations - Future Rule compilers
- Legacy Systems are huge
- MSLOC tens of thousands of files
- Careful design of hypergraph to conserve space
- Arbitrary languages gt robust parsing technology
- Need for domain agility fast domain/dialect
definition - Use domain notations define knowledge
- Applications use multiple languages
- reasoning and transforms must work with mix
- Other scale issues
- Software Versions, Large Engineering Teams, Long
term Transactions
26Overview
- DMS Software Reengineering Toolkit
- Defining notations (domains) for specs and
legacy systems - Parsing and prettyprint
- Transformation mechanics
- Applications for Software Quality Improvement
- C preprocessor conditional removal
- Software Test Coverage
- Cross Reference and Dead Code
- Refactoring Java
- Automatic Code Generation (XML Parsers)
- Fast HTML generation using XSLT
- Clone Detection/Removal
- Porting application software to new languages
27Useless Conditional Elimination
- Problem Too many configuration IFs
- Application on many platforms WNT, SUN, VAX, ...
- IFs still in code
- Too many to remove by hand, confusing to manage
- Does delivered system work with all combinations?
- Solution Use DMS to remove designated IF
- Engineer names dead configuration variables (VAX)
- DMS use transforms to remove IFs
28C simplifying transforms
rule simplify_and_false(eexpression)
expression-gtexpression \e 0 -gt
0. rule simplify_and_true(eexpression)
expression -gt expression \e 1 -gt
\e. rule simplify_or_true(eexpression)
expression-gtexpression \e ! 1 -gt
1. rule simplify_or_false(eexpression)
expression -gt expression \e ! 0 -gt
\e. rule pp_if_true_simplify(bblock)statement
-gtstatement if 1 \b endif
-gt \b. rule pp_if_false_simplify(bblock)sta
tement-gtstatement if 0 \b
endif -gt . rule pp_if_then_else_false_simpl
ify(b1block,b2block)statement-gtstatement
if 0 \b1 else \b2
endif -gt \b2.
29C sample code simplified
Add Rule for dead configuration variables rule
VAX expression -gt expression VAX -gt 0.
Before IF VAX!UNIX syslog(logfile-gtfile_desc
riptor,display output) ENDIF ... IF VAX
sysclose(logfile-gtfile_descriptor) ELSE
fclose(logfile-gtfile_descriptor) ENDIF
After IF UNIX syslog(logfile-gtfile_descripto
r,display output) ENDIF ...
fclose(logfile-gtfile_descriptor)
30Software Test Coverage
- Analysis of code executed by test cases
- Non-executed code likely to have flaws
- Key problem tracking program control flow
- Need way to identify possible program parts
- Capture executed status of parts via tests
- Display execution status of program parts
- Secondary problem exercising all parts
- Exercising individual part
- Generation of tests from specifications
31Test Coverageby Marking visited Blocks
bool fibcached1000 int fibvalue1000 int
fib(int i) int t switch (i) case 0
case 1 return 1 default if
fibcached(i) return fibvalue(i)
else tfib(i-1) return
tfib(i-2)
bool fibcached1000 int fibvalue1000 int
fib(int i) int t visited1true switch
(i) case 0 visited2true case 1
visited3true return 1
default visited4true if
fibcached(i) visited5true
return fibvalue(i) else
visited6true tfib(i-1)
return tfib(i-2)
Original C program
Marking program
32DMS transform(s) to mark program
default base domain C rule mark_function_entry(r
esulttype, nameidentifier,
declsdeclaration_list, stmtsstatement_sequen
ce) \result \name \decls \stmts
rewrites to \result
\name \decls visited\place\(\stmts\)true
\stmts . rule mark_if_then_else(conditionexpr
ession tstmtstatement estmtstatement)
if (\condition)\tstmt else \estmt
rewrites to if (\condition)
visited\place\(\tstmt\)true \tstmt
else visited\place\(\estmt\)true
\estmt. rule mark_while_loop(con
ditionexpression, stmtstatement)
while (\condition) \stmt
rewrites to while (\condition)
visited\place\(\stmts\)true \stmt . rule
mark_case_clause(eexpression, stmtsstatements)
case \e \stmts
rewrites to case \e
visited\place\(\stmts\)true \stmts .
33Test Coverage Tool Flow
Source line information for visitedi
DMS Add marking code
Compile Run tests
Display Coverage
Source Code
Decorated Code
visited Vector
Visit-adding Transforms
Test Data
Note incrementing visited rather then setting
true changes this to profiler tool!
34Display ToolApplication with 3500 Java classes
35Cross Reference and Dead Code
- Harness name declarations per domain
- Enables semantically complex transforms
- Collect by attribute evaluation with symbol table
- Combine with PrettyPrinting for full XREF
- Customize PrettyPrinter to produce HTML
- Attribute evaluation collects use points
- Add links from identifiers to defs/uses
- Add other documentation features (simulate
JavaDoc) - Implement Dead Code detection/removal
- FAA DO187B no dead code in commercial
aircraft - Diagnose unreferenced definitions
- Delete using simple transforms
36JavaDoc Full System XRef
37Detecting Dead Code
package edu.ksu.cis.test public class
SuperClass protected transient int yy 1
void outer() System.out.println("I'm outer")
new AB() class AB SuperClass s new
SuperClass() //AB ab new AB() class
SuperClass void inner() System.out.println("I'm
inner") void callmethod()s.inner()
SuperClass.java
package edu.ksu.cis.test public class Test
extends SuperClass void m() // called from
main Test t new Test() Test t2
Test.this t2.yy 3 class A
void m2() Test t new Test() t.yy
2 class B
Deactive Code detection
7 d/users/hzheng/JavaTestSuite/first/edu/ksu/ci
s/test/SuperClass.java outer -gt
MethodDeclaredInClassType _at_ Line 4 Column 10
inner -gt MethodDeclaredInClassType _at_ Line
10 Column 14 s -gt FieldDeclaredInClassType
_at_ Line 7 Column 16 AB -gt
TopLevelClassDeclaration _at_ Line 6 Column 1
callmethod -gt MethodDeclaredInClassType _at_
Line 12 Column 10 8 d/users/hzheng/JavaTestSui
te/first/edu/ksu/cis/test/Test.java A -gt
TopLevelClassDeclaration _at_ Line 10 Column 1
m2 -gt MethodDeclaredInClassType _at_ Line 11
Column 10 t -gt LocalVariableDeclaredInBlock
_at_ Line 4 Column 11 B -gt
TopLevelClassDeclaration _at_ Line 17 Column 1
t -gt LocalVariableDeclaredInBlock _at_ Line
12 Column 10
Test.java
38Removing Dead Code I
private rule remove_deactive_method_declaration_7(
mmmethod_modifiers,idIDENTIFIER,
pparameters,bbrackets,bbblock_body)
class_body_declaration-gtclass_body_declaration
"\mm void \id \p \b \bb " -gt ""
with side-effect remove_entry_from_symbol_table(id
) if is_deactive_declaration(id).
Transform
(define IsDeactiveDeclaration (lambda
RegistryMatchingCondition (
(SymbolUsesMapGetUses symbol_uses_map
(NodeUsesMapGetWrapperDeclaration node_uses_map
arguments1))) )lambda )define (define
RemoveEntryFromSymbolTable (action
RegistryRuleSideEffect (local
wrapper_declaration JavaSymbolTableWrapperDeclar
ation ( remove entry of arguments1
from symboltable' (
wrapper_declaration (NodeUsesMapGetWrapperDeclara
tion node_uses_map arguments1))
(ConsolePut (. l Remove symbol table entry for
node ')) (ASTPrintSourcePosition
OutputStreamStandardOutput arguments1)
(JavaSymbolTableRemoveOneEntryFromSymbolTable
symbol_table wrapper_declaration arguments1
removed_symbol_space) ) )local
)action )define
Procedural Helpers Using Symbol Table
39Removing Dead Code II
package edu.ksu.cis.test public class
SuperClass protected transient int yy 1
void outer() System.out.println("I'm outer")
new AB() class AB SuperClass s new
SuperClass() //AB ab new AB() class
SuperClass void inner() System.out.println("I'm
inner") void callmethod()s.inner()
package edu.ksu.cis.test public class
SuperClass protected transient int yy 1
SuperClass.java
package edu.ksu.cis.test public class Test
extends SuperClass void m() // called from
main Test t2 Test.this t2.yy 3
package edu.ksu.cis.test public class Test
extends SuperClass void m() // called from
main Test t new Test() Test t2
Test.this t2.yy 3 class A
void m2() Test t new Test() t.yy
2 class B
Test.java
Before
After
Remove symbol table entry for node Line 17
Column 1 File d/users/hzheng/JavaTestSuite/first/
edu/ksu/cis/test/Test.java Remove symbol table
entry for node Line 12 Column 10 File
d/users/hzheng/JavaTestSuite/first/edu/ksu/cis/te
st/Test.java Remove symbol table entry for node
Line 11 Column 10 File d/users/hzheng/JavaTestS
uite/first/edu/ksu/cis/test/Test.java Remove
symbol table entry for node Line 10 Column 1
File d/users/hzheng/JavaTestSuite/first/edu/ksu/c
is/test/Test.java Remove symbol table entry for
node Line 4 Column 11 File d/users/hzheng/JavaT
estSuite/first/edu/ksu/cis/test/Test.java Remove
symbol table entry for node Line 12 Column 10
File d/users/hzheng/JavaTestSuite/first/edu/ksu/c
is/test/SuperClass.java Remove symbol table
entry for node Line 10 Column 14 File
d/users/hzheng/JavaTestSuite/first/edu/ksu/cis/te
st/SuperClass.java Remove symbol table entry for
node Line 7 Column 16 File d/users/hzheng/JavaT
estSuite/first/edu/ksu/cis/test/SuperClass.java
Remove symbol table entry for node Line 6
Column 1 File d/users/hzheng/JavaTestSuite/first/
edu/ksu/cis/test/SuperClass.java Remove symbol
table entry for node Line 4 Column 10 File
d/users/hzheng/JavaTestSuite/first/edu/ksu/cis/te
st/SuperClass.java
40Java O-O Refactoring
changing a software system in such a way that
it does not alter the external behavior of the
code, yet improves its internal
structureRefactoring (Martin Fowler,Addison-Wesl
ey, 1999)
Extract Method, p. 110
void printOwing(double amount)
printBanner() // print details
System.out.println(Name _name)
System.out.println(Amount amount)
void printOwing(double amount)
printBanner() printDetails(amount) void
printDetails (double amount)
System.out.println(Name _name)
System.out.println(Amount amount)
- Many useful refactorings move-method,
generalize-method,... - Fowler view teach programmers, implement
manually - DMS view mechanical transforms --gt CAPTURE IN
TOOL! - Refactoring Object-oriented Frameworks
(Opdyke92) - C refactoring (Tokuda98/99)
- http//st-www.cs.uiuc.edu/brant/RefactoringBrowse
r/Brant98 (Rewrite Rule Editor)
41Extract Method as a Mechanical TransformMix of
DMS RSL pattern matching and procedure
default base domain Java. rule
ExtractMethod(...) "\parent_class_modifiers
\class_header \some_declarations
\method_modifiers \type \method_name \parameters
\brackets \throw_list
\leader_statements \extractable_statem
ents \follower_statements
\more_declarations "class_body
rewrites to "\parent_class_modifiers
\class_header \some_declarations
\extracted_method(\extractable_statements)
\method_modifiers \type \method_name \parameters
\brackets \throw_list
\leader_statements \new_methodcall(\ex
tractable_statements)
\follower_statements
\more_declarations " if
extractable(\leader_statements,\extractable_statem
ents) -- no bad side effects -- Properties of
computed new method -- method name not member
parameter_names(\class_header) or
declared(\leader_statements) -- method throw list
exceptions_escaping(\extractable_statements) --
declarations declared(\leader_statements)
escapes(\executable_statements) -- argumentnames
(parameter_names(\class_header) union
declared(\leader_statements))
intersect reads(\executable_statements) --
argumentbindings ...
Java pattern
Computes new method
Computes call to method
42OO Promise inhibited by Scale
- Typical 21st Century ApplicationIBMs San
Francisco (ERP) - 2 million lines Java
- 9000 Class definitions for ERP framework
- Scale refactoring by manual methods Very
Difficult - Impractical to carry out thoroughly and reliably
- Failure to refactor -gt Framework evolution frozen
-gt Future ? - Scalable re-engineering tools will become
necessary - Transformation systems can provide that
foundation - Can support huge libraries of Refactoring
operations (Fowler99) - Can support other useful transforms
(optimizations, etc.)
43XML Parser Generation
- XML enables Electronic Data Interchange
- Neutral form for moving information between
systems - Problem Need System1 to XML to System2
translators - For arbitrary data
- Must be fast to support high-volume EDI
- Typical solution Use Standard XML -gt DOM reader
- Incomplete Doesnt solve to XML problem
- Slow Parse arbitrary XML, validate against DTD
schema - Slow Procedural interface interpreting DTD for
data access - Clumsy DOM calls clutters application code
- Idea Generate DTD-specific XML parser/generators
- Produce application-target-language code specific
to DTD - Free validation XML data in direct-access native
data structures - Faster parsing/processing, Easier application
coding
44Simple XML DTD for OrderForm
lt?xml version'1.0' ?gt lt!DOCTYPE orderform
lt!ELEMENT orderform (name,company,address,items
) gt lt!ELEMENT name ( firstname, lastname
)gt lt!ELEMENT firstname ( PCDATA )gt lt!ELEMENT
lastname ( PCDATA )gt lt!ELEMENT company ( PCDATA
)gt lt!ELEMENT address ( street, city, country
)gt lt!ELEMENT street ( PCDATA )gt lt!ELEMENT city(
PCDATA )gt lt!ELEMENT country ( zipcode nation
)gt lt!ELEMENT zipcode ( PCDATA )gt lt!ELEMENT
nation ( PCDATA )gt lt!ELEMENT items (item)
gt lt!ELEMENT item ( partnumber, quantity,
unitprice)gt lt!ELEMENT partnumber ( PCDATA
)gt lt!ELEMENT quantity ( PCDATA )gt lt!ELEMENT
unitprice ( PCDATA )gt gt
Sample
orderform ltorderformgt ltnamegtWiley
Coyotelt/namegt ltcompanygtDinner, Inc.lt/companygt
ltaddressgtltstreetgt1 Mesa Highwaylt/streetgt
ltcitygtSouthwest Parklt/citygt
ltcountrygtltzipcodegt98765lt/zipcodegtlt/countrygt
lt/addressgt ltitemsgt ltitemgtltpartnumbergtRocket
Skateslt/partnumbergt ltquantitygt2lt/quanti
tygtltpricegt29.95lt/pricegtlt/itemgt
ltitemgtltpartnumbergtBirdSeedlt/partnumbergt
ltquantitygt2000lt/quantitygtltpricegt.01lt/pricegtlt/item
gt ltitemsgt lt/orderformgt
items is list of item
45Java Code Generation Plan
- For each DTD element
- Produce an element class to hold data for that
element - PCDATA for leaves
- Class references for non-leaves
- Produce an element-specific parsing procedure
- Produce an element-specific unparsing produre
- Handle sequences as array of element class
references - Handle choices as class reference which
alternative integer (1..n)
name Parse() while sequenceifield.Par
se() class XML_Element_name implements Union
public Union sequence new Union
lt!ELEMENT name ( field ) gt
46Sample Code Generation Transforms
pattern CLASS_SEQUENCE(nameNAME,sequencecp_seque
nce)class_body_declarations_at_Java
"\JavaClassName\(\name\) Parse() throws
XML.NotValidForDTD if (!XML.QueryOpeningTa
g(\JavaXMLOpenTagString\(\name\))) return null
\JavaClassName\(\name\) resultnew
\JavaClassName\(\name\)()
\SEQUENCE_FETCH\(\sequence\,\SEQUENCE_LENGTH\(\seq
uence\)\,sequence1\) XML.RequireClosingT
ag(\JavaXMLCloseTagString\(\name\))
return result class \JavaClassName\(\na
me\) extends XML_IO implements Union
\XML_PARSER_DECLARATIONS_FOR_CP_SEQUENCE\(\sequenc
e\,1\,\gtXML\NAME
\contentspec sequence1 \lt\NAME\) // produces
nested subclasses public Union
sequence new Union\SEQUENCE_LENGTH\(\sequence\
) public Generate()
XML.WriteOpeningTag(\JavaXMLOpenTagString\(\name\)
) for (i1ilt\SEQUENCE_LENGTH\(\
sequence\)i)
sequencei.Generate()
XML.WriteClosingTag(\JavaXMLOpenTagString\(\name\)
) ". rule
refine_ELEMENT_sequence_to_class(nameNAME,sequenc
ecp_sequence) elementdecl -gt
class_body_declarations "\markupdecl
lt!ELEMENT \name ( \sequence ) gt" -gt
CLASS_SEQUENCE(name,sequence).
Generates Parser method
Generates class
About 100 rules 2000 lines for all transforms
47Generated Java code for items
class XML_Element_items extends XML_IO implements
Union public Union sequence new
Union1 XML_Element_items Parse()
throws XML.NotValidForDTD if (
!XML.QueryOpeningTag("items")) return
null XML_Element_items result new
XML_Element_items() if ((sequence1
XML_Element_item.Parse()) null) throw
XML.NotValidForDTD XML.RequireClosingTag("
items") return result
class XML_Element_item extends XML_IO
implements Union public Union sequence
new Union3 XML_Element_item Parse()
throws XML.NotValidForDTD if (
!XML.QueryOpeningTag("item")) return null
XML_Element_item result new
XML_Element_item() if ((sequence1
XML_Element_partnumber.Parse()) null)
throw XML.NotValidForDTD if ((sequence2
XML_Element_quantity.Parse()) null)
throw XML.NotValidForDTD if ((sequence3
XML_Element_unitprice.Parse()) null)
throw XML.NotValidForDTD
XML.RequireClosingTag("item") return
result
48Generated Java code for item contentpartnumber,
quantity, unitprice
class XML_Element_partnumber extends XML_IO
implements Union public String PCDATA
XML_Element_partnumber Parse() throws
XML.NotValidForDTD if (XML.QueryOpeningTag("
partnumber")) return null
XML_Element_partnumber result new
XML_Element_partnumber() result.PCDATA
XML.AcceptNonEmptyPCDATA()
XML.RequireClosingTag("partnumber") return
result
class XML_Element_quantity extends XML_IO
implements Union public String PCDATA
XML_Element_quantity Parse() throws
XML.NotValidForDTD if (XML.QueryOpeningTag("
quantity")) return null
XML_Element_quantity result new
XML_Element_quantity() result.PCDATA
XML.AcceptNonEmptyPCDATA()
XML.RequireClosingTag("quantity") return
result
class XML_Element_unitprice extends XML_IO
implements Union public String PCDATA
XML_Element_unitprice Parse() throws
XML.NotValidForDTD if (XML.QueryOpeningTag("
unitprice")) return null
XML_Element_unitprice result new
XML_Element_unitprice() result.PCDATA
XML.AcceptNonEmptyPCDATA()
XML.RequireClosingTag("unitprice") return
result
49Application Code to Print Orderform
Orderform orderform new Orderform.Parse() //
exception thrown if invalid w.r.t. DTD
Print(Customer ) Print(orderform.name.first
name.PCDATA)Print( ) Print(orderform.name.l
astname.PCDATA) Print(Company )
Print(orderform.company.PCDATA) PrintNewline()
Print(Address ) Print(orderform.address.s
treet.PCDATA) PrintNewLine()
Print(orderform.address.city.PCDATA)
PrintNewline() if
(orderform.addreess.region.which1)
Print(orderform.address.region.zipcode.
PCDATA) else
Print(orderform.address.region.country.PCDATA)
PrintNewline() Print
(ITEMS Product Quantity Cost
Extension) float Total0 for
(item1itemltlength(orderform.items.item)1)
PrintNumber(item) PrintTab()
Print(order.items.sequenceitem.partnumber.PCDATA
) PrintTab() Print(order.items.sequenceit
em.quantity.PCDATA) PrintTab()
Print(order.items.sequenceitem.price.PCDATA)
PrintTab() float extensionValue(order.item
sitem.quantity.PCDATA)
Value(order.itemsitem.price.PCDATA)
PrintNumber(extension)PrintNewline()
Totalextension Print Invoice
totalPrintNumber(Total)PrintNewline()
Note direct access to XML data
50Preliminary performance
- From http//www.pankaj-k.net/xpb4j/do
cs/Measurements-May30/
measurements-May30-2002.htmlMeasureme
ntProcess - Comparison among DOM Parsers for XML parsing
rate
30ms
SDDTDtoJava
orderform.class 8kb XML.class
24kb
51Partial Evaluation PEFaster Computation using
Context
- Example
- We want to repeatedly compute XY4
- Somehow we find out X7 all the time
- We can compute 7Y4
- Better, we can compute Y11
- Partial Evaluation (implemented using transforms)
- Find a program fragment with variables
- Find a constant variable value (computation in
constant context) - Substitute that value into program fragment
- Simplify the result
- Spectacular examples
- JITPE(JavaInterpreter,JavaProgram) --gt
CompiledJava - PE(XSLInterpreter,XSLProgram) --gt CompiledXSL
52Cost of Running an HTML serverusing XSLT
interpreter
- Big E-commerce site
- 1 million visits/month
- Each visit uses 100 clicks
- Each click displays 1 XML screen-page as HTML
- MS IE 5.5 1 screen-page in .075 sec (Time
measured on 4 way SMP 200Mhz Ppro) - MSXML core component of IE5.5
- 33300 100 .075 sec --gt 3 CPU days/elapsed
day - Servers have lots of other work to do!
53Compile XML -gt HTML for speed
- Read XSL driving XML-gtHTML conversion
- Compile XSL to C code using transforms
Why? Compiled code typ. 100x interpreted code
(MSXML) - C structs instead of DOM nodes
- C code instead of XSL paths
- C code instead of JavaScript idioms
- Run compiled C code instead of IE5/MSXML
- Estimated runtime 306 seconds (vs. 3 CPU days
for IE5) - Other XSL processors may be faster than IE5
- Still probably interpretive --gt slow
- Major savings in server CPU
- fewer servers (lower cost to maintain)
- fast response (better customer experience)
Backup Spreadsheet
54XSL Compiler using DMS
C
XSL
Analyze
Generate/ Transform
Parse (read)
Pretty Print
XML Patterns
C Transforms
Task Definition
XML Language Definition
C Language Definition
55DMS Scale Application Clone Removal
... code block 1 ...
define Clone27(a,b,c,d)\ for (a1,altb,a)\
for (ca,cgt1,c--)\ if (dcgtdc-1)\
swap(da,dc)
// sort array A for (I1,Ilt10,I) for
(ji,jgt1,J--) if (AjgtAj-1)
swap(AI,AJ)
... code block 2 ...
... code block 1 ...
for (I1, Ilt2Q, I) for (I1i, I1gt1, I1--) //
exchange if less if ( KI1 gt KI1-1 ) swap(
KI, KI1 )
for (a1,altb,a) for (ca,cgt1,c--) if
(dcgtdc-1) swap(da,dc)
// sort array A Clone27(I,10,j,A)
... code block 2 ...
... code block 3 ...
Clone27(I,2Q,I1,K)
// sort my data for (z1,zlt1000,z) for
(ji,jgt1,J--) if (DjgtDj-1) swap(Dz,
DJ)
... code block 3 ...
// sort my data Clone27(z,1000,j,D)
... code block 4 ...
... code block 4 ...
DeCloned System with automatic names10-20
reduction
Skeleton of detected clones
Original System with code clones1 M SLOC
56COBOL Clone Detection/Removal
- Find Clones by matching every program fragment
(AST) to every other - Expensive!
- California Community Colleges application
- Course Inventory Construction for each campus
- Parameters
- 77,000 SLOC ANSI COBOL 85 --gt 774,645 AST nodes
- 128 Second Parse time, 40 minutes for clone
detection - 500 Mb RAM, 6 CPUs
- Number of exact clone pairs 78, near miss pairs
95 - Largest clone 5 copies, 1017 lines each, 1
parameter! - Number of cloned lines 30727 --gt can remove
15363 - SLOC reduction by removing clones 19.7
- Getting ready to try on 2M SLOC Java code
57Sample COBOL Clones
1
Similarity .99178082191781 from 35179 to
35204 file example.cbl 9700-OUTPUT-REPORT-TO
TALS. MOVE CURRENT-COLLEGE-ID
TO REPORT-CODE1 REPORT-CODE2 REPORT-CODE3.
SET EDIT-ERROR-LITERAL-INDEX TO 1. SET
DISTRICT-COUNT-ROW-INDEX TO 1. PERFORM
9710-OUTPUT-TOTALS1 UNTIL
EDIT-ERROR-LITERAL-INDEX gt 30. PERFORM
9720-OUTPUT-TOTALS2. PERFORM
9730-OUTPUT-TOTALS3. 9710-OUTPUT-TOTALS1.
MOVE REPORT-SUM1 TO TOTALS-ID1. MOVE
ELEMENT-NUMBER (EDIT-ERROR-LITERAL-INDEX)
TO DED-NUMBER. MOVE DISTRICT-COUNT
(DISTRICT-COUNT-ROW-INDEX 1) TO
EXCEPT-COUNT. MOVE DISTRICT-COUNT
(DISTRICT-COUNT-ROW-INDEX 2) TO
UNKNOWN-COUNT. MOVE DISTRICT-COUNT
(DISTRICT-COUNT-ROW-INDEX 3) TO
REASON-COUNT. MOVE DISTRICT-COUNT
(DISTRICT-COUNT-ROW-INDEX 4) TO
GRP3-COUNT. WRITE REPORT-TOTALS-RECORD1.
SET EDIT-ERROR-LITERAL-INDEX UP BY 1.
SET DISTRICT-COUNT-ROW-INDEX UP BY 1.
------------------------------------ from
15368 to 15393 file example.cbl
9700-OUTPUT-REPORT-TOTALS. MOVE
HOLD-DISTRICT-ID TO REPORT-CODE1
REPORT-CODE2 REPORT-CODE3. SET
EDIT-ERROR-LITERAL-INDEX TO 1. SET
DISTRICT-COUNT-ROW-INDEX TO 1. PERFORM
9710-OUTPUT-TOTALS1 UNTIL
EDIT-ERROR-LITERAL-INDEX gt 17. PERFORM
9720-OUTPUT-TOTALS2. PERFORM
9730-OUTPUT-TOTALS3. 9710-OUTPUT-TOTALS1.
MOVE REPORT-SUM1 TO TOTALS-ID1. MOVE
ELEMENT-NUMBER (EDIT-ERROR-LITERAL-INDEX)
TO DED-NUMBER. MOVE DISTRICT-COUNT
(DISTRICT-COUNT-ROW-INDEX 1) TO
EXCEPT-COUNT. MOVE DISTRICT-COUNT
(DISTRICT-COUNT-ROW-INDEX 2) TO
UNKNOWN-COUNT. MOVE DISTRICT-COUNT
(DISTRICT-COUNT-ROW-INDEX 3) TO
REASON-COUNT. MOVE DISTRICT-COUNT
(DISTRICT-COUNT-ROW-INDEX 4) TO
GRP3-COUNT. WRITE REPORT-TOTALS-RECORD1.
SET EDIT-ERROR-LITERAL-INDEX UP BY 1.
SET DISTRICT-COUNT-ROW-INDEX UP BY 1.
2
2 gt 30 gt 17 --------- 1 gt
CURRENT-COLLEGE-ID gt HOLD-DISTRICT-ID
Report by District-ID
Report by College-ID
58The generated COPYLIB
9700-OUTPUT-REPORT-TOTALS .
MOVE PARAMETER-1 TO REPORT-CODE1
REPORT-CODE2 REPORT-CODE3 . SET
EDIT-ERROR-LITERAL-INDEX TO 1 . SET
DISTRICT-COUNT-ROW-INDEX TO 1 .
PERFORM 9710-OUTPUT-TOTALS1 UNTIL
EDIT-ERROR-LITERAL-INDEX gt PARAMETER-2 .
PERFORM 9720-OUTPUT-TOTALS2 .
PERFORM 9730-OUTPUT-TOTALS3 .
9710-OUTPUT-TOTALS1 . MOVE REPORT-SUM1
TO TOTALS-ID1 . MOVE ELEMENT-NUMBER (
EDIT-ERROR-LITERAL-INDEX ) TO
DED-NUMBER . MOVE DISTRICT-COUNT (
DISTRICT-COUNT-ROW-INDEX 1 )
TO EXCEPT-COUNT . MOVE DISTRICT-COUNT
( DISTRICT-COUNT-ROW-INDEX 2 )
TO UNKNOWN-COUNT . MOVE
DISTRICT-COUNT ( DISTRICT-COUNT-ROW-INDEX 3 )
TO REASON-COUNT . MOVE
DISTRICT-COUNT ( DISTRICT-COUNT-ROW-INDEX 4 )
TO GRP3-COUNT . WRITE
REPORT-TOTALS-RECORD1 . SET
EDIT-ERROR-LITERAL-INDEX UP BY 1 . SET
DISTRICT-COUNT-ROW-INDEX UP BY 1 .
59Source file change
MOVE DISTRICT-INT-CNT (11) TO
INT-CNT-OUT-B. WRITE PRINT-RECORD-2 FROM
INTEGRITY-ERROR-B AFTER ADVANCING 2
LINES. MOVE DISTRICT-INT-CNT (12) TO
INT-CNT-OUT-C. WRITE PRINT-RECORD-2 FROM
INTEGRITY-ERROR-C AFTER ADVANCING 2
LINES. COPY CDR_clone10
REPLACING PARAMETER-1 BY
HOLD-DISTRICT-ID
PARAMETER-2 BY 17
9720-OUTPUT-TOTALS2. MOVE
REPORT-SUM2 TO TOTALS-ID2. MOVE
DISTRICT-INT-CNT (1) TO
INTEGRITY-ERROR-COUNT. MOVE '01' TO
INTEGRITY-ERROR-CODE. WRITE
REPORT-TOTALS-RECORD2. MOVE REPORT-SUM2 TO
TOTALS-ID2. MOVE DISTRICT-INT-CNT (2)
TO INTEGRITY-ERROR-COUNT. MOVE '02' TO
INTEGRITY-ERROR-CODE.
MOVE DISTRICT-INT-CNT (11) TO
INT-CNT-OUT-B. WRITE PRINT-RECORD-2 FROM
INTEGRITY-ERROR-B AFTER ADVANCING 2
LINES. MOVE DISTRICT-INT-CNT (12) TO
INT-CNT-OUT-C. WRITE PRINT-RECORD-2 FROM
INTEGRITY-ERROR-C AFTER ADVANCING 2
LINES. 9700-OUTPUT-REPORT-TOTALS. MOVE
HOLD-DISTRICT-ID TO REPORT-CODE1,
REPORT-CODE2, REPORT-CODE3. SET
EDIT-ERROR-LITERAL-INDEX TO 1. SET
DISTRICT-COUNT-ROW-INDEX TO 1. PERFORM
9710-OUTPUT-TOTALS1 UNTIL
EDIT-ERROR-LITERAL-INDEX gt 17. PERFORM
9720-OUTPUT-TOTALS2. PERFORM
9730-OUTPUT-TOTALS3. 9710-OUTPUT-TOTALS1.
MOVE REPORT-SUM1 TO TOTALS-ID1. MOVE
ELEMENT-NUMBER (EDIT-ERROR-LITERAL-INDEX)
TO DED-NUMBER. MOVE DISTRICT-COUNT
(DISTRICT-COUNT-ROW-INDEX, 1) TO
EXCEPT-COUNT. MOVE DISTRICT-COUNT
(DISTRICT-COUNT-ROW-INDEX, 2) TO
UNKNOWN-COUNT. MOVE DISTRICT-COUNT
(DISTRICT-COUNT-ROW-INDEX, 3) TO
REASON-COUNT. MOVE DISTRICT-COUNT
(DISTRICT-COUNT-ROW-INDEX, 4) TO
GRP3-COUNT. WRITE REPORT-TOTALS-RECORD1.
SET EDIT-ERROR-LITERAL-INDEX UP BY 1. SET
DISTRICT-COUNT-ROW-INDEX UP BY 1.
9720-OUTPUT-TOTALS2. MOVE REPORT-SUM2 TO
TOTALS-ID2. MOVE DISTRICT-INT-CNT (1)
TO INTEGRITY-ERROR-COUNT. MOVE '01' TO
INTEGRITY-ERROR-CODE. WRITE
REPORT-TOTALS-RECORD2. MOVE REPORT-SUM2 TO
TOTALS-ID2. MOVE DISTRICT-INT-CNT (2)
TO INTEGRITY-ERROR-COUNT. MOVE '02' TO
INTEGRITY-ERROR-CODE.
60Clone Detection/Removal Statistics
61Typical Porting Scenarios
- JOVIAL73 on MIL1750 ? C on PowerPC
- Military Avionics Weapons management
- COBOL74 IDMS ? COBOL85 SQL
- UNISYS 1100 retirement must move data, too!
- KR C custom RTOS ? ANSI C VXworks
- Microprocessor modernization
- Clipper green screen ? Delphi GUI
- Legacy 3GL data processing language
- MODCOMP ASM ? C
- Defense Radar modernization 12 computer
languages! - Verilog ? VHDL
- Reuse of Chip Design in new context
62How DMS handles Porting
- Accepts definitions of source, target and design
languages - Syntax, Semantics, Optimization Transforms and
Analysis rules - Accepts specifications of (porting)
transformations - Written in terms of the language syntax,
conditioned by analyses - Source-language idioms often map directly to
Target-language idioms - Transforms for Complex idioms/OS/Library
callsabstracted to design languages, then
refined to target languages - Parses entire source system (thousands of files!)
- Apply Porting transformations, then Optimizing
transforms - PrettyPrints the results in compilable target
language form - Test Result using Application Regression Test
- Revise transforms and repeat till done
63A few DMS porting transformsJovial to C
default source domain Jovial default target
domain C private rule refine_data_reference_dere
ference_NAME (n1identifier_at_C,n2i
dentifier_at_C)
data_reference-gtexpression "\n1\NAME _at_
\n2\NAME" -gt "\n2-gt\n1". private rule
refine_for_loop_letter_2
(lcidentifier_at_C,f1expression_at_C,
f2expression_at_C,sstatement_at_C)
statement-gtstatement "FOR
\lc\loop_control \f1\formula BY
\f2\formula \s\statement -gt " int
\lc (\f1) for(\lc (\f2)) \s
if is_letter_identifier(lc).
Domain Name
Pattern Variables
Source Domain Syntax
Target Domain Syntax
64Porting Transforms in ActionJovial to C
JOVIAL Source FOR i j3 BY 2
x_at_mydata x_at_mydataI Translated C
Result int i j3
for (i2) mydata-gtx
mydata-gtx i
Typically lots of small transforms for full
translation 1500 rules to translate full Jov