Coevolutionary Automated Software Correction A Proof of Concept presentation

About This Presentation

Transcript and Presenter's Notes

Title: Coevolutionary Automated Software Correction A Proof of Concept

1
Coevolutionary Automated Software Correction A
Proof of Concept
Committee Dr. Daniel Tauritz Chair Dr. Bruce
McMillin Dr. Thomas Weigert

Masters Oral Defense
September 8, 2008
Josh Wilkerson

2
Motivation

In 2002 the National Institute of Science and
Technology stated 9
Software errors cost the U.S. economy 59.5
billion a year
Approximately 0.6 of gross domestic product
30 of these costs could be removed by earlier,
more effective software defect detection and an
improved testing infrastructure

3
Problem Statement

Software Debugging
Test the software
Locate the errors identified
Correct the errors
Time consuming yet critical process
Many publications on automating the testing
process
None that fully automate the testing and
correction phase

4
The System Envisioned
5
Most Related Work

Paolo Tonella 14 and Stefen Wappler 6,15,16
Unit testing of object oriented software
Used evolutionary methods
Focused only on testing, did nothing with
correction
Timo Mantere 7,8
Two-population testing system using genetic
algorithms
Optimized program parameters through evolution
The more control the EA has over the program the
better the results

6
Technical Background

Darrel Rosin 10,11 and John Cartlidge 1
Extensive analysis of co-evolution
Outline many potential problems that can occur
during co-evolution
Koza 2,3,4,5
Popularized genetic programming in the 1990s
Father of modern genetic programming

7
CASC Evolutionary Model
8
CASC Evolutionary Model
9
Parsing in the CASC System
The program population is based on the program
to be corrected (seed program)
10
Parsing in the CASC System Step 1

The ANTLR system is used to create parsing tools
(only done once for each language)
The parser created is based on a provided grammar
(C)
The resulting parser is dependent on the ANTLR
libraries

11
Parsing in the CASC System Step 2

The system reads in the source code for the
program to correct
The code to evolve is extracted in preprocessing

12
Parsing in the CASC System Step 3

The preprocessed source code to evolve is
provided to the parsing tools

13
Parsing in the CASC System Step 4

The parsing tools produce the Abstract Syntax
Tree (AST) for the evolvable code
The AST produced is heavily dependent on the
ANTLR libraries
These dependencies incur unnecessary
computational cost

14
Parsing in the CASC System Step 5

The ANTLR AST is provided to the CASC AST
translator
The AST translator removes the ANTLR
dependencies from the AST
The result is a lightweight version of the AST

15
Parsing in the CASC System Step 6

The lightweight AST is provided to the CASC
coevolutionary system
Copies of the AST are randomly modified
Initial variation phase

16
CASC Evolutionary Model
17
CASC Evolutionary Model
18
CASC Evolutionary Model
19
CASC Evolutionary Model

Reproduction
Parents selected using tournament selection
Uniform crossover with bias
Program child subtrees of the roots were used for
crossover
Mutation
Each offspring has a chance to mutate
Only specific nodes are considered for program
mutation
Genes to be mutated are altered based on a
Gaussian distribution

20
CASC Evolutionary Model
21
CASC Evolutionary Model
22
CASC Evolutionary Model Fitness Evaluation

For each individual
Randomly select set of (unique) opponents
Check hash table to retrieve repeat pairing
results
Execute program with test case as input for each
new pairing
Apply fitness function to program output, store
fitness for the trial
Set individual fitness as average fitness across
all trials
Program compilation is performed as needed
Program errors/time-outs result in arbitrarily
low fitness
This is done in parallel, using the NIC-Cluster
and MPI

23
CASC Evolutionary Model
24
CASC Evolutionary Model
25
CASC Evolutionary Model
26
Experimental Setup

Proof of concept
Correction of insertion sort implementation
Test case unsorted data array

27
Experimental Setup

Fitness function
Scoring method
For each element x in the output data array
For each element a before x in the array,
decrement score if x lt a, increment score
otherwise
For each element b after x in the array,
decrement score if x gt b, increment score
otherwise
Normalized to fall between 0 and 1
-1 assigned to programs with errors/time-outs

28
Experimental SetupExperimental Setup

Four seed programs used
Each has one common error and one unique error
(of varying severity)
Four different configurations used
Mutation Rate Likelihood of an offspring being
mutated
Mutative Proportion Amount of change mutation
incurs

29
Results

A total of 16 experiments per full run
High computational complexity and limited
resources
Five full runs were completed, totaling in 80
experiments

30
Summary of Results

Run three of both the program A and B experiments
found a solution in the initial population (these
were omitted from the table)
20 of the experiments (16) reported success

31
Summary of Results

75 of the experiments reported above 0.7 fitness

32
Summary of Results

There was a high amount of variation in the
experiment endpoints
Large number of possible solutions for each seed
program

33
Summary of Results

The seed program D experiments were the toughest
for the system
Seeded error resulted in either a 0 or -1 fitness
Experiments were either hit or miss

34
Discussion of False Positives

A number of the programs returned by successful
experiments still have an error
For example, this is the evolvable section from a
solution
for(m0 m-1 lt SIZE-1 mm1)
for(nm1 ngt0 datan lt datan-1 nn-1)
Swap(datan, datan-1)
When m is SIZE-1, n is initialized to Size
(invalid array index)
Tough to catch

35
Conclusion

The goal demonstrate a proof of concept
coevolutionary system for integrated automated
software testing and correction
A prototype Coevolutionary Automated Software
Correction system was introduced
80 experiments were conducted
16 successes, with 75 of best-of-experiment
fitnesses reporting over 0.7 (out of 1.0)
These experiments indicate validity of CASC
system concept
Further work is required to determine scalability
Article on this work submitted to IEEE TSE

36
Work in Progress and Future Work

Evolve complete parse tree
Preliminary results using GP evolutionary model
are favorable
Cut down on run-times
Add symmetric multiprocessing (server-client)
functionality
More efficient compilation
Acquire additional computing resources (e.g., NSF
Teragrid)
Investigate the potential benefits of
co-optimization 12,13

37
Work in Progress and Future Work

Implement adaptive parameter control
Investigate options for detecting errors like
false positives
Parameter sensitivity analysis

38
References

1 J. P. Cartlidge. Rules of Engagement
Competitive Coevolutionary Dynamics in
Computational Systems. PhD thesis, University of
Leeds, 2004.
2 J. R. Koza. Genetic Programming On the
Programming of Computers by the Means of Natural
Selection. MIT Press, Cambridge MA, 1992.
3 J. R. Koza. Genetic Programming II Automatic
Discovery of Reusable Programs. MIT Press,
Cambridge MA, 1994.
4 J. R. Koza. Genetic Programming III
Darwinian Invention and Problem Solving. Morgan
Kaufmann, 1999.
5 J. R. Koza. Genetic Programming IV Routine
Human-Competitive Machine Intelligence. Kluwer
Acadmeic Publishers, 2003.
6 F. Lammermann and S. Wappler. Benefits of
software measures for evolutionary white-box
testing. In Proceedings of GECCO 2005 - the
Genetic and Evolutionary Computation Conference,
pages 10831084, Washington DC, 2005. ACM, ACM
Press.

39
References

7 T. Mantere and J. T. Alander. Developing and
testing structural light vision software by
co-evolutionary genetic algorithm. In QSSE 2002
The Proceedings of the Second ASERC Workshop on
Quantative and Soft Computing based Software
Engineering, pages 3137. Alberta Software
Engineering Research Consortium (ASERC) and the
Department of Electrical and Computer
Engineering, University of Alberta, Feb. 2002
8 T. Mantere and J. T. Alander. Testing
digital halftoning software by generating test
images and filters co-evolutionarily. In
Proceedings of SPIE Vol. 5267 Intelligent Robots
and Computer Vision XXI Algorithms, Techniques,
and Active Vision, pages 257258. SPIE, Oct.
2003.
9 M. Newman. Software Errors Cost U.S. Economy
59.5 Billion Annually. NIST News Release, June
2002.
10 C. D. Rosin and R. K. Belew. Methods for
competitive coevolution Finding opponents worth
beating. In L. Eshelman, editor, Proceedings of
the Sixth International Conference on Genetic
Algorithms, pages 373380, San Francisco, CA,
1995. Morgan Kaufmann.
11 C. D. Rosin and R. K. Belew. New methods for
competitive coevolution. Evolutionary
Computation, 5(1)129, 1997.

40
References

12 T. Service. Co-optimization A
generalization of coevolution. Master's thesis,
Missouri University of Science and Technology,
2008.
13 T. Service and D. Tauritz. Co-optimization
algorithms. In Proceedings of GECCO 2008 - the
Genetic and Evolutionary Computation Conference,
pages 387-388, 2008.
14 P. Tonella. Evolutionary testing of
classes. In Proceedings of the 2004 ACM SIGSOFT
international symposium on Software testing and
analysis, pages 119128, Boston, Massachusetts,
2004. ACM Press.
15 S. Wappler and F. Lammermann. Using
evolutionary algorithms for the unit testing of
object-oriented software. In Proceedings of GECCO
2005 - the Genetic and Evolutionary Computation
Conference, pages 10531060, Washington DC, 2005.
ACM, ACM Press.
16 S. Wappler and J. Wegener. Evolutionary
unit testing of object-oriented software using
strongly-typed genetic programming. In
Proceedings of GECCO 2006 - the Genetic and
Evolutionary Computation Conference, pages 1925
1932, Seattle, Washington, 2006. ACM, ACM Press.

Coevolutionary Automated Software Correction A Proof of Concept PowerPoint PPT Presentation