Title: Verification and Validation of Agentbased and Equationbased Simulations and Bioinformatics Computing
1Verification and Validation of Agent-based and
Equation-based SimulationsandBioinformatics
Computing Identifying Transposable Elements in
the Aedes aegypti Genome
- Ryan C. Kennedy
- Department of Computer Science and Engineering
- University of Notre Dame
2Verification and Validation of Agent-based and
Equation-based Simulations
3Overview
- Introduction
- Motivation
- Concepts of Verification and Validation
- Research Objectives and Methods
- Case Study I
- An Agent-based Scientific Model
- Case Study II
- An Equation-based Economic Model
- Conclusion
- Future Work
4Motivation
- NSF Blue Ribbon Panel (February 2006)
- New theory and methods are needed for handling
stochastic models and for developing meaningful
and efficient approaches to the quantification of
uncertainties. As they stand now, verification,
validation, and uncertainty quantification are
challenging and necessary research areas that
must be actively pursued. - Dr. Richard W. Amos
- Deputy to the Commanding General, U.S. Army
Aviation and Missile Command (AMCOM) - Previously the Director of the System Simulation
and Development Directorate in the Aviation and
Missile Research, Development and Engineering
Center (AMRDEC) - Verification and Validation
- 10-15 of total cost of model development, but
often overlooked in overall lifecycle
Oden Simulation-Based Engineering Science
Revolutionizing Engineering Science through
Simulation
5Model Verification Validation (V V)
- V V
- Verification
- solve model right
- Validation
- solve right model
- The cost and value influence confidence of model
- Want optimal cost-effectiveness of V V
Adapted from Sargent Verification and
Validation of Simulation Models
6Verification and Validation Process
Adapted from Sargent Verification and
Validation of Simulation Models and Huang
Agent-Based Scientific Simulation
7Applicable Verification and Validation Methods
Balci Handbook of Simulation Principles,
Methodology, Advances, Applications, and
Practice lists more than 75 Methods
8V V Subjective Analysis
- Examples of V V Techniques
- Face Validity
- Animation
- Graphical Representation
- Turing Test
- Internal Validity
- Tracing
- Black-Box Testing
9V V Quantitative Analysis
- Examples of V V Techniques
- Docking (Model-to-Model Comparison)
- Historical Data Validation
- Sensitivity Analysis/Parameter Variability
- Prediction Validation
10What and How
- Research objective
- Perform V V on distinct models and identify the
more cost-effective techniques - How
- Two very different projects as case studies
- Evaluate and adapt the formalized V V
techniques in industrial and system engineering
11Case Study IAn Agent-based Scientific Model
- NSF funded interdisciplinary project
- Understanding the evolution and heterogeneous
structure of Natural Organic Matter (NOM) - E-science example
- Chemists, biologists, ecologists, and computer
scientists - Agent-based stochastic model
- Web-based simulation model
12Case Study INOM
- What is NOM?
- Heterogeneous mixture of molecules in terrestrial
and aquatic ecosystems - Why study NOM?
- Plays a crucial role in the evolution of soils,
the transport of pollutants, and the global
carbon cycle - Understanding NOM helps us better understand
natural ecosystems - Hard to study in laboratory
13Case Study IThe Conceptual Model I
- Agents
- A large number of molecules
- Heterogeneous properties
- Elemental composition
- Molecular weight
- Characteristic functional groups
- Behaviors
- Transport through soil pores (spatial mobility)
- Chemical reactions first order and second order
- Sorption
14Case Study IThe Conceptual Model II
- Stochastic Model
- Individual behaviors and interactions are
stochastically determined by - Internal attributes
- Molecular structure
- State (adsorbed, desorbed, reacted, etc.)
- External conditions
- Environment (pH, light intensity, etc.)
- Proximity to other molecules
- Length of time step, ?t
- Space
- 2D Grid Structure
- Emergent properties
- Distribution of molecular properties over time
15Case Study IImplementations
16Case Study IFace Validity
17Case Study IInternal Validity I
18Case Study IInternal Validity II
19Case Study IDocking I
- Compare the model with validated one
- Compare the model with non-validated one
- Different implementations
- Different programming languages
- Different packages
- Different modeling approaches
- Agent-based approach vs. Equation-based approach
- Powerful method
20Case Study IDocking II
21Case Study IDocking III
22Case Study IDocking IV
23Case Study IDocking V
24Case Study IIAn Economic Model
- Interdisciplinary project
- Initially written in Matlab within Department of
Finance - Converted to C by Computer Scientists
- Equation-based system
- Concerned with identifying ideal economic
variables, such as debt, money growth, and tax
rate
25Case Study IIThe Conceptual Model
- Equation-based system
- Nonlinear projection methods used to solve Ramsey
problems in a stochastic money economy - Goal is to generate the best social welfare for a
given economy - Motivation
26Case Study IIFace Verification
27Case Study IITracing
it 44, af 3.7496e-08, rc 0, timer 11.1, l
0.1382704496, m -0.0092286139, t 0.1881024991, h
0.3093668925 cc1 0.4861695543, cc2 0.6212795130,
rl 1.0092221442 it 45, af 2.64653e-08, rc 0,
timer 11.0, l 0.1382704643, m -0.0092286175, t
0.1881024947, h 0.3093668931 cc1 0.4861695553,
cc2 0.6212795120, rl 1.0092221442
it 44 af 0.00144839 rc 0 l 0.138359 m
-0.00936025 t 0.188252 h 0.309338 cc1 0.486205
cc2 0.621244 rl -0.65888 it 45 af 0.00144784
rc 0 l 0.138401 m -0.00937062 t 0.188239 h
0.30934 cc1 0.486208 cc2 0.621241 rl -0.665511
28Case Study IIDocking
29Case Study IIPerformance
30Summary Conclusion
- Applied V V techniques to distinct case studies
to increase model confidence - Some techniques are more cost-effective
31Future Work
- More in-depth survey of V V methods
- More rigorous quantitative methods
- Compare simulation results against empirical data
- Invalidation Testing
- More general and formalized V V process model
32Bioinformatics Computing Identifying
Transposable Elements in the Aedes aegypti Genome
33Overview
- Introduction
- Motivation
- Basic Biological Concepts
- Bioinformatics
- Aedes aegypti
- Transposable Elements
- Approaches to Identifying Transposable Elements
- Conclusion
- Future Work
34Motivation
- Bioinformatics field is rapidly growing
- Computer scientists can help advance its study
- A better understanding of the biology of
organisms would be helpful to scientists - Transposable elements can be useful tools to
scientists - Computer scientists can help biologists develop
advanced techniques to find transposable elements
35Biological Foundations
- All cells contain DNA, RNA, and protein molecules
- DNA
- Composed of four nucleotides
- Building block of life
- RNA
- Transfers DNA throughout a cell
- Protein
- Laborer of the cell
- Central Dogma of Molecular Biology
36Bioinformatics
- Collective study of numerous fields and
techniques to solve biological problems - Focused on the study of DNA and its underlying
characteristics - Computer science lends itself well to
bioinformatics
37Bioinformatics Research Topics
- Genome Annotation
- Assigning biological meaning to regions of a
sequence - Sequence Alignment
- Comparing two or more sequences
- Sequencing
- Finding the structure of a given sequence
- Genome Assembly
- Assembling many short sequences of DNA
38Bioinformatics Tools
- Perl
- BioPerl
- BLAST
- Popular alignment tool
- Hidden Markov Model
- Clustal X
- Phylogenetic Tree
- Relationships between sequences
- Bioinformatics Collaboratories
- NCBI, Ensembl, VectorBase
39Aedes aegypti
- Tropical Mosquito
- Vector for dengue and yellow fever viruses
- Its unannotated genome recently released
- Much larger genome than that of other mosquitoes
40Transposable Elements
- Often referred to as jumping genes
- Can make up large portions of a genome
- Can transfer genetic material
- Useful when performing evolutionary studies
- Typically divided into Class I, Class II, and
Class II elements
41Transposons
- Class II transposable elements
- Divided into many families
- piggyBac, Tc1, pogo, mariner, P element
- Typical structure of a transposon
42Typical Approach
- BLAST known transposons against a new genome
- Good for identifying known or similar transposons
in new genomes - Does not account for sequence variations
43Approach I
- Focused on identifying P elements
- Utilized multiple tools and scripts
- Able to identify previously unknown transposons
- Clustal X and the HMMER suite allowed us to
perform a more through search - Cannot account for frame shifts
44(No Transcript)
45Approach II
- Used for five families of transposons
- Utilized GeneWise
- Did not search for new transposons
46Hybrid Approach A Transposable Element Discovery
Methodology
- Proposed approach
- Utilize better aspects of first two approaches
- Can be used for all families described in this
study
47Phylogentic Tree
- mariner family
- Clustered clades indicate close relationships
48Summary Conclusion
- Found a reasonable number of transposons
- Utilized novel approaches to finding transposons
- First such study using this type of approach on
the Aedes aegypti genome - Proposed a hybrid approach
49Future Work
- Utilize hybrid approach
- Automate process
- Comparison of transposable elements found in
Aedes aegypti and Anopheles gambiae
50Questions or Comments?