Title: SIWAnalysis Data Collection Working Group
1EMF Agenda
0800 - 0810 Introduction Exercise Management
Slot 0810 - 0840 Paper 043 0840 - 0910 Paper
138 0910 - 0940 Paper 144 0940 - 1000
Discussion 1000 - 1030 Break FEDEP Automation
Slot 1030 - 1100 Paper 049 1100 - 1130 Paper
092 1130 - 1200 Paper 184 Discussion 1200 - 1330
Lunch
Software Reuse Slot 1330 - 1400 Paper 168 1400 -
1430 Paper 181 1430 - 1500 Paper 210 1500 -
1530 Break Data Collection Slot 1530 - 1600
Paper 129 1600 - 1630 Discuss Wed's DC
session 1630 - 1730 Wrap-Up Session
2Data Collection Working Group
Fall Simulation Interoperability
Workshop September 1998
- 129 A Progress Report Recommended Practices
for Data Collection in HLA and ADS
Environments - Discussion of Joint DC Sessions
3Origin
- 1997F-SIW Outbrief Summary Point 3
- "The Analysis Forum should investigate data
collection and analysis in distributed
simulation. - Explanation Data collection continues to be a
primary concern of the Analysis Forum and
significant work has occurred in this area by
various groups over the past year. - Recommendation The Analysis Forum should create
a working group to investigate data collection
and analysis in ADS using HLA.
4Charter
- Serve as a focal point for the discussion and
eventual formulation of guidelines or recommended
practices for data collection in HLA and in other
ADS environments.
5Tasks
- Review Existing Documentation
- 1278.3 Rational - AE Tiger Team report
- Recent SISO papers - DIS DLIF
- Solicit Available HLA Data Collection Experiences
- Compile Lessons Learned
- Draft Guidelines
6Data Collection WG Strategies
- Workshops
- update on current status
- solicit papers on specific collection data topics
- Interim meetings
- focus on workgroup tasks
- Reflector/e-mail
- drafting guidelines
- debate consensus building
- ANL SISO reflector ANL web page
www.trac.nps.navy.mil/SIW_ANL - Join group? email to neubergt_at_cna.org
7FSIW Data Collection Working Group
- Sponsored by ANL and EM Fora
- Special Sessions Tuesday, Wednesday Thursday
114 Robert Michael Senko Data Verification
Interactive Editor 161 Stephen Thor Berglie DC
in the Integrated Ship Defense Federation 199 Br
ian Higgins/Don-May Lee Advanced DC
Analysis Tool for HLA Federation 207 Jack
Harrington Run-Time Monitoring and Analysis Tool
for HLA Enable World (MATHEW) 209 Paul
Brian Perkinson Full FEDEP Life-Cycle Data
Management System 255 Jeff Opper Federation-Neu
tral Interchange Specification for Logged
Simulation Data 193 Lee Lacy Interchanging
Simulation Data using XML 129 Tom Neuberger A
Progress Report Recommended Practices for
DC in HLA and other ADS Environments
8Collective Knowledge Recommended
Practices?Data collection is not just about
what happens, but also about why causality is
critical to many applications
9Discussion Topics
- What?
- Types of data
- Why?
- Entire simulation life-cycle
- How?
- General approach
- Distribution of data collection
- Channeling of DC efforts
- Management monitoring
- Data credibility
- Storage
- HLA Challenges
10What Data?
- A data management system that provides access to
federation data required to answer difficult
operational questions and identify complex
relationships. If end-users can not answer key
questions concerning simulation execution, the
collection system will be judged a failure.
11What Data?(Not just automatic data from RTI)
- Data automatically generated from simulation
- RTI generated data
- Specialized model output files
- Other electronically recorded data
- Manually collected data
- Formal data collectors
- Comments from
- observers/trainers/participants/subject matter
experts - Operational data
- Information from C4I systems
- Federation and network performance data
12Why Collect Data?
Data collection is an important part of each step
of a simulation life cycle, to include
preprocessing (data preparation and management of
parameter files), run-time services (real-time
monitoring, logging), and post-processing (format
for analysis and replay). The cost of designing,
integrating and executing a distributed
simulation make reliable data collection,
analysis, and replay a necessity. Creating an
operational data store to capture federate data
provides analysts with the ability to answer
difficult questions and identify relationships
that could not be accomplished using traditional
loggers.
13Why? Prior to Exercise Execution
- Data preparation and management of parameter
files - Federation development, testing, and management
- Integration testing, debugging, and dress
rehearsal - Playback proxy if federations fail
14Why? Concurrent with Exercise Execution
- Monitor system via real-time exercise displays
and provide playback and other products - immediate feedback to program leadership,
exercise management, analysts - exercise credibility
- event reconstruction (focus on high interest
trigger events) - session management decisions
- intel updates/BDA to feed subsequent exercise
execution - Operational assessment of scenario interactions
15Why?Post Execution
- Formal AAR/Feedback
- reconstruct major events
- identification of driving issues
- focusing long-term analysis
- calculation of selected measures
- Support for Analysis Hot Wash
- Detailed Analysis
- calculation of analytical measures (MOE, etc.)
- exploration of major issues
- VVA
- Data Archiving
16How to Collect Data?
- General approach
- Distribution of efforts
- Channeling of data collection
- Management and monitoring
- Data verification and credibility
- Storage
17How?General Approach
- Unfocused collection of all federation execution
details - Focused collection for specific analytical (or
other) purpose
18How?General Approach
- How access private model data?
- Separate network for transfer of model output
files - Use HLA approach and special comprehensive object
model (Data Object Model DOM?) or SOM to specify
non-public data collection requirements, formats,
and transfer processes
19How?Distribution of DC efforts
- The data collection process can be broken-up into
two basic steps - Division of the relevant exercise data among the
collector applications for subscription (to avoid
redundant collection while ensuring complete
data subscription coverage) - Insertion of all collected data into a
centralized binary files in the data repository - The location of the data collection systems must
consider desires to minimize network traffic
requirements.
20How?Distribution of DC efforts
- Data collection network traffic can be driven by
- Design to distribute all data to a central
location - Transferring data to remote (or centralized)
location for use by analysts and other users - While technological advances in relational
databases have made an approach of a distributed
data store possible, centralizing this data store
allows for more flexibility during analysis and
can reduce the amount of redundantly stored data,
thereby reducing hardware, software and network
resource needs.
21How?Channeling of DC efforts
- Multi-cast communications technology can be used
by the RTI to create multiple channels that can
be exploited for data collection. A specialized
data collection LAN/WAN may be helpful in
segregating inter-data collection system
communications and data collection system to AAR
system from the simulation - Dynamic load balancing can be used to partition
simulation data among the distributed collectors
as the scenario evolves while still reducing data
loss. - Decoupling the data collection and data load
process is necessary to ensure minimal data loss
without encumbering other data flow requirements
(queries, main data loading)
22How?Management and monitoring
- Use of configuration files to set-up data
collection systems simplifies the process and
reduces the chance for errors. - A GUI to monitor the data collector status can
assist with the data subscription/load balancing
process. - Scalability the data collection design should
include an ability for the control system to add
additional collectors if required without
increasing system overhead.
23How?Management and monitoring
- The RTI can be used to ensure that the data
collection process provides non-overlapping, yet
complete data coverage. - Ad hoc query and data visualization capabilities
are a must for taming the mountains of data
generated during federation executions.
24How?Management and monitoring
- The goal of collection should be to passively
collect as much information as possible about the
simulation without impacting the federation
performance. The best approach may be
collection federates. - The FOM in OMT format should serve as the
foundation for the data collection and analysis
processes. - Data management systems must be flexible,
scaleable, and use commercial tools when possible
in order to reduce development cost and meet
scheduled milestones.
25How?Data verification and credibility
- Fault tolerance designs must be included in the
data collectors, to include notification of
unacceptable data loss, restart of failed
software, re-subscription procedures, and back-up
systems to cover for hardware failures.
26How?Data storage
- Data storage needs to be optimized for the
expected analysis tasks which frequently evolve
during the course of an exercise
27HLA Challenges
- The same capabilities that make HLA a potent
environment for distributed simulation execution
complicate data collection and replay.
28HLA Challenges
- Data collection federates impact bandwidth
requirements - Local logging presents correlation and reduction
challenges while limiting real time analysis - Dynamic nature of update regions poses problems
for complete data subscription coverage
29FSIW Data Collection Working Group
- Sponsored by ANL and EM Fora
- Special Sessions Tuesday, Wednesday Thursday
114 Robert Michael Senko Data Verification
Interactive Editor 161 Stephen Thor Berglie DC
in the Integrated Ship Defense Federation 199 Br
ian Higgins/Don-May Lee Advanced DC
Analysis Tool for HLA Federation 207 Jack
Harrington Run-Time Monitoring and Analysis Tool
for HLA Enable World (MATHEW) 209 Paul
Brian Perkinson Full FEDEP Life-Cycle Data
Management System 255 Jeff Opper Federation-Neu
tral Interchange Specification for Logged
Simulation Data 193 Lee Lacy Interchanging
Simulation Data using XML 129 Tom Neuberger A
Progress Report Recommended Practices for
DC in HLA and other ADS Environments
30Major Issues
- Logger Data Interchange Format (LDIF)
- Federate independent approach
- Future standard?
- Redundancy and Efficiency (RTI Limitations)
- Overhead of DC federate because not passive
- Some overlap in DC inevitable
- Handling opaque data from RTI
- Accessing private model data in addition to
public simulation data - Same process used to access RTI data?
31Major Issues (cont)
- Analysis tools (LDIF processors) federation
specific isolate from data collection - Dynamic load balancing necessary to optimize DC
assets and reduce unnecessary flow - Awareness of network/federation performance to
provide context for analytical results - network outages and latency error impact
- DC linked to entire FEDEP life-cycle
32Major Issues (cont)
- Documentation of DC specifics?
- FOM/SOM/DOM/other documentation?
- Rigorously enforce compliance
- Need common lexicon for data
- Major need for overarching DC guidance vice
federation specific work-arounds - XML shows promise for structuring data collection
language and format - Analysis tools (LDIF processors) federation
specific
33Major Issues (cont)
- DC federates may be implemented on a separate
network to minimize run-time impact - Separate model output files may be best approach
for legacy models - Major data filtering challenges
- reduce data flow
- limit unnecessary collection and storage
- Challenge to make DC both federation and platform
independent
34In closing ...
- Data collection is not just about what happens,
but also about why. - Can you
- resolve your DC issues through the experiences of
others? - provide valuable DC lessons and solutions to help
others? - contribute to the DCWG efforts?
35Get Involved with DCWG?
- ANL SISO reflector
- ANL web page
- www.trac.nps.navy.mil/SIW_ANL
- Join group? email to neubergt_at_cna.org
36Data Collection Working Group
Fall Simulation Interoperability
Workshop September 1998
- Analysis Forum Tuesday
- Focus Session Wednesday Morning
- Exercise Management Forum Thursday
- Tom Neuberger
- Paper 129 A Progress Report Recommended
Practices for Data Collection in HLA and other
ADS Environments
37Recent Accomplishments
- Conducted interim meeting
- Reviewed Why collect data slides
- Applicability of technical sim network DC
- Discussed recommended long-term structure for
DCWG - Assigned action items for completion prior to
Fall SIW - DCWG presence on ANL forum web site
- Special session with focus papers for FSIW98
- Tentative start on recommended practices