Title: Software Engineering and Systems Science
1Software Engineering and Systems Science
- David Stotts
- Computer Science Department
2Environmental Modeling
- Atmosphere Mechanics and Chemistry, Meterology,
Hydrology, Soil Chemistry, Geology, Marine
Biology, Civil Engineering - Different research communities, specialties
- Different models different mathematics
different abstractions and domains of
discourse... different software architectures - Limited interactions, difficult information
exchange
3Multi-media
- To EnviSci this means DIRT and AIR and
WATERsheds in one analytical context - Currently via manual data communications
- We ask you for some rainfall data from your
atmosphere model - We take it and make an input file in the format
my soil model needs to generate runoff data for
an estuary model... - Further complications to get me some rain data,
you may need from my soil model some evaporation
data data dependency loops
4Multi-media Modeling
Air model
Water model
soil model
5Not so fast...
- spatial scale mismatch
- sq.meters vs. sq.miles vs. cu.cm
- time scale mismatch
- hours per cycle vs. minutes vs. centuries
- math method mismatch
- finite elements vs. pdes vs. genetic algs vs.
simulated annealing - units mismatch
- we make ints, you want floats
- error tracking, bounding during execution
6Multi-media Modeling
file of int, 5 per line
Air model
file of float, one per line with 3 ints after
5 minute steps
Access DB out
Water model
Unix pipe of char in
3 hour steps
soil model
1 month steps
7Model Federations
No scientist has expertise for the
whole Scientists must be able to develop
components individually i.e., not be software
engineers working as a cohesive team Component
models must interoperate
Air model
Water model
soil model
8NC WeatherScope
- WRAL-TV weather forecasts for the next half-day
done with a federation - Work of McHenry, Coates, et alia at MCNC/NCSC
- 3 models running in a federation
9NC WeatherScope
- Runs on 3 hosts, collects data and gens images
too - Hand-knit with 80,000 lines of Unix shell script
- Provides jet-stream, precipitation, air-quality
forecasts modified/enhanced by emissions/particula
te data
10Model Federations
Mismatch management module
Air model
5 minute steps
Water model
3 hour steps
soil model
1 month steps
11The Question
- Can we automate some good chunk of the
hand-knitting
so that new federations can be assembled more
easily?
12The Issues
- Mismatch management
- manual filter production, menu of existing common
ones - Execution mapping to hosts
- network use, distributed data
- Model execution/outputs bookkeeping
- Support for collaborations
- expertise sharing and leveraging
13Project Goals
- Modular framework for scientific model
interconnection and interoperation - Methods of composing models
- Support for distributed model data and net-based
model execution
14JDeCo Federation
- Java-based Federation framework
- XML meta data describes the federation
connections among components, filters and
mismatch management modules - JDeco1 uses RMI for multi-machine executions
- JDeco2 uses Globus grid services
15Neuse River Study Area
Neuse River Study Area
16New Areas
- Human lung model biochemistry, fluid dynamics,
protein motors and cilia mechanics, cell
physiology for cystic fibrosis research
17Carolina Environmental Bioinformatics Research
Center
- EPA center for toxicology research
- 3 main projects at UNC (one in CS)
- 5 years, 4.5 million
- Software infrastructure for systems science
18New Areas
- Bioinformatics
- searching genome databases with disparate set of
tools - each tool used for a partial solution, and one
tool producing as output the data needed as input
for another
19Information Flow in Systems Toxicology
20Research Problems
- Software technology
- Data source/generator federations
- Interconnection frameworks
- Execution platforms
- Grid computing
- Parallel algorithms and supercomputing
- Data access and management issues
- Data mining
- Data formats and interchange
- Common abstractions for scientific communities
21BREAK
22Distributed Pair Programming and the Transparent
Video FaceTop
- David Stotts
- Dept. of Computer Science
- Univ. of North Carolina at Chapel Hill
- August 2005
23Extreme Programming (XP)
- Kent Beck, late 90s
- One Agile software development processes
- change happens deal with it
- Small teams (2 to 10 people)
- Modest sized projects
If we know something is a good idea, lets take
it to the extreme
24XP Practices
- Test-first development
- Simple design (add no function before its time)
- Re-factoring
- User/client on-site
- Planning game
- 100 regression tests
- run end of each day
- No overtime
best known for Pair programming
25(No Transcript)
26Pair Programming Benefits
- Increased productivity
- 15 increase in person-hours
- 45 reduction in clock hours
- Fewer errors
- Better program structure
- Increased engineer satisfaction
- Increased client satisfaction (for full XP)
27Teleworking Boom
- Half of American information workers work
offsite at least part of the time
Companies are looking at hot bunking as a way
of controlling real estate costs and increasing
productivity of the plant they are paying
for National area of technical emphasis
28Distributed Pair Programming
- Assume you must develop across the wire
- If we use synchronous collaboration
- If we organize as pairs
Can we effectively develop good software? Will
collocated PP benefits still hold? Can we apply
all XP practices ? Should we?
29Current Studies
- COTS software how far can we get?
- Screen sharing ( pcAnywhere, VNC )
- Audio/text exchange ( Yahoo messenger )
- No video ( bandwidth concerns )
- Broadband speed or better
30Findings
- Distance matters is a truism among
collaborative systems researchers ( Olson
Olson ) - Studies of graduate programmers (136, 16, 12, 2)
- Paired, non-paired, co-located, distributed
We are finding that distance doesnt always
matter doesnt matter as much as thought
31Findings
- dPP software development over the wire is
- feasible,
- effective,
- affordable (COTS),
- pleasant for the participants
- Small-scale distributed development is better
done as pairs (synchronous) than as individuals
who integrate - Synchronous pairing engenders better teamwork and
communications for remote work
32Findings
- dPP programs were equal in quality to those of
co-located pairs ( also distributed non-paired
teams ) - Distributed pairs maintain many of the advantages
of collocated pairs ( fewer errors, etc. ) - Distributed pairs maintain many of the pair
effects ( pair pressure, pair learning, two
brains ) oserved in co-located PP
33Anecdotal Observation
- A student told me he had participated in 2 PP
experiments one co-located, one distributed - He said he and his were more efficient in the dPP
development that he and his partner did more
chit-chatting in co-located development - Not sure if this is social, or technical
- Or not real
- An IBM engineer confirmed a similar phenomenon
there
34Video-Enhanced Environment
- In all studies, dPP programmers say
- we need a whiteboard
- we miss facial expressions
- we cant point at what we are sharing
- We are constructing a video-enhanced dPP
environment to address these issues
35The Basic Facetop
- Transparency combined with user self-view
- Uniquely integrates video conferencing with
desktop information and applications - Works as a single-use PC interface
- Works as dual-user collaboration support
- Effective in projected context
36(No Transcript)
37(No Transcript)
38(No Transcript)
39(No Transcript)
40(No Transcript)
41(No Transcript)
42(No Transcript)
43Dual-user Collaborative FaceTop
44(No Transcript)
45(No Transcript)
46(No Transcript)
47(No Transcript)
48(No Transcript)
49(No Transcript)
50No Registration Issues
- No registration worries
- Object tracking (fingertip) allows camera motion
- User self-registers 30 fps
- For natural pointing camera must be generally in
front of user and oriented face-on
51 Early evaluations
- Had pairs that had done non-video dPP try Facetop
for dPP - Results very encouraging no team found the
system objectionable - All found it preferable to non-video
- All found the pointing to be natural and effective
52 Other evaluations
- Had two people with hearing-impairments try it
- Asked them to attempt signing and lip reading via
Facetop to discuss document on the screen - No problems signing (18 fps on 100mb switched
dept ethernet) - Lip reading not possible yet (more fps)
- Tic-Tac-Toe collaborative chess coming
53 Studies
- User studies
- Controlled dPP studies in fall
- Medical imaging group
- Architects, blueprints
- Emergency room use
- Enabling technology (D. Miller)
- Hearing-impaired can participate in dPP, other
sync. remote collaborations - UML diagrams for visually impaired (Eclipse)
54END
55(No Transcript)
56(No Transcript)
57The Problem
- Existing simulation models
- Large programs, usually Fortran
- Built to run on their own
- Variety of data formats
- How to combine?
- How describe data?
- How describe programs?
- How express coordination?
58The Approach
- Treat components more formally
- Better understanding, documentation
- Consistency checking
- Automatic mediation
- Specify coordination declaratively
- What more than How
- Minimal context dependence
- Compositional and higher-order
59DeCo
- Declarative Coordination
- Prototype coordination framework
- DSEL
- Domain-Specific
- Embedded (in Haskell)
- Language
60Driving Problem
- Water quality study of Neuse River Estuary
- Federation of two existing models
- Water model 1 Fortran 77 program, 9300 lines
- Sediment model 2 Fortran 77 programs, 4700
lines - Dozens of files input and output
61Driving Problem
- Alternate execution of two models
- But
- Spatial mismatch
- Temporal mismatch
- Units mismatch
62Neuse River Study Flow
63(No Transcript)
64(No Transcript)
65(No Transcript)
66(No Transcript)
67(No Transcript)
68(No Transcript)
69(No Transcript)