Title: Mega Software Engineering and EASE Project
1Mega Software Engineering and EASE Project
- Katsuro Inoue
- Osaka University
2Overview
- Proposed a concept of Mega Software Engineering,
which shares experiences and knowledge in
community - Introduced EASE project based on the concept of
MSE - Presented the overview of Empirical Environment
and showed current implementation of Empirical
Project Monitor EMP, as a partial realization of
Empirical Environment - Predicted ongoing directions to deeper analyses
of empirical data
3Empirical Software Engineering
- Various technologies in Software Engineering
based on empirical data - Essential for scientific improvement of project
processes and products
43 Major Phases in Empirical SE
collection
5Classification of ESE Technologies by Target Scale
Mega Software Engineering
6Mega Software Engineering MSE
- Targets many projects
- A new concept but not a new technology itself
- Collection of key technologies already existing
and emerging - Distributed environment and data sharing
- Analysis and data mining
- Project monitoring and controlling
- Scalable computing
- ...
- Use advances of hardware performance, e.g.,
network bandwidth, CPU clock, memory space, disk
capacity, ... - Software engineering technology should share in
advances of hardware, which is mainly used for
multimedia, grid, simulator, ...
7Characteristics of MSE
- Experience and knowledge of individual developer
or project are collected, refined as assets, and
reused in community - Single-level flat static community for
information sharing - Automatic process Little burden is required for
each developer or manager - View from the organizational benefits may be
directly obtained (no individual developers
view or project view) - Open source development is a simple case of MSE
(MSE focuses analysis and feedback)
8EASE Project
- Empirical Approach to Software Engineering
- Using the concept of MSE as its basis
- Funded by MEXT (Japanese government, Ministry of
Education, Culture, Sports, Science and
Technology) - 5 year project starting 2003
Senri Lab.
9Project Target
Empirical software development environment from 1
to thousands of projects
Empirical Environment
10Project Objectives
- Development of empirical environment
- Application of empirical environment to real
projects - Collection of data and expertise of empirical SE
- Organizational benefits by applying empirical
environment
11Concept of Empirical Environment
Internet
Public Domain Software
Open Source Project
Collection
Improvement
12ImplementingEmpirical Environment
13(1)Policy for Collection
- Goal first (ideal cases)
- ? Data collection first (Realistic approach)
- Collect mainly product data(Obtain process data
from product data) - Minimize developers overhead for collection
- Raw data without human tampering
- Real-time collection
- Applicable to various projects
- Small scale
- Non-water fall process such as XP
- Distributed development including sub-contracting
14(2)Policy for Analysis
Step-wise implementation
difficult
5.
4.Reuse comp./ expertise
3.Classification and evolution
2.Inter-project metrics
simple
1.Process / product metrics inside single project
15(3)Policy for Improvement
- Feedback method for each objective
- Various mechanisms for various cases
- Currently construct a browser for visualizing
collected data and measured metrics
16Empirical Project Monitor EPM
- A partial implementation of Empirical Environment
- Collect, measure, and show various data for
project control - Data source
- Versioning system CVS
- Mailing list manager Mailman
- Issue tracking tool GNATS
17Architecture of EPM
analysis tools
developer manager
measurement of intra and inter projects
PostgreSQL(Repository)
Standardized empirical SE data (in XML)
developer manager
prediction/ schedule metrics value other tool
data etc.
versioning history
mail history
problem history
18Characteristics of EPM
- Use open source development tools
- ? Easy to introduce
- Small overhead of data collection
- Most data from versioning history
- Communication through e-mail, and recoding issues
by tracking tool - Easy to transform other data format to the
standardized empirical SE data format
19Application Area of EPM
- Large project
- Share project status immediately
- Reduce project management load
- Reduce risk for tampering data
- Small project
- Apply with small cost
- Apply to various projects, including XP and
distributed development
20Features ofEmpirical Project Monitor
21EPM Analysis Tool
- Single activity view
- Source code size
- Issue resolution time
- Cumulative number of issue, number of unsolved
problems, ... - Multiple activity view
- Check-in and check-out
- Issue and mail
- check-in and issue
22Growth of LOC
- Progress monitoring
- Schedule v.s. actual
menu
Project EmpiriPrj
LOC
Cumulative LOC
month
23Growth of LOC(3 months)
LOC
Project EmpiriPrj
LOC
Check-in occurred
month
24Growth of LOC
Open source project nkf (character-code converter)
LOC
LOC
Check-in occurred
month
25Cumulative Issues/Unsolved Issues /Mean
Resolution Time
cumulative issues
ProjectEmpiriPrj
mean resolution days
cumulative issues
unsolved issues
mean resolution days
month
26Check-in and Check-out
check-out
ProjectEmpiriPrj
check-out
Check-in occurred
month
27CVS Log View
28Growth of Mail and Issues
cumulative mail
ProjectEmpiriPrj
cumulative mail
check-in occurred
issue raised
issue resolved
month
29Mail Log View
30Cumulative Issues and Check-in
cumulative issues
ProjectEmpiriPrj
cumulative issues
check-in occurred
month
31Future of Empirical Environment
32Extending Analysis Features
- Make deeper analysis and extract organizational
expertise - Find and reuse expertise easily
33Code clone detection
Component search
Metrics measurement
Project categorization
Cooperative filtering
Process data archive (XML format)
Product data archive (CVS format)
Format Translator
Format Translator
Format Translator
Format Translator
Versioning (CVS)
Mailing (Mailman)
Issue tracking (GNATS)
Other tool data
Managers
Project x
Project y
Corporate Source GUI
Project z
. . .
Developers
34Example Scenario (1)
Scheduled progress of project X
1
Actual progress of project X
2
Find projects similar to X - Project
categorization - Collaborative filtering
E
C
A
W
X
Y
V
Q
T
P
35Example Scenario (2)
3
Average reuse rate in similar projects
Project Xs reuse rate
- Code-clone detection
Promote using software asset search engine to
project X
4
- Software asset search engine
36Expected Effect
- Productivity can be drastically improved by
reusing organizational assets - Management of assets can be easily performed
- Cost control can be precisely made relative to
previous similar projects - Reliability can be improved using issue history
37Analysis Technology (1) Fast Code Clone
Detection
Code clones similar portions of program
38Analysis Technology (2) System Similarity Using
Code-Clone Detecion
39Analysis Technology (3)Collaborative Filtering
Represen-tative
Focused
Collaborative
Q MResources
OutcomeAdopted
7.5 (target)
9
9
9
7
App. A
8
7
8
? (missing)
8
App. B
? (missing)
8
8
8
7
App. C
7
6
? (missing)
9
6
App. D
40Analysis Technology(4)
Java Class Search Engine SPARS-J
41(No Transcript)
42(No Transcript)
43Markov Model
- Component rank model can be considered as a
Markov Chain of user's focus - User's focus moves from one component to another
along a use relation at a fixed time duration - Node weight represents the existence probability
of the user's focus at infinite future
44Demo of SPARS-J
45Current Status and Schedule
- Current - Demo version of EPM
- First quarter of 2004
- a release of EPM
- First quarter of 2005
- Application of EPM in industry
- End of 2005
- Inclusion of analysis tools
- User group, consortium, interest group, ...
46Summary
- Proposed a concept of Mega Software Engineering,
which shares experiences and knowledge in
community - Introduced EASE project based on the concept of
MSE - Presented the overview of Empirical Environment
and showed current implementation of Empirical
Project Monitor EMP, as a partial realization of
Empirical Environment - Predicted ongoing directions to deeper analyses
of empirical data