Title: Automated Source Code Changes Classification
1Automated Source Code Changes Classification
for Effective Code Review and Analysis
- Evgeny G. Knyazev
- Senior developer
- Transas Technologies
- Post-graduate student
- SPb State University of Information Technologies,
Fine Mechanics and Optics
2Source Code Review
- Informal source code look-through trying to find
different kind problems in it
3Source Code Review Helps to
- increase code quality
- find errors on early stages
- know all code of a system
- keep an eye on novices work
4Source Control System andCode Changes Review
- Source control system keeps development history
- It allows to review only changed code
Source Control System
Change Request (Revision X)
Review
Developer
5Code Change Review Example
6Changes Review Task Complexity
- In large project a lot of changes need to be
reviewed
7The Solution
- Split changes into classes
- Choose class for review
8The solution (2)
- Automate changes classification
Source Control System
Automated Changes Classifier
Change
Change Class
Is This Class Interesting ?
Developer
Yes
Review
9Known Code Changes Classification Methods
- Changes Comments Classification
- bug, fixed a bug fix
- implement, feature new feature
implementation - Refactoring Search Using Changes Metrics
- Extract parent class (?DITgt0 ? ?NOMlt0, )
- Move to other class (?DIT0 ? ?NOMlt0, )
- Split method (?NOM lt T, ...)
- Difference Search in Semantic Graphs
- Build code graph before and after the change
- Generate transition script
- Search refactoring templates
10Changes Metrics Clustering Method Learning Phase
11Fuzzy Change Metrics Clustering Algorithm
12Changes Metrics Clustering Method Changes
Classification
13Changes Metrics
- Calculated as subtraction of revisions metrics
- ?M Mr Mr-1
- CC Cyclomatic Complexity (number of linearly
independent paths in execution graph) - CS number of Classes/Structures
- eLOC Effective Lines of Code (without empty and
comment lines)
14Metrics Calculation and Clustering of Changes
from Navi-Manager Project
15Fuzzy Clusters of Revisions Table
16Method Learning Example
- Project Navi-Manager
- Size of Learning Set 29 changes
- Number of Clusters 4
17Classification Example
18Classification Fuzziness
- Change r16833 Deleted an extra commit command
classified as - On 2 as refactoring
- On 79 as code deletion
- On 0 as new functionality implementation
- On 20 as bugfix
19Code Changes Classification in Software
Development Process
20Changes Control During Important Development
Phases
- Deny potentially destabilizing changes classes
21Request List of Changes by Class
- For Example request refactorings list done in
specific version X
Automated Source Code Changes Classifier
Dev Team Leader
Request Refactorings in Version X
List of Refactorings in Version X
List of Changes in
Version X
Source Control System
22Project Statistics Analysis
23Achieved Results on Navi-Manager Project
- Effectiveness
- More than 50 time economy on code review
- Development Problems Discover
- Too much bugfixes comparing to new feature
implementations
24Automated Changes Classification Tool
- Works with Subversion
- Low depended from program language
- Calculates CC, CS, eLOC metrics
- Discovers change classes
- New feature implementation
- Code deletion
- Refactoring
- Cosmetic Changes
- Bugfixes
25Future Research
- Method improvements
- Gustavson-Kessel Clustering
- Object and coupling metrics usage
- Refactorings classification
- Application widening
- Usage in development process on constant basis
- Adaptability analysis for different types of
projects
26Thank you!
evgeny.knyazev_at_gmail.com