Fault Prediction and Software Aging - PowerPoint PPT Presentation

About This Presentation
Title:

Fault Prediction and Software Aging

Description:

Fault Prediction and Software Aging Carlos Perez Outline Software Lifecycle / Motivation Software Aging / Problem Fault Prediction / Approach Methodology for ... – PowerPoint PPT presentation

Number of Views:179
Avg rating:3.0/5.0
Slides: 31
Provided by: CAR6177
Category:

less

Transcript and Presenter's Notes

Title: Fault Prediction and Software Aging


1
Fault Prediction and Software Aging
  • Carlos Perez

2
Outline
  • Software Lifecycle / Motivation
  • Software Aging / Problem
  • Fault Prediction / Approach
  • Methodology for Detection and Estimation of
    Software Aging
  • Approach / Preventive Maintenance
  • Experiment
  • Data Analysis
  • Results
  • Conclusion

3
The Software Lifecycle
  • Youth
  • Software is new, simple, efficient. Functionality
    might be limited.
  • Maturity
  • As new requirements arise software becomes
    complex and code limitations surface.
  • Elderliness
  • Aging has taken a heavy toll on performance.
  • Death
  • Legacy App is replaced by newborn

4
DOS A Case Study
  • Youth
  • DOS - Simple, but very limited functionality
  • Maturity
  • Windows 3.1 GUI interface on top of DOS. More
    functionality, but more bugs
  • Windows95/97 More functionality, new bugs,
    performance has suffered.
  • Elderliness
  • Windows98 Many bugs have been patched, but
    increasing functionality is risky at this point.
  • Death
  • Windows XP was introduced!

5
The Software Aging Problem
  • The main problem with legacy code is aging
  • What is Software Aging?
  • Deterioration in the availability of OS
    resources, data corruption and numerical error
    accumulation
  • Consequences
  • Performance degradation
  • Crash / Hang
  • Failure

6
Causes of Software Aging
  • Common causes of software aging are
  • Memory bloating or leaks
  • Unreleased file-locks
  • Data corruption
  • Storage space fragmentation
  • Accumulation of round off errors
  • Legacy code is more likely to experience these
    kind of problems

7
Combating Software Aging
  • Research Question How can we combat software
    aging?
  • Why is it a challenging problem?
  • It is caused by heisenbugs (hard to find bugs)
  • It is an inherent characteristic of elderly
    systems
  • It is hard to detect
  • It can be present in critical systems

8
Software Rejuvenation Approach
  • Software rejuvenation is a proactive fault
    management technique aimed at cleaning up the
    system internal state to prevent the occurrence
    of future failures.
  • Examples of cleaning
  • Garbage collection
  • Kernel table flushing
  • Rebooting
  • Advantages
  • Prevents crashes from occurring
  • Provides fault tolerance in the presence of bugs
  • Disadvantages
  • Introduces overhead

9
Fault Prediction
  • Fault prediction tries to detect errors before
    they happen
  • It monitors system resources in order to detect
    and estimate aging
  • It computes an estimated time to failure
  • Preventive measures can be taken to avoid crashes
  • Enables software rejuvenation

10
  • S. Garg et al. A methodology for detection and
    estimation of software aging. In Proc. 9th
    International Symposium on Software Reliability
    Engineering, 1998
  • Presents a methodology for fault prediction based
    on the characterization of software aging

11
Approach
  • Collect UNIX system resource usage at regular
    intervals using a distributed monitoring tool
  • Use statistical trend detection techniques to
    detect and validate the existence of aging in
    UNIX.

12
Experimental Setup
  • Distributed monitoring tool based on SNMP
  • Works like a distributed database
  • Monitors state of UNIX running in stations
  • Monitoring station
  • Queries SNMP agent at each workstation
  • Determines health of each system

13
SNMP Model
  • SNMP Simple Network Management Protocol
  • Supports monitoring of network-attached devices
  • Pro-Active Fault Management MIB
  • Defines a set of objects that can be queried on
    any workstation by the managing station
  • These objects describe the state of the
    workstation

14
PFM MIBs
  • hostID provides basic information about the
    station
  • timeVal provides current time and time since
    last reboot
  • osResource describes state of OS resources such
    as free memory, file table size, etc.
  • procStats describes state of processes running
  • etc, etc

15
Data Collection
  • Heterogenous UNIX workstations were monitored
  • Their resource data was gathered every 15 minutes
  • Crashes are recorded for correlation purposes

16
Data Analysis
  • The data gathering face provides a time series
    for every object monitored
  • Using these time series several issues are
    addressed
  • Is aging present?
  • What is the nature of the variations in the
    value?
  • Can failures be related to observed values?
  • Can we quantify aging?

17
Data Analysis
  • Visual cues
  • Can periodicity be clearly seen from time series
    plots?
  • Is an increasing/decreasing trend visible?
  • What analysis should we do?
  • Classical time series analysis
  • Linear and periodic dependency analysis
  • Trend detection and estimation

18
Periodicity and Linear Dependence
  • Determines the nature variations in data
  • Approach
  • Autocorrelation function
  • Harmonic Analysis
  • Confirms daily and weekly periodicities in the
    data

19
Trend Detection and Estimation
  • Detection
  • Trends indicate the presence of aging
  • Approach looks for monotonically
    increasing/decreasing trends in resources
  • Estimation
  • Trend estimation quantifies the aging
  • Approach approximates slope of trend to estimate
    the expected time to resource exhaustion

20
Trend Detection
  • Smoothing
  • Robust Locally Weighted Regression
  • Reliable for nonlinear data
  • Test Trend Existence Hypothesis
  • Seasonal Kendall Test
  • Detects trends in the presence of cycles

21
Smoothing Step 1
  • Start at focal point
  • Define the window width
  • Larger size causes heavier smoothing
  • Overall trend is captured

22
Smoothing Step 2
  • Choose a weight function
  • Tricube weight function is the most common

23
Smoothing Step 3
  • Polynomial regression using weighted least
    squares
  • Take fitted value at focal point from regression
  • These steps are repeated at every X

24
Smoothing Results
  • Steps are repeated for every observation in the
    data
  • A separate local regression is performed at each
    X
  • The fitted value for each focal X is plotted

25
Trend Hypothesis
  • Seasonal Kendall test
  • Compares the relationships of points at different
    time periods (seasons)
  • Determines if a trend exists

26
Trend Estimation
  • Once we confirm the existence of a trend, we must
    estimate its slope
  • Sen Slope
  • Determines the slope at each point and takes the
    median of the slopes.

27
Results
  • Periodicities and Linear Dependence
  • Many values show daily and weekly periodic
    dependencies

28
Results
  • Existence of aging
  • Proved for file table size using seasonal trend
    decomposition
  • Original time series
  • Increasing trend from regression
  • Periodicities
  • Residual

29
Aging Quantification
  • Estimated time to failure due to aging is
    calculated with respect to a particular resource
  • Estimation is done from Sens slope and initial
    values
  • Important resources can then be identified for
    monitoring and managing

30
Conclusion
  • Quantification of software aging is presented as
    a means of fault prediction
  • Statistical analysis is an appropriate method for
    the detection and estimation of software aging
  • Can help in developing a strategy for software
    rejuvenation
Write a Comment
User Comments (0)
About PowerShow.com