A SelfHealing Approach for Developing Complex Software Systems - PowerPoint PPT Presentation

1 / 30

About This Presentation

Title:

A SelfHealing Approach for Developing Complex Software Systems

Description:

INVENTORY CONTROL APPS - PC. DPI/CPI. IC Batching. Inventory Adj/Count Correct ... Misc Accounting/Finance Apps - PC/NT. COBA (Corp office Budget Assistant) ... – PowerPoint PPT presentation

Number of Views:92

Avg rating:3.0/5.0

Slides: 31

Provided by: ibm76

Category:

more less

Transcript and Presenter's Notes

Title: A SelfHealing Approach for Developing Complex Software Systems

1
The Shadows Project

A Self-Healing Approach for Developing Complex
Software Systems

IBM Haifa Research Lab, Reliable
Systems Presented by Onn Shehory, Shadows
project coordinator
IBM Academy Conference, April 2006
2
Outline

Introduction
Technical overview
Organization
Shadows background technologies
ConTest Concurrency Testing
ATS Automated Threshold Setting
BCT Behavior Capture and Test
Contribution to standards
Summary

3
System Complexity
Actual Application Architecture for Consumer
Electronics Company
4
Shadows - Profile

Consortium formed to address challenges
formulated by EU
EU 6th Framework RD Program, call no. 5
Strategic Objective 2.5.5 Software Services
Research proposal submitted to EU 9/2005
Members
IBM, Univ of Milan Bicocca, Univ of Potsdam,
Univ of Brno, Artisys, Comverse Technologies, Net
Technologies, Philips, Scapa Technologies,

Blue Technology Validator Green - Technology
Provider Pink Dissemination/Exploitation
5
Technical Overview

A paradigm for developing complex software
systems with design-time and run-time
self-healing (SH) capabilities
Goal mitigate the challenge of growing software
complexity and its detrimental impact on software
quality
Integration of several SH technologies across the
system lifecycle
Mainly in middleware and applications

6
Shadows Technologies

The underlying set of SH technologies will
include
Verification, and run-time amelioration, of
Concurrent Systems (IBM)
Automatic Threshold Setting for Performance
Management (IBM)
Behavioral Capture and Test (Univ Milan)
Formal Methods (Univ. Brno/Potsdam)

7
Technology Validation

The contribution of validators includes
Gap analysis
Requirement definitions
Technology evaluation
Validation environments include, e.g.
Real-time resource constrained embedded C
software (Philips Nexpedia)
Server-side Java software for high-availability
telco systems (Comverses MMS)
Avionics software (Artisys)

8
Methodology Flow
Requirements Definition
Analysis
SH-Oriented
Healing
Development
SH-Oriented
Assurance
Testing / Debugging
SH-Oriented
SH
System Deployment
System Design
9
Abstract Architecture
Integrated Model-Based Framework for Designing
and Managing Self-Healing Systems
System Design and Management Standards
Methodology and Tools
Concurrency Testing
Fault Prediction and Automatic Threshold Setting
Behavioral Capture and Test
Model-Based Technologies
Open Standards
CIM
TPTP
10
Shadows Solution Architecture
11
Background Technologies

ConTest Concurrency Testing
ATS Automated Threshold Setting
BCT Behavior Capture and Test

12
ConTest Testing Concurrent and Distributed
Applications
13
ConTest the Challenge

Finding bugs in parallel and multi-threaded
software is challenging
Bugs depend on the program execution order
In lab environment only a small subset of
possible execution orders occur
As a result, many problems/bugs are discovered
only in the field

14
ConTest the Solution

ConTest runs existing tests multiple times
Using different scheduling orders created by
ConTest.
ConTest increases the probability of revealing
timing related bugs in Java programs
ConTest supports execution replay to reproduce
the execution that caused the bugs
Replay and debugging aids to assist once a bug is
found
Solution for Java done, C/C and C under
development

15
ConTest Technology in Brief
16
ConTest Benefits and Future

Benefits
ConTest improves testing of concurrent and
distributed applications for timing related bugs
from early development stages
ConTest has minimal impact on the testing process
and allows re-use of existing tests
Reduction of maintenance cost due to higher
quality
Planned, or in the works
Automated fix of concurrency bugs
For some bug families, this already works

17
ATS Automated Threshold Setting
18
ATS Problem Statement

Given
A computer system, its components, applications
running on the system
A service dependency of applications on
components
When unknown must revert to correlation analysis
(data mining, statistical)
Service-Level Objectives (SLOs) for the
system/applications and indications of their
violations
A monitoring infrastructure that
monitors operational parameters at the components
Generates/sends component alarms when
measurements violate thresholds
Compute thresholds on operational values of each
component metric, such that
Percentages of false alarms meet pre-specified
levels
Adapt thresholds to changes in workload patterns,
system configuration, and SLOs
The solution should be computationally efficient

19
ATS Motivation

In complex computer systems, manually-set
thresholds are NOT
Indicative
Adaptive
Scalable
Sub-optimal and rigid performance management
Administrator overloaded
Automating threshold setting will allow more
reliable use of component-level performance
parameters and thresholds for system-level
performance management

20
ATS Solution Approach

Use standard tools to measure operational
parameters on components
Use SLOs set by administrators or policy
Automation of threshold computation procedure
Start with initial component level threshold
values
Use histories
Of thresholds
Of SLO violations
Build a statistical model for PPV and NPV of the
thresholds based on the SLO and threshold
histories
Compute updated thresholds via the model to
satisfy target PPV and NPV
Iterate the process to dynamically update the
thresholds
Regular regression is inapplicable - we use
logistic regression

21
ATS Status and plans

ATS algorithms formulated and successfully tested
on a small laboratory system (2005)
Paper published and patent filed (2005)
Future versions will address large, complex
systems
Multiple and compound SLOs
Suggest system reconfiguration to allow for
better SLOs

22
BCT Behavior Capture and Test
23
Component-based software

Component reuse
Reduce costs
Increase productivity
Unexpected failures
Components areRobust and reliable, butDesigned
without knowledge of the final system -gt
integration problems
Integration testing problems
no source code
incomplete specifications

24
Integration problems

Inconsistent interpretation of parameters or
values
Each component's interpretation may be
reasonable, but incompatible (Martian lander,
Sept. 1999)
Violations of value domains or of capacity or
size limits
Implicit assumptions on ranges of values or sizes
Buffer overflow
Side effects on parameters or resources
Resources not explicitly mentioned in the
interface
temporary files
Missing or misunderstood functionality
Underspecified functionality leads to incorrect
assumptions
Hit counts

25
Verifying component-based systems

Testing
mutational analysis Ghosh, Mathur TOOLS 2000
Dynamic analysis
Only numeric data Raz, Koopman, Shaw ICSE 2002
Requires source code and focuses on data
McCamant, Ernst ESEC/FSE 2003

26
Behavior Capture and Test (BCT)

Key idea
Integration analysis and test require information
about components behavior
Extensive reuse of components produces a lot of
information
Can we capture behavior information to test and
analyze component integration?

27
BCT Main Steps

BCT
Capture Behavior Data
Monitor component execution
Capture run-time information
Distill Behavior Models
I/O models
Interaction models
Verify the Run-Time Behavior
Verify reused/replaced components with behavior
models

28
Contribution to Standards

The Shadows project will be based on open
standards for software lifecycle management
Enable true collaboration and interoperability
Faster adoption
Example the TPTP framework enabled by the
Eclipse open-source standard IDE
Supports software modeling, testing, logging and
profiling

29
Contribution to Standards cont

The Consortium seeks close and productive working
connections with standards working groups
Potential Collaboration with DMTF
CIM enhancements and refinements
Automated Management models
Behavior and State models
Policy-Based Management
Self healing models

30
Summary

The Shadows initiative is an independent RD
effort, which aims to improve state-of-art in
system lifecycle and system management
Shadows will rely on its background technologies
Expand them to fix bugs of various types
Combine them
to cover a large variety of problems
for data sharing and mutual improvement
Shadows will build on open standards and
influence them
The project entails collaboration with partners
in Europe
Feedback Early Access Validation

31
Backup Material

Write a Comment

User Comments (0)