Searching Call Stacks for Known Problem Determination - PowerPoint PPT Presentation

1 / 14
About This Presentation
Title:

Searching Call Stacks for Known Problem Determination

Description:

Matching call stacks ... String matching algorithms can be used. Dynamic programming approach of string matching (LCS based) ... – PowerPoint PPT presentation

Number of Views:38
Avg rating:3.0/5.0
Slides: 15
Provided by: resear3
Category:

less

Transcript and Presenter's Notes

Title: Searching Call Stacks for Known Problem Determination


1
Searching Call Stacks for Known Problem
Determination
  • Rajeev Gupta, Laurent Mignet, Natwar Modani
  • (IBM India Research Lab, New Delhi, India)
  • Guy Lohman, Tanveer Syeda-Mahmood
  • (IBM Almaden Research Center,San Jose, CA, USA )
  • Mark Brodie, Sheng Ma
  • (IBM T.J. Watson Research Center, Yorktown
    Heights, NY)

2
Automated techniques for known problem
determination
Known Problems
New Problems
3
Value proposition
  • More than 70 of problems reported for various
    IBM products are known problems.
  • Support staff spends more than 33 of time in
    identifying known problems.
  • By efficiently identifying known problems (and
    solving them) cost of servicing software's can be
    reduced.

4
Problems distribution
A small number of problems account for most of
the crashes
Data collected from using automated data
collection mechanism of Lotus Notes.
5
Example stack
  • Start stack traceback
  • sqloDumpEDU 0x1C
  • sqldDumpContext 0x148
  • sqlrr_dump_ffdc 0x520
  • sqlzeDumpFFDC 0x48
  • sqlzeSqlCode 0x2D4
  • sqlnn_erds 0x174
  • sqlno_ff_compute 0x628
  • sqlno_ff_or 0x260
  • sqlno_ff_compute 0x3D8
  • sqlno_ff_or__ 0x260
  • sqlno_ff_compute 0x3D8
  • sqlno_ff_or0x260
  • sqlno_ff_compute 0x3D8

Common Error Handling Routines
FAILURE!
Levels of Recursion Probably Not Relevant
6
  • sqlno_ntup_ff_scan 0x10E0
  • sqlno_prep_phase 0x1704
  • sqlno_exe 0x944
  • sqlnn_cmpl
  • sqlnn_cmpl
  • sqlrr_cmpl
  • sqlra_compile_var 0x1290

Entry-level Routines Definitely Not Relevant (Too
Common)
Importance of stack reduces as we go down the
stack
7
Implementation architecture
Known Problem Database (Stack)
Text Containing Stack(s)
Parser
Remove recursion
Insert With indexing
Queries
Query Stack
Matching Algorithm
Indexing Algorithm
Ranked Matches to Query
8
Recursion removal
  • Call stacks with different numbers of repetitions
    of the same recursive subsequence are treated as
    matches
  • Stacks are stored while replacing recursive
    patterns with single instance of the pattern.

9
Matching call stacks
  • Each function can be treated as a symbol and each
    stack as a string of characters.
  • String matching algorithms can be used.
  • Dynamic programming approach of string matching
    (LCS based)
  • Matching score of two sequences is written in
    terms of matching score of their sub-sequences.

10
  • Extended the Needleman-Wunsch algorithm.
  • For query sequence Qq1, q2,,qm and stored
    sequence Dd1,d2,,dn matching score, matching
    sub-sequences till the ith and jth elements in
    respective stacks, Hij(Q,D) can be written as

sij Function match score i.e. qidj gij
Function mis-match penalty i.e. qi?dj
11
  • We add sij to the matching score when two
    functions are same. It is dependent on
  • importance of the function, i.e., if the matching
    function rarely occur in stacks then this sij is
    high.
  • value of i, i.e. matching at the top of the stack
    is more important.
  • Mis-match penalty increases with i-j.

12
Indexing scheme
  • Index is used for faster rejection of stacks.
    Indexing generates candidate stacks on which the
    matching algorithm is used to get matching
    stacks,
  • Create hash table of function names and
    corresponding stacks.
  • V(fi) Si1,Si2,.,Sin
  • For query stack f1, f2,.,fm we use hash table
    to create a multi-set of stacks as V(f1) U V(f2)
    U V(f3) . V(fm).
  • All stacks appearing in the multi-set more than
    certain threshold are selected as candidate stacks

13
Experiments
  • 2300 stacks were collected from a
    production-level problem database used by
    customer support.
  • Each problem record consisted of defect-id,
    problem description and call stack.
  • Stacks belonging to same/similar problem were
    grouped by the domain experts.
  • Initial results indicates that the algorithm is
    able to identify matching stacks with 80
    accuracy. False negatives were around 25 of
    correct matches.

14
Continuing work
  • More experiments with bigger data sets covering
    other IBM products.
  • Better heuristics for function match score and
    mis-match penalty.
  • Learning based approaches for incorporating
    client feedback.
Write a Comment
User Comments (0)
About PowerShow.com