Title: Symptom Services
1Symptom Services For Ambiguous Situations
Hoi Chan, Jeanette Rosenthal Thomas Kwok IBM
T.J.Watson Research Center hychan,jeanie,kwok_at_us
.ibm.com
2Motivation
- In a typical data center, there are thousands of
different events reporting system faults, status,
and performance information. Their occurrences
are unpredictable. New events and conditions
appear as operating environment changes - Traditional approaches of symptom recognition
relying on static authoring of pattern matching
rules become insufficient - On demand and autonomic computing will benefit
from problem determination and remediation
systems which are responsive to new and ambiguous
situations and able to learn from them
3A Statistical Approach
- a method by which problem symptoms can be
recognized even in ambiguous situations - treats the observed event-symptom relationship
represented by an event-symptom matrix as a
statistical problem - using Singular Value Decomposition (SVD)
technique, implicit higher order structure in the
association of events with symptom is modeled to
estimate event and symptom association
4Singular Value Decomposition
- a classical mathematical technique closely
related to a class of mathematical and
statistical techniques, such as eigenvector
decomposition, spectral and factor analysis - widely used in applications such as latent
semantic analysis for information retrieval, ink
retrieval from handwritten documents, document
search - well-established theoretical foundation and
readily available tools
5Singular Value Decomposition
- Singular value decomposition takes a rectangular
matrix of event and symptom data (defined as A,
where A is an m x n matrix) in which the m rows
represents the events, and the n columns
represents the symptoms. -
- The SVD theorem states
- A mxn E mxm S mxn P nxn
- Where
- E T E I mxm
- PT P I nxn (i.e. E and
P are orthogonal) - Where the columns of E are the left singular
vectors (gene coefficient vectors) S (the same
dimensions as A) has singular values and is
diagonal (mode amplitudes) and P has rows that
are the right singular vectors (expression level
vectors). The SVD represents an expansion of the
original data in a coordinate system where the
covariance matrix is diagonal
6Creation of Event-Symptom Matrix and Space by
SVD Technique
- A simplified set of events and symptoms
-
- E1 request response time gt 400ms
- E2 request queue length gt 100
- E3 excessive logins in the entire system
- E4 excessive requests from a domain
- E5 excessive requests from individual IP
- E6 average server utilization gt 90
- E7 connection from unknown source
- E8 requests frequent timeouts
- E9 excessive unknown application terminations
- Symptom 1 saturated on demand router
- Symptom 2 unexpected peak demand
- Symptom 3 possible intruder attack
- Symptom 4 network congestion
- Symptom 5 serious security breach
7Creation of Event-Policy Matrix and Space by SVD
Technique
Events-Symptom Matrix - A Matrix
This dataset consists of m events (Em) and n
symptoms (Pn), where m9 and n5. The m events
are entered as rows and the n symptoms are
entered as columns. The entries in the
event-symptom matrix are simply occurrences of
events in different symptoms.
8Singular Value Decomposition Calculation
Split A into E, S and P ( visualNumerics library
)
A ESP
9E Matrix
- -0.65 -0.2 -0.17 0.28 0.18 0.44
0.12 0.40 0.04 - -0.48 0.09 -0.36 -0.22 0.44 -0.44
-0.12 -0.40 -0.04 - -0.09 0.33 0.43 0.15 0.25 -0.27
0.05 0.35 -0.62 - -0.16 -0.33 0.18 0.50 -0.25 -0.36
0.48 -0.36 -0.00 - -0.2 -0.56 0.48 -0.15 -0.07 -0.07
-0.60 -0.03 -0.04 - -0.39 0.46 0.27 -0.11 -0.36 0.42
-0.01 -0.45 -0.12 - -0.04 -0.23 0.29 -0.66 0.18 0.07
0.60 0.03 0.04 - -0.29 0.13 -0.15 -0.27 -0.62 -0.42
0.01 0.45 0.12 - -0.09 0.33 0.43 0.15 0.25 -0.15
-0.03 0.09 0.75
10S Matrix
11P Matrix
- -0.74 0.25 -0.25 -0.30 -0.47
- -0.41 -0.61 0.30 0.56 -0.19
- -0.10 -0.43 0.47 -0.74 0.14
- -0.46 -0.07 -0.33 0.04 0.81
- -0.23 0.61 0.70 0.17 0.19
122 dimensional Event Symptom Space
13Creation of Event-Policy Matrix and Space by SVD
Technique
- In a two dimensional model where k 2
- all the event to event, symptom to symptom, and
event to symptom similarities are now
approximated by the first two largest singular
values of S. - As a result, the row vectors of the reduced
matrices (shaded columns of the E matrix in
Figure 3 and P matrix in Figure 4) are taken as
coordinates of points representing events and
symptoms in a two-dimensional space - where events are represented as diamonds and
symptoms as squares. - The dot product or cosine between two vectors
representing any two components corresponds to
their estimated similarity.
14Selection and Creation of a Policy Based on a New
Set of Events
- When an observed event set matches one or more of
the existing event set, the system simply
retrieves its corresponding symptom from the
event- symptom repository. - When a new set of events occurs without any
individual new event, using the new observed
event set, a pseudo-symptom is constructed as the
weighted sum of its constituent event vectors.
placing the pseudo-symptom at the centroid of its
corresponding event points. - This pseudo-symptom is compared against all
existing symptoms by calculating the cosine
between the pseudo-symptom vector and the
existing symptom vector as a similarity metric. - Those symptoms with the highest cosines (the
nearest vectors) to the pseudo-symptom are
selected. - Clearly, the choice of the threshold cosine value
plays a significant role in the number and the
accuracy of the symptoms selected. - The common practice is to use a small cosine
value to enable a broader search space initially,
and reduce the search space gradually as more
data is accumulated to maximize accuracy.
15Conclusion / New Problems Solved
- Traditional problem determination system has
focused on pattern matching - Here we introduces a statistical approach via SVD
to recognition of symptoms. - It empowers a class of applications which require
the problem determination and remediation system
to handle ambiguous situations and allow the
system to evolve in response to changing
operation and environment conditions. - This approach not only performs well as
traditional symptom recognition systems where
conditions for a symptom are fixed, but
reasonably well in ambiguous and unpredictable
situations.