ASI: Automated Summary and Insight on data cubes - PowerPoint PPT Presentation

1 / 20
About This Presentation
Title:

ASI: Automated Summary and Insight on data cubes

Description:

... Clustered Subset Selection (CSS) Input: data matrix ... CSS vs. other deterministic methods. Comparison of 8 subset selection algorithms over 4 datasets. ... – PowerPoint PPT presentation

Number of Views:21
Avg rating:3.0/5.0
Slides: 21
Provided by: IBMU444
Category:

less

Transcript and Presenter's Notes

Title: ASI: Automated Summary and Insight on data cubes


1
(No Transcript)
2
(No Transcript)
3
IBM A large IT services provider organization
  • Services, Services, Services
  • - to different companies
  • Oracle, Intel,
  • - across the globe
  • US, Canada,India,China,
  • - various service lines
  • Database, Storage,
  • Prism Insight A platform for
  • IT Service Metric Monitoring.

4
IT Service Data
  • IT Service Metrics
  • - MTTR Mean Time to Resolve
  • - Opened Tickets
  • - Closed Tickets
  • WID Geography
  • Service line
  • workstream
  • account

WID
We model the data as matrices
A matrix corresponds to a metric.
TIME
5
Summarizing the IT Service Data
  • - Select Representative WIDs

Representative WIDs
6
(No Transcript)
7
(No Transcript)
8
(No Transcript)
9
Connection between Subset Selection and Service
Metric Summary
A
C
TIME
TIME
WID
WID
  • Problem min A CCA
    F

C
10
(No Transcript)
11
Subset selection
12
(No Transcript)
13
Summary of deterministic Subset Selection
algorithms
14
Our Clustered Subset Selection (CSS)
WID
Input data matrix
TIME
  • clustering
  • cluster the columns of the matrix
  • Column Selection
  • select one column from each cluster

15
Our Clustered Subset Selection (CSS)
Input data matrix
Your favorite matrix
  • k-means
  • k-median
  • spectral clustering
  • close to rank 1 clustering
  • clustering
  • cluster the columns of the matrix
  • Select best column by exhaustive enumeration
  • use one QR-based deterministic technique
  • Column Selection
  • select one column from each cluster

16
CSS vs. other deterministic methods
  • Comparison of 8 subset selection algorithms over
    4 datasets.
  • Our method outperforms the others in all
    datasets.
  • Entries show approximation error A CX /
    A - Ak .
  • A - Ak is the best rank-k approximation
    with the SVD.


17
(No Transcript)
18
MTTR Mean Time to resolve a problem
MTTR
TIME (490 dates)
WID ( 834 wids)
19
Conclusions Our results
  • We proposed the Clustered Subset Selection
    Algorithm
  • to solve a classical linear algebraic
    problem
  • We demonstrated the effectiveness of our
    algorithm
  • in IT service metric summarization.

20
Conclusions One idea to take home
  • If you have a summarization problem
  • - text summarization
  • - video summarization
  • - you name it
  • There is a lot of work in linear algebra that
    might be useful in your domain.
Write a Comment
User Comments (0)
About PowerShow.com