Enhancing Set-Analysis through Scalable Visualizations - PowerPoint PPT Presentation

1 / 34
About This Presentation
Title:

Enhancing Set-Analysis through Scalable Visualizations

Description:

Link follow-up. Search. May 09, 2006. CMSC 838S Information Visualization Spring 2006 ... Link Follow-up. CMSC 838S Information Visualization Spring 2006. 31 ... – PowerPoint PPT presentation

Number of Views:42
Avg rating:3.0/5.0
Slides: 35
Provided by: csU2
Learn more at: http://www.cs.umd.edu
Category:

less

Transcript and Presenter's Notes

Title: Enhancing Set-Analysis through Scalable Visualizations


1
Enhancing Set-Analysis through Scalable
Visualizations
Presented by Hamid Haidarian Shahri
(hamid_at_cs.umd.edu) Mudit Agrawal (mudit_at_cs.umd.e
du)
2
Content
  • Problem Definition
  • Motivation
  • Dataset
  • Architecture
  • Visualization Methods
  • Interaction Tools
  • Demo
  • Future Work

3
Problem Definition
  • Analysis of sets by
  • representing the clusters graphically
  • depicting their internal and external links
  • Scaling visualization

4
Motivation
  • Sets are encountered in various domains
  • websites
  • commodities
  • publications
  • anything that has attributes!!
  • Visualization of sets to aid human perception is
    still an unsolved problem
  • no direct relations between sets (or its
    elements) in spatial domain
  • can be grouped based on various attributes

5
Dataset
  • 2700 law cases
  • Each case identified by a numerical id ranging
    from 1000 to 3718
  • Tuples in the dataset imply a referencing
  • Relation is unidirectional and not symmetric (the
    referencing also implies a temporal constraint on
    the cases)

6
Snapshot of the data
  • First 50 links (approximately 0.1 percent of
    whole dataset)
  • (1001,1105,'100 S.Ct. 318'),(1001,1612,'101
    S.Ct. 2352'),(1001,1018,'107 S.Ct.
    1232'),(1001,1016,'112 S.Ct. 2886'),(1001,2923,'11
    3 S.Ct. 2264'),(1001,1016,'120 L.Ed.2d
    798'),(1001,2923,'124 L.Ed.2d 539'),(1001,2286,'13
    8 F.3d 1036'),(1001,2396,'238 F.3d
    382'),(1001,3410,'438 U.S. 104'),(1001,1105,'444
    U.S. 51'),(1001,1612,'452 U.S. 264'),(1001,1018,'4
    80 U.S. 470'),(1001,1016,'505 U.S.
    1003'),(1001,2923,'508 U.S. 602'),(1001,3410,'57
    L.Ed.2d 631'),(1001,1105,'62 L.Ed.2d
    210'),(1001,1612,'69 L.Ed.2d 1'),(1001,1789,'926
    F.2d 1169'),(1001,1018,'94 L.Ed.2d
    472'),(1001,3410,'98 S.Ct. 2646'),(1002,1276,'100
    S.Ct. 2138'),(1002,1101,'105 S.Ct.
    3108'),(1002,1018,'107 S.Ct. 1232'),(1002,1098,'10
    7 S.Ct. 2378'),(1002,1016,'112 S.Ct.
    2886'),(1002,1015,'114 S.Ct. 2309'),(1002,1016,'12
    0 L.Ed.2d 798'),(1002,1013,'121 S.Ct.
    2448'),(1002,1012,'122 S.Ct. 1465'),(1002,1015,'12
    9 L.Ed.2d 304'),(1002,2316,'142 F.3d
    1319'),(1002,1013,'150 L.Ed.2d 592'),(1002,1012,'1
    52 L.Ed.2d 517'),(1002,1121,'266 F.3d
    487'),(1002,3028,'306 F.3d 113'),(1002,3410,'438
    U.S. 104'),(1002,1276,'447 U.S.
    255'),(1002,1101,'473 U.S. 172'),(1002,1018,'480
    U.S. 470'),(1002,1098,'482 U.S.
    304'),(1002,1016,'505 U.S. 1003'),(1002,1015,'512
    U.S. 374'),(1002,1013,'533 U.S.
    606'),(1002,1012,'535 U.S. 302'),(1002,3410,'57
    L.Ed.2d 631'),(1002,2091,'59 F.3d
    852'),(1002,1276,'65 L.Ed.2d 106'),(1002,1889,'746
    F.2d 135'),(1002,1101,'87 L.Ed.2d
    126'),(1002,1018,'94 L.Ed.2d 472'),(1002,2319,'953
    F.2d 1299'),(1002,1098,'96 L.Ed.2d
    250'),(1002,3410,'98 S.Ct. 2646'),(1002,1022,'980
    F.2d 84'),(1002,2670,'989 F.2d 362'),(1003,1104,'1
    00 S.Ct. 383'),(1003,1611,'104 S.Ct.
    2862'),(1003,1100,'106 S.Ct. 1018'),(1003,1099,'10
    7 S.Ct. 2076'),(1003,1016,'112 S.Ct.
    2886'),(1003,3110,'116 S.Ct. 2432'),(1003,1016,'12
    0 L.Ed.2d 798'),(1003,1012,'122 S.Ct.
    1465'),(1003,1881,'13 F.3d 1192'),(1003,3054,'133
    F.3d 893'),(1003,3110,'135 L.Ed.2d
    964'),(1003,1012,'152 L.Ed.2d 517'),(1003,1047,'18
    F.3d 1560'),(1003,1886,'265 F.3d
    1237'),(1003,2689,'271 F.3d 1090'),(1003,1358,'271
    F.3d 1327'),(1003,1149,'28 F.3d
    1171'),(1003,1040,'331 F.3d 891')

(1001,1105,'100 S.Ct. 318')
7
Architecture
Visualization Module
Clustering Module
Clustered Data
Data
Similarity Metric
8
Routine K-Means Clustering
  • Data points are in vector space.
  • x and are vectors.
  • This assumption does not hold for cases
    represented as sets.
  • Centroids are not simple geometric means.
  • In fact, mean does not make any sense.

9
Routine Self Organizing Map
  • Wv and D are assumed to be vectors.
  • Wv(t 1) Wv(t) T(t)a(t) D(t) - Wv(t)
  • This assumption does not hold.

10
Similarity Measures
  • Jaccard similarity
  • Reference-based similarity
  • Weighted reference-based similarity

11
Contribution to clustering
  • Applying K-means and SOM for producing better
    visualizations
  • Not apparent at first glance, but the above
    algorithms are not applicable to set
    visualization directly
  • They assume a 2D or nD (vector) representation
    for each data point (i.e. law case). More
    specifically, the attributes must form a vector
    space.
  • This assumption does not hold
  • no clear geometric attribute corresponding to the
    dataset

12
Similarity Metrics ? Geometric Metrics
  • 1-D Partitioning
  • 2-D Partitioning
  • Sequential arrangement
  • Distance based arrangement

1 2 5 9
3 4 7 12
6 8 11 14
10 13 15 16
13
K-Means
14
K-Means
15
SOM after K-Means
16
Various Interactive Tools
  • Referencing pattern (activating all links)
  • Local referencing
  • Density map
  • Representative element
  • Tool tip
  • Link follow-up
  • Search

17
Referencing Pattern
18
Local Referencing
19
Local Referencing
20
Density Map
21
Density Map
22
Representative Element
23
Link Follow-up
24
Link Follow-up
25
Link Follow-up
26
Link Follow-up
27
Link Follow-up
28
Link Follow-up
29
Link Follow-up
30
Link Follow-up
31
DEMO
32
Future Work
  • Other clustering algorithms can be explored
  • Spectral
  • Fuzzy C-means
  • More similarity functions
  • Better initial posting of data
  • Zooming and Panning

33
References
  • Abello, J., Korn, J., Visualizing Massive
    Multi-Digraphs. Proceedings of the IEEE Symposium
    on Information Visualization 2000.
  • Berry, M.W., Drma, Z., Jessup, E.R., Matrices,
    Vector Spaces, and Information Retrieval. SIAM
    Review, 412, 1999, pp. 335-362.
  • Gansner , E.R., Koutsofios, E., North, S.C., Vo,
    K.P., A Technique for Drawing Directed Graphs.
    IEEE Trans. on Soft. Eng. 19(3), 1993, pp.
    214-230.
  • Guimerà, R., Mossa, S., Turtschi, A., Amaral,
    L.A.N., The Worldwide Air Transportation Network
    Anomalous Centrality, Community Structure, and
    Cities' Global Roles. Proceedings of the National
    Academy of Sciences 102, May 31, 2005, pp.
    7794-7799.
  • Jain, A.K., Murty, M.N., Flynn, P.J., Data
    Clustering A Review. ACM Computing Surveys,
    1999.
  • Kohonen, T., The Self-Organizing Map. Proceedings
    of the IEEE, Volume 78, Issue 9, Sept. 1990, pp.
    1464-1480.
  • Kohonen, T., Kaski, S., Lagus, K., Salojärvi, J.,
    Honkela, J., Paatero, V., Saarela, A., Self
    organization of a massive document collection.
    IEEE Transactions on Neural Networks, Vol. 11,
    2000, pp. 574-585.
  • Kunz, C., Botsch, V., Ziegler, J., Spath, D.,
    Contextualizing Search Results in Networked
    Directories. Proceedings of HCII, 2003.
  • Leuski, A., Strategy-based Interactive Cluster
    Visualization for Information Retrieval.
    International Journal on Digital Libraries, Vol.
    3, Issue 2, 2000, pp. 170.
  • Liu, X., Luo, M., Shneiderman B. Visualization of
    Sets. Unpublished manuscript, 2005.
  • McQueen, J.B., Some Methods for classification
    and Analysis of Multivariate Observations.
    Proceedings of 5-th Berkeley Symposium on
    Mathematical Statistics and Probability,
    Berkeley, University of California Press, 1967,
    pp. 281-297.
  • Murata, T., Visualizing the Structure of Web
    Communities Based on Data Acquired From a Search
    Engine. IEEE Trans. on Industrial Electronics,
    Vol. 50, No. 5, 2003.
  • Palla, G., Derenyi, I., Farkas, I., Vicsek, T.,
    Uncovering the Overlapping Structure of Complex
    Networks in Nature and Society. Nature Letters,
    Vol. 435, 9 June 2005, pp. 814.
  • Self-organizing map. Wikipedia, The Free
    Encyclopedia.
  • Seo, J., Shneiderman, B., Understanding
    Hierarchical Clustering Results by Interactive
    Exploration of Dendograms A Case Study with
    Genomic Microarray Data. IEEE Computer Special
    Issue on Bioinformatics, Volume 35, No. 7, July
    2002, pp. 80-86.

34
Thanks!
Write a Comment
User Comments (0)
About PowerShow.com