Projective Mapping: A noniteratve method for the Layout of Multidimensional Data presentation

About This Presentation

Transcript and Presenter's Notes

Title: Projective Mapping: A noniteratve method for the Layout of Multidimensional Data

1
Projective Mapping A non-iteratve method for the
Layout of Multidimensional Data

Dr. Karina Assiter
Wentworth Institute of Technology
SCI2002

2
Outline

Background
Projective Mapping
Experimental Results
Conclusions
References

3
Background

Data generation and Data understanding
Data mining example
Objective of Layout
Layout Methods
MDS Optimization
Faithfulness in Mapping
Subjective Layout

4
Data generation and data understanding

Data generation
Provides Raw material
Results in large, high-dimensional data sets
Amount in worlds databases doubles every 20
months
Many attributes for each sample
Data understanding
Fuels business growth (Competitive advantage)
Insights and patterns used to predict the future
Techniques
Clustering
Classifying
Visualizing (layout)

5
Data Mining Example

Iris Dataset - R.A. Fisher (1930s)
50 samples
Three types of plants (setosa, versicolor,
virginica)
Four attributes (Sepal length/width,Petal
length/width)
Techniques
Classify new plants (learn rules)
If petal-length lt 2.45 then Iris-setosa
Cluster items that fall together
If we did not know the type of Iris
Visualize the dataset (layout)

6
Objective of Layout

Create a mapping F from h samples in an
n-dimensional sample set S to representative
points in an m-dimensional representation set R.

7
Layout Methods

Multidimensional scaling (MDS)
Position points in R so that the distances match
(as closely as possible) the set of
dissimilarities in S.
Self-organizing maps
Mapping from an input data space S onto a
two-dimensional array of nodes R
Similar samples in S mapped to nearby nodes in R
Subjective Layout methods
Samples placement in R depends upon its
relationship with user defined (and positioned)
anchors
Inter-sample relationships may or may not be
preserved.

8
Illustration of MDS
9
MDS Optimization Methods

Move points around in R to minimize error
function
Error function
How accurately proximity values in S are related
to distances in R
Example E2 1 ? (d(Pi , Pj )
d(Qi , Qj )) 2 ? d (Pi ,
Pj ) iltj d(Pi , Pj ) iltj
General characteristics
Run time complexity of O(h2)
h is number of samples
Iterative (continue until stopping condition)

10
Faithfulness in Mapping

A mapping F S -gt R of an x-dimensional
subspace of an n-dimensional space into an
m-dimensional reference space is faithful if the
mapping preserves proximities precisely so that
d(Pi,Pj) d( F(Pi ), F(Pj )), when Pi and Pj are
in S.

11
Illustration of Faithfulness
S
R
F
P2
Q2
P1
Q1
P3
P4
Q3
Q4
12
Subjective Layout

A samples mapping in R depends upon its
relationship with user defined (and positioned)
anchors
Inter-object relationships may or may not be
preserved.
Classified as
Query-relative (traditional)
Anchors in T are queries (keywords, mailing
lists, etc,)
Point-relative
Anchors in T are representatives of samples from
S

13
Illustration of Subjective Layout
14
Projective Mapping

Overview
Illustration of placing a sample
Explain reference set T
Method Details
Determine Sense of P in P

15
Projective Mapping Overview

Samples in a dataset S are mapped into a
representation set R based on the centroid of
their geometric relationships to a set T of
pre-positioned references

16
Illustration of placing a sample
S
R
Sk
Rk
J
Sj
J
Rj
L1
L1
L2
L2
Qi
P
17
Reference Set T

k samples selected from S
S1, S2, Sk
Mapped into R
Arbitrarily
User positions
Computed
Optimization method that preserves distances

18
Method Details
I. Determine plane P onto which we wish to
project w1, w2) II. For each vector, P, in S
A. For each valid pair of references,
Sj and Sk 1. Determine sense (left
or right) of projected P in P (explained in
next slide) 2. In S, find unit vector B
perpendicular to L1 through P 3. In
S, get t1 and t2 from
J Sj (Sk Sj) t1 J
P B t2 4. In R, find unit
vector B perpendicular to L1, depending on
sense B ( Rk1 - Rj1 , Rj0 - Rk0
) Sense was left B ( Rj1
Rk1 , Rk0 Rj0 ) Sense was right
5. Use t1 to determine J in R
J Rj (Rk Rj ) t1
6. Determine scale factor from ratio between
line Sk Sj in S and line Rk - Rj in R
sf ( Rk-Rj / Sk-Sj
7. Calculate Qi based on t2 and B
Qi J (B t2 sf)
B. Average results from each pair to get
position Q.
19
Determine Sense of P in P
B-coord of Sk (-2,1)
B-coord of Sj (2,-1)
B-coord of P (-1,1)
W2 (0,1,0)
w1 (1,0,0)
Plane P to project onto (1,0,0), (0,1,0)
Equation of line between B-coord Sj and
B-coord Sk of 2x 4y 0 So
2(-1) 4(1) gt 0 left side of line
20
Experimental Results

Example I
Embedded two-dimensional data
Example II
Lines to Lines
Example III
Rotation
Example IV
Identity mapping
Example V (Optional)
Cube series

21
Example I Embedded two-dimensional data
z
S
Samples in S (triangles) x y z P0
.5 0 0 P1 0 0 .5 P2 .5 0
-.5 P3 1 0 0 P4 0 0 -2 P5 -1
0 0
S1 at (0,0,1)
S2 at (-1,0,0)
S0 at (1,0,0)
x
x
y
S3 at (0,0,-1)
R
22
Example II Lines to lines
23
Example III Rotation
24
Example IV Identity Mapping
25
Example V Cube series (Optional)
Front face, horizontal plane
Front face T, Plane between horizontal and 45
Front face T, Plane between 45 and vertical
Front face T, Plane is vertical
Diagonal T, vertical plane
Diagonal T, horizontal plane
Diagonal T, Planes between horizontal and 45
Diagonal T, Planes between 45 and vertical
26
Run-time Complexity

Inserting one sample
Depends on k
The number of references in T
Upper bound of O(k2)
k! All ways to select k items 2
ways
(k-2)! 2!
Inserting h samples
Upper bound of O(k2 h).

27
Comparing Methods
Hypercube samples 00000-11111.
Projective mapping References in S
00000-00011 Associated references in R
00-11. The plane for the projection was
00001,00010.
Projective Mapping on the Hypercube
Sammon Mapping on the Hypercube
Iris dataset - 150 samples - three clusters of
flowers.
Sammon Mapping on the Iris dataset
Projective Mapping on the Iris dataset
28
Conclusions

Benefits of PM
Drawbacks of PM
Uses for Projective mapping
Future Work

29
Benefits of PM

Fast O (k2h)
Non-iterative
Guaranteed upper-bound
Adaptive
Point-relative Subjective
In one and two-dimensions
Faithful mapping of references applied to samples
Linear transformation of references applied to
samples
Structure preserving (with distortion)
Works with fallible and sparse datasets
Creates consistent layouts
Not domain specific

30
Drawbacks of PM

With high-dimensional data
Distortion occurs
Plane selection is hard to optimize
Faithfulness requires Euclidean distances

31
Uses for Projective Mapping

Visualization to discover
Data Dimensionality
Data Structure
Relationship between a subset of references and
all other samples

32
Future work

Generalize for
N-dimensional to n-dimensional mapping
Vary for testing
Plane selection
Data dimensionality
Reference set
Selection
Size
Visualize as slide show
multiple-views of dataset

33
Selected References

Assiter, K.A. (2001) Projective Mapping A
non-iterative method for the layout of
multidimensional data. Dissertation. Tufts
University, Medford, MA, 2001.
Chalmers, M. and P. Chitson (1992). Bead
Explorations in Information Visualization. SIGIR
'92 Proceedings of the Fifteenth annual
International ACM SIGIR Conference on Research
and Development in Information Retrieval,
Denmark, ACM Press.
Cox, T. F. and M. A. Cox (1994). Multidimensional
Scaling. Monographs on Statistics and Applied
Probability. London, Chapman Hall
Fairchild, K. M., S. E. Poltrock and G. W. Furnas
(1988). SemNet Three-Dimensional Graphic
Representations of Large Knowledge Bases.
Cognitive Science and its Applications for
Human-Computer Interaction. R. Guindon.
Hillsdale, New Jersey, Lawrence Erlbaum
Associates 201-233.
Kruskal, J. B. (1964a). Multidimensional Scaling
by optimizing goodness of fit to a non-metric
hypothesis. Psychometrika 29(1) 1-27. Reprinted
in Key Texts in Multidimensional Scaling, P.M.
Davies and A.P.M Coxon, Eds. Heinemann
Educational Books, Exeter, N.H.., 1982, pp 59-83.
Kruskal, J. B. (1964b). Non-metric
multidimensional Scaling A numerical method.
Psychometrika 29(2) 115--129, Reprinted in Key
Texts in Multidimensional Scaling, P.M. Davies
and A.P.M Coxon, Eds. Heinemann Educational
Books, Exeter, N.H.., 1982, pp 59-83.
Olsen, K. A. (1993). Visualization of a document
Collection The Vibe System. Information
processing and management 29(1) 69-81, Pergamon
Press Ltd, 1993.
Sammon, J. W. (1969). A Nonlinear mapping for
Data Structure Analysis. IEEE Transactions on
Computers 18(5) 401-409, May 1969.

34
End
35
Multidimensional Scaling

Metric
Preserve actual proximities
Types
Classical (PVA)
Least squares
Optimization method with global error function
Non-metric
Preserve rank order of proximities
Optimization method with global error function
Spring model methods
Attractive and repulsive forces between objects
act to either bring them together or push them
apart
Optimization with local error function

36
Notation
S N-dimensional sample set Sj, Sk
Projection pair of references in S J Point
where P is projected perpendicularly to L2 L1
Line that goes through Sj and Sk L2 Line
that goes through J and P P Sample in S to
be mapped t1 Placement of J along L1
(time parameter in linear equation) t2
Distance between J and P (not a time
parameter) R Two-dimensional
representation set Rj, Rk Projection
pair of references in R J Intersection
point between L1 and L2 L1 Line that
goes through Rj and Rk L2 Line that goes
through J and Qi Qi Result of Two point
n-dimensional projective mapping

Write a Comment

User Comments (0)

About PowerShow.com

Projective Mapping: A noniteratve method for the Layout of Multidimensional Data PowerPoint PPT Presentation