Projective Mapping: A noniteratve method for the Layout of Multidimensional Data PowerPoint PPT Presentation

presentation player overlay
1 / 36
About This Presentation
Transcript and Presenter's Notes

Title: Projective Mapping: A noniteratve method for the Layout of Multidimensional Data


1
Projective Mapping A non-iteratve method for the
Layout of Multidimensional Data
  • Dr. Karina Assiter
  • Wentworth Institute of Technology
  • SCI2002

2
Outline
  • Background
  • Projective Mapping
  • Experimental Results
  • Conclusions
  • References

3
Background
  • Data generation and Data understanding
  • Data mining example
  • Objective of Layout
  • Layout Methods
  • MDS Optimization
  • Faithfulness in Mapping
  • Subjective Layout

4
Data generation and data understanding
  • Data generation
  • Provides Raw material
  • Results in large, high-dimensional data sets
  • Amount in worlds databases doubles every 20
    months
  • Many attributes for each sample
  • Data understanding
  • Fuels business growth (Competitive advantage)
  • Insights and patterns used to predict the future
  • Techniques
  • Clustering
  • Classifying
  • Visualizing (layout)

5
Data Mining Example
  • Iris Dataset - R.A. Fisher (1930s)
  • 50 samples
  • Three types of plants (setosa, versicolor,
    virginica)
  • Four attributes (Sepal length/width,Petal
    length/width)
  • Techniques
  • Classify new plants (learn rules)
  • If petal-length lt 2.45 then Iris-setosa
  • Cluster items that fall together
  • If we did not know the type of Iris
  • Visualize the dataset (layout)

6
Objective of Layout
  • Create a mapping F from h samples in an
    n-dimensional sample set S to representative
    points in an m-dimensional representation set R.

7
Layout Methods
  • Multidimensional scaling (MDS)
  • Position points in R so that the distances match
    (as closely as possible) the set of
    dissimilarities in S.
  • Self-organizing maps
  • Mapping from an input data space S onto a
    two-dimensional array of nodes R
  • Similar samples in S mapped to nearby nodes in R
  • Subjective Layout methods
  • Samples placement in R depends upon its
    relationship with user defined (and positioned)
    anchors
  • Inter-sample relationships may or may not be
    preserved.

8
Illustration of MDS
9
MDS Optimization Methods
  • Move points around in R to minimize error
    function
  • Error function
  • How accurately proximity values in S are related
    to distances in R
  • Example E2 1 ? (d(Pi , Pj )
    d(Qi , Qj )) 2 ? d (Pi ,
    Pj ) iltj d(Pi , Pj ) iltj
  • General characteristics
  • Run time complexity of O(h2)
  • h is number of samples
  • Iterative (continue until stopping condition)

10
Faithfulness in Mapping
  • A mapping F S -gt R of an x-dimensional
    subspace of an n-dimensional space into an
    m-dimensional reference space is faithful if the
    mapping preserves proximities precisely so that
    d(Pi,Pj) d( F(Pi ), F(Pj )), when Pi and Pj are
    in S.

11
Illustration of Faithfulness
S
R
F
P2
Q2
P1
Q1
P3
P4
Q3
Q4
12
Subjective Layout
  • A samples mapping in R depends upon its
    relationship with user defined (and positioned)
    anchors
  • Inter-object relationships may or may not be
    preserved.
  • Classified as
  • Query-relative (traditional)
  • Anchors in T are queries (keywords, mailing
    lists, etc,)
  • Point-relative
  • Anchors in T are representatives of samples from
    S

13
Illustration of Subjective Layout
14
Projective Mapping
  • Overview
  • Illustration of placing a sample
  • Explain reference set T
  • Method Details
  • Determine Sense of P in P

15
Projective Mapping Overview
  • Samples in a dataset S are mapped into a
    representation set R based on the centroid of
    their geometric relationships to a set T of
    pre-positioned references

16
Illustration of placing a sample
S
R
Sk
Rk
J
Sj
J
Rj
L1
L1
L2
L2
Qi
P
17
Reference Set T
  • k samples selected from S
  • S1, S2, Sk
  • Mapped into R
  • Arbitrarily
  • User positions
  • Computed
  • Optimization method that preserves distances

18
Method Details
I. Determine plane P onto which we wish to
project w1, w2) II. For each vector, P, in S
A. For each valid pair of references,
Sj and Sk 1. Determine sense (left
or right) of projected P in P (explained in
next slide) 2. In S, find unit vector B
perpendicular to L1 through P 3. In
S, get t1 and t2 from
J Sj (Sk Sj) t1 J
P B t2       4. In R, find unit
vector B perpendicular to L1, depending on
sense B ( Rk1 - Rj1 , Rj0 - Rk0
) Sense was left B ( Rj1
Rk1 , Rk0 Rj0 ) Sense was right
5. Use t1 to determine J in R
J Rj (Rk Rj ) t1
6. Determine scale factor from ratio between
line Sk Sj in S and line Rk - Rj in R
sf ( Rk-Rj / Sk-Sj  
7. Calculate Qi based on t2 and B
Qi J (B t2 sf)
B. Average results from each pair to get
position Q.
19
Determine Sense of P in P
B-coord of Sk (-2,1)
B-coord of Sj (2,-1)
B-coord of P (-1,1)
W2 (0,1,0)
w1 (1,0,0)
Plane P to project onto (1,0,0), (0,1,0)
Equation of line between B-coord Sj and
B-coord Sk of 2x 4y 0 So
2(-1) 4(1) gt 0 left side of line
20
Experimental Results
  • Example I
  • Embedded two-dimensional data
  • Example II
  • Lines to Lines
  • Example III
  • Rotation
  • Example IV
  • Identity mapping
  • Example V (Optional)
  • Cube series

21
Example I Embedded two-dimensional data
z
S
Samples in S (triangles) x y z P0
.5 0 0 P1 0 0 .5 P2 .5 0
-.5 P3 1 0 0 P4 0 0 -2 P5 -1
0 0
S1 at (0,0,1)
S2 at (-1,0,0)
S0 at (1,0,0)
x
x
y
S3 at (0,0,-1)
R
22
Example II Lines to lines
23
Example III Rotation
24
Example IV Identity Mapping
25
Example V Cube series (Optional)
Front face, horizontal plane
Front face T, Plane between horizontal and 45
Front face T, Plane between 45 and vertical
Front face T, Plane is vertical
Diagonal T, vertical plane
Diagonal T, horizontal plane
Diagonal T, Planes between horizontal and 45
Diagonal T, Planes between 45 and vertical
26
Run-time Complexity
  • Inserting one sample
  • Depends on k
  • The number of references in T
  • Upper bound of O(k2)
  • k! All ways to select k items 2
    ways
  • (k-2)! 2! 
  • Inserting h samples
  • Upper bound of O(k2 h).

27
Comparing Methods
Hypercube samples 00000-11111.
Projective mapping References in S
00000-00011 Associated references in R
00-11. The plane for the projection was
00001,00010.
Projective Mapping on the Hypercube
Sammon Mapping on the Hypercube
Iris dataset - 150 samples - three clusters of
flowers.
Sammon Mapping on the Iris dataset
Projective Mapping on the Iris dataset
28
Conclusions
  • Benefits of PM
  • Drawbacks of PM
  • Uses for Projective mapping
  • Future Work

29
Benefits of PM
  • Fast O (k2h)
  • Non-iterative
  • Guaranteed upper-bound
  • Adaptive
  • Point-relative Subjective
  • In one and two-dimensions
  • Faithful mapping of references applied to samples
  • Linear transformation of references applied to
    samples
  • Structure preserving (with distortion)
  • Works with fallible and sparse datasets
  • Creates consistent layouts
  • Not domain specific

30
Drawbacks of PM
  • With high-dimensional data
  • Distortion occurs
  • Plane selection is hard to optimize
  • Faithfulness requires Euclidean distances

31
Uses for Projective Mapping
  • Visualization to discover
  • Data Dimensionality
  • Data Structure
  • Relationship between a subset of references and
    all other samples

32
Future work
  • Generalize for
  • N-dimensional to n-dimensional mapping
  • Vary for testing
  • Plane selection
  • Data dimensionality
  • Reference set
  • Selection
  • Size
  • Visualize as slide show
  • multiple-views of dataset

33
Selected References
  • Assiter, K.A. (2001) Projective Mapping A
    non-iterative method for the layout of
    multidimensional data. Dissertation. Tufts
    University, Medford, MA, 2001.
  • Chalmers, M. and P. Chitson (1992). Bead
    Explorations in Information Visualization. SIGIR
    '92 Proceedings of the Fifteenth annual
    International ACM SIGIR Conference on Research
    and Development in Information Retrieval,
    Denmark, ACM Press.
  • Cox, T. F. and M. A. Cox (1994). Multidimensional
    Scaling. Monographs on Statistics and Applied
    Probability. London, Chapman Hall
  • Fairchild, K. M., S. E. Poltrock and G. W. Furnas
    (1988). SemNet Three-Dimensional Graphic
    Representations of Large Knowledge Bases.
    Cognitive Science and its Applications for
    Human-Computer Interaction. R. Guindon.
    Hillsdale, New Jersey, Lawrence Erlbaum
    Associates 201-233.
  • Kruskal, J. B. (1964a). Multidimensional Scaling
    by optimizing goodness of fit to a non-metric
    hypothesis. Psychometrika 29(1) 1-27. Reprinted
    in Key Texts in Multidimensional Scaling, P.M.
    Davies and A.P.M Coxon, Eds. Heinemann
    Educational Books, Exeter, N.H.., 1982, pp 59-83.
  • Kruskal, J. B. (1964b). Non-metric
    multidimensional Scaling A numerical method.
    Psychometrika 29(2) 115--129, Reprinted in Key
    Texts in Multidimensional Scaling, P.M. Davies
    and A.P.M Coxon, Eds. Heinemann Educational
    Books, Exeter, N.H.., 1982, pp 59-83.
  • Olsen, K. A. (1993). Visualization of a document
    Collection The Vibe System. Information
    processing and management 29(1) 69-81, Pergamon
    Press Ltd, 1993.
  • Sammon, J. W. (1969). A Nonlinear mapping for
    Data Structure Analysis. IEEE Transactions on
    Computers 18(5) 401-409, May 1969.

34
End
35
Multidimensional Scaling
  • Metric
  • Preserve actual proximities
  • Types
  • Classical (PVA)
  • Least squares
  • Optimization method with global error function
  • Non-metric
  • Preserve rank order of proximities
  • Optimization method with global error function
  • Spring model methods
  • Attractive and repulsive forces between objects
    act to either bring them together or push them
    apart
  • Optimization with local error function

36
Notation
 S N-dimensional sample set Sj, Sk
Projection pair of references in S J Point
where P is projected perpendicularly to L2 L1
Line that goes through Sj and Sk L2 Line
that goes through J and P P Sample in S to
be mapped t1 Placement of J along L1
(time parameter in linear equation) t2
Distance between J and P (not a time
parameter)   R Two-dimensional
representation set Rj, Rk Projection
pair of references in R J Intersection
point between L1 and L2 L1 Line that
goes through Rj and Rk L2 Line that goes
through J and Qi Qi Result of Two point
n-dimensional projective mapping
Write a Comment
User Comments (0)
About PowerShow.com