Multidimensional Scaling - PowerPoint PPT Presentation

About This Presentation
Title:

Multidimensional Scaling

Description:

MDS is a mapping from proximities to corresponding distances in MDS space. ... Scaling (All X doubled in size (or flipped)) Rotatation (X rotated 20 degrees left) ... – PowerPoint PPT presentation

Number of Views:118
Avg rating:3.0/5.0
Slides: 30
Provided by: andrew221
Learn more at: http://people.umass.edu
Category:

less

Transcript and Presenter's Notes

Title: Multidimensional Scaling


1
Multidimensional Scaling
2
Agenda
  • Multidimensional Scaling
  • Goodness of fit measures
  • Nosofsky, 1986

3
Proximities
pAmherst, Hadley
Amherst Belchertown Hadley Leverett Pelham Shutesbury Sunderland
Amherst 0 9.94 4.32 7.29 6.81 9.94 7.81
Belchertown 0 14.06 14.94 8.25 13.96 17.66
Hadley 0 11.02 10.93 14.49 9.5
Leverett 0 12.57 7.45 5.18
Pelham 0 5.71 16.16
Shutesbury 0 11.04
Sunderland 0
4
Configuration (in 2-D)
xi
5
Configuration (in 1-D)
6
Formal MDS Definition
  • f pij?dij(X)
  • MDS is a mapping from proximities to
    corresponding distances in MDS space.
  • After a transformation f, the proximities are
    equal to distances in X.

Amherst Belchertown Hadley Leverett Pelham Shutesbury Sunderland
Amherst 0 9.94 4.32 7.29 6.81 9.94 7.81
Belchertown 0 14.06 14.94 8.25 13.96 17.66
Hadley 0 11.02 10.93 14.49 9.5
Leverett 0 12.57 7.45 5.18
Pelham 0 5.71 16.16
Shutesbury 0 11.04
Sunderland 0
7
Distances, dij
dAmherst, Hadley(X)
8
Distances, dij
9
Distances, dij
dAmherst, Hadley(X)4.32
10
Proximities and Distances
Proximities
Amherst Belchertown Hadley Leverett Pelham Shutesbury Sunderland
Amherst 0 9.94 4.32 7.29 6.81 9.94 7.81
Belchertown 0 14.06 14.94 8.25 13.96 17.66
Hadley 0 11.02 10.93 14.49 9.5
Leverett 0 12.57 7.45 5.18
Pelham 0 5.71 16.16
Shutesbury 0 11.04
Sunderland 0
Distances
Amherst Belchertown Hadley Leverett Pelham Shutesbury Sunderland
Amherst 0 10.0577 6.3325 7.4738 7.9313 7.8319 7.8328
Belchertown 0 12.0455 16.8332 6.7959 12.7215 17.6600
Hadley 0 12.0350 13.1492 14.1632 8.1892
Leverett 0 12.2097 7.3591 6.6429
Pelham 0 6.3360 15.4250
Shutesbury 0 12.7366
Sunderland 0
11
The Role of f
  • f relates the proximities to the distances.
  • f(pij)dij(X)

12
The Role of f
  • f can be linear, exponential, etc.
  • In psychological data, f is usually assumed any
    monotonic function.
  • That is, if pijltpkl then dij(X)?dkl(X).
  • Most psychological data is on an ordinal scale,
    e.g., rating scales.

13
Looking at Ordinal Relations
Proximities
Amherst Belchertown Hadley Leverett Pelham Shutesbury Sunderland
Amherst 0 9.94 4.32 7.29 6.81 9.94 7.81
Belchertown 0 14.06 14.94 8.25 13.96 17.66
Hadley 0 11.02 10.93 14.49 9.5
Leverett 0 12.57 7.45 5.18
Pelham 0 5.71 16.16
Shutesbury 0 11.04
Sunderland 0
Distances
Amherst Belchertown Hadley Leverett Pelham Shutesbury Sunderland
Amherst 0 10.0577 6.3325 7.4738 7.9313 7.8319 7.8328
Belchertown 0 12.0455 16.8332 6.7959 12.7215 17.6600
Hadley 0 12.0350 13.1492 14.1632 8.1892
Leverett 0 12.2097 7.3591 6.6429
Pelham 0 6.3360 15.4250
Shutesbury 0 12.7366
Sunderland 0
14
Stress
  • It is not always possible to perfectly satisfy
    this mapping.
  • Stress is a measure of how closely the model
    came.
  • Stress is essentially the scaled sum of squared
    error between f(pij) and dij(X)

15
Stress
Correct Dimensionality
Stress
Dimensions
16
Distance Invariant Transformations
  • Scaling (All X doubled in size (or flipped))
  • Rotatation (X rotated 20 degrees left)
  • Translation (X moved 2 to the right)

17
Configuration (in 2-D)
18
Rotated Configuration (in 2-D)
19
Uses of MDS
  • Visually look for structure in data.
  • Discover the dimensions that underlie data.
  • Psychological model that explains similarity
    judgments in terms of distance in MDS space.

20
Simple Goodness of Fit Measures
  • Sum-of-squared error (SSE)
  • Chi-Square
  • Proportion of variance accounted for (PVAF)
  • R2
  • Maximum likelihood (ML)

21
Sum of Squared Error
Data Prediction (Data-Prediction)2
7 5.03 3.88
8 6.97 1.06
1 2.12 1.25
8 8.91 0.83
6 6.97 0.94
SSE 7.97
22
Chi-Square
Data Prediction (Data-Prediction)2 (Data - Prediction)2/Prediction
7 5 4 0.80
8 7 1 0.14
1 2 1 0.50
8 9 1 0.11
6 7 1 0.14
Chi-Square 1.70
23
Proportion of Variance Accounted for
Data Mean Prediction Mean Prediction Mean Prediction Model Prediction Model Prediction Model Prediction
Mean Error Error2 Prediction Error Error2
7 6 1 1 5.03 1.97 3.88
8 6 2 4 6.97 1.03 1.06
1 6 -5 25 2.12 -1.12 1.25
8 6 2 4 8.91 -0.91 0.83
6 6 0 0 6.97 -0.97 0.94
SST 34 SSE 7.96
(SST-SSE)/SST (34-7.96)/34 .77
24
R2
  • R2 is PVAF, but

Data Mean Prediction Mean Prediction Mean Prediction Model Prediction Model Prediction Model Prediction
Mean Error Error2 Prediction Error Error2
7 6 1 1 5.9 1.1 1.21
8 6 2 4 10.1 -2.1 4.41
1 6 -5 25 4 -3 9
8 6 2 4 5.9 2.1 4.41
6 6 0 0 1 5 25
SST 34 SSE 44.03
(SST-SSE)/SST (34-44.03)/34 -0.295
25
Maximum Likelihood
  • Assume we are sampling from a population with
    probability f(Y ?).
  • The Y is an observation and the ? are the model
    parameters.

?0
Y
N(-1.7 ?0)0.094
26
Maximum Likelihood
  • With independent observations, Y1Yn, the joint
    probability of the sample observations is

?0
Y1
Y2
Y3
0.094 x 0.2661 x .3605 .0090
27
Maximum Likelihood
  • Expressed as a function of the parameters, we
    have the likelihood function
  • The goal is to maximize L with respect to the
    parameters, ?.

28
Maximum Likelihood
?0
Y1
Y2
Y3
0.094 x 0.2661 x .3605 .0090
(Assuming ?1)
?-1.0167
Y1
Y2
Y3
0.3159 x 0.3962 x .3398 .0425
29
Maximum Likelihood
  • Preferred to other methods
  • Has very nice mathematical properties.
  • Easier to interpret.
  • Well see specifics in a few weeks.
  • Often harder (or impossible?) to calculate than
    other methods.
  • Often presented as log likelihood, ln(ML).
  • Easier to compute (sums, not products).
  • Better numerical resolution.
  • Sometimes equivalent to other methods.
  • E.g., same as SSE when calculating mean of a
    distribution.
Write a Comment
User Comments (0)
About PowerShow.com