Data%20Visualization%20STAT%20890,%20STAT%20442,%20CM%20462 - PowerPoint PPT Presentation

About This Presentation
Title:

Data%20Visualization%20STAT%20890,%20STAT%20442,%20CM%20462

Description:

... Applied statistics Adaptive (stochastic) signal processing Probabilistic planning or reasoning are all closely related to the second problem. – PowerPoint PPT presentation

Number of Views:116
Avg rating:3.0/5.0
Slides: 40
Provided by: AliG155
Category:

less

Transcript and Presenter's Notes

Title: Data%20Visualization%20STAT%20890,%20STAT%20442,%20CM%20462


1
Data VisualizationSTAT 890, STAT 442, CM 462
  • Ali Ghodsi
  • Department of Statistics
  • School of Computer Science
  • University of Waterloo
  • aghodsib _at_uwaterloo.ca
  • September 2006

2
Two Problems
  • Classical Statistics
  • Infer information from small data sets (Not
    enough data)
  • Machine Learning
  • Infer information from large data sets (Too many
    data)

3
Other Names for ML
  • Data mining,
  • Applied statistics
  • Adaptive (stochastic) signal processing
  • Probabilistic planning or reasoning
  • are all closely related to the second problem.

4
Applications
  • Machine Learning is most useful when the
    structure of the task is not well understood but
    can be characterized by a dataset with strong
    statistical regularity.
  • Search and recommendation (e.g. Google, Amazon)
  • Automatic speech recognition and speaker
    verification
  • Text parsing
  • Face identification
  • Tracking objects in video
  • Financial prediction, fraud detection (e.g.
    credit cards)
  • Medical diagnosis

5
Tasks
  • Supervised Learning given examples of inputs and
    corresponding desired outputs, predict outputs on
    future inputs.
  • e.g. classification, regression
  • Unsupervised Learning given only inputs,
    automatically discover representations, features,
    structure, etc.
  • e.g. clustering, dimensionality reduction,
    Feature extraction

6
Dimensionality Reduction
  • Dimensionality The number of measurements
    available for each item in a data set.
  • The dimensionality of real world items is very
    high.
  • For example The dimensionality of a 600 by 600
    image is 360,000.
  • The Key to analyzing data is comparing these
    measurements to find relationships among this
    plethora of data points.
  • Usually these measurements are highly redundant,
    and relationships among data points are
    predictable.

7
Dimensionality Reduction
  • Knowing the value of a pixel in an image, it is
    easy to predict the value of nearby pixels since
    they tend to be similar.
  • Knowing that the word corporation occurs often
    in articles about economics, but not very often
    in articles about art and poetry then it is easy
    to predict that it will not occur very often in
    articles about love.
  • Although there are lots of measurements per item,
    there are far fewer that are likely to vary.
    Using a data set that only includes the items
    likely to vary allows humans to quickly and
    easily recognize changes in high dimensionality
    data.

8
Data Representation





9
Data Representation





10
Data Representation
1 1 1 1 1
1 0 1 0 1
1 1 1 1 1
1 0.5 0.5 0.5 1
1 1 1 1 1
11
(No Transcript)
12
2 by 103
644 by 103
644 by 2

23 by 28
2 by 1
2 by 1
23 by 28
13
(No Transcript)
14
(No Transcript)
15
(No Transcript)
16
(No Transcript)
17
(No Transcript)
18
(No Transcript)
19
(No Transcript)
20
(No Transcript)
21
Arranging words Each word was initially
represented by a high-dimensional vector that
counted the number of times it appeared in
different encyclopedia articles. Words with
similar contexts are collocated
22
(No Transcript)
23
(No Transcript)
24
(No Transcript)
25
(No Transcript)
26
Different Features
27
Glasses vs. No Glasses
28
Beard vs. No Beard
29
Beard Distinction
30
Glasses Distinction
31
Multiple-Attribute Metric
32
Embedding of sparse music similarity graph
Platt, 2004
33
Reinforcement learning
Mahadevan and Maggioini, 2005
34
Semi-supervised learning
  • Use graph-based discretization of manifold to
    infer missing labels.

Belkin Niyogi, 2004 Zien et al, Eds., 2005
Build classifiers from bottom eigenvectors of
graph Laplacian.
35
Learning correspondences
  • How can we learn manifold structure that is
    shared across multiple data sets?

36
Mapping and robot localization
Bowling, Ghodsi, Wilkinson 2005
Ham, Lin, D.D. 2005
37
The Big Picture
38
Manifold and Hidden Variables
39
Reading
  • Journals Neural Computation, JMLR, ML, IEEE PAMI
  • Conferences NIPS, UAI, ICML, AI-STATS, IJCAI,
    IJCNN
  • Vision CVPR, ECCV, SIGGRAPH
  • Speech EuroSpeech, ICSLP, ICASSP
  • Online citesser, google
  • Books
  • Elements of Statistical Learning, Hastie,
    Tibshirani, Friedman
  • Learning from Data, Cherkassky, Mulier
  • Machine Learning, Mitchell
  • Neural Networks for pattern Recognition, Bishop
  • Introduction to Graphical Models, Jordan et. al
Write a Comment
User Comments (0)
About PowerShow.com