Discovery of Hidden Structure in High-Dimensional Data - PowerPoint PPT Presentation

1 / 6
About This Presentation
Title:

Discovery of Hidden Structure in High-Dimensional Data

Description:

Discovery of Hidden Structure in High-Dimensional Data Aapo Hyv rinen Senior Research Scientist University of Helsinki Helsinki University of Technology – PowerPoint PPT presentation

Number of Views:51
Avg rating:3.0/5.0
Slides: 7
Provided by: csHelsin8
Category:

less

Transcript and Presenter's Notes

Title: Discovery of Hidden Structure in High-Dimensional Data


1
Discovery of Hidden Structure in High-Dimensional
Data
  • Aapo Hyvärinen
  • Senior Research Scientist

University of Helsinki Helsinki University of
Technology
2
Discovery of Hidden Structure in
High-Dimensional Data
  • Science produces enormous data sets, often with
    hidden structure
  • This theme Typically continuous data and/or
    probabilistic models
  • Tasks
  • Parsimonious models
  • Decomposition into dependent components
  • Non-Gaussian Bayesian networks
  • Spatiotemporal models
  • Applications
  • Neuroinformatics imaging data analysis,
    functional modelling
  • Bioinformatics Genome structure, metabolic
    models, gene regulation
  • Telecom, linguistics, forestry, ecology,
    atmospheric data, etc.
  • Main teams Hyvärinen Mannila

3
Highlight/Background Independent Component
Analysis
  • Linear decomposition of multivariate data
  • Finds hidden directions (green), in contrast to
    classic PCA (red)
  • Application in blind source separation, e.g. in
    brain imaging data
  • FastICA the most popular ICA algorithm
    (Hyvärinen. IEEE Trans. NN, 1999)
  • Standard reference book on the theory
    Independent Component Analysis. Hyvärinen,
    Karhunen, Oja. Wiley, 2001.

Mixing process
4
Task Decomposition into Dependent Components
  • Components are not independent in general
  • Extend to dependent components (Hyvärinen team)
  • Grouping and visual ordering (Hyvärinen et al.
    Neural Comput. 2000 2001)
  • Separation with some dependency (Hyvärinen and
    Hurri. Signal Proc., 2004)
  • Future More general dependency structures
  • Related to nonlinear decompositions
  • Extend to binary data (teams Mannila, Toivonen)
  • Analyze stability of components (Himberg et al,
    NeuroImage, 2004)

5
Highlight and Task Non-Gaussian Bayesian
Networks
  • Non-gaussianity enables learning network
    structure and weights in basic linear DAG
    case(Shimizu, Hoyer, Hyvärinen, Kerminen. J.
    Mach. Learn. Res., 2006)
  • Another example of the power of non-gaussianity
  • Enables inference on the direction of causality
  • Currently extending to, e.g.
  • hidden confounding variables (Hyvärinen team)
  • nonlinearities (Hyvärinen team)
  • (partly) binary data (Mannila Hyvärinen teams)
  • Applications to gene networks, brain imaging
    datato be explored

6
Future Vision
  • Probabilistic methods with emphasis on
    algorithmic aspects
  • Interface between computer science and statistics
  • Combine expertise on algorithms and multivariate
    statistics
  • Discovery of hidden components, clusters, or
    connections
  • Continuous data nongaussianity a nonclassic yet
    central tool
  • Discrete data e.g., covering approaches
  • Applications in many different fields of science
  • Our special competence in neuro- and
    bioinformatics
Write a Comment
User Comments (0)
About PowerShow.com