A Framework for Modelling Short, HighDimensional Multivariate Time Series: Preliminary Results in Vi - PowerPoint PPT Presentation

About This Presentation
Title:

A Framework for Modelling Short, HighDimensional Multivariate Time Series: Preliminary Results in Vi

Description:

1Dept of Immunology and Molecular Pathology, UCL, UK ... Sx : Self Organising Map with x Clusters. Gx : Grouping Genetic Algorithm with x Clusters ... – PowerPoint PPT presentation

Number of Views:26
Avg rating:3.0/5.0
Slides: 19
Provided by: allant5
Category:

less

Transcript and Presenter's Notes

Title: A Framework for Modelling Short, HighDimensional Multivariate Time Series: Preliminary Results in Vi


1
A Framework for Modelling Short, High-Dimensional
Multivariate Time Series Preliminary Results in
Virus Gene Expression Data Analysis
  • Paul Kellam1, Xiaohui Liu2, Nigel Martin3,
    Christine Orengo4, Stephen Swift2, Allan Tucker2
  • 1Dept of Immunology and Molecular Pathology, UCL,
    UK
  • 2Dept of Information Systems and Computing,
    Brunel University, UK
  • 3Dept of Computer Science, Birkbeck College,
    London, WC1E 7HX, UK
  • 4Dept of Biochemistry and Molecular Biology, UCL,
    WC1E 6BT, UK

2
Framework
Expression Data
Clustering Algorithms
ClusterFusion
Clusters
Model Building
RobustClusters
Forecasts
Explanations
3
Clustering Algorithms
  • Hierarchical
  • The Grouping Genetic Algorithm
  • K-Means
  • The Self Organising Map

4
Cluster Fusion (1)
Cluster Method 1
Cluster Method 2
Construct Agreement Matrix
. . .
Cluster Method N
Clusterfusion
5
The Agreement Matrix
To Gene
F
From Gene
6
Viral Gene Expression Data
  • Kaposi's Sarcoma-Associated Human Herpesvirus 8
    (HHV8)
  • 106 viral and human genes
  • Induced with 12-O-TetradecoylPhorbol 13-Acetate
    (TPA)
  • 13 Measurements over time
  • Normalised expression levels

7
Evaluation
  • Compare cluster similarity using Weighted-Kappa
  • Compare clusters against biologicaldomain
    knowledge
  • Clusterfusion

8
Weighted-Kappa Results
Hx Hierarchical Clustering with x ClustersKx
K-Means Clustering with x ClustersSx Self
Organising Map with x ClustersGx Grouping
Genetic Algorithm with x Clusters
9
Domain Knowledge Results
10
Clusterfusion Results
  • 48 out of 106 genes unassigned
  • Mostly pairs or triples
  • Only 3 of feature 2 are present!
  • Although there are some interesting results, e.g.
    unknown function genes placed with those of known
    function

11
Modelling
  • We have focussed on the Dynamic Bayesian Network
  • Models a temporal domain probabilistically
  • Consists of a graphical representation and
    conditional probability distributions
  • Facilitates the combining of expert knowledge and
    data
  • Models can be queried to investigate the
    relationships discovered from data
  • Requires data discretisation

12
Dynamic Bayesian Networks
13
Modelling Results
  • Example DBNs (compact representation
  • without lags included)

14
Forecast Results
15
Explanation
  • Apply inference given observations about certain
    nodes
  • Insert observations into DBN
  • Apply inference back in time
  • Construct explanations using posterior
    probabilities

16
Explanation - Results
P(C7 is 2) 1.000
  • An example
  • explanation
  • using a
  • discovered DBN

1
2
2
P(C7 is 1) 1.000
P(H8 is 2) 0.999
P(B12 is 2) 0.884
1
2
1
P(B6 is 2) 0.568
P(B12 is 1) 0.440
P(A7 is 1) 0.510
17
Conclusions
  • Modelling gene expression data is a challenging
    task
  • Introduced a framework for modelling such data
  • Encouraging preliminary results when applied to
    viral gene expression data
  • More rigorous testing on different datasets

18
Acknowledgements
  • Biotechnology and Biological Sciences Research
    Council (BBSRC), UK
  • The Engineering and Physical Sciences Research
    Council (EPSRC), UK
Write a Comment
User Comments (0)
About PowerShow.com