Exploring Gene Cluster Coherence from a Text Perspective - PowerPoint PPT Presentation

1 / 15
About This Presentation
Title:

Exploring Gene Cluster Coherence from a Text Perspective

Description:

Similarity in a certain property among the members of a cluster ... Does the cluster picked by text-based methods agree with an independent external ... – PowerPoint PPT presentation

Number of Views:36
Avg rating:3.0/5.0
Slides: 16
Provided by: stead
Category:

less

Transcript and Presenter's Notes

Title: Exploring Gene Cluster Coherence from a Text Perspective


1
Exploring Gene Cluster Coherence from a Text
Perspective
  • Xin Ying Qiu, Padmini Srinivasan,
  • Olivier Bodenreider, Kelly Zeng
  • Department of Management Sciences, Univerisity
    of Iowa
  • School of Library and Information Science,
    University of Iowa
  • National Library Of Medecine, National Institute
    of Health

2
Clustering Genes Based on Expression Patterns
  • Goal
  • study the properties of all genes in an organism
    at once
  • Method
  • group together genes based on similar patterns of
    expression under a series of micro-array
    experiments

3
Alternate Methods to Explore Gene Expression
Clusters
  • Coherence (or cohesiveness)
  • Similarity in a certain property among the
    members of a cluster

What property can describe clusters Eye Color ?
Or Height and Weight ? Which property leads to
higher cluster coherence ?
4
Alternate Methods to Explore Gene Expression
Clusters
  • Text-based methods to describe gene property
  • For each gene, collect all annotation documents
  • Represent each gene with
  • Free-text vector
  • Document id vector
  • MeSH metadata vector (via Manjal)
  • Basic GO terms vector
  • Expanded GO terms vector
  • Measure cluster coherence

5
Comparing coherence using gold-standard clusters
6
Comparing Cluster Coherence using threshold
results
7
Results Methods Pair-wise Correlation by
Coherence Scores
Un-correlated
8
Evaluating cluster quality using Dunns Index
  • Coherence only measures the compactness of gene
    members in a cluster
  • Dunns index a measurement of both the
    compactness and the separation of a clustering
    result

9
Text-based methods Dunns index using
gold-standard data
10
Text-based methods Dunns index using threshold
data
11
Further Analysis cluster size and distribution
  • Threshold of clusters of genes
  • 0.9 33 151
    7
  • 0.8 127 464
    21
  • 0.7 307 1084
    50
  • 0.6 384 1664
    77
  • 0.5 328 2007
    96
  • 0.4 201 2140
    98.8
  • 0.3 94 2161
    99.8
  • 0.2 37 2166
    100
  • 0.1 9 2166
    100
  • Considering the number of clusters and the number
    of genes clustered, and the performance of Dunns
    index, threshold 0.5 seems to achieve the overall
    best quality, according the text-based
    representation

12
Validate Text-based results with Gold standard
criterion
  • The Question
  • Does the cluster picked by text-based methods
    agree with an independent external criteria?
  • The Method
  • Use Eisens gold standard genes, perform
    hierarchical clustering
  • Compute Rands Index to measure the agreement of
    hierarchical clustering and the Eisens gold
    standard.

13
Validate Text-based results with Gold standard
criterion
0.5 is the best
14
Conclusions
  • GO-expanded method and Manjal (Mesh term based)
    method achieve higher coherence
  • Free-text method, GO-expanded method and Manjal
    (Mesh term based) method agree in identifying
    well-separated and highly-compact clustering
    result
  • The good clustering result identified by
    text-based method is consistent with external
    independent criteria
  • Document co-occurrence based method does not
    correlate with other free-text based methods

15
Application of our findings
  • Alternate text-based methods can be applied to
    help identify the highly-compact and well
    separated gene clusters
  • The gene clusters identified by text based method
    contain strong cohesive property that serve as
    description of the cluster members
Write a Comment
User Comments (0)
About PowerShow.com