Building biological networks from diverse genomic data - PowerPoint PPT Presentation

About This Presentation
Title:

Building biological networks from diverse genomic data

Description:

Incorporating expert direction can: ... Expert-driven network discovery ... Expert-directed search. Web-based dynamic interface ... – PowerPoint PPT presentation

Number of Views:24
Avg rating:3.0/5.0
Slides: 34
Provided by: csPrin
Category:

less

Transcript and Presenter's Notes

Title: Building biological networks from diverse genomic data


1
Building biological networks from diverse genomic
data
  • Chad Myers
  • Department of Computer Science, Lewis-Sigler
    Institute for Integrative Genomics
  • Princeton University
  • PRIME Workshop on Pathway Databases and Modeling
    Tools
  • June 16, 2006

2
Motivation building biological networks from
experimental data
?
  • Find missing pathway components
  • Detect uncharacterized crosstalk between
    pathways
  • Discover novel pathways

Explosion of functional genomic DATA
KNOWLEDGE of components and inter-relationships
that lead to function
3
Motivation building biological networks from
experimental data
noisy
How can we harness this information without
sacrificing precision?
4
Directed network discovery involving the
biologist in the search process
  • Previous approaches to network analysis from
    genomic data
  • largely undirected global approaches that detect
    interesting network features
  • Incorporating expert direction can
  • Improve sensitivity and precision by using
    context information
  • Focus on relevant information for biologist user
    (allows interactivity)

Two-hybrid interaction network, yeast (SH3
domain) Boone lab
Previous work Bader et al. (2003), Asthana et
al. (2004) Yamanashi et al. (2004,2005), Kato et
al. (2005)
5
bioPIXIE system overview
bioPIXIE Pathway Inference from eXperimental
Interaction Evidence
6
Overview
  • How do we integrate heterogeneous evidence?
  • Expert-driven network discovery
  • Making it usable practical visualization and
    other interface considerations
  • Does it work?
  • (evaluation experiments and biological
    validation)
  • Challenges/opportunities and future work

7
Heterogeneous data integration
  • Diverse forms of data whats a unifying
    framework?
  • Variable coverage, reliability, and relevance
  • Integration scheme should utilize information in
    data when available, but be robust when missing

physical binding
cellular localization
genetic interaction
expression
sequence (TF motifs, coding,)
? Map to associations of genes/proteins
? Bayes net
8
Bayes net for evidence integration
We infer
  • Input evidence grouped by lab (source) and by
    type
  • Structure
  • Naïve Bayes (60 nodes)
  • (also tried TAN)
  • CPTs
  • learned from GO gold standard

Functional Relationship
Fully-connected, weighted graph of proteins
Microarray correlation
Shared transcription factors
Synthetic lethality
Synthetic rescue
Co- localization
Purified complex
2 Hybrid

Affinity precipitation
9
Overview
  • How do we integrate heterogeneous evidence?
  • Expert-driven network discovery
  • Making it usable practical visualization and
    other interface considerations
  • Does it work?
  • (evaluation experiments and biological
    validation)
  • Challenges/opportunities and future work

10
Expert-driven network discovery
  • Local search in the PPI network centered at the
    query
  • Which proteins should we extract as a single,
    functionally coherent group?
  • Should consider confidence in links and topology
    surrounding query group

11
Extracting relevant proteins
  • Basic idea compute expected linkage to query
    set
  • eij P ( protein i is functionally related to
    protein j evidence)
  • Xij binary RV with prob. eij
  • SQ ( pi ) of links from protein i to query
    set, Q
  • Find proteins that maximize

What about indirect links to the query set?
12
Graph search handling indirect links
  • Solution iterative expanding search where
    indirect links to the query through high
    confidence neighbors are counted

13
Overview
  • How do we integrate heterogeneous evidence?
  • Expert-driven network discovery
  • Making it usable practical visualization and
    other interface considerations
  • Does it work?
  • (evaluation experiments and biological
    validation)
  • Challenges/opportunities and future work

14
Making bioPIXIE usable

  • Guiding principles
  • Accessibility
  • (users can access most recent data with little
    effort)
  • Simplicity vs. flexibility
  • Drill-down
  • (details, e.g. supporting exp. data, hidden
    until requested)
  • Browseable

15
Graph visualization


16
Overview
  • How do we integrate heterogeneous evidence?
  • Expert-driven network discovery
  • Making it usable practical visualization and
    other interface considerations
  • Does it work?
  • (evaluation experiments and biological
    validation)
  • Challenges/opportunities and future work

17
Evaluation experiments
Recovering known network components How much
does integration help?
  • Results averaged over 31 pathways, processes, and
    complexes (KEGG, GO, MIPS)
  • 10 random proteins as query set and try to
    recover remaining members

18
Evaluation experiments (2)
Recovering known network components Do naïve
methods of integration/search work just as well?
  • Results averaged over 31 pathways, processes, and
    complexes (KEGG, GO, MIPS)
  • 10 random proteins as query set and try to
    recover remaining members

19
Biological validation finding new components
  • Using bioPIXIE to characterize unknown genes

S. cerevisiae uncharacterized gene,
YPL077C Predicted involvement in chromosome
segregation
20
Biological validation finding new components
P-value based on blind counting 1.98x10-7 ,
Fishers exact test
21
Biological validation novel links between
pathways
DNA replication initiation Cdc7 switch that
starts replication (activated by Dbf4) Linked to
Hsp90 complex by our method Hsp90 (yeast-
hsc82,hsp82) Cytosolic molecular chaperone that
participates in the folding of several signaling
kinases and hormone receptors
(Helmut Pospiech)
22
Genetic analysis of DNA replication-Hsp90 link
dbf4?hsp82?
dbf4?hsc82?
dbf4?cpr7?
hsp82?
hsc82?
cpr7?
dbf4?
dbf4?
dbf4?
wt
wt
wt
105 cells
RT
105 cells
30C
105 cells
37C
YKO Dbf4 vs. hsp82, hsc82 and co-chaperones
cpr7, sti1, cdc37
23
Overview
  • How do we integrate heterogeneous evidence?
  • Expert-driven network discovery
  • Making it usable practical visualization and
    other interface considerations
  • Does it work?
  • (evaluation experiments and biological
    validation)
  • Challenges/opportunities and future work

24
Practical challenges/opportunities
  • Visualizing complex networks of interactions in
    a meaningful way
  • how does it scale with added data?
  • easy user navigation around the network
  • Data-centric vs. established knowledge views
  • How do we overlay current knowledge of pathways
    with predictions derived from experimental data?

25
Future work
An observation
The more specific we can be about the end goal,
the better the accuracy of our prediction
26
Future work
Exploiting relevance and reliability variation
context-specific integration
27
Summary
  • bioPIXIE can facilitate precise network discovery
    from experimental data using
  • Bayesian data integration
  • Expert-directed search
  • Web-based dynamic interface
  • bioPIXIE is an effective tool for browsing
    genomic evidence and generating specific,
    testable hypotheses

http//pixie.princeton.edu
28
Acknowledgements
Olga Troyanskaya Drew Robson Adam Wible Kara
Dolinski Camelia Chiriac Matt Hibbs Curtis
Huttenhower David Botstein Lab Leonid Kruglyak Lab
Thank you!
http//pixie.princeton.edu
29
Evaluation experiments (3) what about noise in
the query set?
of random proteins out of 20 total query
proteins
AUPRC
30
Evaluation experiments (4)
Comparing with existing approaches
SEEDY proteins ranked by max. direct connection
to query
Complexpander
31
Hydroxyurea sensitivity (replication inhibitor)
dbf4?hsp82?
dbf4?hsc82?
dbf4?cpr7?
dbf4?hsc82?
dbf4?hsp82?
dbf4?hsp82?
dbf4?hsc82?
dbf4?sti1?
dbf4?cpr7?
dbf4?cpr7?
dbf4?sti1?
dbf4?sti1?
hsc82?
hsp82?
hsp82?
hsc82?
hsc82?
dbf4?
dbf4?
wt
wt
wt
cpr7?
hsp82?
cpr7?
cpr7?
dbf4?
sti1?
sti1?
sti1?
106 cells
30C
106 cells
37C
HU 50 mM
HU 100 mM
HU 0 mM
32
Is this interaction specific to DNA replication?
MMS sensitivity (induces DNA damage)
  • Conclusions
  • Hsp90 complex plays specific role in DNA
    replication
  • Hsc82 and hsp82 do not have identical function
  • Possible new link between signaling cascades,
    stress, and DNA replication
  • Our system generates specific, testable
    hypotheses

dbf4?hsc82?
dbf4?hsp82?
dbf4?cpr7?
dbf4?hsc82?
dbf4?hsc82?
dbf4?hsp82?
dbf4?hsp82?
dbf4?sti1?
dbf4?cpr7?
dbf4?cpr7?
dbf4?sti1?
dbf4?sti1?
hsc82?
hsc82?
hsc82?
hsp82?
hsp82?
dbf4?
dbf4?
wt
wt
wt
cpr7?
hsp82?
cpr7?
cpr7?
dbf4?
sti1?
sti1?
sti1?
106 cells
37C
MMS treatment has no apparent effect at RT, 30C
or 37C (shown)
33
(No Transcript)
34
(No Transcript)
Write a Comment
User Comments (0)
About PowerShow.com