Using Bayesian Networks to Analyze Expression Data - PowerPoint PPT Presentation

1 / 28
About This Presentation
Title:

Using Bayesian Networks to Analyze Expression Data

Description:

Using Bayesian Networks to Analyze Expression Data. Presenter: Chai Xiaoyong ... temporal indicators, background variable and exogenous cellular conditions, etc. ... – PowerPoint PPT presentation

Number of Views:90
Avg rating:3.0/5.0
Slides: 29
Provided by: Carn6
Category:

less

Transcript and Presenter's Notes

Title: Using Bayesian Networks to Analyze Expression Data


1
Using Bayesian Networks to Analyze Expression Data
Nir Friedman Iftach Nachman Dana Peer Institute
of Computer Science, Hebrew University
  • Presenter Chai Xiaoyong

2
Outline
  • Biological Background
  • Bayesian Networks
  • Analyzing Expression data
  • Technical Aspects
  • Experiment
  • Conclusion and Future Work

3
Part I Biological Background
  • DNA
  • Gene
  • DNA is a double-stranded molecule
  • Hereditary information is encoded
  • Complementation rules
  • Gene is a segment of DNA
  • Contain the information required
  • to make a protein

4
Part I Biological Background
  • Gene Expression refers to the processes involved
    in converting genetic information from a DNA
    sequence into an amino acid sequence, or protein.
  • The processes

(gene)
5
Part I Biological Background
  • Transcriptionthe focus of this research

the transfer of the genetic information from DNA
to messenger RNA (mRNA), a complementary copy of
the gene.
6
Part I Biological Background
  • Each gene encodes a protein and proteins are the
    functional units of life
  • Every gene is present in every cell, but only a
    fraction of the genes are expressed at any time
  • Many diseases result from the interaction
    between genes
  • Understanding the mechanisms that determine
    which genes are expressed, and when they are
    expressed, is the key to the development of new
    treatments of diseases

7
Part I Biological Background
  • Why some genes are expressed while others not, or
    having different expression level?
  • gene expressions are not independent
  • interactions between genes exist

e.g. the expression of gene A promotes the
expression of gene B
e.g. the expression of gene C inhibits the
expression of gene D
  • interactions can be complicated

8
Part I Biological Background
  • Traditionally experimental means
  • ?inefficient and
    insufficient
  • Newly developed technique ?DNA Microarray

Make possible measure and compare quantatively
the expression level of tens of thousands of
genes in cells in a single experiment.
9
Part II Bayesian Networks
  • Prior work Clustering of expression data
  • Groups together genes with similar expression
    pattern
  • Disadvantage does not reveal structural
    relations between genes
  • Big challenge
  • Extract meaningful information from the
    expression data
  • Discover interactions between genes based on the
    measurements

10
Part II Bayesian Networks
11
Part II Bayesian Networks
  • A Bayesian Network (BN) is a graphical
    representation of a probability distribution
  • Compact intuitive representation
  • Useful for describing processes composed of
    locally interacting components
  • Have a good statistical foundation
  • Efficient model learning algorithm
  • Capture causal relationships
  • Deals with noisy data

12
Part II Bayesian Networks
  • Why is it suitable for this problem?
  • Gene expression is an inherently stochastic
    phenomenon
  • To capture the nature of interactions between
    genes especially the
    causal connection

A
  • Microarray techniques are associated with
    missing and noisy data values

B
13
Part III Analyzing Expression Data
  • Practical problem Small data sets
  • variables hundreds of or thousands of genes
  • samples just tens of microarray experiments
  • On the positive side, genetic regulation networks
    are sparse!!!
  • Characterize and learn features that are common
    to most of these networks

14
Part III Analyzing Expression Data
  • The first feature Markov relations
  • Symmetric relation Y is in Xs Markov blanket
    iff there is either an edge between them, or
    both are parents of another variable (Pearl 98).
  • Biological interpretation a Markov relation
    indicates that the two genes are related in some
    joint biological interaction or process

15
Part III Analyzing Expression Data
  • The second feature order relations
  • Global property A is an ancestor of B in all
    the equivalent Bayesian networks learned
  • Biological interpretation an order relation
    indicates that the transcription of one gene is a
    direct cause of the transcription of another gene

A
B
16
Part IV Technical Aspects
  • Learning algorithm induce network structure
  • Sparse Candidate Algorithm.
  • Feature estimate extract useful features
  • A Bootstrap Approach.

17
Part IV Technical Aspects
  • Sparse Candidate Algorithm
  • An heuristic, iterative approach
  • Identify a relatively small number of candidate
    parents for each variable (gene) based on simple
    local statistics at each iteration (Cin)

PaGn(Xi) Cin
Score(Xi , PaGn-1(Xi)?Xj D ) Score(Xi ,
PaGn-1(Xi) D)
18
Part IV Technical Aspects
  • Bootstrap method
  • Generate perturbed versions of original data
    set, and learn from them
  • For i1 m

Resample with replacement N instances from D
(Di)
Learn on Di to induce a network structure Gi
  • For each feature f of interest calculate

conf(f) Si1mf(Gi)/m f(Gi) 1 if f is a feature
in Gi
19
Part V Experiment
  • Induce Bayesian Networks for 250 yeast genes from
    76 Microarray measurements
  • Analyze features in the networks

20
Part V Experiment
  • The map left is an example of Markov relation
    features for gene SVS1.
  • The width of edges corresponds to the confidence.

21
Part V Experiment
  • List of top Markov relation

22
Part VI Conclusions
  • Biological motivation
  • The develop of microarray technology asks for
    methodologies that are both statistically sound
    and computationally tractable for analyzing data
    sets and inferring biological interactions from
    them
  • Advantages of Bayesian Network models
  • Can describe local interaction components
  • Can Reveal the structure of the transcription
    regulation process
  • Provide clear methodologies for learning from
  • Can Deal with uncompleted data sets

23
Part VI Future Work
  • Incorporate biological knowledge as a prior
  • Model the condition attributes into the network,
    such as temporal indicators, background variable
    and exogenous cellular conditions, etc.
  • Learn from continuous data
  • Combine Bayesian methods with clustering
    algorithms to learn models over clustered genes

24
VII Some Useful Reference
  • A Brief Introduction to Graphical Models and
    Bayesian Networks
  • DNA Microarray
  • Project description
  • http//www.ai.mit.edu/murphyk/Bayes/Bayes.html
  • http//www.gene-chips.com/
  • http//genome-www.stanford.edu/cellcycle/

25
The End
  • Thank You !

26
Complementation rules
A(adanine) C(cytosine) T(thymine)
G(guanine)
C
T
C
A
A
T
T
G
A
G
C
G
27
DNA Microarray (informal, intuitive)
  • Testing objects

(gene)
experimental
controlled
Referential cDNA
28
DNA Microarray (conti)
  • Microarray plate

each slot corresponds to one gene
all genes to be studied are present
Write a Comment
User Comments (0)
About PowerShow.com