Detection and Compensation of CrossHybridization in DNA Microarray Data - PowerPoint PPT Presentation

1 / 7
About This Presentation
Title:

Detection and Compensation of CrossHybridization in DNA Microarray Data

Description:

... consist of a 2-D array of probes, each with a short DNA sequence attached. ... probes will be matched with sequences for which it wasn't specifically ... – PowerPoint PPT presentation

Number of Views:46
Avg rating:3.0/5.0
Slides: 8
Provided by: Jim471
Category:

less

Transcript and Presenter's Notes

Title: Detection and Compensation of CrossHybridization in DNA Microarray Data


1
Detection and Compensation of Cross-Hybridization
in DNA Microarray Data
  • Joint work with
  • Quaid Morris(1), Tim Hughes(2)
  • and Brendan Frey(1)

Jim Huang (1),
  • Probabilistic and Statistical Inference Group,
    University of Toronto
  • (2) Banting Best Department of Medical
    Research, University of Toronto

2
Description and Applications of DNA Microarrays
  • Microarrays consist of a 2-D array of probes,
    each with a short DNA sequence attached. These
    sequences are called oligonucleotide sequences.
  • The output of each probe is approximately
    proportional to the amount of DNA that binds to
    the probe from a given tissue the data for each
    probe is an N-dimensional expression profile
    vector, where N is the number of tissues used on
    the array.
  • DNA microarrays can be used to measure the level
    of gene expression across these N tissues.

3
Hybridization and cross-hybridization
DNA from tissue sample
  • The process of 2 complementary DNA strands
    binding is called hybridization
  • Ideally, an oligonucleotide probe will only bind
    to the DNA sequence for which it was designed and
    to which it is complementary
  • However, many DNA sequences are similar to one
    another and can bind to other probes on the
    array
  • This phenomenon is called cross-hybridization

ATCTAGAAT
TCGAT CCTA
AGCTAGGAT
TCGAT CCTA
Hybridization
Cross-hybridization
Oligonucleotide Probe
4
The trouble with cross-hybridization
  • With cross-hybridization, each probe will signal
    the presence of multiple sequences other than
    that it was designed for
  • This skews the observed data from the expected
    data.



Observed expression profile vector
(cross-hybridized)
Expected expression profile vector (no
hybridization)
5
Detecting cross-hybridization (1)
  • To test for whether cross-hybridization is
    impacting the gene expression data, we perform a
    BLAST sequence match on all oligonucleotide probe
    sequences used on the microarray
  • Many probes will be matched with sequences for
    which it wasnt specifically designed.

6
Detecting cross-hybridization (2)
  • We compute the Pearson correlation coefficient ?
    between matched probe sequence expression
    profiles and between the profiles of
    randomly-paired probes
  • Approximately 33 of the BLAST-matched probes
    have ? gt 0.95, whereas only 2 of
    randomly-matched probes have ? gt0.95
  • This difference in the 2 distributions indicates
    that cross-hybridization indeed has a significant
    impact on the observed gene expression data.

7
Compensating for cross-hybridization
  • We model the observed, cross-hybridized
    expression profile vector x as a matrix product
    of a hybridization matrix ? and an unobserved
    expression profile vector z in which there is no
    cross-hybridization.
  • The elements ?ij of the ? matrix are set as
    parameterized functions of the Gibbs free energy
    ?Gij between probes i and j.
  • To compensate for cross-hybridization, we use a
    generalized Expectation-Maximization algorithm in
    which we solve for z and ? iteratively.
Write a Comment
User Comments (0)
About PowerShow.com