Rennie, C1 - PowerPoint PPT Presentation

1 / 1
About This Presentation
Title:

Rennie, C1

Description:

3School of Computer Science, Kilburn Building, University of Manchester, Oxford ... b. duplex with terminal unclosed loop or dangling end and mismatch near end ... – PowerPoint PPT presentation

Number of Views:49
Avg rating:3.0/5.0
Slides: 2
Provided by: Office20093
Category:
Tags: rennie | unclosed

less

Transcript and Presenter's Notes

Title: Rennie, C1


1
Rennie, C1 Noyes,HA2 Kemp, SJ2 Hulme, H1 Brass,
A1,3 Hoyle, DC4 1Faculty of Life Sciences,
University of Manchester, Smith Building, Oxford
Road, Manchester, M13 9PT, UK 2Biosciences
Building, School of Biological Sciences,
University of Liverpool, Crown Street, Liverpool,
L69 7ZB, UK 3School of Computer Science, Kilburn
Building, University of Manchester, Oxford Road,
Manchester, M13 9PL, UK 4North West Institute of
Bio-Health Informatics, School of Medicine,
Stopford Building, Oxford Road, Manchester, M13
9PT, UK
Mismatches between probe and target sequences
have a strong position-dependent effect on signal
ratios from aCGH using 60mer oligonucleotide
microarrays
Abstract Sequence mismatches between probe and
fluorescently-labelled target strands are known
to affect the stability of the probe-target
duplex formed, and hence the strength of the
observed fluorescent signals in microarray
experiments. However, the exact effects of
sequence mismatches on microarray hybridisations
are not well characterised. Array-based
Comparative Genomic Hybridisation (aCGH) is a
common technique for identifying DNA copy number
variations. aCGH data are particularly suitable
for analysing the effects of probe-target
sequence mismatches because using genomic DNA
avoids complications due to the large ranges of
intracellular mRNA levels. A previous study
provided data for hybridisations comparing three
mouse strains to a C57BL/6 reference, using
Agilent 60mer oligonucleotide arrays. Sequence
mismatches between targets and probes were
identified using the Perlegen 8 million mouse SNP
dataset, and their effect on log2 signal ratio
between test and reference strains was
assessed. Observations indicated a strong effect
of sequence mismatches on log2 signal ratio,
dependent on the number of mismatches and on
their position relative to the probe sequence.
Ratios for probes with 1 mismatch or 2 mismatches
were strongly correlated when probes were matched
on the maximum length of perfectly matched
sequence between probe and target. An existing
model of nucleic acid melting was tested, but
predictions did not correspond to these results.
Progress has been made in developing a new
computational model to reproduce these findings.
Background Nucleic acid hybridisation is the
formation of a double-helix from two single
strands by complementary base-pairing. It is the
basis for many key biological techniques. An
obvious example is microarrays, which use
hybridisation to probes attached at a
surface. Sequence mismatches are often present,
for example in cross-species hybridisation, or
due to ordinary variation between strains, breeds
or individuals. They are known to have a strong
effect on results from short oligonucleotide
probes and less effect on cDNA probes.
Experimental dataset Log2 signal ratios (see
equation 1) were obtained from hybridisations of
gDNA from three mouse test strains against a
C57BL/6 reference using Agilent 244K whole mouse
genome and 56K custom CGH array platforms. The
probe sequences and the Perlegen mouse SNP
dataset were compared to identify SNP loci that
would cause sequence mismatches between the
probes and the test strain targets. 15206 probes
on the whole genome array and 3710 probes on the
custom array overlapped 1 or more polymorphic
loci (see table 1).
3 SNP ( of probes)
2 SNP ( of probes)
1 SNP ( of probes)
Test strain
Array
36 (0.02)
803 (0.34)
8032 (3.41)
A/J
244K whole genome
45 (0.02)
724 (0.31)
7417 (3.15)
BALB/cJ
244K whole genome
41 (0.02)
868 (0.37)
8106 (3.44)
129P3/J
244K whole genome
80 (0.03)
1546 (0.66)
13984 (5.94)
All strains
244K whole genome
5 (0.01)
120 (0.22)
1343 (2.51)
A/J
56K custom
8 (0.01)
178 (0.33)
1834 (3.43)
BALB/cJ
56K custom
11 (0.02)
233 (0.44)
2273 (4.25)
129P3/J
56K custom
23 (0.04)
536 (1.00)
5199 (9.71)
All strains
56K custom
Table 1 Number of probes in each hybridisation
overlapping 1, 2 or 3 SNP loci that would cause a
mismatch in the probe-target duplex. There were
also 2 probes that overlapped 4 SNP loci, but
these were omitted from the analysis
Equation 1 Log2 signal ratio. A higher ratio
indicates lower intensity for the test strain
(hence possible destabilisation of the duplex
between the probe and the test strain target)
Key observations from experimental data The mean
log2 signal for each number of mismatches was
plotted for each hybridisation (see figure 1).
Larger numbers of mismatches were associated with
higher mean log2 signal ratios, and there was a
strong correlation between number of known
mismatches and log2 signal ratio (r2 0.94),
indicating that mismatches do have an effect on
the results from long oligonucleotide
probes. For the probe-target pairs with 1
mismatch, the mean log2 ratio was plotted for
each possible mismatch position, measured from
the nearest end of the probe (see figure 2).
Mismatches further from the end of the probe were
associated with higher mean log2 signal ratios,
and there was a strong correlation between
mismatch position and log2 signal ratio (r2
0.92). Moving mismatch position nearer to the
centre of the probe reduces the length of
continuous complementary duplex that can be
formed. It is possible that this length of
perfect match could be a factor. When the mean
log2 signal ratios for probe-target pairs with 1
mismatch are compared with those for pairs with 2
mismatches and the same length of perfect match
(see figure 3), there was a correlation between
the results for pairs with 1 mismatch and pairs
with 2 mismatches (Pearsons correlation
co-efficient 0.65, r2 0.43, indicating that
length of perfect match accounts for
approximately 43 of the variance in log2 signal
ratio)
Attempts to replicate experimental observations
with DINAMelt simulations Simulations of
hybridising a 60mer probe to perfect and
mismatched targets were carried out using
DINAMelt, an existing model of nucleic acid
hybridisation. The difference in Gibbs free
energy (between a perfectly matched probe-target
duplex and one with mismatches) predicted by
DINAMelt is approximately equivalent to the log2
signal ratio in the experimental data. For each
simulation, the actual probe sequences from the
experimental data were used and perfect or
mismatched targets were generated. DINAMelt
appeared to replicate the effect of the number of
mismatches (see figure 5). Results suggested that
a larger number of mismatches would lead to lower
thermodynamic stability of the duplex, as
suggested by the experimental results. There was
a correlation between number of mismatches and
difference in Gibbs free energy (r2
0.663). The results of analysing the effect of
mismatch position on the DINAMelt simulations
showed much less similarity to the experimental
data (see figure 6). Rather than always observing
reductions in stability as the mismatch is moved
further from the end of the probe, a plateau is
reached after around 6 bases.
Figure 1 Mean log2 signal ratio for each number
of mismatches
Figure 5 Mean difference in Gibbs free energy
for each number of mismatches
Figure 2 Mean log2 signal ratio for each mismatch
position (measured from the end of the probe),
only for probe-target pairs containing 1 mismatch
Figure 6 Mean difference in Gibbs free energy
for each mismatch position (measured from the end
of the probe), only for pairs with 1 mismatch
Possible reason why the position effect was not
reproduced by the simulations There are two main
differences between the DINAMelt model and the
hybridisation conditions that produced the
experimental data used in this analysis. Firstly,
DINAMelt models hybridisations in solution. There
are several ways in which this affects the
thermodynamics of hybridisation, but factors such
as the presence of an array surface or probe
density are unlikely to lead to the observed
dependence on mismatch position and length of
perfect match. Secondly, DINAMelt uses parameters
derived from melting experiments. These largely
used very short nucleic acids and were carried
out at much lower temperatures than the 65oC used
for the Agilent long oligonucleotide microarray
hybridisations. At higher temperatures, entropy
and the range of many possible partially-bound
duplex configurations will make a much greater
contribution to the total energy of the duplex.
If a mismatch lies within an unbound section of
the duplex, it will have no effect on the duplex
stability. If the majority of duplex
configurations are fully-bound or have internal
loops (see figure 7a), the mismatch position
doesnt alter the likelihood that it will affect
duplex stability. However, if the majority of
configurations are partially-bound and melt from
the ends (see figures 7b and 7c), mismatches
nearer the ends will be less likely to affect
duplex stability. This will lead to a greater
effect from mismatches near the middle of the
probe, as observed in the experimental
data. These configurations are the basis for a
new model of hybridisation that is being
developed. This model is an extension of the
Poland-Scheraga model that restricts the
partition function to states where the duplex
opens only from the two ends. In initial testing,
the model successfully replicates the
position-dependent effect of mismatches observed
in the experimental results
Figure 3 Mean log2 signal ratio for probe target
pairs with 1 mismatch compared to pairs with 2
mismatches that contain the same length of
perfect match
Figure 7 Possible probe-target binding
configurations. a. duplex with internal loops.
Mismatch position does not alter likelihood of
lying within loop and so affecting duplex
stability. b. duplex with terminal unclosed loop
or dangling end and mismatch near end
c. as 7b but the mismatch is further from the end
and so less likely to lie within an unbound
section and more likely to affect duplex stability
Figure 4 Mean log2 signal ratio for each
substitution type
Mismatch position explains more log2 signal ratio
variation than polymorphism type Mean log2 signal
ratio was plotted for each type of substitution
(see figure 4). All substitutions were associated
with increased log2 signal ratio, and the largest
effect was seen for changes from pyrimidines to a
G. A two-way ANOVA was performed to compare the
scale of the polymorphism type effect and
position effect. The majority of the variation in
log2 signal ratio (94.4) was explained by
neither factor. However, both factors were
significant and mismatch position explained over
5 times as much variation as polymorphism type.

Conclusion Mismatches affect results from long
oligonucleotide probes, the effect is dependent
on the mismatch position and, for small numbers
of mismatches, on the maximum length of perfect
match between probe and target. These
observations have implications for data analysis
and probe design. They have informed the initial
design of a new model of nucleic acid
hybridisation that is currently being developed
and that successfully replicates the qualitative
aspects of these results.
Acknowledgements Thanks to Tara Hill (Agilent)
and Leanne Wardlesworth (University of Manchester
Core Services Unit) for excellent technical
assistance. This research was partly funded by
the Wellcome Trust and by BBSRC
Write a Comment
User Comments (0)
About PowerShow.com