Title: Computational methods to quantify transcriptome changes in bacteria
1Computational methods to quantify transcriptome
changes in bacteria
- Rebecca Pankow
- Mentor Dr. Jeff Chang
- Botany and Plant Pathology
- Oregon State University
2What makes a pathogen?
- Overcome host defenses
- Manipulate host cell
- Survive in host environment
Infections caused by Pseudomonas syringae
3Hypothesis
Genes that are expressed in conditions that mimic
the plant are candidates for host-associated
genes.
4Experimental Setup
Grow P. syringae in KB (rich media)
Grow P. syringae in minimal media simulates
environment of plant host
No virulence gene expression
Virulence gene expression
Identify differential expression of genes
5How to identify expressed genes?
- Transcriptome all mRNAs in a cell at a given
time
sequenced transcriptome
- completely sequenced genome
TAATTCTCGTTATCGTCCGG ATTAAGAGCAATAGCAGGCC
AGAGCAATAGCA
AGAGCAATAGCA
6How to quantify transcriptome changes?
mRNAs in transcriptome
- Next-Generation Illumina IIG Genome Sequencer
ACATAGGAGCTAGATAGCTATGCATCGATCGACATG GATCGACATGAGA
GTTACGAGTAGACTGAGAGATAT CTGAGAGATATGTTTACCCAGATTAC
TCTCCGATGC GATCGACATGAGAGTTACGAGTAGACTGAGAGATAT
36 base-long reads (36-mers)
7Computational Pipeline
Processed 36-mers
TGTTTACCCAGATTACTCTCCGATGCCAGGGAGAAT
GATCGACAGATGCATGTTTACCCAGATTACTCTCCG
ACATAGGAGCTAGATAGCTATGCATCGATCGACAGA GATCGACAGATGC
ATGTTTACCCAGATTACTCTCCG
Align to ref. genome
8Signal Processing
0010100234201231201001022410301022040102020
Graph signal
reads that map to coordinates
genome coordinates of a potential transcription
unit
Not very informative!
9Signal Processing
Using sliding window approach to minimize noise
old signal
Set
sliding window 15
Sum of reads in sliding window
19
20
22
processed signal
__________________________
19 _________________________
19 20 _______________________
19 20 22 _____________________
10Resulting signal
old signal
scaled and processed signal
More informative, but signal is jagged
11Smoothing the Signal
Iteration of the sliding window
12Deconvoluting Signal
Changes in the signal found by using the sliding
window on the first and second derivatives of the
signal.
13Deconvoluting Signal
- Refine signal divisions by looking in-between
previous divisions - Categorize signal divisions as increasing,
decreasing, or flat
14Processing Empirical Data
Next-Generation Illumina IIG Genome Sequencer
ACATAGGAGCTAGATAGCTATGCATCGATCGACATG GATCGACATGAGA
GTTACGAGTAGACTGAGAGATAT CTGAGAGATATGTTTACCCAGATTAC
TCTCCGATGC GATCGACATGAGAGTTACGAGTAGACTGAGAGATAT
36 base-long reads (36-mers)
15Problems
Mistakes in sequencing can be made!
ACATAGGAGCTAGATAGCTATGCATCGATCGACATG GATCGACATGAGA
GTTACGAGTAGACTGAGAGATAT CTGAGAGATATGTTTACCCAGATTAC
TCTCCGATGC GATCGACATGAGAGTTACGAGTAGACTGAGAGATAT
30 of reads match P.syringae genome
16Solution
Account for mismatches by treating each base in a
36-mer as a wildcard
ACATAGGAGCTAGATAGCTATGCATCGATCGACATG
_CATAGGAGCTAGATAGCTATGCATCGATCGACATG
A_ATAGGAGCTAGATAGCTATGCATCGATCGACATG
AC_TAGGAGCTAGATAGCTATGCATCGATCGACATG
36-mers containing wildcards are mapped back to
the original genome
17Conclusions
- Computational pipeline developed to
- Generate and smooth signal
- Divide signal into sections that are going up,
down, or are flat - 30 of reads from transcriptome map back to
original genome
18Future Work
- Quantify changes in bacterial transcriptome
under different treatments
19Acknowledgements
Jeff Chang Jason Cumbie Jeff Kimbrel Bill
Thomas Cait Thireault Allison Smith Ryan
Lilley Phillip HillenbrandJayme
Stout HHMI/USDA Kevin Ahern