Title: Genomic Analysis Lecture on Microbial Genomes
1E. coli (Walkerton outbreak) Salmonella (Food
poisoning) Hepatitis (Food handling) Meningitidi
s (Abbotsford outbreak) Anthrax
(Bioterrorism) Flesh eating disease
2Genome data for
Anthrax Necrotizing fasciitis Cat scratch
disease Paratyphoid/enteric fever Chancroid
Peptic ulcers and gastritis Chlamydia
Periodontal disease Cholera Plague Dental
caries Pneumonia Diarrhea (E. coli
etc.) Salmonellosis Diphtheria Scarlet
fever Epidemic typhus Shigellosis Mediterranean
fever Strep throat Gastroenteritis
Syphilis Gonorrhea Toxic shock
syndrome Legionnaires' disease Tuberculosis
Leprosy Tularemia Leptospirosis Typhoid
fever Listeriosis Urethritis Lyme disease
Urinary Tract Infections Meliodosis Whooping
cough Meningitis Hospital-acquired
infections
3More Bacterial Pathogens
Chlamydophila psittaci Respiratory disease,
primarily in birds Mycoplasma mycoides
Contagious bovine pleuropneumonia Mycoplasma
hyopneumoniae Pneumonia in pigs Pasteurella
haemolytica Cattle shipping fever Pasteurella
multicoda Cattle septicemia, pig
rhinitis Ralstonia solanacearum Plant bacterial
wilt Xanthomonas citri Citrus canker Xylella
fastidiosa Pierces Disease - grapevines
Bacterial wilt
4Yet More Pathogens
Chickenpox Flu Herpes Mono Ebola Measles Mumps Rub
ella AIDS Meningitis Warts Hepatitis
5Good Bacteria
- Make yogurt, cheese, sourdough bread
- Actinomycetes Produce antibiotics (bacteria as
factories) - Plant growth promoting bacteria
- Break down dead matter
- Break down chemicals - bioremediation
- Food for many organisms
- - Good bacteria in our bodies (trillions!)
6Amazing But True
More bacteria in our bodies than human
cells! More different types of bacterial genes
in our body then there are human genes! The
second human genome project (David Relman)
7Lifes Diversity
8Cracking the Code of Microbes
9Cracking the Code of Microbes
1995 First complete genome sequence for a free
living organism (Haemophilus influenzae) - cited
more than 2,100 times! 2002 More than 60
bacterial genomes completed
Doolittle Apr 18, 2002 Nature
10Percent Regulators as a Function of Genome Size
10
13
Specialized environments Free-living
8
12
11
6
Regulators ()
8
4
10
9
2 3
1
6 7
2
4 5
0
0
1000
2000
3000
4000
5000
6000
7000
Number of Genes
1, Mycoplasma genitalium 2, Chlamydia
trachomatis 3, Treponema pallidum 4, Borrelia
burgdorferi 5, Chlamydia pneumoniae 6,
Helicobacter pylori 7, Helicobacter pylori 8,
Haemophilus influenzae 9, Neisseria
meningitidis 10, Mycobacterium tuberculosis 11,
Bacillus subtilis 12, Escherichia coli 13,
Pseudomonas aeruginosa.
11How to identify bacterial genes
Basic Look for starts and stops and identify
open reading frames What kind of problems can
occur that result in you identifying a sequence
as containing a gene when it really doesnt? How
can you avoid such problems? Think Pair
Share!
12Glimmer and Genemark
Training on a dataset Hidden Markov Models
based on coding and non-coding Additional
knowledge data about gene grammar http//www.ti
gr.org/softlab/glimmer/glimmer.html http//opal.b
iology.gatech.edu/GeneMark/
13Cracking the Code of Microbes
14Figure 1 Circular genome map showing the position
and orientation of known genes, pseudogenes and
repetitive sequences. From the outside circles 1
and 2 (clockwise and anticlockwise) genes on the
- and strands, respectively circles 3 and 4,
pseudogenes 5 and 6, M. leprae specific genes
7, repeat sequences 8, GC content 9, G/C bias
(skew) (GC)/(G-C)
15Microbial Genome Surprises
- Prevalence of gene clusters and gene islands
(genomic islands). Horizontal gene transfer
between microbes, mediated by phage or phage-like
elements, appears to be common - Closely related bacteria can have significant
differences in genome content and structure
16Circular genome map of EDL933 compared with
MG1655. Outer circle shows the distribution of
islands shared co-linear backbone (blue)
position of EDL933-specific sequences (O-islands)
(red) MG1655-specific sequences (K-islands)
(green) O-islands and K-islands at the same
locations in the backbone (tan) hypervariable
(purple). Second circle shows the GC content
calculated for each gene longer than 100 amino
acids, plotted around the mean value for the
whole genome, colour-coded like outer circle.
Third circle shows the GC skew for third-codon
position, calculated for each gene longer than
100 amino acids positive values, lime negative
values, dark green. Fourth circle gives the scale
in base pairs. Fifth circle shows the
distribution of the highly skewed octamer Chi
(GCTGGTGG), where bright blue and purple indicate
the two DNA strands.
17Microbial Genome Surprises
- Intracellular bacterial genomes under isolation
reduced genome size - Buchnera
- endosymbiont of aphids
- 50 million years of genetic isolation
- only observe gene loss
- Rickettsia 25 non coding (vs 10 for most)
evidence of decay - Mycobacterium leprae massive decay why?
18Microbial Genome Surprises
- Always at least a quarter of predicted genes in a
microbial genome are hypothetical Why? - Roadblock to a full understanding of cellular
machinery
19What we can do with the code
- Identify all genes and proteins in a bacterium
- Identify new antibiotic drug targets
- Identify new vaccine targets
- Identify new diagnostics
- Understand how bacteria work Develop better
bacterial factories for producing drugs,
chemicals and foods, and for biodegradation,
bioremediation - - Antibiotic Production, Insulin production
- - Oil spill cleanup
- Identifying new bacteria and new disease-causing
agents
20Other genomic approaches easier in bacteria
- Whole genome gene knockouts. Saturation
transposon mutagenesis to identify essential
genes - Library of gene fusions with reporter genes
- Library of promoters from the genome cloned
infront of a reporter gene (e.g. identify in vivo
expressed genes) - Whole genome microarrays RNA expression
- Hot area Proteome arrays
- Genome comparisons - Whole genome
hybridizations to identify unique sequences. -
Whole genome mapping (Pulsed field gel
electrophoresis, High resolution fingerprinting)
21Resources
- www.tigr.org
- www.sanger.ac.uk/Projects/Microbes