Title: P1252108597jWZFI
1Analysis of Horizontal Gene Transfers of
Potential Relevance to Microbial Virulence
Fiona Brinkman Simon Fraser University,
Greater Vancouver, British Columbia, Canada
2- Overview
- High pathogen-host protein similarities
detecting horizontal gene transfer - Characteristics of proteins/genes putatively
horizontally acquired by bacterial pathogens - Implications
- Proposal How we should be combating bacterial
pathogens
3Yersinia Type III secretion system
4Approach
Idea Could we identify novel virulence factors
by identifying bacterial pathogen proteins more
similar to host proteins than you would expect?
1. Primary sequence similarity approach
identifies possible horizontal gene transfer (2.
Structural similarity approach)
5Unusual similarities between Bacteria Eukaryote
genes Sequence similarity-based approach
- For each complete bacterial and eukaryote genome
BLASTP (and MSP Crunch) analysis of all deduced
proteins, searched against non-redundant SWALL
database - Overlay NCBI taxonomy information
- Query database for bacterial proteins whos top
BLASTP scoring hit is eukaryotic (and
eukaryotic proteins whos top hit is bacterial) - Initial Assumption Three Domains of life
(Bacteria, Eukarya, and Archaea) are so divergent
that top hits to another Domain are rare
6Unusual similarities between Bacteria Eukaryote
genes Sequence similarity-based approach
- Problem If a gene transfer occurs from a
eukaryote to an ancestor of closely related
bacteria ? top hit will be to other bacteria - Therefore, perform similar query, but filtering
different taxonomic groups from the analysis
Bacteria 1
closely related Bacteria 2 bacteria
(same species, family, etc) Eukaryote
7BAE-watch Database Bacterial proteins with
unusual similarity with Eukaryotic proteins
8Problem Proteins highly conserved in the three
domains of life
Top hit to a protein from another domain may
occur by chance. StepRatio score helps detect
these. Example Glucose-6-Phosphate Reductase
9Example of a case with a high StepRatio Enoyl
ACP reductase
10Brinkman et al. (2001) Bioinformatics
17385-387.
PhyloBLAST a tool for analysis
11BAE-watch Analysis of Haemophilus influenzae
Rd-KW20 proteins for unusual eukaryotic protein
similarities
12Genome data for
Anthrax Necrotizing fasciitis Cat scratch
disease Paratyphoid/enteric fever Chancroid
Peptic ulcers and gastritis Chlamydia
Periodontal disease Cholera Plague Dental
caries Pneumonia Diarrhea (E. coli
etc.) Salmonellosis Diphtheria Scarlet
fever Epidemic typhus Shigellosis Mediterranean
fever Strep throat Gastroenteritis
Syphilis Gonorrhea Toxic shock
syndrome Legionnaires' disease Tuberculosis
Leprosy Tularemia Leptospirosis Typhoid
fever Listeriosis Urethritis Lyme disease
Urinary Tract Infections Meliodosis Whooping
cough Meningitis Hospital-acquired
infections
13Bacterial Pathogens
Chlamydophila psittaci Respiratory disease,
primarily in birds Mycoplasma mycoides
Contagious bovine pleuropneumonia Mycoplasma
hyopneumoniae Pneumonia in pigs Pasteurella
haemolytica Cattle shipping fever Pasteurella
multicoda Cattle septicemia, pig
rhinitis Ralstonia solanacearum Plant bacterial
wilt Xanthomonas citri Citrus canker Xylella
fastidiosa Pierces Disease - grapevines
Bacterial wilt
14Trends in this Sequence-based Analysis
- Identifies the strongest cases of lateral gene
transfer between bacteria and eukaryotes - Most common
- Bacteria ? ? Unicellular
Eukaryote - ? Makes sense
- Bacteria to Multicellular eukaryote must involve
germline - Eukaryote to Bacteria must not involve introns
15Trends in this Sequence-based Analysis
- Identifies nuclear genes with potential organelle
origins - A control Method identifies all previously
reported Chlamydia trachomatis plant-like genes.
16First case Bacterium Eukaryote Lateral
Transfer
N-acetylneuraminate lyase (NanA) of the protozoan
Trichomonas vaginalis is 92-95 similar to NanA
of Pasteurellaceae bacteria.
Pasteurellaceae
de Koning et al. (2000) Mol Biol Evol 171769-1773
17N-acetylneuraminate lyase role in pathogenicity?
- Pasteurellaceae
- Mucosal pathogens of the respiratory tract
- T. vaginalis
- Mucosal pathogen, causative agent of the STD
Trichomonas
18N-acetylneuraminate lyase (sialic acid lyase,
NanA)
Hydrolysis of glycosidic linkages of terminal
sialic residues in glycoproteins, glycolipids
Sialidase Free sialic acid
Transporter Free sialic acid
NanA N-acetyl-D-mannosamine pyruvate
Involved in sialic acid metabolism Role in
Bacteria Proposed to parasitize the mucous
membranes of animals for nutritional purposes
Role in Trichomonas ?
19Another case A Sensor Histidine Kinase for a
Two-component Regulation System
Signal Transduction In General Histidine
kinases more common in bacteria Ser/Thr/Tyr
kinases more common in eukaryotes However, a
histidine kinase was recently identified in
fungi, including pathogens Fusarium solani and
Candida albicans How did it get there?
Candida
20Streptomyces Histidine Kinase. The Missing Link?
Brinkman et al. (2001) Infection and Immunity
695207-5211
Pseudomonas aeruginosa PhoQ
Xanthomonas campestris RpfC
100
Vibrio cholerae TorS
100
Escherichia coli TorS
Escherichia coli RcsC
Candida albicans CaNIK1
39
100
Neurospora crassa NIK-1
100
Fungi
Fusarium solani FIK1
100
51
54
Fusarium solani FIK2
Streptomyces coelicolor SC4G10.06c
100
Streptomyces coelicolor SC7C7.03
Pseudomonas aeruginosa GacS
100
100
Virulence Factor ( )in every organism
examined to date
Pseudomonas fluorescens GacS / ApdA
100
Pseudomonas tolaasii RtpA / PheN
100
Pseudomonas syringae GacS / LemA
100
86
Pseudomonas viridiflava RepA
100
Azotobacter vinelandii GacS
Erwinia carotovora RpfA / ExpS
100
Escherichia coli BarA
100
Salmonella typhimurium BarA
0.1
21Plant-like genes in Chlamydia
- Proteins Unusually high number most similar to
plant proteins - Previous proposal Obtained genes from a
plant-like amoebal host? (A relative of
Chlamydiaceae infects Acanthamoeba.
Chlamydiaceae Obligate intracellular pathogens) - However Acanthamoeba relationship to plants very
controversial
22Plant-like genes in Chlamydia
23Plant-like genes in Chlamydia
24Plant-like genes in Chlamydia
- Endosymbiotic theory
- Rickettsia Many eukaryotic-like genes
- Synechocystis Many plant-like genes
- Does Chlamydia share an ancient ancestral
relationship with the ancestor of the
Chloroplast?
25Chlamydiaceae share an ancestral relationship
with Cyanobacteria and Chloroplast
Pyrococcus furiosus (Archaea)
16S rRNA
Thermotoga maritima
Aquifex pyrophilus
Bacillus subtilis
Chlamydophila pneumoniae
Chlamydiaceae
538
Chlamydophila psittaci
1000
704
Chlamydia muridarum
1000
Chlamydia trachomatis
1000
Chlamydomonas reinhardtii
530
Chloroplasts
Klebsormidium flaccidum
998
988
Zea mays
1000
Nicotiana tabacum
1000
Synechococcus PCC6301
349
Cyanobacteria
1000
Synechocystis PCC6803
1000
Microcystis viridis
Escherichia coli
Zea mays mitochondrion
764
Rickettsia prowazekii
986
868
Caulobacter crescentus
0.1
26Chlamydiaceae share an ancestral relationship
with Cyanobacteria and Chloroplast
S10
L23
L29
L22
L16
L14
L24
S14
L18
L30
L15
S19
S17
S3
S8
S5
L3
L4
L2
L5
L6
Escherichia
Bacillus
Thermatoga
Synechocystis
Chlamydia
Unique shared-derived characters unite
Chlamydiaceae and Synechocystis
27Chlamydiaceae plant-like genes reflect an
ancestral relationship with Cyanobacteria and the
Chloroplast
- Chlamydiaceae do not appear to be exchanging DNA
with their hosts - Existing knowledge of Cyanobacteria may
stimulate ideas about the function and control of
pathogenic Chlamydia? - Brinkman et al. (2002) Genome Research
121159-1167.
28- Overview
- High pathogen-host protein similarities
detecting horizontal gene transfer - Characteristics of proteins/genes putatively
horizontally acquired by bacterial pathogens - Implications
- Proposal How we should be combating bacterial
pathogens
29Horizontal Gene Transfer and Bacterial
Pathogenicity
Pathogenicity Islands Uro/Entero-pathogenic E.
coli Salmonella typhimurium Yersinia
spp. Helicobacter pylori Vibrio cholerae
Transposons ST enterotoxin genes in E.
coli Prophages Shiga-like toxins in
EHEC Diptheria toxin gene, Cholera
toxin Botulinum toxins Plasmids Shigella,
Salmonella, Yersinia
30Pathogenicity Islands
- Associated with
- Atypical GC
- tRNA sequences
- Transposases, Integrases and other mobility genes
- Flanking repeats
31IslandPath Aiding identification of
Pathogenicity Islands and other Genomic Islands
Yellow circle high GC Pink circle
low GC Region of unusual dinucleotide
bias tRNA gene lies between the two dots
rRNA gene lies between the two dots Both tRNA
and rRNA lie between the two dots Dot is
named a transposase Dot is named an integrase
_
Hsiao et al. (2003) Bioinformatics 19 418-420
32Dinucleotide bias analysis
- Genome divided into ORF-clusters of 6
consecutive ORFs - For each ORF cluster,
- the average absolute dinucleotide relative
abundance difference is -
- where
- f (fragment) is derived from sequences in an
ORF-cluster - g (genome) is derived from all predicted ORFs in
the genome - Dinucleotide relative abundance is
?XY fXY/fXfY - where
- fX denotes the frequency of the mononucleotide X
- fXY the frequency of the dinucleotide XY
-
See Hsiao et al. (2003) Bioinformatics 19
418-420 and Karlin, S. and Burge, C (1995).
Trends in Genetics 1995 11283-90 for review
33Dinucleotide bias analysis
- ORF-clusters sampled in an overlapping manner
(shift by one ORF at a time) - The mean is calculated by
averaging the results from all ORF-clusters in
the genome - Regions with greater than 1 standard deviation
away from the mean are marked on the IslandPath
graphical display with strikethrough lines - Why did we use 6 ORFs per cluster?
- - Not enough bp in a single ORF to get a good
estimate - - 4.5kb (corresponding to approximately 6-8 ORFs)
is required for reliable estimation of
nucleotide composition (Lawrence and Ochman, J
Mol Evolution 1997 44383-97) -
34Boxes Known islands in the Salmonella typhi
genome
1
VI
7
V
11
II
IX
20
I
22
VIII
III
IV
32
VII
33
X
34
36
35
35What features best predict Islands?
- Examined prevalence of features in over 200 known
islands - 94 of islands contain gt25 dinucleotide bias
(majority have gt75 dinucleotide bias coverage) - Mobility genes identified in gt75 (but ID
recently improved) - Atypical GC (above cutoff used in Brinkman et
al., 2002) not over 50 coverage on average, and
tRNA genes not observed with gt50 of known
islands
36Boxes Insertions in the Salmonella typhi
genome verses Salmonella typhimurium
37Properties of genes in these islands?
- Defined a putative island as
- 8 or more genes in a row with dinucleotide bias
- Functional category analysis ? Any difference for
genes in islands verses genome?
38Analysis 1 COG functional category analysis
Hypothetical genes are more common in putative
islands vs the genome
(Paired T test P 6.8E-19)
Genome
Put. Islands
39Analysis 2 SUPERFAMILY HMM search results
SUPERFAMILY a set of HMMs built from SCOP
superfamilies
Paired T test P 3.3E-14
Genome
Put Islands
Fewer ORFs in the putative islands were
assigned to a SUPERFAMILY class
40Analysis 3 Gene size in Putative Islands vs.
Non-Islands
ORFans (genes with no homologs among 60 microbial
genomes) tend to be shorter genes Are genes in
putative islands shorter as well on average?
Paired T test P 7.1E-34
Non Island
Put. Islands
In most cases, average ORF length in putative
islands is shorter
41Analysis 4 COG analysis after removing ORFs lt300
bp
Hypothetical genes more common in
islands? Paired T test P 0.0016
Genes may be less well predicted in such
island/atypical dinucleotide bias regions Some
genomes still show marked increase hypothetical
genes in islands verses genome
42Summary Bacteria gene transfer analysis
- No cases identified in our database to date of
clear, recent horizontal gene transfer between
bacteria and a multicellular eukaryote (involving
gt80 sequence similarity) - The pathogens studied are not commonly
acquiring genes from their hosts, or vice versa - Bacterial and eukaryotic pathogens may have
exchanged genes - Overall increased prevalence of hypothetical
genes in putative bacterial genomic islands? ?
Cautionary note about gene prediction accuracy
43- Overview
- High pathogen-host protein similarities
detecting horizontal gene transfer - Characteristics of proteins/genes putatively
horizontally acquired by bacterial pathogens - Implications
- Proposal How we should be combating bacterial
pathogens
44Implications Evolution of Pathogenicity
Pathogen mimicry of their host Convergent
evolution or genes selectively maintained Gene
exchange between pathogens Arms Deals
45Pathogens and The Art of War
What is of supreme importance in war is to
attack the enemy's strategy. Next best is to
disrupt his alliances by diplomacy. The next best
is to attack his army. And the worst policy is to
attack cities.
46Functional Pathogenomics of Mucosal Immunity
www.pathogenomics.ca
INDUSTRY Anigenics Canada Inimex Pharma Inc
ACADEMIA VIDO, U Sask UBC, SFU, BCGSC
FPMI
GOVERNMENT Genome Canada Genome Prairie Genome
BC Govt of Saskatchewan
47- BC Pathogenomics group
- Ann M. Rose, Yossef Av-Gay, David L. Baillie,
Fiona S. L. Brinkman, Robert Brunham, Artem
Cherkasov, Rachel C. Fernandez, B. Brett Finlay,
Hans Greberg, Robert E.W. Hancock, Steven J.
Jones, Patrick Keeling, Audrey de Koning, Don G.
Moerman, Sarah P. Otto, B. Francis Ouellette,
Nancy Price, William Hsiao. - Jeff Blanchard (NCGR, New Mexico) and Olof
Emanuelsson (Stockholm Bioinformatics Center) - Peter Wall Institute for Advanced Studies,
Genome Canada