Title: Using informatics to focus bacterial pathogenicity studies
1Using informatics to focusbacterial
pathogenicity studies
2Using informatics to focusbacterial
pathogenicity studies
Goal Use informatic analyses to generate new
testable hypotheses about pathogen protein
function and pathogenicity mechanisms Test the
hypotheses in the laboratory
3Need for informatics in biology origins
- Gramicidine S (Consden et al., 1947), partial
insulin sequence (Sanger and
Tuppy, 1951) - First codon assignment UUU/phe (Nirenberg and
Matthaei, 1961) - 3.5 kb RNA bacteriophage MS2
(Fiers et al., 1976) 5.4 kb bacteriophage ?X174
(Sanger et al., 1977) - Early databases Dayhoff, 1972 Erdmann, 1978
4(from the National Centre for Biotechnology
Information)
5- Explosion of data
- 22 of the 33 publicly available microbial genome
sequences are for bacterial pathogens - Approximately 18,000 pathogen genes with no known
function! - 95 bacterial pathogen genome projects in
progress -
6Pathogen Informatics
- Pseudomonas aeruginosa
- Three dimensional comparative protein modeling
- Phylogenetic analysis of gene families
- Other analyses Regulatory network complexity
- Pathogenomics Project
- Detecting eukaryotepathogen homologs
- Detecting pathogenicity islands
7Pseudomonas aeruginosa
- Found in soil, water, plants, animals
- Common cause of hospital acquired infection ICU
patients, Burn victims, cancer patients - Almost all cystic fibrosis (CF) patients infected
by age 10 - Intrinsically resistant to many antibiotics
- No vaccine
8Outer membrane protein OprF
- Nonspecific porin
- Required for
- Maintenance of cell shape
- Growth in low-osmolarity environments
- OprF- clinical mutant with multiple antimicrobial
resistance being characterized - Adhesin in plant colonizing Pseudomonas species
- Proposed vaccine component
9Gram Negative Cell Envelope
PORE
LPS
PORIN
Mg
Outer membrane
Peptidoglycan
Periplasm
Cytoplasmic membrane
10Structure of the outer membrane protein A
transmembrane domain Pautsch and Schulz
(1998). Nature Structural Biology 51013-1017 No
channel formation detected
11OprF and OmpA share only 15 identity
OprF 1 -QGQNSVEIEAFGKRYFTDSVRNMKN-------ADLYGG
SIGYFLTDDVELALSYGEYH OmpA 1 APKDNTWYTGAKLGWSQYHD
TGLINNNGPTHENKLGAGAFGGYQVNPYVGFEMGYDWLG
OprF 52 DVRGTYETGNKKVHGNLTSLDAIYHFGTPGVG
LRPYVSAGLA-HQNITNINSDSQGRQQ OmpA 60
RMPYKGSVENGAYKAQGVQLTAKLGYPIT-DDLDIYTRLGGMVWRADTYS
NVYGKNHDT
OprF 110
MTMANIGAGLKYYFTENFFAKASLDGQYGLEKRDNGHQG--EWMAGLGVG
FNFG OmpA 118 GVSPVFAGGVEYAITPEIATRLEYQWTNNIGDAHTI
GTRPDNGMLSLGVSYRFG
12Model of the N-terminus of OprF based on
OmpA Brinkman, Bains and Hancock (2000).
Journal of Bacteriology 1825251-5255
13OprF model (yellow and green) aligned with the
crystal structure of OmpA (blue) Many residues
are in the same three dimensional environment,
though on different strands
14OprF and OmpA similarity
OprF 1 -QGQNSVEIEAFGKRYFTDSVRNMKN-------ADLYGG
SIGYFLTDDVELALSYGEYH OmpA 1 APKDNTWYTGAKLGWSQYHD
TGLINNNGPTHENKLGAGAFGGYQVNPYVGFEMGYDWLG
OprF 52 DVRGTYETGNKKVHGNLTSLDAIYHFGTPGVG
LRPYVSAGLA-HQNITNINSDSQGRQQ OmpA 60
RMPYKGSVENGAYKAQGVQLTAKLGYPIT-DDLDIYTRLGGMVWRADTYS
NVYGKNHDT
OprF 110
MTMANIGAGLKYYFTENFFAKASLDGQYGLEKRDNGHQG--EWMAGLGVG
FNFG OmpA 118 GVSPVFAGGVEYAITPEIATRLEYQWTNNIGDAHTI
GTRPDNGMLSLGVSYRFG
15Residues implicated in blocking channel formation
in OmpA are not conserved in OprF
16Voltage
Current
Source
Amplifier
Protein
Planar
Bathing
Bilayer
Solution
Membrane
Planar Lipid Bilayer Apparatus
17The N-terminus of OprF forms channels in a lipid
bilayer membrane
18Upstream of OprF is a probable sigma factor gene,
sigX
sigX
oprF
Promoter Transcription terminator
19Disruption of sigX reduces expression of OprF
- Marker
- Wildtype
- sigX- mutant
- oprF- mutant
P. aeruginosa P. fluorescens
20No SigX expression
oprF
sigX
SigX expression
oprF
sigX
2118 ECF sigma factors in the P. aeruginosa genome
22(No Transcript)
23Percent Regulators as a Function of Genome Size
10
13
Specialized environments Free-living
8
12
11
6
Regulators ()
8
4
10
9
2 3
1
6 7
2
4 5
0
0
1000
2000
3000
4000
5000
6000
7000
Number of Genes
Genomes represented 1, Mycoplasma genitalium 2,
Chlamydia trachomatis 3, Treponema pallidum 4,
Borrelia burgdorferi 5, Chlamydia pneumoniae 6,
Helicobacter pylori --- 7, Helicobacter
pylori--- 8, Haemophilus influenzae 9,
Neisseria meningitidis 10, Mycobacterium
tuberculosis 11, Bacillus subtilis 12,
Escherichia coli 13, Pseudomonas aeruginosa.
24P. aeruginosa Genome Sequence Analysis Outer
Membrane Proteins (OMPs)
Approximately 150 OMPs predicted including three
large paralogous families
- OprM Family of putative Efflux and Type I
- secretion proteins (18 members)
- OprD Family of putative Amino acid, Peptide and
- Aromatic compound transporters (19 members)
- TonB Family of putative iron-siderophore
- receptors (34 members)
25OprJ
OprM
OpmJ
OpmB
OpmA
OprM Family (Multidrug Efflux?)
OpmG
OpmE
OpmI
OprN
OpmD
OpmQ
AprF
OpmM
Protein Secretion?
OpmN
OpmH
TolC
OpmK
OpmL
OpmF
0.1
26OprM structural model based on TolC
27OprM structural model based on TolC
28OprM structural model based on TolC
29Future Developments
- Modeling of other outer membrane proteins in
Neisseria species. - Developing a better algorithms for secondary
structure prediction
30Pathogenomics
Goal Identify previously unrecognized mechanisms
of microbial pathogenicity using a unique
combination of informatics, evolutionary biology,
microbiology and genetics.
31Pathogenicity
Processes of microbial pathogenicity at the
molecular level are still minimally
understood Pathogen proteins identified that
manipulate host cells by interacting with, or
mimicking, host proteins. Idea Could we
identify novel virulence factors by identifying
pathogen genes more similar to host genes than
you would expect based on phylogeny?
32Eukaryotic-like pathogen genes
- YopH, a protein-tyrosine phosphatase, of
Yersinia pestis - Enoyl-acyl carrier protein
reductase (involved in lipid metabolism) of
Chlamydia trachomatis
Aquifex aeolicus
96
Haemophilus influenza
100
Escherichia coli
Anabaena
100
Synechocystis
100
Chlamydia trachomatis
63
Petunia x hybrida
64
Nicotiana tabacum
83
Brassica napus
99
Arabidopsis thaliana
0.1
52
Oryza sativa
33Pathogens
Anthrax Necrotizing fasciitis Cat scratch
disease Paratyphoid/enteric fever Chancroid
Peptic ulcers and gastritis Chlamydia
Periodontal disease Cholera Plague Dental
caries Pneumonia Diarrhea (E. coli
etc.) Salmonellosis Diphtheria Scarlet
fever Epidemic typhus Shigellosis Mediterranean
fever Strep throat Gastroenteritis
Syphilis Gonorrhea Toxic shock
syndrome Legionnaires' disease Tuberculosis
Leprosy Tularemia Leptospirosis Typhoid
fever Listeriosis Urethritis Lyme disease
Urinary Tract Infections Meliodosis Whooping
cough Meningitis Hospital-acquired
infections
34Pathogens
Chlamydophila psittaci Respiratory disease,
primarily in birds Mycoplasma mycoides
Contagious bovine pleuropneumonia Mycoplasma
hyopneumoniae Pneumonia in pigs Pasteurella
haemolytica Cattle shipping fever Pasteurella
multicoda Cattle septicemia, pig
rhinitis Ralstonia solanacearum Plant bacterial
wilt Xanthomonas citri Citrus canker Xylella
fastidiosa Citrus variegated chlorosis
Bacterial wilt
35Interdisciplinary group
- Informatics/Bioinformatics
- BC Genome Sequence Centre
- Centre for Molecular Medicine and Therapeutics
- Evolutionary Theory
- Dept of Zoology
- Dept of Botany
- Canadian Institute for Advanced Research
- Pathogen Functions
- Dept. Microbiology
- Biotechnology Laboratory
- Dept. Medicine
- BC Centre for Disease Control
- Host Functions
- Dept. Medical Genetics
- C. elegans Reverse Genetics Facility
- Dept. Biological Sciences SFU
36Approach
Screen for candidate genes. Search pathogen genes
against sequence databases. Identify those with
eukaryotic similarity/motifs
- Rank candidates.
- how much like host protein?
- info available about protein?
Modify screening method /algorithm
Evolutionary significance. - Horizontal transfer?
- Similar by chance?
Prioritize for biological study. - Previously
studied biologically? - Can UBC microbiologists
study it? - C. elegans homolog?
37Bacterium Eukaryote Horizontal Transfer
N-acetylneuraminate lyase (NanA) of the protozoan
Trichomonas vaginalis is 92-95 similar to NanA
of Pasteurellaceae bacteria.
38N-acetylneuraminate lyase role in pathogenicity?
- Pasteurellaceae
- Mucosal pathogens of the respiratory tract
- T. vaginalis
- Mucosal pathogen, causative agent of the STD
Trichomonas
39N-acetylneuraminate lyase (sialic acid lyase,
NanA)
Hydrolysis of glycosidic linkages of terminal
sialic residues in glycoproteins, glycolipids
Sialidase Free sialic acid
Transporter Free sialic acid
NanA N-acetyl-D-mannosamine pyruvate
Involved in sialic acid metabolism Role in
Bacteria Proposed to parasitize the mucous
membranes of animals for nutritional purposes
Role in Trichomonas ?
40Eukaryote Bacteria Horizontal Transfer?
Rat
0.1
GMP reductase of E. coli is 81 similar to the
corresponding enzyme studied in humans and
rats Role in virulence not yet investigated
Human
Escherichia coli
Caenorhabditis elegans
Pig roundworm
Methanococcus jannaschii
Methanobacterium thermoautotrophicum
Bacillus subtilis
Streptococcus pyogenes
Aquifex aeolicus
Acinetobacter calcoaceticus
Haemophilus influenzae
Chlorobium vibrioforme
41Eukaryote Bacteria Horizontal Transfer?
Ralstonia solanacearum cellulase
(ENDO-1,4-BETA-GLUCANASE) is 56 similar to
endoglucanase present in a number of
fungi. Demonstrated virulence factor for plant
bacterial wilt
42Functional studies
Prioritized candidates
Study function of similar gene in model host, C.
elegans.
Study function of gene. Investigate role of
bacterial gene in disease Infection study in
model host
Contact other groups for possible collaborations.
C. elegans
DATABASE
World Research Community
43Pathogenicity Islands
- Virulence genes commonly in clusters
- Associated with
- tRNA sequences
- Transposases, Integrases and other mobility genes
- Flanked by repeats
44GC Analysis Identifying Pathogenicity Islands
Yellow circle high GC Pink circle
low GC tRNA gene lies between the two
dots rRNA gene lies between the two dots
Both tRNA and rRNA lie between the two dots
Dot is named a transposase Dot is named an
integrase
45 Neisseria meningitidis serogroup B strain MC58
Mean GC 51.37 STD DEV 7.57 GC SD
Location Strand Product 37.22 -1
1831577..1832527 pilin gene inverting
39.95 -1 1834676..1835113 VapD-related
51.96 1835110..1835211 - cryptic plasmid
A-related 39.13 -1 1835357..1835701
hypothetical 40.00 -1 1836009..1836203
hypothetical 42.86 -1 1836558..1836788
hypothetical 34.74 -2 1837037..1837249
hypothetical 43.96 1837432..1838796
conserved hypothetical 40.83 -1
1839157..1839663 conserved hypothetical
42.34 -1 1839826..1841079 conserved
hypothetical 47.99 1841404..1843191 -
put. hemolysin activ. HecB 45.32
1843246..1843704 - put. toxin-activating
37.14 -1 1843870..1844184 - hypothetical
31.67 -2 1844196..1844495 - hypothetical
37.57 -1 1844476..1845489 - hypothetical
20.38 -2 1845558..1845974 - hypothetical
45.69 1845978..1853522 -
hemagglutinin/hemolysin-rel. 51.35
1854101..1855066 transposase, IS30 family
46GC of ORFs Analysis of Variance
- GC variance is similar within a given species
- Low GC variance correlates with an
intracellular lifestyle for the bacterium and a
clonal nature (P 0.004) - Neisseria meningitidis /- 7
- Chlamydia species /- 2
- Intracellular bacteria ecologically isolated?
-
47Future Developments
- Identify eukaryotic motifs and domains in
pathogen genes - Identify further motifs associated with
- Pathogenicity islands
- Virulence determinants
- Functional tests for new potential virulence
factors - www.pathogenomics.bc.ca
48Informatics as a focus
- Outer membrane protein modeling Focus mutational
studies and studies of surface exposed sequences - Phylogenetic analyses Focus study of gene
mutants under certain environmental conditions - Other analyses - Regulatory network complexity
Change focus of regulation studies - Eukaryotepathogen homologs Focus identification
of mimics - Pathogenicity islands Focus identification of
recently obtained virulence determinants
49Acknowledgements
- Pathogenomics group Ann Rose, Steven Jones, Ivan
Wan, Hans Greberg, Yossef Av-Gay, David Baillie,
Bob Brunham, Stefanie Butland, Rachel Fernandez,
Brett Finlay, Patrick Keeling, Audrey de Koning,
Sarah Otto, Francis Ouellette, Peter Wall
Institute - Pseudomonas Genome Project PathoGenesis Corp.
(Ken Stover) and University of Washington
(Maynard Olsen) - Outer membrane proteins Manjeet Bains, Kendy
Wong, Canadian Cystic Fibrosis Foundation - Bob Hancock