Title: Immunological bioinformatics
1Immunological bioinformatics
- Ole Lund,
- Center for Biological Sequence Analysis (CBS)
- Denmark.
2World-wide Spread of SARS
Status as of July 11, 2003 8437 Infected, 813
 Dead
3SARS
Â
- First severe infectious disease to emerge in the
post-genomic era - Modern societies are vulnerable to epidemics
- Classical containment strategies has been
successful in controlling the epidemic, but - SARS may resurface (e.g. be seasonal)
- Suggested existence of an animal reservoir could
compromise the containment strategy - Need to develop a vaccine strategy
- Biotechnology has provided new tools to analyze
genome/proteome information and guide vaccine
development. - The causative virus, the SARS corona virus (SARS
CoV), has been isolated and full-length
sequenced.
4Main scientific achievements
Â
- Discovery of causative agent
- Genome(s)
- 3D Structure of main proteinase
- Seromics
- Fragments-gtAntibodies
- T cell epitopes
- VSV-SARS(spike) Co-transfection
- Pseudovirus-light when entry
5Main scientific achievements
Â
- Discovery of causative agent
- Genome(s)
- 3D Structure of main proteinase
- Origin
- Similar virus found in from Himalayan palm civets
and other animals, including a raccoon-dog, and
in humans working at an animal market in
Guangdong, China (Guan et al., Sep 4, 2003).
Himalayan (Masked) palm civet
Ferret-Badger
Raccoon-dog
http//biobase.dk/david-c/uk-dk-mammmal-list.htm
6Discovery of causative agent
Source Albert Osterhaus, Beijing June, 2003
Nature May 2003 Lancet July, 2003
Â
- Random Arbitrarily Primed (RAP) PCR method
- Kochs postulates (modified by rivers for viruses
1937) fulfilled - Isolation of virus from diseased hosts
- Cultivation in host cells
- Proof of filterability
- Infection course similar disease in original or
related host species (Macaques) - Re-isolation of virus
- Detection of specific immune response to the
virus - Clinical samples
- SARS CoV found in 75 of SARS patients
- Human metapneumovirus found in 12 of patients
- Other agents only sporadically found
7Transmission
Source Claus Stohr, Beijing June, 2003
Â
- Transmission
- No symptoms -
- Early period
- Very ill
- 10 days post fever -
- 41 flights -gt 25 transmitted cases
- Prevention
- Early tracking of contacts
- Origin
- Serovonversion (Guan, 2003)
- Animal traders 40
- Vegetable traders 5
8New corona viruses
Source Michael Buchmeier, Beijing June, 2003
Â
- 1978 Porcine Epidemic diarrhea virus (PEDV)
- Probably from humans
- 1984 Porcine Respiratory Coronavirus
- 1987 Porcine Reproductive and Respiratory
Syndrome (PRRS) - 1993 Bovine corona virus
- 2003 SARS
9Will it be back?
Â
- When?
- Every year?, Like the flu.
- Every few years? Like measles used to.
- Sporadic? Like Ebola
- Never?
- Lab safety The patient, a 27-year-old
virologist, worked on the West Nile virus in a
biosafety level 3 lab at the Environmental Health
Institute, where the SARS coronavirus was also
studied (Enserink, 2003)
10How does the immune system see a virus?
Â
11The immune system
- The innate immune system
- Found in animals and plants
- Fast response
- Complement, Toll like receptors
- The adaptive Immune system
- Found in vertebrates
- Stronger response 2nd time
- B lymphocytes
- Produce antibodies (Abs) recognizes 3D shapes
- Neutralize virus/bacteria outside cells
- T lymphocytes
- Cytotoxic T lymphocytes (CTLs) - MHC class I
- Recognize foreign protein sequences in infected
cells - Kill infected cells
- Helper T lymphocytes (HTLs) - MHC class II
- Recognize foreign protein sequences presented by
immune cells - Activates cells
12SARS, a corona virus
13Â
14Vaccines concerns
Â
- Enhancement
- Inactivated RSV, Measles vaccines can lead to a
more severe disease - Infection with on (of four, 40-65 identity)
serotypes of Dengue virus leads so more severe
disease if later infected with another serotype - Test
- Erasmus university monkey model
- Mobile vaccine efficacy clinical test protocols
that can move to site of outbreak
15The SARS Genome
29,736 nt Single stranded RNA genome
16Weight matrices (Hidden Markov models)
YMNGTMSQV GILGFVFTL ALWGFFPVV ILKEPVHGV ILGFVFTLT
LLFGYPVYV GLSPTVWLS WLSLLVPFV FLPSDFFPS CVGGLLTMV
FIAGNSAYE
A2 Logo
17Protein sequence information content
Â
- Entropy
- Average Uncertainty in the random variable
- H -Spilog2pi range 0 to log2(20) 4.3
- Logo height I log2(20) H
- Relative entropy (Kullback Leibler distance)
- D Spilog2(pi/qi) range 0 to infinity
- Mutual information
- Reduction in uncertainty due to knowledge of
another random variable (corresponds to
correlation) - M SSpijlog2(pij/pipj)
18Prediction of MHC binding specificity
Â
- Simple Motifs
- Allowed (non allowed) amino acids
- Extended motifs
- Amino acid preferences
- Structural models
- Limitations precision of force field, and speed
of calculations - Neural networks
- Can take correlations into account
19Log odds ratios
Â
- Used for scoring Alignments (BLAST), HMMs, Matrix
methods - Odds ratio of observing given amino acids
- Relative probability of observing amino acid i in
motif position j - Oj p(aai at pos j)/p(aai)
- Assumption of independence gt
- Odds for observing sequence O1O2 On
- Log odds ratio
- LO log(O1O2 On) log(O1)log(O2)log(On)
- LO in half bits 2 LO/log(2)
20G
F
C
A
21Evaluation of prediction accuracy
Â
Coverage TP/actual_positive
Reliability TP/predicted_positive
22A1101 performance154 peptides, 9 Binders
23The MHC gene region
From Bill Paul, Fundamental Immunology, 4th Ed
24 Human Leukocyte antigen (HLAMHC in humans)
polymorphism - alleles
A total of 229 HLA-A 464 HLA-B 111 HLA-C class
I alleles have been named, a total of 2 HLA-DRA,
364 HLA-DRB 22 HLA-DQA1, 48 HLA-DQB1 20
HLA-DPA1, 96 HLA-DPB1 class II sequences have
also been assigned. As of October 2001
(http//www.anthonynolan.com/HIG/index.html)
25HLA polymorphism - supertypes
- Each HLA molecule within a supertype essentially
binds the same peptides - Nine major HLA class I supertypes have been
defined - HLA-A1, A2, A3, A24,B7, B27, B44, B58, B62
- Sette et al, Immunogenetics (1999) 50201-212
26HLA polymorphism - frequencies
Supertypes Phenotype frequencies Caucasian Bla
ck Japanese Chinese Hispanic Average A2,A3,
B27 83 86 88 88 86 86 A1, A24,
B44 100 98 100 100 99 99 B7, B58,
B62 100 100 100 100 100 100
Sette et al, Immunogenetics (1999)
50201-212
27Â
28Â
29Â
30Â
31Â
32Conclutions
Â
- We suggest to
- split some of the alleles in the A1 supertype
into a new A26 supertype - split some of the alleles in the B27 supertype
into a new B39 supertype. - the B8 alleles may define their own supertype
- The specificities of the class II molecules can
be clustered into nine classes, which only partly
correspond to the serological classification
Lund O, Nielsen M, Kesmir C, Petersen AG,
Lundegaard C, Worning P, Sylvester-Hvid C,
Lamberth K, Roder G, Justesen S, Buus S, Brunak
S. Definition of supertypes for HLA molecules
using clustering of specificity matrices.
Immunogenetics. 2004 Feb 13 Epub ahead of print
33MHC class I binding of SARS peptides
- Predictions for all supertypes
- Broad population coverage
- Allele specific neural networks
- Peptides with associated measured binding
affinity - A1 (A0101), A2 (A0204), A3 (A1101A0301), B7
(B0702) - Weight matrices
- Peptides from public databases (Sypfeithi,
MHCpep) - A24, B27, B44, B58 and B62
34Super type weight matrices
B27
B44
B62
B58
35Proteasomal cleavage
36(No Transcript)
37Epitope predictions
- Binding to MHC class I
- High probability for C-terminal proteasomal
cleavage - No sequence variation
38(No Transcript)
39- Inside out
- Position in RNA
- Translated regions (blue)
- Observed variable spots
- Predicted proteasomal cleavage
- Predicted A1 epitopes
- Predicted A0204 epitopes
- Predicted A1101 epitopes
- Predicted A24 epitopes
- Predicted B7 epitopes
- Predicted B27 epitopes
- Predicted B44 epitopes
- Predicted B58 epitopes
- Predicted B62 epitopes
40SARS- Experimental validation
? Peptides are synthesized, first one received
May 8 ? Peptide preparation (5400 tubes) ?
Peptides are validated for purity and correct
sequence ? Peptides are analyzed for peptide
binding affinity method used The quantitative
ELISA technique ? Calculation af binding
affinity, KD
41Strategy for the quantitative ELISA assay C.
Sylvester-Hvid, et al., Tissue antigens, 2002
59251
- Step I Folding of MHC class I molecules in
solution
b2m
Heavy chain
peptide
- Step II Detection of de novo folded MHC class I
molecules by ELISA
42Summery of peptide binding assays
Â
- tested binding lt500nM
- A1 15 13
- A2 15 12
- A3 15 14
- A24 0 -
- B7 15 10
- B27 13 2
- B44 0 -
- B58 15 13
- B62 14 12
43Initial polytope (19 HIV epitopes)
- New epitopes 12
- Poor C-term cleavage 8
- Cleavage within 31
- Linker length 12
44Optimized polytope
- New epitopes 1
- Weak C-term cleavage 3
- Cleavage within 7
- Linker length 37
45(No Transcript)
46(No Transcript)
47MHC class II Molecule
48Virtual matrices
- HLA-DR molecules sharing the same pocket amino
acid pattern, are asumed to have identical amino
acid binding preferences.
49MHC Class II binding
- Virtual matrices
- TEPITOPE Hammer, J., Current Opinion in
Immunology 7, 263-269, 1995, - PROPRED Singh H, Raghava GP Bioinformatics 2001
Dec17(12)1236-7 - Web interface http//www.imtech.res.in/raghava/pro
pred - Prediction Results
50MHC class II prediction
- Complexity of problem
- Peptides of different length
- Weak motif signal
- Alignment crucial
- Gibbs Monte Carlo sampler
- RFFGGDRGAPKRG
- YLDPLIRGLLARPAKLQV
- KPGQPPRLLIYDASNRATGIPA
- GSLFVYNITTNKYKAFLDKQ
- SALLSSDITASVNCAK
- PKYVHQNTLKLAT
- GFKGEQGPKGEP
- DVFKELKVHHANENI
- SRYWAIRTRSGGI
- TYSTNEIDLQLSQEDGQTIE
51Class II binding motif
Random
ClustalW
Alignment by Gibbs sampler
RFFGGDRGAPKRG YLDPLIRGLLARPAKLQV KPGQPPRL
LIYDASNRATGIPA GSLFVYNITTNKYKAFLDKQ
SALLSSDITASVNCAK PKYVHQNTLKLAT
GFKGEQGPKGEP DVFKELKVHHANENI
SRYWAIRTRSGGI TYSTNEIDLQLSQEDGQTI
Gibbs sampler
52MHC class II predictionsAllele DRB1_0401
Accuracy
53Â
54Â
55Epitope based genetic vaccines
Ishioka (1999)
Â
- Advantages Epitope vaccines can
- be controlled better
- induce subdominant epitopes, for example against
tumour antigens where there is tolerance against
dominant epitopes - target multiple conserved epitopes in rapidly
mutating pathogens like HIV and HCV - be analogued to break tolerance
- be protective in animal models (Rodriguez, 2001)
56Genetic vaccines
Ellis, RW 1999
Â
- Stimulate synthesis only in cells
- Advantages
- Stimulate cellular immune responses
- Standardized method of production
- Disadvantages
- Needs boosting
57Polytope construction
Linker
NH2
M
COOH
Epitope
cleavage
C-terminal cleavage
New epitopes
Cleavage within epitopes
58Summery of SARS study
Â
- We have combined bioinformatics and immunology to
perform a proteome-wide scan for cytotoxic T cell
epitopes directed against SARS and restricted to
one of the nine human HLA supertypes (covering
gt99 of all major human populations). - For each HLA supertype, the 15 top-candidates
were tested in biochemical binding assays. - 75 of epitopes tested thus far bind with an
affinity of better than 500nm - More than 112 potential vaccine candidates have
been identified thus far. - They may be tested in SARS survivors and then
included in future vaccine design.
59(No Transcript)
60(No Transcript)
61Prediction of Antibody epitopes
- Linear
- Hydrophilicity scales (average in 7 window)
- Hoop and Woods (1981)
- Kyte and Doolittle (1982)
- Parker et al. (1986)
- Other scales combinations
- Pellequer and van Regenmortel
- Alix
- Discontinuous
- Protrusion (Novotny, Thornton, 1986)
- Neural networks (In preparation)
62Secondary structure in epitopes
Sec struct H T B E S G I .
Log odds ratio -0.19 0.30 0.21 -0.27 0.24 -0.04 0.00 0.17
H Alpha-helix (hydrogen bond from residue i to
residue i4) G 310-helix (hydrogen bond from
residue i to residue i3) I Pi helix (hydrogen
bond from residue i to residue i5) E Extended
strand B Beta bridge (one residue short
strand) S Bend (five-residue bend centered at
residue i) T H-bonded turn (3-turn, 4-turn or
5-turn) . Coil
63Amino acids in epitopes
Fre
Amino Acid G A V L I M P F W S
e/E 0.09 0.07 0.05 0.08 0.04 0.02 0.06 0.03 0.01 0.08
. 0.07 0.08 0.07 0.10 0.06 0.03 0.05 0.05 0.02 0.07
Amino acid C T Q N H Y E D K R
e/E 0.03 0.08 0.04 0.04 0.02 0.04 0.06 0.07 0.07 0.04
. 0.03 0.06 0.04 0.05 0.02 0.03 0.04 0.04 0.05 0.04
64Dihedral angles in epitopes
Z-scores for number of dihedral angle
combinations in epitopes vs. non epitopes
Phi\Psi 1 2 3 4 5 6 7 8 9 10 11 12
1 -0.47 0.44 -0.58 0.45 0.46 0.00 0.00 -0.73 -0.79 0.00 -0.83 1.42
2 -0.01 -0.12 -1.82 0.52 1.75 0.00 0.00 0.00 1.42 -0.82 0.00 0.00
3 1.82 -2.26 -1.57 0.48 0.10 0.00 -0.77 0.45 1.77 0.00 -0.82 0.99
4 1.76 1.15 -0.34 0.75 0.00 0.00 0.97 0.16 0.38 1.03 0.00 0.00
5 -0.85 0.45 -1.09 0.57 0.00 0.00 0.00 0.13 1.52 0.00 1.02 -0.79
6 0.60 1.28 1.30 1.73 0.00 0.00 0.00 0.00 1.32 -0.89 -0.76 0.00
7 0.27 -0.91 1.67 -0.51 0.00 0.00 0.00 0.00 -1.02 -1.09 0.00 0.00
8 0.93 1.21 -0.23 -3.63 0.49 0.00 0.00 0.00 0.00 -0.19 0.31 -0.82
9 0.00 0.28 -0.67 0.33 0.01 -0.83 0.00 0.00 0.87 0.23 0.00 0.00
10 0.00 0.95 1.71 -0.70 0.00 0.00 0.00 1.29 1.08 0.00 1.00 0.00
11 0.00 0.00 1.02 0.00 0.00 0.00 0.00 0.86 -0.75 0.00 0.00 0.00
12 0.42 0.83 0.28 1.68 0.00 0.00 0.00 0.00 1.03 -0.21 -0.79 0.93
65Immunological bioinformatics
Â
- Classical experimental research
- Few data points
- Data recorded by pencil and paper/spreadsheet
- New experimental methods
- Sequencing
- DNA arrays
- Proteomics
- Need to develop new methods for handling these
large data sets - Immunological Bioinformatics/Immunoinformatics
66Acknowledgements
CBS, Technical University of Denmark Søren
Brunak (Director of CBS) Morten Nielsen (Epitope
prediction) Peder Worning (Genome
atlases) Claus Lundegaard (Data bases) Mette
Børgesen (CTL prediction) Jesper Schantz
(Polytope optimization) IMMI, University of
Copenhagen Søren Buus (Professor) Christina
Sylvester-Hvid (Experimental coordinator) Kasper
Lamberth (Peptide bank, Quality control) Erland
Johansson, Jeanette Nielsen (Preparations of
peptides) Hanne Møller (ELISA binding assay)