Title: Comparative Genome and Proteome Analysis of Anopheles gambiae and Drosophila melanogaster
1Comparative Genome andProteome Analysis of
Anophelesgambiae and Drosophilamelanogaster
- Evgeny M. Zdobnov, Christian von Mering, Ivica
Letunic, David Torrents, Mikita Suyama, Richard
R. Copley, George K. Christophides, Dana
Thomasova, Robert A. Holt, G. Mani Subramanian,
Hans-Michael Mueller, - George Dimopoulos, John H. Law, Michael A. Wells,
Ewan Birney, Rosane Charlab, Aaron L. Halpern,
Elena Kokoza, Cheryl L. Kraft, Zhongwu Lai,
Suzanna Lewis, Christos Louis, Carolina
Barillas-Mury, Deborah Nusskern, Gerald M. Rubin,
Steven L. Salzberg, Granger G. Sutton, Pantelis
Topalis, Ron Wides, Patrick Wincker, Mark
Yandell, Frank H. Collins, Jose Ribeiro, William
M. Gelbart, Fotis C. Kafatos, Peer Bork
Presented by Leon G Xing
SCIENCE VOL 298 4 OCTOBER 2002
2Why Anopheles gambiae?
- It is the principal vector of malaria
- It carries many other infectious diseases
- Malaria afflicts more than 500 million people
- More than 1 million people die each year from
malaria
3The Culprit
4Why Drosophila melanogaster
- One of the most intensively studied organisms in
biology - Serves as a model system for the investigation of
many developmental and cellular processes common
to higher eukaryotes - Modest genome size 180 MB
- Its genome has been sequenced in 2000
5Mosquito vs. Fruit Fly
- They diverged about 250 million years ago
- (Human and pufferfish diverged about 450
million years ago) - Share considerable similarities
- Half of the genes in both genomes
- are interpreted as orthologs
- Average sequence identity about 56,
6Mosquito vs. Fruit Fly
- Anopheles genome is twice the size of Drosophila
- Female Anopheles feeds on blood (Hematophagy),
which is essential for egg development and
propagation - Viruses and parasites use Anopheles as a vehicle
for transmission
7Orthologs
- Genes in different species that evolved from a
common ancestral gene by speciation - Typically retain the same function in the course
of evolution
8Paralogs
- Genes related by duplication within an organism
and have evolved a related but different function
9Predict the function of a new protein
- A powerful approach is to use bioinformatics and
domain database searches to find its
characterized orthologs - We know a lot about Drosophila but dont know
much about Anopheles - Compare their genomes may deduce a lot of
information
10Drosophila melanogaster Genome
- The assembled and annotated genome sequence of 5
Drosophila melanogaster chromosomes is in GenBank
- Its the collaboration between Celera and the
Berkeley Drosophila Genome Project - Published in the March 24, 2000 issue of Science.
11Drosophila Genome
12Anopheles vs DrosophilaGene Comparison at
Protein Level
- The proteins are classified into 4 categories
based on - 12,981 deduced Anopheles proteins out of 15,189
annotated transcripts - Omit transposon-derived bacterial like sequences,
and alternative transcripts
13Classification of Anopheles proteins
- 11 orthologs
- Anopheles proteins with one clearly identifiable
counterpart in Drosophila and vice versa - 47 of the Anopheles
- 44 of the Drosophila proteins
14Classification of Anopheles proteins
- Many-to-many orthologs.
- Gene duplication has occurred in one or both
species after divergence - Includes 1779 Anopheles proteins
15Classification of Anopheles proteins
- The third category
- Have homologs in Drosophila and/or other species
but without easily discernable orthologous
relationships - 3590 Anopheles predicted proteins
16Classification of Anopheles proteins
- The fourth category
- Has little or no homology in Drosophila but
instead have best matches to other species. - 1283 proteins
17Classification of Anopheles proteins
- Remaining proteins
- No detectable homologs in any other species with
a fully sequenced genome - 1437 in Anopheles
- 2570 in Drosophila
- Might be new or quickly evolving genes.
18Classification of proteins
19Some Notes
- The numbers and derived estimates are
approximations. - Annotation of genomes is an ongoing effort
- Some Anopheles genes have not been sequenced yet
- Highly polymorphic regions or in highly
repetitive contexts prone to errors - 70 accuracy
20The core of conserved proteins
- The 11 orthologs (6089 pairs) can be considered
the conserved core - The average sequence identity is 56
- Humans and pufferfish share 61
- Indicates that insect proteins diverge at a
higher rate
21Properties of 11 orthologs.
22Orthologous proteins constitute a core of
conserved functions
- Early embryogenesis are conserved between
Drosophila and Anopheles - 315 early developmental genes in Drosophila vs
251 genes showed a clear single ortholog in
Anopheles
23Orthologous proteins
- 85 of the developmental genes have single
orthologs - 47 for the genome as a whole
24Protein family expansions and reductions
- Due to adaptations to environment and life
strategies - Leads to changes in cellular and phenotypic
features - Implies duplications after speciation
25Protein family expansions and reductions example
- Epsilon subunit of the adenosine
triphosphate-synthase complex - Encoded by two genes in both Anopheles and
Drosophila - They might share a single-copy ancestral gene
- After speciation they were duplicated
independently later
26Expansions of proteins with FBN-like domains in
Anopheles.
- Fibrinogen (FBN) are found originally in human
blood coagulation proteins - A large expansion of mosquito proteins contains a
domain resembling the COOH-terminus of the beta
and gamma chains of FBN
27Expansions of proteins with FBN-like domains in
Anopheles.
- Phylogenetic tree of 58 Anopheles and
13 Drosophila FBN genes - They largely belong to two distinct
species-specific clades - Identified only two 11 orthologous relationships
28The significant implication of FBN gene expansion
- The massive expansion of the Anopheles gene FBN
family might be associated with particular
aspects of the mosquito's biology - That is, hematophagy and exposure to Plasmodium
- Blood meal is a challenge associated with
microbial flora in the gut and blood coagulation
29The implication of FBN gene expansion
- The bacteria-binding properties of FBNs might be
important in controlling or aggregating bacteria
in the midgut - These proteins might be used as competitive
inhibitors i.e. anticoagulants - Some mosquito FBN proteins are up-regulated by
invading malaria parasites
30Expansion of FBN-like proteins in Anopheles
31Gene losses in insects
- Some genes are absent in both Anopheles and
Drosophila but are present in other eukaryotes - Criteria genes must be present in at least one
animal but also in fungi or plants
32Gene losses in insects.
33Gene genesis and gene loss
- 1437 predicted genes in Anopheles have no
detectable homology with genes of other species - 522 of these have putative paralogs only within
Anopheles - At least 26 of such genes expressed in the adult
female salivary glands
34Strategy for identifying gene losses
- Search for genes that are present in only one of
the two insects but that do have orthologs in
other species
35Gene Losses
- Widespread orthologs missing from both Anopheles
and Drosophila are putative insect-specific gene
losses - Example
- Insects are known to unable to synthesize sterols
- Absence of several enzymes involved in sterol
metabolism
36Gene Losses example
- Absence of the DNA repair enzyme uracil-DNA
glycosylase in insects - DNA methylation can lead to spontaneous
deamination of cytosine to uracil - Drosophila has long been known to have no or only
very little DNA methylation
37Cladogram based on Orthologs
38Intron gain and loss
- Drosophila are known to have a reduction of
noncoding regions - 11,007 out of 20,161 Anopheles introns in 11
orthologs have equivalent positions in Drosophila - Almost 10,000 introns have either been lost or
gained
39The Drosophila Dscam gene
- Able to encode up to 38,000 proteins through
extensive alternative splicing - Three different cassettes of duplicated exons
that can generate exponential combinations of
splice variants - The numbers of exons within the cassettes are at
least similar in Anopheles
40(No Transcript)
41Microsynteny
- Through evolution genome structure may vary
greatly, but small regions of conserved gene will
be retained - Microsynteny studies the localized region of
sequences with high similarity
42Microsynteny blocks
43Mapping of orthologs and microsyntenyblocks to
chromosomal arms in Anophelesand Drosophila.
44Chromosome mapping
- Both Anopheles and Drosophila have five major
chromosomal arms (X, 2L, 2R, 3L, and 3R, and a
small chromosome 4 in Drosophila melanogaster). - In Drosophila, reassortment of recognizable
chromosomal arms occurs by fission and fusion at
the centromeres
45Chromosome mapping
- The most conserved pair of chromosomal arms is
Dm2L and Ag3R - 76 of the orthologs and 95 of microsynteny
blocks in Dm2L mapping to Ag3R
46Chromosome mapping.
47Chromosome mapping surprise
- Significant portions of the Anopheles X
chromosome appear to have been derived from what
are presently autosomal Drosophila chromosome
segments - 11 of Dm3R and 33 of Dm4
48Homology of chromosomal arms
49Thank you!