Title: What Have We Learned From Unicellular Genomes?
1What Have We Learned From Unicellular Genomes?
- Propionibacterium acnes
- Bacteroides thetaiotaomicron
- Mycoplasma genitalium
- Mimivirus
- Cyanobacteria
- Plasmodium
- Yeast
2Why do I get so many pimples?
- The genome of Propionibacterium acnes was
sequenced in July of 2004. - P. acnes lives in sebaceous cysts and sometimes
stimulates and immune response. - A group in Paris, along with two groups in
Germany sequenced P. acnes. - They found 2,333 genes in its 2.6 Mb genome.
- 68 of these had orthologs in other species.
- 20 had none, and 12 encoded only RNA.
3Anatomy of a pimple
4(No Transcript)
5Genome-wide evaluations
- A first step following bacterial genome
sequencing is finding the ori and terminus for
replication. - GC skewing (non-uniform distribution of Gs Cs
- Oris tend to have the lowest skew, while termini
have the highest. - Genes that have originated by horizontal transfer
are identified using a sliding window to find
segments with abnormal GC content. - Codon bias is also used to detect HT. Immunogenic
and metabolic genes were detected.
6Transcriptional Phase Variation
- During finishing, it was found that P. acnes had
a variable of Gs associated with some genes. - It is hypothesized that the initiation of
transcription depends on the of consecutive
Gs. - As rows of Gs are replicated, the will change.
- This leads to a mixed population of bacteria with
varying degrees of protein production. - This diverse population is optimized to respond
differentially to various skin treatments.
7(No Transcript)
8(No Transcript)
9Digesting Our Cells For Food
- P. acnes was found to be able to grow
anaerobically as well as aerobically. - Cells produce many enzymes that are able to
degrade lipids, ester, and amino acids. - Some of these degradation products increase
adhesion to our cells. - Many of the digestive enzymes contain a motif
(LPXTG) that targets them to the cell wall. - Hyaluronate lyase is also found on the surface of
the bacteria, this destroys the extracellular
matrix that binds our cells together.
10(No Transcript)
11Stimulating the Immune Response
- P. acnes produces 5 CAMP factors (secreted
proteins that bind antibodies) that can form
pores in the cell membrane. - A dipeptide motif (PT) is present in certain
proteins, this motif is also found in M.
tuberculosis. - The bacteria also has at least 7 heat shock
protein genes. - Porphyrin is also secreted, which produces toxic
forms of oxygen, further stimulating the immune
response.
12Withstanding the Environment
- P. acnes can signal nearby cells that something
has changed in the environment. - Sensors called two-component systems (1 to sense
1 to signal) exist in some bacteria, P. acnes
has 10 pairs. - Quorum sensing is the ability to detect
conditions of overcrowding. The LuxS gene is
expressed in these instances, which produces a
universal signal for interspecies communication
among bacteria. - Biofilms of meshed-together cells protect
themselves.
13Are all bacteria living in us bad for us?
- An average human body is composed of about 1013
cells. - Our intestines have about 1010 microbes/ml and
contain at least 1,000 ml. - A majority of the cells in our bodies may be
bacteria! (500 - 1,000 different species) - This accounts for 2-4 million non-human genes
- Bacteroides thetaiotaomicron constitutes a
substantial portion of our intestinal flora. - A group from Wash. U. in St. Louis sequenced its
genome.
14(No Transcript)
15- Overview of the Genome
- B. thetaiotaomicrons genome contains 6.3 Mb, as
well as 4,779 genes (and a 33 kb plasmid). - 58 of ORFs have known function, 18 have
orthologs of no known function, and 24 have no
homology with known proteins. - COGs (functional categories of genes) are
determined following sequencing to create an
overview of a given genome. - Many of the genes specialize in sugar uptake,
cell wall synthesis, environmental sensing and
signaling, as well as transposition.
16Major COGs
- Sugar metabolism- 170 genes fit into this
category, most bacteria have a set of 23. - 61 of these appear to be secreted, this not only
benefits other bacteria but us as well. - 163 paralogs of 2 genes (SusC SusD) import
sugars into the cytoplasm of the microbe. - Many two-component genes are present for
signaling, some of these interact with s factors. - 63 tranposons are present, which may help spread
antibiotic resistance.
17(No Transcript)
18Does Size Matter?
- The coding capacity for this genome is very high
(89 coding DNA) but it has a lower ratio of gene
to genome size than expected. - This was a paradox until it was determined that
the ORFs of this microbe are unusually large. It
is unclear why this is the case.
Summary
- Gut symbionts provide us with predigested sugars,
stimulated blood vessel formation, crowd out
pathogens, sequester limited resources, and
stimulate our mucosal layer.
19Can Microbial Genomes Become Dependent Upon Us?
- In the microbial world, if you dont use it- you
lose it. - Mycoplasma genitalium has one of the most reduced
microbial genomes and the 2nd smallest bacterial
genome with 580 kb (the smallest is N. equitans
with 490 kb). - TIGR sequenced its genome in 1995.
- 470 ORFs were found, 96 of which have no known
orthologs. - M. genitalium has an 88 coding capacity.
20(No Transcript)
21Genes that have been lost
- M. genitalium has presumably lost many genes
involved in the synthesis of amino acids,
cofactors, cell envelope, and regulatory factors.
It has only 1 s factor. - The microbe has retained genes for energy
metabolism, fatty acid and phospholipid
metabolism, nucleotide production, replication,
transcription, and protein transport. - The only category overrepresented is translation,
namely rRNA and tRNA genes.
22(No Transcript)
23What is the Minimum of Genes?
- Craig Venter, along with Hamilton O. Smith, is
trying to construct an organism with the fewest
possible genes. - A new field called synthetic biology seeks to
synthesize a functioning genome de novo. - A better understanding of evolutionary principles
and genome circuitry is sought. - Japanese European scientists have tried to
identify the essential genes of B. subtilis. - They have found that only 192 genes are
indispensable to life.
24Do all Viruses have Small Genomes?
- Most viral genomes are much smaller than
bacterial ones - HIV- 9,200 nt
- WNV- 10,962 nt
- SARs- 29,727 nt
- T7- 39,900 nt
- l- 48,502 nt
- In 2003, a new virus that infects amoeba was
isolated that has 1.2 Mb! A group in Marseille,
France sequenced Mimivirus, as it is called.
25(No Transcript)
26Mimivirus Genome
- 1,262 ORFs were identified, the coding capacity
is 90.5. - Like most viruses, the genome is linear, but it
has inverted repeats at both ends by which it may
circularize, perhaps during replication. - Isoleucine is used twice as often as usual, and
there is a strong codon bias for codons lacking G
or C. The genome is 28 GC. - Mimivirus is overrepresented in genes for
translation, posttranslational modification, and
amino acid transport and metabolism.
27Is Mimivirus Alive?
- The genome of Mimivirus resembles bacterial,
Mimivirus even stains Gram , is it a virus? - In 1957, the definition of a virus was proposed
- 1) smaller than .2 microns
- 2) possesses DNA or RNA, not both
- 3) not able to synthesize its own proteins
- 4) cannot generate energy from substrates
- 5) cannot grow by binary fission
- Mimivirus only satisfies the 4th category, we are
not sure about the 5th.
28(No Transcript)
29What is it then?
- Mimivirus has blurred the distinction between
prokaryotes and viruses. - It is hypothesized that, like M. genitalium,
Mimivirus has lost genes over time. - We will learn of more obligate intracellular
parasites later in class. - Mimivirus may resemble some of the earliest forms
of life that was able to replicate independently
until it became a parasite.
30Genomes Reflect an Organisms Ecological Niche
- Cyanobacteria are the most productive
phytoplankton in the world. - The two most abundant genera of cyano-bacteria
are Prochlorococcus and Synecho-coccus. 3
genomes in the former group and 1 in the latter
were sequenced in 2003. - Individual cells from both genera are referred to
using a numbering system to indicate different
ecotypes. Species designations are difficult to
assign still, Prochlorococcus was discovered in
the 1990s.
31Prochlorococcus
32(No Transcript)
33(No Transcript)
34Dot Plot Align-ment
35Prochlorococcus MED4 vs. MIT9313
- These ecotypes share 1,352 orthologs.
- Short diagonal segments indicate synteny.
- A negative slope indicates that the segment was
inverted in one type relative to the other. - Segments with positive slope but located off the
diagonal indicate chromosome recombinations. - Genes along the axis means they are missing from
the other ecotype, MED4 has 364 genes not found
in MIT9313, which has 923 genes not found in the
other.
36pcb gene family
- A major difference between the ecotypes is in the
pcb gene family, which encode chlorophyll-binding,
light-harvesting antenna complex proteins that
help capture a wider spectrum of light. - MED4 (high light) has only 1 pcb gene
- MIT9313 (medium light) has 2 (A B)
- SS120 (low light) has 8 (A-H)
- MED4s gene does not respond to changes in Fe3
but MIT9313s is induced 7-fold and SS120s is
induced 23-fold.
37MED4s Small Genome
- MED4s genome is the smallest known for a
photoautotroph and may represent the minimum for
a photosynthetic organism. - MED4 appears to have lost genes over time.
- A more stream-lined genome means a narrower
ecological range that an organism is adapted for.
Synechococcus has the largest genome of this
group and the largest ecological range as well. - People have proposed seeding the ocean with Fe3
to help stimulate CO2 consumption.
38Gene deletions in Cyanobacteria
39Malaria
- Malaria, although it rarely makes news headlines,
is a daily threat to the 3 billion people who
live in tropical climates. - In 2002, about 500 million people were infected.
About 2.7 million people die each year (about 90
of these are lt 5 years old). - The cause of malaria has been known for 100 years
but we still cant stop its spread. - The most lethal form of malaria is caused by
Plasmodium falciparum.
40Lifecycle of Plasmodium
41RBC Infection
- The most vulnerable time for Plasmodium is during
the RBC infection stage. - The parasite must force its way into a RBC
without rupturing any plasma membranes. - Three structures are important during infection
- 1) extracellular coating to make cells sticky
- 2) apical end of cell must be oriented downward
- 3) apicoplast is an internalized algal symbiont
42(No Transcript)
43Plasmodium Genomes
- Plasmodium actually has three genomes nuclear,
mitochondrial, and apicoplastic. - Pulse-field gel electrophoresis to separate
chromosomes, followed by shotgun genome
sequencing was used on Plasmodium. - This proved to be the most AT-rich genome
sequenced so far (19.4 GC). - The 22.9 Mb genome has 52.6 coding capacity and
5,268 ORFs (60 of which have no known function,
the largest of any genome).
44Tricking the Immune System
- The genes of Plasmodium that are responsible for
binding to RBCs and for avoiding the immune
system are located near the telomeres of this
eukaryote. - Genes located near Plasmodium telomeres are
replicated many times, all three gene families in
these categories (var, rif, stevor) are
polymorphic. - There are 59 var paralogs, 149 rif, and 28
stevor. This may account for our immune systems
lack of ability to deal with this parasite
45The Plasmodium Proteome
- 1 of proteins are used for host cell invasion
- 4 help evade the immune response
- 31 are integral to the membrane
- 14 are enzymes (about 4x lt most proteomes)
- 10 are transported to the apicoplast
- 60 have unknown function
- The Krebs cycle is present, but the organism
- grows anaerobically and only uses this cycle for
- heme biosynthesis (which it could get from us)
46Apicoplast Proteome
- Similar to a chloroplast in origin but used for a
different purpose now. - Only two photosynthetic orthologs remain.
- This organelle synthesizes fatty acids,
isoprenoids, and heme groups. - Nuclear proteins sent here assist in DNA
replication repair, transcription, translation,
posttranslational glycosylation, protein import,
and protein degradation.
47Comparing Plasmodia
- The Plasmodium sequencing project took 45 people
6 years to complete. - At the same time, other groups were working on P.
yoelii, which infects rats and is used as a model
organism for malaria research. - Unfortunately, this latter genome was never
finished, making comparisons difficult. - P. yoelii has 600 additional ORFs, and the two
have 3,310 genes in common (56). - Is this similar enough to make a good model
organism?
48Malaria Treatment Options?
- Recently, a German American team used reverse
genetics (starting with a gene sequence and
deducing its function) to target a gene in the
production of a knock-out strain. This strain is
expected to be less pathogenic than wild type.
Mice injected with this strain were protected for
30 days. - Even if a better drug were produced, funding and
health care infrastructure are lacking in many
problem areas. Very little is spent on malaria
research.
49Yeast
50Yeast Genome
- The S. cerevisiae genome was sequenced in 1996.
- It took over 600 scientists in Europe, North
America, and Japan working together to seqeunce
the 12 Mb genome. - Yeast has a 70.3 coding capacity, higher than
Plasmodium but lower than all bacteria. - There is a gene every 2 kb in yeast, one every 6
kb in C. elegans, and one every 30 kb in humans.
Eukaryotes have more junk DNA than prokaryotes
and enhancers, promoters, and introns add
substantially to the size of eukaryotic genes.
51(No Transcript)
52Chromosome Structure in Yeast
- The 4 smallest chromosomes in yeast have a
unique structure. It was known from using YACs
that chromosomes smaller that 150 kb were not
stable in yeast. These chromosomes are
relatively gene-poor and undergo recombination at
high frequencies, perhaps to protect the larger
ones from the same fate. - Transcriptionally silent genes are found in the
sub-telomeric regions of many chromosomes, this
may help identify the right and left sides of a
chromosome.
53Yeast Chromosomes
54Evolutionary History of Yeast
- There were a substantial number of genes found in
duplicate copies in yeast. - It was proposed that yeast had undergone
duplication events at some point in time. - Many regions of chromosomes are syntenic with
regions on other chromosomes. Such paralogs are
seen as evolutionary experiments where one gene
can drift to provide new specialized functions. - Some genes were initially thought to be extra
copies but experiments proved their difference
55Predictions for the Future
- The authors of the landmark 1996 yeast sequencing
publication made the following predictions - 1) they described plans to produce a collection
of single, double, and even triple KO mutations - 2) they addressed the value of making all genome
sequences publicly available. - 3) They felt WGS sequencing of large genomes was
not feasible. - 4) They looked forward to comparing yeast with
the S. pombe as well as the human genome.
56(No Transcript)
57Better Annotation
- A number of yeast genomes have been sequenced
since 1996. With these, the need to annotate
genes based on GO, Gene Ontology, became clear. - Improvements in computers, search algorithms, and
the increased volume of genes in the databases
lead to better annotation. - The original 5,885 ORFs annotated has been
increased to 6,672, many below the original
cutoff of 100 codons