Title: Sarah Pyfrom
1CRISPR-associated Proteins
- Sarah Pyfrom
- sapyfrom_at_davidson.edu
2Research Questions
- What Cas-proteins does our species share with the
10 other species we chose to study? - If so, how do they compare?
- How do Cas-proteins function in relation to
CRISPR units? - Edit
- Why did JGI change its annotation?
3Cas Proteins
- Proteins that are almost always associated with
(near) CRISPR sequences - Originally four major families
- Now, at least 45 families total
4JGI annotation
- Cas1
- Cas2
- Cas3
- Cas4
- TM1800
- TM1801
- Cas1
- Cas2
- Cas4
- Cas5
- Cas6
- Csh1
- Csh2
5Changes
- TM1800 Cas5
- TM1801Csh2
- Hypothetical protein Csh1
- Part of hypothetical protein Cas6
- Cas3 hypothetical protein
- Cas4
- MTDSSGDPVDRFLAAARDESAELPFRLTGVMFQYYVVCERELWFLSRDVE
IDRDTPAIVRGSDVDDSAYADKRRDVRVDGIIAIDVLDSGEILEVKPSSS
MTEPARLQLLFYLWYLDRVTGVEKTGVLAHPAEKRRETVELTPETSAEVE
SAIEGIRAVVTAESPPPAEEKPVCDSCAYHDFCWSC (red
original Cas4)
6Map of CRISPR region
TM1800
TM1801
Transposases
Cas3
Hypothetical proteins
CRISPR
Cas1
Unidentified
Csh1
Cas2
Cas5
Cas4
Cas6
Csh2
7Cas1(from Sulfolobus solfataricus)
- high-affinity nucleic acid binding protein
- binds DNA, RNA and DNARNA hybrid
- sequence non-specific in a multi-site binding
mode - promotes the hybridization of complementary
nucleic acid strands. - From SSO1450 A CAS1 protein from Sulfolobus
solfataricus P2 with high affinity for RNA and DNA
8Cas2 function unknown
- Usually similar to helicases
- Unwinds double-stranded DNA
- Thought to be involved in DNA metabolism and
repair
- Often resemble Rec-B exonucleases
- Break down nucleic acid strains
- Thought to be involved in DNA metabolism
From Genbank
9- Often found with Cas1, and Cas6.
- Share and N-terminal region of about 43 amino
acids in length - Are usually 210-265 amino acids long
- Characterized by GhGxxxxxGhG motif, where h
indicates a hydrophobic residue, at the C-terminus
From EMBL IPR013422 profile page (
http//www.ebi.ac.uk/interpro/IEntry?acIPR013422)
From Sanger PF09559 Profile page (
http//pfam.sanger.ac.uk/family/PF09559)
10Csh1 and Csh2?
- Protein families determined for ease of alignment
- Often large differences between species
- Alignment easier if protein soup is divided
into more readily-compared subgroups.
11CRISPRs thought to create stable secondary RNA
structures
- Spacers remain associated with their DR
neighbors. - Provide a way for Cas-Proteins to recognize the
spacers and facilitate immune response.
From Evolutionary conservation of sequence and
secondary structures in CRISPR repeats
12Cas-Proteins and Immunity
- Thought to act like Slicer and Dicer (eukaryotic
counterpart) - Create siRNA that will inhibit/break down
invading RNA - Not known if Cas-proteins are involved in
integrating pathogenic DNA into spacers - Video of eukaryotic siRNA process
http//www.youtube.com/watch?vD-77BvIOLd0
13Alignments of Cas
- Compared Cas1, Cas2, Cas3 etc. proteins across
all 10 species
14Comparison with other species(based on old
proteins)
Species Cas1 Cas2 Cas3 Cas4 TM1800 TM1801
H vallismortis
H. volcanii
H. sulfurifontis X X
H. sinaiiensis X X X X X
H. californiae X X X
H. utahensis X X X X X X
H. mucosum X X X X X X
H. mediteranei X X X X X X
H. denitrificans X X X X X X
H. mukohataei X X X X X X
15Phylogenetic tree comparing amino acid sequences
for all CAS-proteins
2
2
Halomicrobium mukohataei
2
2
Haloarcula sinaiiensis
1800
1
Haloarcula californiae
4
Haloferax dentrificans
3
1
Haloferax mediteranei
1801
1801
Haloferax sulfurifontis
1801
1801
Haloferax mucosum
1801
Halorhabdus utahensis
3
1
1
1800
2
3
3
2
1800
1
1800
1800
1
3
1801
4
4
4
4
4
1
4
1800
3
3
1800
1801
16Cas 1 and Cas2 did not change
17- Cas 4
- JGI revision shortened this protein
- Would expect low sequence similarity near end of
protein
18- TM1801 (Csh2)
- Revision by JGI simply renamed this protein
- Would expect sequence similarity
19Map of CRISPR region
TM1800
TM1801
Transposases
Cas3
Hypothetical proteins
CRISPR
Cas1
Unidentified
Csh1
Cas2
Cas5
Cas4
Cas6
Csh2
20In conclusion
We dont know much.
but we do know everything that everybody else
knows.
21Questions?
22References
- Kunin, V., Sorek, R., Hugenholtz, P. (2007)
Evolutionary conservation of sequence and
secondary structures in CRISPR repeats. Genome
Biology.http//genomebiology.com/2007/8/4/R61.
Accessed 24 Nov, 2009. - Haft, D.H., Selengut, J. Mongodin, EF., Nelson,
K.E. (2005). A guild of 45 CRISPR-associated
(Cas) protein families and multiple CRISPR/Cas
subtypes exist in prokaryotic genomes. PLoS
Comput Biol. http//www.ncbi.nlm.nih.gov/pubmed/16
292354. Accessed 24 Nov 2009.