Databases. - PowerPoint PPT Presentation

About This Presentation
Title:

Databases.

Description:

Biochemistry – PowerPoint PPT presentation

Number of Views:27
Slides: 29
Provided by: m.prasadnaidu
Tags: super

less

Transcript and Presenter's Notes

Title: Databases.


1
BIOLOGICAL DATABASES
M.Prasad Naidu MSc Medical Biochemistry, Ph.D,.
2
INTRODUCTION
  • The database
  • must be maintained as a central shareable
    resource
  • should provide easy-to-use software to access the
    information (web-pages...)
  • has to be structurally organised and fully
    annotated to find the information needed
  • should not contain redundant information
  • should be error free

3
Levels of protein sequence databases and
structural organisation
Primary database
Primary
Sequence
AVILDRYFH
Motif or Pattern
Secondary
AS-X-IL2-DE
Secondary database
Rosmann fold, GTP-binding domain...
Structure database
Tertiary
Domain
4
Different Types Of Databases
  • Primary Databases.
  • Composite Databases.
  • Secondary Databases.

5
PRIMARY DATABASES
  • In 1980, Due to the flooding of sequence
    information, need to storage of sequence Data.
  • They contain sequence information.
  • Eg NA Protein
  • EMBL PIR
  • Gen Bank MIPS
  • DDBJ SWISS-Prot
  • Tr-EMBL
  • NRL-3D

6
PIR
  • Developed by National Biomedical Research
    Foundation in 1960s by Margaret Dayhoff to
    investigate evolutionary relationships between
    proteins.
  • Maintained by PIR, an association of
    Macromolecular sequence data collection centres
  • Pir at NBRF
  • International protein information database of
    Japan (JIPID).
  • Martinsried institute of Protein sequences
    (MIPS).

7
Quality of PIR Database
  • Has been split into 4 different sections ranked
    according to quality
  • PIR1 fully classified and annotated entries
  • PIR2 includes preliminary entries (may include
    redundancy)
  • PIR3 includes unverified entries
  • PIR4 contains conceptual translations

8
MIPS
  • Collects and processes sequence Data for the PIR.
  • Also distributed with Patch x ,a supplement of
    unverified protein sequences from external
    resources.

9
SWISS-PROT database
  • Produced by the Dept. of Medical Biochemistry at
    University of Geneva and the EMBL in 1986.
  • Was transferred to EBI in1994.
  • Further changed to Swiss institute of
    Bioinformatics-SIB.
  • Has a High level annotated entries with
    descriptions of functions, structure, post
    translational modifications.

10
Example of a Flat file SWISS-PROT Q14790 ID
ICE8_HUMAN STANDARD PRT 479 AA. AC
Q14790 Q14791 Q14792 Q14793 Q14794 AC
Q14795 Q14796 Q15780 Q15806 Q9UQ81 AC
O14676 DT 01-NOV-1997 (Rel. 35, Created) DT
01-NOV-1997 (Rel. 35, Last sequence update) DT
01-OCT-2000 (Rel. 40, Last annotation DT
update) DE CASPASE-8 PRECURSOR (EC 3.4.22.-)
(ICE-LIKE DE APOPTOTIC PROTEASE
5)(MORT1-ASSOCIATED CED-DE 3 HOMOLOG) (MACH)
(FADD-HOMOLOGOUS ICE/CED-DE 3-LIKE PROTEASE)
(FADD-LIKE ICE) (FLICE) DE (APOPTOTIC
CYSTEINE PROTEASE)(APOPTOTIC DE PROTEASE
MCH-5) (CAP4). GN CASP8 OR MCH5.
11
OS Homo sapiens (Human). OC Eukaryota
Metazoa Chordata Craniata OC Vertebrata
Euteleostomi OC Mammalia Eutheria Primates
Catarrhini OC Hominidae Homo. OX
NCBI_TaxID9606 RN 1 RP SEQUENCE FROM
N.A., AND ALTERNATIVE RP SPLICING. RC
TISSUEThymus, and B-cell RX MEDLINE96279826
PubMed8681376 NCBI, RX ExPASy, EBI, Israel,
Japan RA Boldin M.P., Goncharov T.M., Goltsev
Y.V., Wallach D.
12
RT "Involvement of MACH, a novel
MORT1/FADD-interacting protease, in RT
Fas/APO-1- and TNF receptor-induced cell
death." RL Cell 85803-815(1996). RN 2 RP
X-RAY CRYSTALLOGRAPHY (2.8 ANGSTROMS). RX
MEDLINE99451259 PubMed10508784 NCBI, RX
ExPASy, EBI, Israel, Japan RA Blanchard H.,
Kodandapani L.,Mittl P.R.E., RA Di Marco RA,
S., Krebs J.F., Wu J.C., RA Tomaselli
K.J., Gruetter M.G. RT "The three-dimensional
structure of RT caspase-8 an initiator
enzyme in RT apoptosis." RL
Structure 71125-1133(1999).
13
CC -!- FUNCTION MOST UPSTREAM PROTEASE OF CC
THE ACTIVATION CASCADE OF CASPASES CC
RESPONSIBLE FOR THE FAS-RECEPTOR CC
MEDIATED (CD95) AND TNFR-1 INDUCED CELL CC
DEATH. BINDING TO THE ADAPTOR MOLECULE CC FADD
RECRUITS IT TO EITHER RECEPTORS. CC THE
RESULTING AGGREGATE CALLED THE CC
DEATH-INDUCING SIGNALING COMPLEX (DISC) CC
PERFORMS FLICE/MACH PROTEOLYTIC CC
ACTIVATION. THE ACTIVE DIMERIC ENZYME IS CC
THEN LIBERATED FROM THE DISC AND FREE TO CC
ACTIVATE DOWNSTREAM APOPTOTIC PROTEASES. CC
PROTEOLYTIC FRAGMENTS OF THE N-TERMINAL CC
PROPEPTIDE (TERMED CAP3, CAP5 AND CAP6) CC ARE
LIKELY RETAINED IN THE DISC. CLEAVES
Comments
14
CC AND ACTIVATES CASPASE-3, -4, -6, -7, -9, CC
AND -10. MAY PARTICIPATE IN THE GRANZYME B CC
APOPTOTIC PATHWAYS. PROTEOLYTICALLY CC
CLEAVES POLY(ADP-RIBOSE) POLYMERASE(PARP). CC
HYDROLYZES THE SMALL- MOLECULE SUBSTRATE, CC
AC- ASP-GLU-VAL-ASP--AMC. LIKELY TARGET CC FOR
THE COWPOX VIRUS CRMA DEATH INHIBITORY CC
PROTEIN. CC -!- SUBUNIT HETERODIMER OF A 18
KDA (P18) CC AND A 10 KDA (P10) SUBUNIT.
INTERACTS WITH CC CFLAR. CC -!- ALTERNATIVE
PRODUCTS 8 ISOFORMS 1- CC ALPHA (SHOWN HERE),
2-ALPHA/MCH5-BETA, 3-CC ALPHA, 4-ALPHA, 1-BETA,
2-BETA, 3-BETA AND CC 4-BETA ARE PRODUCED BY
ALTERNATIVE CC SPLICING.
Presence of subunits
and of alternative proteins
15
CC -!- TISSUE SPECIFICITY ALPHA 1 AND BETA 1
CC ISOFORMS ARE EXPRESSED IN A WIDE VARIETY CC
OF TISSUES. HIGHEST EXPRESSION IN CC
PERIPHERAL BLOOD LEUKOCYTES, SPLEEN, CC
THYMUS AND LIVER. BARELY DETECTABLE IN CC
BRAIN, TESTIS, AND SKELETAL MUSCLE. CC -!- PTM
GENERATION OF THE SUBUNITS CC REQUIRES
ASSOCIATION WITH THE DISC, CC WHEREAS
ADDITIONAL PROCESSING IS LIKELY CC DUE TO THE
AUTOCATALYTIC ACTIVITY OF THE CC ACTIVATED
PROTEASE. GRANZYME B AND CC CASPASE-10
CAN BE INVOLVED IN THESE CC PROCESSING
EVENTS. CC -!- SIMILARITY BELONGS TO PEPTIDASE
CC FAMILY C14 ALSO KNOWN AS THE CASPASE
CC FAMILY. CONTAINS 2 DEATH EFFECTOR CC
DOMAINS (DED).
Tissue specificity, Post-translational
modifications , Similarity
16
DR EMBL X98172 CAA66853.1 -. EMBL / DR
GenBank / DDBJ CoDingSequence DR EMBL
X98173 CAA66854.1 -. EMBL / DR GenBank
/ DDBJ CoDingSequence DR EMBL X98174
CAA66855.1 -. EMBL / DR GenBank / DDBJ
CoDingSequence DR PDB 1QDU PRELIMINARY.
ExPASy / RCSB DR SWISS-3DIMAGE
ICE8_HUMAN. DR InterPro IPR001875 DED. DR
Pfam PF01335 DED 2. DR Pfam PF00655
ICE_p10 1. DR Pfam PF00656 ICE_p20 1. DR
PROSITE PS50207 CASPASE_P10 1. DR PROSITE
PS50208 CASPASE_P20 1. DR PROSITE PS50168
DED 2.
Database cross-reference with access number
17
DR ProDom Domain structure / List of seq. DR
sharing at least 1 domain DR BLOCKS
Q14790. DR DOMO Q14790. DR PROTOMAP
Q14790. DR PRESAGE Q14790. DR DIP
Q14790. DR SWISS-2DPAGE GET REGION ON 2D
PAGE. KW Hydrolase Thiol protease Apoptosis
KW Zymogen Alternative splicing KW
3D-structure.
Keywords
18
FT PROPEP 1 216 FT CHAIN 217
374 CASPASE-8 SUBUNIT P18. FT PROPEP 375
384 FT CHAIN 385 479 CASPASE-8
SUBUNIT P10. FT ACT_SITE 317 317 FT
ACT_SITE 360 360 FT DOMAIN 2 80
DED 1. FT DOMAIN 100 177 DED 2. FT
VARSPLIC 102 102 R -gt RFHFCRMSWAEANSQC FT
QTQSVPFWRRVDHLLIR (IN ISOFORM 4 ALPHA). FT
VARSPLIC MISSING (IN ISOFORM 2 ALPHA, FT
ISOFORM 4 ALPHA AND ISOFORM 4 BETA). FT
CONFLICT 285 285 D -gt H (IN REF. 3 AND FT
5). FT CONFLICT 294 294 E -gt D (IN REF. 4).
Feature Table
19
SQ SEQUENCE 479 AA 55391 MW SQ
7A5FEAA6B39B582F CRC64 MDFSRNLYDI GEQLDSEDLA
SLKFLSLDYI PQRKQEPIKD ALMLFQRLQE KRMLEESNLS
FLKELLFRIN RLDLLITYLN TRKEEMEREL QTPGRAQISA
YRVMLYQISE EVSRSELRSF KFLLQEEISK CKLDDDMNLL
DIFIEMEKRV ILGEGKLDIL KRVCAQINKS LLKIINDYEE
FSKERSSSLE GSPDEFSNGE ELCGVMTISD SPREQDSESQ
TLDKVYQMKS KPRGYCLIIN NHNFAKAREK VPKLHSIRDR
NGTHLDAGAL TTTFEELHFE IKPHDDCTVE QIYEILKIYQ
LMDHSNMDCF ICCILSHGDK GIIYGTDGQE APIYELTSQF
TGLKCPSLAG KPKVFFIQAC QGDNYQKGIP VETDSEEQPY
LEMDLSSPQT RYIPDEADFL LGMATVNNCV SYRNPAEGTW
YIQSLCQSLR ERCPRGDDIL TILTEVNYEV SNKDDKKNMG
KQMPQPTFTL RKKLVFPSD // The same file in an
oriented Web looking via SWISS-Prot
20
TrEMBL database
  • Designed as a supplement to SWISS-PROT
  • Benefits by providing translation of all coding
    sequences
  • Consists of 2 sections
  • SP-TrEMBL with entries that will be
    incorporated into SWISS-PROT after annotation
  • REM-TrEMBL with entries that are not destined
    to be included in SWISS-PROT (synthetic
    sequences, conceptual translations,)? do not
    compromise
  • the quality of the SWISS-PROT

21
NRL-3D databases
  • Contains only protein sequences extracted from
    the Brookhaven Protein Databank (PDB)
  • But includes
  • bibliographic references and MEDLINE cross-
    references
  • secondary structure information
  • active and binding site, modification in the
    sequence
  • details on experimental method, resolution,
    R-factor,

22
Composite protein sequence Databases
  • 1) To render sequence searching more efficient
  • To answer the questions of choosing the best
    primary databases?
  • (the most up-to-date, which database to use? ,)

23
Some of the Composite protein sequence databases
available
  • NRDB OWL MIPSX
    SPTrEMBL
  • PDB SWISS-PROT PIR
    SWISS-PROT
  • SWISS-PROT PIR
    MIPSOwn TrEMBL
  • PIR GenBank
    MIPSTrn
  • GenPept NRL-3D
    MIPSH
  • SWISS-PROT update
    PIRMOD
  • GenPeptupdate NRL-3D
  • SWISS-PROT

  • EMTrans

  • GBTrans
  • Kabat
  • PseqIP

24
NRDB
  • NRDB (Non-Redundant Database) is built locally at
    the NCBI.
  • It is a composite of
  • -Gen pept. (Genbanks CDS translations)
  • -PDB sequences.
  • -Swissprot update (updates of swissprot)
  • -PIR
  • -Gen pept updates (daily updates of Gen pept)
  • NRDB is not prone to errors.
  • NRDB is the database of BLAST services.

25
OWL
  • Non redundant protein Sequence database.
  • Built at university of Leeds in collaboration
    with the Dares bury Laboratory in Washington.
  • Composite of
  • -Swiss-Prot.
  • -PIR
  • -Genbank.
  • -NRL-3D.

26
MIPS X
  • Merged database produced at the Max Planck
    institute in Martinsried Institute of Protein
    sequences.
  • Composite of
  • -PIR NRL-3D
  • -MIPSOWN Swiss-prot
  • -MIPS Trn EM trans
  • -MIPS H GB trans
  • -PIRMOD

27
Swiss-Prot TrEmbl
  • EBI constructed database.
  • Composite of both Swiss-Prot TrEmbl.
  • Minimally redundant.
  • SRS is used to retrieve the information.

28
THANK YOU
Write a Comment
User Comments (0)
About PowerShow.com