Bio-Trac 25 (Proteomics: Principles and Methods) - PowerPoint PPT Presentation

1 / 47
About This Presentation
Title:

Bio-Trac 25 (Proteomics: Principles and Methods)

Description:

Title: Protein Family Classification for Functional Genomics Author: wuc Last modified by: zh9 Created Date: 3/9/2001 10:06:29 PM Document presentation format – PowerPoint PPT presentation

Number of Views:825
Avg rating:3.0/5.0
Slides: 48
Provided by: wuc
Category:

less

Transcript and Presenter's Notes

Title: Bio-Trac 25 (Proteomics: Principles and Methods)


1
Tutorial Bioinformatics Resources

(http//pir.georgetown.edu/pirwww/workshop/bioinfo
_resource.html)
  • Bio-Trac 25 (Proteomics Principles and Methods)
  • October 5, 2007
  • Zhang-Zhi Hu, M.D.
  • Research Associate Professor
  • Protein Information Resource, Department of
  • Biochemistry and Molecular Cellular Biology
  • Georgetown University Medical Center

2
What is Bioinformatics?
computer mouse bioinformatics
(information) (biology)
  • NIH Biomedical Information Science and Technology
    Initiative (BISTI) Working Definition (2000) -
    Research, development, or application of
    computational tools and approaches for expanding
    the use of biological, medical, behavioral or
    health data, including those to acquire, store,
    organize, archive, analyze, or visualize such
    data.

3
Molecular Biology Database Collection
(http//nar.oxfordjournals.org/cgi/content/full/35
/suppl_1/D3/DC1)
4
Database Collection in Nucleic Acids Res.
5
2007
Online Access to Database Collection
http//pir.georgetown.edu/pirwww/workshop/2005_dat
abase_update.html
http//www.oxfordjournals.org/nar/database/cap/
6
Overview
Database Contents, Search and Retrieval
  1. Text search / Information retrieval
  2. Sequence genomics databases
  3. Protein family databases
  4. Database of protein functions
  5. Databases of protein structures
  6. Proteomics databases

7
Entrez Text Searches
(http//www.ncbi.nlm.nih.gov/Entrez/)
Lab
8
PubMed Literature Database
(http//www.ncbi.nlm.nih.gov/entrez/query.fcgi?CMD
SearchDBPubMed)
Lab
9
iProLINK Protein Literature Mining Resource
Text mining for protein phosphorylation
Gene/protein name thesaurus synonyms, ambiguous
names
http//pir.georgetown.edu/iprolink/
Lab
10
BioThesaurus Gene/protein name searches -
synonyms, ambiguous names
Synonyms CRYAA crystallin, alpha A CRYA1 HSPB4
http//pir.georgetown.edu/iprolink/biothesaurus
Lab
11
RLIMS-P Text mining for protein phosphorylation
http//pir.georgetown.edu/iprolink/rlimsp/
Lab
12
UniProt Text Search
(http//www.pir.uniprot.org/cgi-bin/textSearch)
Google type search vs. Boolean searches AND,
OR, NOT
Lab
13
PIR Text Search (I)
(http//pir.georgetown.edu/pirwww/search/textsearc
h.html)
Search alpha crystallin A chain that are in
protein families?
Search for synonyms
Lab
14
PIR Text Search (II)
Search what crystallins are enzymes and what
families they belong to?
Can you find which crystallins have 3D structure
determined?
Lab
15
I. Sequence Genomics Databases
  • GenBank An annotated collection of all publicly
    available nucleotide and protein sequences.
  • RefSeq NCBI non-redundant set of reference
    sequences, including genomic DNA, transcript
    (RNA), and protein products
  • UniProt Consortium Database Universal protein
    resource, a central repository of protein
    sequence and function.
  • Entrez Gene Gene-centered information at NCBI.
  • UniGene Unified clusters of ESTs and full-length
    mRNA sequences .
  • OMIM Online Mendelian inheritance in man a
    catalog of human genetic and genomic disorders.
  • Model Organism Genome Databases MGD, RGD, SGD,
    Flybase
  • GeneCards Integrated database of human genes,
    maps, proteins and diseases.
  • SNP Consortium Database International HapMap
    Project Genes associated with human disease

(http//www.oxfordjournals.org/nar/database/cap/)
16
UniProt Consortium Databases
Universal Protein Resource
http//beta.uniprot.org/
(http//www.uniprot.org)
17
UniProt Sequence Report (I)
UniProtKB
Whats the difference between CRYAA_RABIT
CYRBAA?
(http//www.pir.uniprot.org/cgi-bin/unipEntry?idC
RYAA_RABIT)
Lab
18
UniProt Report (II) UniRef100 90
UniRef100
(http//www.pir.uniprot.org/cgi-bin/unipEntry?idU
niRef100_P02489)
UniRef90
(http//www.pir.uniprot.org/cgi-bin/unipEntry?idU
niRef90_P02489)
19
Entrez Gene Gene centric information
http//www.ncbi.nlm.nih.gov/entrez/query.fcgi?dbg
enecmdRetrievedoptGraphicslist_uids12954ubo
r0_RefSeq
20
OMIM Online Mendelian inheritance in man
(http//www.ncbi.nlm.nih.gov/entrez/dispomim.cgi?i
d123580)
21
II. Protein Family Databases
  • Whole Proteins
  • PIRSF Network Classification Based on
    Evolutionary Relationship of Whole Protein
  • COG (Clusters of Orthologous Groups) of Complete
    Genomes
  • PANTHER Proteins Classified into
    Families/Subfamilies of Shared Function
  • ProtoNet Automated Hierarchical Classification
    of Proteins
  • Protein Domains
  • Pfam Alignments and HMM Models of Protein
    Domains
  • SMART Protein Domain Families
  • CDD Conserved Domain Database
  • Protein Motifs
  • PROSITE Protein Patterns and Profiles
  • BLOCKS Protein Sequence Motifs and Alignments
  • PRINTS Compendium of Protein Fingerprints (a
    group of conserved motifs)
  • Integrated Family Databases
  • InterPro Integrate Pfam, PRINTS, PROSITES,
    ProDom, SMART, PIRSF, SuperFamily

22
Protein Clustering
Initial version
COGs (http//www.ncbi.nlm.nih.gov/COG/)
New version Includes Eukaryotic Clusters - KOGs
23
PIRSF Full Length Classification iProClass
Family Report
Lab
(http//pir.georgetown.edu/cgi-bin/ipcSF?idSF0022
80)
24
Domain Classification Pfam Domain
(http//www.sanger.ac.uk/cgi-bin/Pfam/swisspfamget
.pl?nameCRYAA_RABIT)
(http//pir.georgetown.edu/cgi-bin/ipcEntry?idP02
493)
25
Pfam Domain
(http//www.sanger.ac.uk/cgi-bin/Pfam/getacc?PF005
25)
26
Protein Motifs PROSITE A database of protein
families and domains. It consists of biologically
significant sites, patterns and profiles.
(http//us.expasy.org/prosite/)
27
Integrated Family Classification
  • InterPro
  • An integrated resource unifying PROSITE, PRINTS,
    ProDom, Pfam, SMART, and TIGRFAMs, PIRSF.
    (http//www.ebi.ac.uk/interpro/search.html)

Mapping of families
28
III. Databases of Protein Functions
  • Metabolic Pathways, Enzymes, and Compounds
  • Enzyme Classification Classification and
    Nomenclature of Enzyme-Catalysed Reactions
    (EC-IUBMB)
  • KEGG (Kyoto Encyclopedia of Genes and Genomes)
    Metabolic Pathways
  • LIGAND (at KEGG) Chemical Compounds, Reactions
    and Enzymes
  • EcoCyc Encyclopedia of E. coli Genes and
    Metabolism
  • MetaCyc Metabolic Encyclopedia (Metabolic
    Pathways)
  • BRENDA Enzyme Database
  • UM-BBD Microbial Biocatalytic Reactions and
    Biodegradation Pathways
  • Inter-Molecular interactions and Regulatory
    Pathways
  • IntAct Protein interaction data from literature
    and user submission
  • BIND Descriptions of interactions, molecular
    complexes and pathways
  • DIP Catalogs experimentally determined
    interactions between proteins
  • Reactome - A curated knowledgebase of biological
    pathways
  • BioCarta Biological pathways of human and mouse
  • GO Gene Ontology Consortium Database
  • Pathway Resources - Pathguide

29
Biological Pathway Resource Collection
http//www.pathguide.org/
  • Protein-protein interactions
  • Metabolic pathways
  • Signaling pathways
  • Pathway diagrams
  • Transcription factors / gene regulatory networks
  • Protein-compound interactions
  • Genetic interaction networks

30
KEGG Metabolic Regulatory Pathways
Lab
  • KEGG is a suite of databases and associated
    software, integrating our current knowledge
  • on molecular interaction networks, the
    information of genes and proteins, and of
    chemical
  • compounds and reactions. (http//www.genome.ad.
    jp/kegg/kegg2.html)

(http//www.genome.ad.jp/dbget-bin/show_pathway?hs
a002204.3.2.1)
31
BioCyc EcoCyc/MetaCyc Metabolic Pathways
  • The BioCyc Knowledge Library is a collection of
    Pathway/Genome Databases (http//biocyc.org/)

32
BioCarta Cellular Pathways
(http//www.biocarta.com/index.asp)
33
Reactome http//www.reactome.org/
  • Collaboration of CSHL, EBI and GO Consortium
  • Curated resource of core pathways and reactions
    in human biology
  • Authored by biological researchers of field
    experts
  • Cross-referenced with NCBI, Ensembl and UniProt,
    HapMap, KEGG
  • Inferred orthologous events in 22 non-human
    species (mouse, rat)

34
Transforming Growth Factor (TGF) beta signaling
Homo sapiens
(http//reactome.org/cgi-bin/eventbrowser?DBgk_cu
rrentFOCUS_SPECIESHomo20sapiensID170834)
Reactome events and objects (including modified
forms and complex)
Event -gtREACT_6879.1 Activated type I receptor
phosphorylates R-SMAD directly Homo sapiens
Object -gt REACT_7364.1 Phospho-R-SMAD
cytosol Event -gt REACT_6760.1 Phospho-R-SMAD
forms a complex with CO-SMAD Homo
sapiens Object -gt REACT_7344.1
Phospho-R-SMADCO-SMAD complex cytosol Event -gt
REACT_6726.1 The phospho-R-SMADCO-SMAD
transfers to the nucleus Object -gt REACT_7382.2
Phospho-R-SMADCO-SMAD complex nucleoplasm
35
Protein-Protein Interaction Database - IntAct
(http//www.ebi.ac.uk/intact/)
36
Gene Ontology (GO)
(http//www.geneontology.org/)
- Molecular Function - Biological Process -
Cellular Component
37
IV. Databases of Protein Structures
  • Protein Structure
  • PDB Structure Determined by X-ray
    Crystallography and NMR
  • PDBsum Summaries and analyses of PDB structures
  • MMDB NCBIs database of 3D structures, part of
    NCBI Entrez
  • SWISS-MODEL Repository Database of annotated
    protein 3D models
  • ModBase Annotated comparative protein structure
    models
  • Structure Classification
  • CATH Hierarchical Classification of Protein
    Domain Structures
  • SCOP Familial and Structural Protein
    Relationships
  • FSSP Protein Fold Classification Based on
    Structure--Structure Alignment

38
PDB Experimental 3D Structure Repository
Rat gamma-crystallin (chain A, B.)
Can you do a text search at PIR to find this
(CRGE_RAT)?
(http//www.rcsb.org/pdb/)
Lab
39
PDBsum
Pictorial Database to Provide Summary and
Analysis to PDB Entries
Search
3-D structure summary
2-D structure
(http//www.ebi.ac.uk/thornton-srv/databases/pdbsu
m/)
40
Protein Structural Classification (1)
CATH Hierarchical domain classification of
protein structures (http//www.cathdb.info/latest/
index.html)
41
Protein Structural Classification (2)
SCOP comprehensive description of structural
and evolutionary relationships between all
proteins whose structure is known.
(http//scop.mrc-lmb.cam.ac.uk/scop/data/scop.b.ht
ml)
42
SWISS-MODEL Repository
A database of annotated three-dimensional
comparative protein structure models
(http//swissmodel.expasy.org/repository/smr.php?s
ptr_acCRGE_RATjob2)
43
VI. Proteomic Resources
  • GELBANK (http//gelbank.anl.gov) 2D-gel patterns
    of species with completed genomes.
  • SWISS-2DPAGE (http//www.expasy.org/ch2d/) index
    of 2D-gels
  • PEP (http//cubic.bioc.columbia.edu/ pep/)
    Predictions for Entire Proteomes summarized
    analyses of protein sequences
  • Integr8 (http//www.ebi.ac.uk/integr8/) A
    browser for information relating to completed
    genomes and proteomes, based on data contained in
    Genome Reviews and the UniProt proteome sets
  • PRIDE (http//www.ebi.ac.uk/pride/) PRoteomics
    IDEntifications database Expression Profiling
    databases
  • GPMdb (http//gpmdb.thegpm.org/) Mass Spec
    Proteomics Databases

44
2D-Gel Image Databases
Lab
(http//us.expasy.org/ch2d/)
Part of WORLD-2DPAGE index to 2-D PAGE databases
and services
(http//us.expasy.org/swiss-2dpage/acP02489)
45
GPMdb MS Data Search
(http//gpmdb.thegpm.org/)
Craig, et al., J Proteome Res. 2004, 31234-42.
46
PRIDE centralized, standards compliant, public
data repository for proteomics data
http//www.ebi.ac.uk/pride/
47
Lab
  • Text search / Information retrieval
  • Literature search and text mining
  • Finding synonyms (BioThesaurus)
  • Information extraction (e.g., protein
    phosphorylation sites)
  • Find the sequence for the rabbit alpha crystallin
    A chain
  • Find all alpha crystallin A chain classified in
    protein families
  • Search crystallins that have active enzyme
    activities
  • Find crystallins that have determined 3D
    structures
  • Database contents (reports)
  • Sequence genomics databases (UniProt)
  • Protein family databases (PIRSF)
  • Database of protein functions (KEGG)
  • Databases of protein structures (PDB)
  • Proteomics databases (Swiss-2D)
  • Protein Examples
  • Rabbit alpha crystallin A (UniProtKB
    CRYAA_RABIT/P02493)
  • Delta crystallin II (Argininosuccinate lyase)
    (UniProtKB ARLY2_ANAPL/P24058)
  • Any additional proteins of your interest for
    search and retrieval
Write a Comment
User Comments (0)
About PowerShow.com