PIRSF Classification System - PowerPoint PPT Presentation

About This Presentation
Title:

PIRSF Classification System

Description:

... the button will show the PIRSF hierarchy in a DAG view with Pfam as the top node. 5. PIRSF hierarchy in DAG view (cont.) Pfam level. Hfam level. Subfam level ... – PowerPoint PPT presentation

Number of Views:25
Avg rating:3.0/5.0
Slides: 18
Provided by: wuc
Category:

less

Transcript and Presenter's Notes

Title: PIRSF Classification System


1
PIRSF Classification System
Protein Classification and Functional Annotation
Discovery of New Knowledge by Using Information
Embedded within Families of Homologous Sequences
and Their Structures
  • PIRSF Evolutionary relationships of proteins
    from super- to sub-families
  • Homeomorphic Family Homologous proteins sharing
    full-length similarity and common domain
    architecture
  • Significance
  • Improve sensitivity of protein identification and
    functional inference
  • Detect and correct genome annotation errors
    systematically
  • Provide basis for evolutionary and comparative
    genomics research
  • Provide basis for automated annotation of protein
    features annotate generic biochemical and
    specific biological functions

2
A protein may be assigned to only one
homeomorphic family, which may have zero or more
child nodes and zero or more parent nodes. Each
homeomorphic family may have as many domain
superfamily parents as its members have domains.
3
Creation and Curation of PIRSFs
  • Computer-Generated (Uncurated) Clusters
  • Preliminary Curation
  • Membership
  • Signature Domains
  • Full Curation
  • Family Name, Description, Bibliography
  • PIRSF Name Rules

4
PIRSF family classification system
http//pir.georgetown.edu/pirwww/dbinfo/pirsf.shtm
l
5
PIRSF Text Search
Ways to get to PIRSF text search
Add extra input boxes for advanced search
Select field
6
PIRSF Text Search Result (I)
  • Things you can do from the result table
  • Add search terms or start search over

2. Customize the table columns
3. Save your results as table or FASTA format
4. Select entries using check boxes and perform
analysis using tool bar options
5. Links to PIRSF records, PIRSF hierarchy, to
protein domains (Pfam)
1
2
3
4
5
7
PIRSF Text Search Result (II)
2. How to customize the table columns Display
KEGG pathway ID column
a- Select KEGGPathway ID in the Fields not in
display box
c- Now KEGG ID should be in the Fields in
display. Press apply button for the changes to
take place.
8
PIRSF Text Search Result (III)
3. Save your results as table or FASTA format
a- Select Entries using check boxes in the PIRSF
column. To select all, check the box in the
column heading.
9
PIRSF Text Search Result (IV)
4. Select entries using checkboxes and perform
analysis using tool bar options
a- Select families using check boxes in the PIRSF
ID column. To select all, check the box in the
column heading. Then select tool, e.g., Taxonomy
Distribution
Display taxonomic distribution for the selected
families. In this case, PIRSF001501 and
PIRSF017318 contain members of the AroQ class
from prokaryotes and eukaryotes, respectively,
which is also reflected in the family name.
10
PIRSF Text Search Result (V)
  • Note on selecting families for analysis for
    Multiple Alignment and Domain Display
  • If more than one family is selected the chosen
    tool will perform the operation on representative
    members of the selected families. Example
    multiple alignment PIRSF001501, PIRSF500251,
    PIRSF026640 and PIRSF029775.
  • If one family is selected the chosen tool will
    perform the operation on the seed members.
    Example multiple alignment PIRSF001501

11
PIRSF Text Search Result (VI)
5. The result table contains summarized
information about family size, domain
architecture, level of curation. Additional data
can be viewed by using the Display Option.
PIRSF Name The names assigned to PIRSF
predominantly reflect the membership.  The main
source of PIRSF names is the literature. Fully
curated families have a name accompanied, in most
cases, by an evidence tag Validated to
indicate that at least one member in the family
has experimentally determined function.
Predicted for families whose functions are
inferred computationally based on sequence
similarity and/or functional associative
analysis. Tentative cases where experimental
evidence is not decisive.
Curation Status Indicates the level of manual
curation of the PIRSF. Uncurated
Computer-generated protein clusters, no manual
curation. The clusters are computationally
defined using both pairwise based parameters (
sequence identity, sequence length ratio and
overlap length ratio) and cluster-based
parameters ( matched members, distance to
neighboring clusters and overall domain
arrangement).Preliminary Computer-generated
clusters are manually curated for membership (do
proteins belong to the assigned cluster?) and
domain architecture (Pfam domains listed from N-
to C- termini). Full/Full (with description) A
name is assigned to the protein family, and
accompanying references are listed when
available. In many cases, brief descriptions are
also provided.
Hfam/Superfam/Subfam Indicates the hierarchical
level for the PIRSF homeomorphic, superfamily or
subfamily level, respectively. Selecting the
button will show the PIRSF hierarchy in a DAG
view with Pfam as the top node.
12
5. PIRSF hierarchy in DAG view (cont.)
Pfam level
Hfam level
Subfam level
13
PIRSF Family Report (I) Curated Protein Family
Information
Level of manual curation
14
PIRSF Family Report (II)
Integrated value-added information from other
databases
Mapping to other protein classification databases
15
PIRSF Batch Retrieval
Retrieve PIRSF families by selecting a specific
identifier or a combination of identifiers.
Define IDs
Display the list of query/PIRSF matches
List IDs
16
PIRSF SCAN (sequence search)
17
PIRSF SCAN (sequence search)
Returns only matches to fully curated PIRSFs
UniProtKB sequence Q8Y5X7 is automatically
classified as chorismate mutase of the AroH
class PIRSF005965
Write a Comment
User Comments (0)
About PowerShow.com