Title: Integration of Protein Family, Function, Structure
1iProClass Protein Knowledgebase
- Integration of Protein Family, Function,
Structure - Rich Links to gt90 Databases
- Value-Added Reports for UniProtKB Proteins
2iProClass Text Search
Search tips 1- Use not null or null to
search entries that contain or do not contain
information in the selected search field,
respectively. In the present example, we want to
search for proteins that have enzymatic activity
corresponding to EC 1.14.16.1 and have 3D
structure (PDB ID not null). 2- Use and/or/not
logical operators
Select field
Add ()/delete (-) input boxes
Search!
3iProClass Text Search Result (I)
- Things you can do from the result table
- Add search terms or start over
2. Customize the table columns
3. Save your results as table or FASTA format
4. Select entries using check boxes and perform
analysis using tool bar options
5. Links to protein records, to protein names
(BioThesaurus), to protein families (PIRSF)
1
2
3
4
5
4iProClass Text Search Result (II)
2. How to customize the table columns Display
PDB ID column
a- Select PDB ID in the Fields not in display
box
c- Now PDB ID should be in the Field in
display. Press apply button for the changes to
take place.
5iProClass Text Search Result (III)
3. Save your results as table or FASTA format
a- Select Entries using check boxes in the
Protein AC/ID column. To select all, check the
box in the column heading.
6iProClass Text Search Result (IV)
4. Select entries using checkboxes and perform
analysis using tool bar options
a- Select Entries using check boxes in the
Protein AC/ID column. To select all, check the
box in the column heading. Then select tool,
e.g., Domain Display
Domain Display shows Pfam domains present in the
proteins selected
7iProClass Text Search Result (V)
5. Links to protein records, to protein names
(BioThesaurus), to protein families (PIRSF)
Link to protein reports
Link to PIRSF report
Link to pre-computed BLAST
Link to taxonomy
Link to protein names
8iProClass Protein Report (I)
See protein synonyms
pre-computed BLAST
Rich links extensive cross-references
9iProClass Protein Report (II)
Integrated added-value information from other
databases
10iProClass Protein Report (III)
Links to different protein family classification
databases
Interactive Domain and Sequence Display
11iProClass Text Search Result (VII)
12iProClass Text Search Result (VII)
Related Sequences (pre-computed BLAST) show
proteins similar to the query, significantly
faster than running BLAST in real time, and may
also evidence tight protein clusters (related
sequence number low).
13iProClass Protein Knowledgebase
14Batch Retrieval in iProClass
Due to the diversity of databases and the lack of
consistency in protein/gene names and/or
identifiers in the literature, it can be
difficult to retrieve multiple entries when
protein and gene identifiers come from different
sources. The batch retrieval tool overcomes this
problem and provides high flexibility, allowing
the retrieval of multiple entries from the
iProClass database by selecting a specific
identifier or a combination of them.
If possible, specify the type of ID
3979833 304131 24660393
15Batch Retrieval Result Page
Retrieve more sequences
Choose columns to be displayed
Links to iProClass and UniProtKB reports
16iProClass Protein Knowledgebase
17Search a Pattern in iProClass
Pattern search at PIR allows1- The search of a
specific PROSITE or user-defined pattern against
one of the following sequence database (i)
UniProtKB is the central hub for the collection
of functional information on proteins, with
accurate, consistent, and rich annotation. It
consists of two sections a section containing
manually-annotated records (UniProtKB/Swiss-Prot),
and a section with computationally analyzed
records that await full manual annotation
(UniProtKB/TrEMBL). (ii) A subset of UniProtKB
entries belonging to a certain organism or taxon
group. (iii) UniRef100 provides clustered sets
of sequences at 100 identity from UniProtKB
(including splice variants and isoforms) and
selected UniParc records.
A pattern is a formula (regular expression) that
represents the conserved region of a group of
related proteins.
Enter pattern
P-D-x(2)-H-DE-LIVF-LIVMFY-G-H-LIVMC-PA
PROSITE is a database that contains patterns and
profiles specific for more than a thousand
protein families or domains.
Enter PROSITE ID
18Search a Pattern Result in iProClass
Display the query pattern
Sequence range where pattern is found
19Search a Pattern in iProClass
Pattern search at PIR allows2- The search of
PROSITE patterns (note that profiles are
excluded) in a query sequence, entering the
single amino acid code sequence or its unique ID.
Enter sequence
MNDRADFVVPDITTRKNVGLSHDANDFTLPQPLDRYSAEDHATWATLYQR
QCKLLPGRACDEFMEGLERLEVD
Enter ID
20iProClass Protein Knowledgebase
21Protein ID Mapping Service
Maps between UniProtKB and more than 30 other
data sources to support data interoperability
among disparate data sources and to allow
integration and querying of data from
heterogeneous molecular biology databases.
Enter IDs
Load file with ID list
22Protein ID Mapping Service
Example we want to obtain a list of Entrez Gene
IDs for a group of UniProtKB proteins
P04176 P16331 P00439 P17276
Enter IDs
IDs can be cut and pasted if needed or saved as a
text file using the "save as" option provided by
your web browser.
23iProClass Protein Knowledgebase
Cite iProClass
The iProClass Integrated database for protein
functional analysis Wu CH, Huang H, Nikolskaya
A, Hu Z, Yeh LS, Barker WC.Computational Biology
and Chemistry, 28 87-96, 2004.
iProClass Distribution
iProClass is freely available for academic
institutions. Vendors and commercial entities who
want to use and/or redistribute iProClass need to
contact PIR to request a license
(pirmail_at_georgetown.edu).