Considering protein interaction sites in coexpression networks and a tool called intersite - PowerPoint PPT Presentation

1 / 38
About This Presentation
Title:

Considering protein interaction sites in coexpression networks and a tool called intersite

Description:

The creation of the structural interaction network (SIN) data set. ... GR, Erg l A, Grimplet J, Tillett RL, Tattersall EA, Bohlman MC, Vincent D, ... – PowerPoint PPT presentation

Number of Views:58
Avg rating:3.0/5.0
Slides: 39
Provided by: johnvan3
Category:

less

Transcript and Presenter's Notes

Title: Considering protein interaction sites in coexpression networks and a tool called intersite


1
Considering protein interaction sites in
co-expression networks and a tool called intersite
  • In order to analyze a co-expression network of
    Affy probes, putative protein product structure
    was considered via number of interaction sites.
    A web based tool called Intersite was developed
    as a Group Decision Support System for annotating
    the interaction sites on proteins.

2
Background
  • Understanding Protein Function on a Genome scale
    using networks
  • First Annual Midwest Computational Biology and
    Bioinformatics Symposium at Northwestern
    University, 9/2007
  • Talk by Dr. Mark Gerstein Yale University
  • Problems with using pathway node degree
    (hub-ness) as an indicator for essentiality
  • Need to consider interact-ability
  • via number of interaction sites on a protein
    product
  • Showed more correlation between essentiality
    (slower evolution rate) and network between-ness
    than between essentiality and node degree

3
Background
  • Yeates, Todd O and Beeby, Morgan.
  • Proteins in a Small World.
  • Science 22 Dec 2006 Vol. 314. no. 5807, pp. 1882
    1883

4
Background
  • Kim PM, Lu LJ, Xia Y, Gerstein MB.
  • Relating three-dimensional structures to protein
    networks provides evolutionary insights.
  • Science. 2006 Dec 22 314(5807)1938-41.

Fig. 1. The creation of the structural
interaction network (SIN) data set. All
interactions from the filtered protein
interaction data set are mapped to Pfam domains
(30). The Pfam domains are mapped to known
structures of protein interactions by means of
iPfam (31). Only those interactions in which both
interaction partners (or a homologous domain of
either) can be found in a 3D structure of a
protein complex are kept. All interactions are
then classified into mutually exclusive and
simultaneously possible by 3D structural
exclusion. When a protein has more than one
simultaneously possible interaction, the number
of interaction interfaces is counted.
5
The Idea
  • Facilitate the same analysis for other networks.
  • ie, coexpression networks
  • Incorporate human decision making
  • Incorporate genomic/primary sequence data

6
A co-expression network
  • Cramer GR, Ergül A, Grimplet J, Tillett RL,
    Tattersall EA, Bohlman MC, Vincent D, Sonderegger
    J, Evans J, Osborne C, Quilici D, Schlauch KA,
    Schooley DA, Cushman JC.
  • "Water and salinity stress in grapevines early
    and late changes in transcript and metabolite
    profiles.
  • Funct Integr Genomics. 2007 Apr7(2)111-34. Epub
    2006 Nov 29.
  • Data available as experiment VV2 on plexdb.org
  • 4 timepoints
  • X 3 treatments
  • X 3 replicates
  • 3 controls at time 0
  • 39 Affy hybridizations with RMA normalization
  • X 16,602 Affy grape probes

7
A co-expression network
  • A program was written that, among other things
  • Built a correlation network based on an
    arbitrarily set cutoff value for edge assignment
  • Thinned the network by dropping those edges
    associated with first and second order partial
    correlations not significantly different from
    zero
  • according to de la Fuente, et al.
    Bioinformatics. Vol. 20, No. 18. Pp. 3565-3574,
    2004.

8
A co-expression network
9
A co-expression network
  • Zero-order Pearson correlation networks were
    constructed with the following characteristics

10
A co-expression network
  • Related networks were created, with edges removed
    which are associated with a first-order partial
    correlation not significantly different from
    zero.

11
Probe ID -gt a protein product
  • Map file from Wengang Zhou
  • Difficult problem!

OneMany
ManyMany
Grape Probe ID
AT Gene Name
UniProt AC
12
Interactions?
  • Kim el al. used
  • Pfam
  • Protein information based on a given network
  • Finn, et al. Pfam clans, web tools and
    services Bioinformatics. 35D 2005.
  • iPfam
  • A database within a database containing
    pairwise interaction information between proteins
    and even their respective residues.
  • Finn, Marhsall, Bateman. iPfam visualization
    of protein-protein interactions in PDB at domain
    and amino acid resolutions. Bioinformatics. 21
    3 2005.
  • But

13
Im on my own
14
Incorporate human decision making
  • Group Decision Support Systems
  • Electronically integrated decision help conducted
    by and for many people at once.
  • InterSite
  • a Group Decision Support System for the
    annotation and classification of proteins based
    on their number and types of interaction sites

15
Intersite Database Schema
16
Intersite Data Flow Diagram
Login
Enter UniProt ACs
17
Fetch and store GenBank data
18
Incorporate human decision making
19
Logged in
20
View Proteins
21
Map External IDs
22
Application to smallest VV2 network
23
Incorporate genomic data Site Similarity Network
Site similarity is also used to assist decision
making
24
Annotate/Classify
25
GenBank File doesnt know it all
26
User-defined sites
27
Try it
  • Goto http//lab.bcb.iastate.edu/sandbox/jlv/inters
    ite/
  • Log in
  • Use the file linked with VV2 UniProt ACs
  • Paste in the top 20 Acs
  • Put on your Biochemistry hat and vote!
  • Use the tools to help make a decision
  • EBI fetch
  • PIRSF scan
  • UniProt record
  • GenBank file

28
Results Number of Interactions
29
Results Number of Interactions
30
Result Site similarity network
31
Results Site similarity network
32
Results Site similarity network
33
Results Site similarity network
34
Results Site similarity network
35
Results Site similarity network
36
Results Site similarity network
37
Discussion
  • This project was an interesting exercise in
    utilizing multiple Bioinformatics tools (and
    creating some new ones) in order to ask and
    answer questions relating sequence and structure.
    Future work may include separate analysis of the
    groups of VV probes separated by their associated
    proteins numbers of interaction sites. Also,
    like in 1, between-ness, closeness, hub-ness,
    and hierarchy level of the proteins in the
    pathways can also be calculated.
  • A useful outcome from the project is the ability
    to build protein networks based on interaction
    site similarity and visualize such clusters. For
    the data used here, nucleus, membrane, and
    secreted proteins share unique interaction site
    sequences. This is probably due to the fact that
    nucleus proteins must be delivered to the nucleus
    using signal sequences and special chaperone
    interaction sites. The same is true for secreted
    proteins and membrane proteins. Further analysis
    could consider the rest of the locations
    specified for proteins with multiple subcellular
    location values.
  • Availability
  • Intersite is available online at
    http//lab.bcb.iastate.edu/sandbox/jlv/intersite
    . More details on this project and all the
    scripts used can be found at http//www.public.ias
    tate.edu/jlv/intersite.shtml

38
References
  • 1 Gerstein, Mark. Presentation.
    Understanding Protein Function on a Genome scale
    using networks. First Annual Midwest
    Computational Biology and Bioinformatics
    Symposium, Northwestern University.
  • 2 Kim PM, Lu LJ, Xia Y, Gerstein MB. Relating
    three-dimensional structures to protein networks
    provides evolutionary insights. Science. 2006
    Dec 22 314(5807)1938-41.
  • 4 Finn, et al. Pfam clans, web tools and
    services Bioinformatics. 35D 2005.
  • 3 Finn, Marhsall, Bateman. iPfam
    visualization of protein-protein interactions in
    PDB at domain and amino acid resolutions.
    Bioinformatics. 21 3 2005.
  • 5 Cramer GR, Ergül A, Grimplet J, Tillett RL,
    Tattersall EA, Bohlman MC, Vincent D, Sonderegger
    J, Evans J, Osborne C, Quilici D, Schlauch KA,
    Schooley DA, Cushman JC. "Water and salinity
    stress in grapevines early and late changes in
    transcript and metabolite profiles. Funct
    Integr Genomics. 2007 Apr7(2)111-34. Epub 2006
    Nov 29.
  • 6de la Fuente, et al. Bioinformatics. Vol. 20,
    No. 18. Pp. 3565-3574, 2004.
  • 8 Wu, Cathy H, et al. The iProClass
    integerated database for protein functional
    analysis. Computational Biology and Chemistry.
    28 (2004) 87-96.
  • 9 Liu, Hongfang, et al. BioThesaurus a
    web-based thesaurus of protein and gene names.
    Bioinformatics. 22 1 2006. pp103-105.
Write a Comment
User Comments (0)
About PowerShow.com