Title: A GraphTheoretic Method for Mining Functional Modules in Large Sparse Protein Interaction Network
1A Graph-Theoretic Method for Mining Functional
Modules in Large Sparse Protein Interaction
Network
- Zhang et al.
- Presented by Youngik Yang
2Motivation
- Network modules do not occur by chance,
identification of modules is likely to capture
the biologically meaningful interaction
3Existing Methods
- Hierarchical clustering methods
- Prediction Methods
- Combinde methods (enumeration of complete
sub-graphs, superparamagnetic clustering and
Monte Carlo simulation)
4Problem on existing methods
- Partition algorithms each protein belongs to
only one specifiic module not suitable for
finding overlapping modules. - PPI networks are very sparse, while most methods
only identify strongly connected subgraphs as
modules, so only a few modules were detected
5Real Networks Figures from Palla et al, Nature,
2005
6(No Transcript)
7Approach CPM LGT
- Clique Percolation Method CPM based on clique
can reveal overlapping module structure of
complex networks. - Shortcoming a 3-clique structure.
- the spoke-like module can not be detected
- only a few modules can be detected in large
sparse PPI network (fly, yeast, worm, etc). - Line Graph Transformation (LGT) is introduced to
overcome the shortcoming
8Data
- Data collected from various sources such as
MIPS, PreBIND, BIND, GRID, and released papers
from Nucleic Acids Research and Science - Preprocessing - remove self-interactions and
duplicated interactions
9Procedure
- Step1. Compute line graph L(G)
- Step 2. Apply CPM on the L(G)
- Step 3. Resulting modules in L(G) are transformed
back to modules in G - Step 4. Merge two heavily overlapped modules
10Method Illustration
11Clique Percolation Method (CPM)
- k-clique community
- a union of all k-cliques
- a series of adjacent k-cliques (where adjacency
means sharing k-1 nodes) - k-clique community can be considered as a usual
module because of its dense internal linkage and
sparse external linkage with other part of the
whole network.
12(No Transcript)
13Line Graph Transformation (LGT)
- Problem on CPM
- Too restrictive to detect proper modules in
sparse networks. E.g. spoke-like modules can not
be detected. - Transform nodes into edges.
- Retain information of the original network
- more highly structured than the original network.
So it is much more convenient than directly using
clique percolation clustering.
14Reverse Transformation (RLG)
- Edges in G which correspond to the nodes of a
module in L(G) will form a subnet of the original
network G - Add the lost edges within the nodes of the subnet
to form modules in the original PPI network.
15Post-processing
- Merging is executed for two modules which have a
large overlap
16(No Transcript)
17Validation of protein complexes
- P-value the probability that a given set of
proteins is enriched by a given functional group
merely by chance - , where module M contains k proteins in a
function category F, and the PPI network contains
N proteins
18Cont.
- By minimizing the probability Pol of a random
overlap between a computational group and an
experimental group, we can determine the
best-matching experimental complex for a module. - , where C, M are the sizes of an experimental
complex and a computed module respectively
19Results
20Proteins in the same module have the same
localization
21Functional annotation of network modules
22Matching with experimentally determined complexes
Cellular complexes (550.1.136)
Coat complexes II (260.30.20)
Membrane complex (290.10)
23(No Transcript)
24Statistical properties of overlapping modules