Clustering Metabolic Networks Using Minimum Cut Trees - PowerPoint PPT Presentation

1 / 1
About This Presentation
Title:

Clustering Metabolic Networks Using Minimum Cut Trees

Description:

Clusters may correspond to groups of reactions that perform a ... [1] G.W. Flake, R.E. Tarjan, K. Tsioutsiouliklis. ' Graph Clustering and Minimum Cut Trees. ... – PowerPoint PPT presentation

Number of Views:41
Avg rating:3.0/5.0
Slides: 2
Provided by: tracy49
Category:

less

Transcript and Presenter's Notes

Title: Clustering Metabolic Networks Using Minimum Cut Trees


1
Clustering Metabolic Networks Using Minimum Cut
Trees Ryan Kellogg1, Allison Heath2, Lydia
Kavraki2,3 1Carnegie Mellon University,
Department of Electrical Computer Engineering,
2Rice University, Department of Computer
Science 3Rice University, Department of
Bioengineering
Problem
Results
Method
This project is about the discovery and analysis
of clusters in metabolic networks. We implement
an algorithm for cluster detection based on
minimum cut trees, apply the algorithm to
metabolic network data, analyze the identified
clusters and discuss the biological implications.

The minimum cut tree clustering (MCTC) algorithm
proceeds as follows 1

Overview
Tuning Alpha
Clustering Algorithm
  • We seek to objectify selection of alpha in our
    analysis
  • Choose the value corresponding to clusters that
    best fit known metabolic pathway structure
  • To calculate, find intersection of average
    pathways per cluster (PPC) and average clusters
    per pathway (CPP)
  • Figure to right shows best fit alpha values for
    the four organisms in our study

1
Begin with an undirected, weighted graph G.
  • Finding clusters in metabolic networks is
    important for several reasons
  • Clusters may correspond to groups of reactions
    that perform a common function
  • Complex metabolic networks can be simplified
    based on their cluster composition
  • Insights about large-scale organization and
    evolutionary history can be achieved 3




  • Our approach is interesting because
  • One can change the size and number of clusters
    produced by adjusting a single parameter
  • The algorithm is elegant and mathematically
    robust
  • Execution is efficient and based on network flow
    computations

Motivation
2
Attach artificial sink to each node in G with
edge of weight a. Call this structure expanded
graph.
  • We obtain optimal clusterings for each of the
    four organisms and compare with known metabolic
    pathways. Matches fall roughly into four
    categories
  • Full match A cluster coincides exactly with a
    pathway.
  • Partial match A cluster is contained by but does
    not fill a pathway.
  • Multi-match A single cluster spans multiple
    pathways.
  • No match There is little discernable clustering
    in a pathway.
  • We present an example of each type

Biological Analysis
3
Compute the minimum cut tree of the expanded
graph.
4
No Match A. thaliana Reductive carboxylate cycle
The algorithm for detecting clusters is based on
a structure called a minimum cut tree 2. The
minimum cut tree T of a graph G has the property
that lowest edge weight along the path between
two nodes in T equals the minimum cut between the
same two nodes in G. Consider the following
example graph and its corresponding minimum cut
tree Explanation Suppose we are
interested in the minimum cut between nodes A and
F. The dashed red line indicates this cut, which
has capacity 17. Consequently, in the min-cut
tree, along the path between nodes A and F, the
lowest edge weight is 17.
Minimum Cut Trees
Now, remove the artificial sink from the
structure. The disconnected components are
clusters of G.
Cluster Statistics
Full Match E. coli Fatty Acid Biosynthesis
Partial Match S. cerevisiae Nucleotide sugars
metabolism
  • Interesting observations
  • Number of clusters changes with a in step-like
    fashion
  • Moderate sized clusters for only small range of a
  • Overall behavior is as expected

Conclusion and Future Work
This is a ongoing project. More analysis is
necessary to determine the extent that the MCTC
algorithm is useful for understanding metabolic
networks. Current progress is encouraging the
algorithm seems to produce biologically
meaningful clusters with reasonable efficiency.
Future work we will explore cluster detection
when pathway structure is unknown, simplified
network representations based on cluster
composition, and applications in other types of
biological networks, such as motif identification
in regulatory networks.

Data
  • We model metabolic networks using a directed,
    bipartite graph
  • One set of nodes represents compounds
  • One set of nodes represents reactions
  • Edges associate compounds with reactions
  • Metabolic networks are very complex. This model
    is a first order approximation. It relates the
  • topological information necessary for cluster
    identification.
  • Our data comes from the Kyoto
  • Encyclopedia for Genes and
  • Genomes (KEGG). We study the full
  • metabolism of four organisms
  • Saccharomyces cerevisiae
  • Arabidopsis thaliana
  • Escherichia coli
  • Homo sapiens

Metabolic Networks as Graphs
References 1 G.W. Flake, R.E. Tarjan, K.
Tsioutsiouliklis. Graph Clustering and Minimum
Cut Trees. Internet Mathematics1 385-408.
2002. 2 R.E. Gomory and T.C. Hu.
Multi-terminal Network Flows. J. Soc. Indust.
Appl. Math 9 551-571.1961. 3 P. Holme and M.
Huss. Discovery and Analysis of Biochemical
Network Hierarchies. Bioinformatics 19 532-
538. 2003. For questions or comments Allison
Heath aheath_at_cs.rice.edu
Write a Comment
User Comments (0)
About PowerShow.com