Remembrance of - PowerPoint PPT Presentation

1 / 111
About This Presentation
Title:

Remembrance of

Description:

Remembrance of – PowerPoint PPT presentation

Number of Views:64
Avg rating:3.0/5.0
Slides: 112
Provided by: budmi
Learn more at: https://cs.nyu.edu
Category:
Tags: remembrance | yad

less

Transcript and Presenter's Notes

Title: Remembrance of


1
Remembrance of Experiments Past
2
Bud Mishra
  • Professor of Computer Science, Mathematics and
    Cell Biology
  • Courant Institute, NYU School of Medicine, Tata
    Institute of Fundamental Research, and Mt. Sinai
    School of Medicine

3
Ludwig Joseph Johann Wittgenstein
4
When Wittgenstein told Russell that he wanted to
leave Cambridge and go to Norway, Russell tried
to dissuade him I said it would be dark, and he
said he hated daylight. I said it would be
lonely, and he said he prostituted his mind
talking to intelligent people. I said he was mad,
and he said God preserve him from sanity. (God
certainly will.)
5
Wittgenstein Brown Book
  • Augustine, in describing his learning of
    language, says that he was taught to speak by
    learning the names of things
  • Suppose a man describes a game of chess, without
    mentioning the existence and operations of the
    pawns. His description of the game as a natural
    phenomenon will be incomplete. On the other hand,
    we may say that he has completely described a
    simpler game.

6
An Augustinian Game
  • Four Characters Analyzed using Mathematical Logic
  • Can this be applied to Time-Series
    Gene-Expression Data?

7
Female, Sings, Overweight
X1 X2 X3 X4
Male, Talks, Thin
Male, Sings, Overweight
Female, Sings, Underweight
8
X4
X3
X2
X1
X3
X1
X4
X2
9
X4
X3
X2
X1
X3
X1
X4
X2
10
X4
X3
X2
X1
X3
X1
X4
X2
11
X2 X4
X3 X4
X1 X2
Finale
X3
X1
12
X2 X4
O, FLS
X3 X4
X1 X2
Finale
X3
O, FLS
O, FLS
O
O, FLS
X1
O, FLS
13
X2 X4
O, FLS O UFLS
X3 X4
X1 X2
Finale
X3
O, FLS O UFLS
O, FLS
O
O, FLS
X1
O, FLS O UFLS
14
X2 X4
O, FLS O UFLS
X3 X4
X1 X2
Finale
X3
O, FLS O UFLS
O, FLS O UFLS
O
O, FLS
X1
O, FLS O UFLS
15
It aint over til the fat lady sings
X2 X4
O, FLS O UFLS
X3 X4
X1 X2
Finale
X3
O, FLS O UFLS
O, FLS O UFLS A O UFLS
O
O, FLS
X1
O, FLS O UFLS
16
Augustine
  • The good Christian should beware of
    mathematicians and all those who make empty
    prophecies. The danger already exists that
    mathematicians have made a covenant with the
    devil to darken the spirit and confine man in the
    bonds of Hell.

17
A Picture
Biological System Part-List Functional
Relations
Measurements
Regulatory, Metabolic Signaling Relations
Recomputation
Redescription
Descriptive Relation
Numerical Traces
KRIPKE MODELS
Invariants
18
Typical Analysis
  • Time-Course data set
  • Example
  • Fibroblast response to Serum
  • Iyer et al, Science (1999)
  • Cluster by global expression pattern
  • Manually look for genes of interest

19
GOALIEGene Ontology Algorithmic Logic for
Information Extraction
  • Assign biological vocabulary to expression data
  • Piece together into Kripke-structure model
  • Support query, inference, and comparative
    assessment tasks
  • Present descriptive summaries

20
Examples
21
Reconstructing Temporal Invariants
  • GOALIE aims to reconstruct temporal invariants
    from time course experiments
  • Several illustrations follow

22
Example Datasets
  • Spellman cell cycle dataset (4 cycles alpha
    factor, Cdc15, elutration, Cdc28)
  • Tu et al. dynamic organization of cell cycle (3
    cycles of 12 time points each)
  • Results show significant potential to recover
    common trends in cell cycle control
  • 18 functions common to all Spellman cycles
  • 67 functions common to all Tu cycles
  • 16 functions common to all

23
Invariants
G0
G1(I)
M
G1(II)
S
G2
24
Example Gantt Chart
25
Yeast-Cell Cycle DataSpellman et al.
26
Chasing Time
27
Reconstructing Temporal Models
  • Cluster time windows to compress gene expression
    values but preserve information about category
    membership
  • Clusters are modeled using Gaussian/vMF
    distributions category memberships are modeled
    multinomial
  • Stochastic approximation algorithm for
    iteratively re-assigning genes to clusters

28
Host-Pathogen Interaction
  • Pre-Apoptosis 6 time points data analysis
  • Six time-point data at 2h, 4h, 6h, 8h, 12h, 24h
  • Kidney cells treated with SEB (anthrax)
  • Control untreated cells
  • Data from Jett-Lab (Walter-Reed)

29
Hypothesized pathway
30
Clusters by P.C., TJU
31
GOALIE GO Algorithmic Logic for Invariant
Extraction
Clusters connection treeEach level a window
Micro-array accessions
GO categories
Cluster Information
Connection information
Clusters information
32
GOALIE GO Algorithmic Logic for Invariant
Extraction
GO categories describing source cluster but not
destination
GO categoriesdescribing destination cluster
but not source
GO categories shared with destination cluster
GO categories describing genes in source cluster
33
GOALIE SEB Analysis Preliminary Results
  • Time Course Window 1 to Time Course Window 2
    Connection 19 to 218.By inspecting the first
    cluster in the first window (Cluster19), we
    note that one of the connection to the cluster2
    in the second window (Cluster218) is labeled
    (among many others) by the GO categories
    circulation (GO0008015), and by the category
    negative regulation of heart rate (GO0045822).
    This represents a constant set of biological
    processes shared by this cluster chain,
    traversing Cluster 317, to Cluster 413.
  • Time Course Window 1 to Time Course Window 2
    Connection 19 to 26.The connection between
    Cluster 19 and Cluster 26 is interesting
    because it shows how the category regulation of
    lymphocyte proliferation (GO0050670) becomes
    activated in the next time-window (Cluster 26),
    while the categories antigen presentation and
    antigen processing became inactive. This should
    indicate that some of the genes in the clusters
    start a response to the pathogen in the second
    time point.
  • Goalie movies\GOALIEshort.avi

34
Framework Outline
  • Language
  • Ontology
  • Controlled Vocabulary
  • Logic Models
  • Kripke Structure
  • Temporal Logic
  • Model Checking
  • Model Building
  • Model Building Hidden Kripke Models (HKM)
  • Information Bottleneck
  • Invariants Redescriptions
  • Labeling with Propositions
  • Statistical Significance
  • Examples
  • Yeast Cell Cycle
  • Host-Pathogen Interaction
  • Life Cycle of a Parasite
  • Cancer Initiation and Progression
  • Implementation

35
Language Ontology
36
The Gene Ontology (GO) Consortium
  • Ashburner et al. Nature Genetics 25 25-29.
    http//www.geneontology.org
  • GO provides three structured networks of defined
    terms to describe gene product attributes
  • Molecular Function Ontology (7304 terms as of
    April 5, 2004) the tasks performed by
    individual gene products examples are
    carbohydrate binding and ATPase activity
  • Biological Process Ontology (8517 terms) broad
    biological goals, such as mitosis or purine
    metabolism, that are accomplished by ordered
    assemblies of molecular functions
  • Cellular Component Ontology (1394 terms)
    subcellular structures, locations, and
    macromolecular complexes examples include
    nucleus, telomere, and origin recognition complex

37
Gene Ontology
  • Systematic Hierarchical

38
  • Note that GOALIE is not intimately tied to any
    particular ontology. It can be customized to work
    with other ontologies e.g., MeSH, UMLS, MetaCYC,
    Reactome
  • Thus GOALIE also provides a way to compare
    different ontologies

From the GO web site. The path back to each
ontology from a gene. We will call each term in
a path a split.
39
Redescription
  • A redescription mining algorithm was used to
    relate concerted clusters of gene expression to
    membership in GO taxonomical categories. Prior
    work (e.g., host-pathogen interactions in plant)
    has shown how this algorithm identifies
    functionally coherent subsets of genes, yielding
    testable biological hypotheses.

40
Logic Model Checking
41
Task Verify Temporal Properties of a Reactive
System
  • Formally encode the behavior of the system

42
Kripke Structure
  • Formal Encoding of a Dynamical System
  • Simple and intuitive pictorial representation of
    the behavior of a complex system
  • A Graph with nodes representing system states
    labeled with information true at that state
  • The edges represent system transitions as the
    result of some action

43
Kripke Structure Example
  • There is a finite set of states
  • Some are initial states
  • Total transition relation every state has at
    least one next state i.e. infinite paths
  • There is a set of basic environmental variables
    or features (atomic propositions)
  • In each state, some atomic propositions are true

44
Task Verify Temporal Properties of a Reactive
System
  • Formally encode the behavior of the system
  • Formally encode the properties of interest

45
Temporal Logic
  • First Order Logic Time is an explicitly
    quantified variable
  • Propositional Modal logic was invented to
    formalize modal notions and suppress the
    quantified variables with operators possibly
    P and necessarily P (similar to eventually
    and henceforth)
  • Temporal Logic
  • Short hand for describing the way properties of
    the system change with time
  • Time is implicit

46
Branching versus Linear Time
  • Linear-time Only one possible future in a moment
  • Look at individual computations
  • Branching-time It may be possible to split to
    different courses depending on possible futures
  • Look at the tree of computations

Time is Linear
Time is Branching
47
Computation Tree Logic (CTL)
  • Branching Time temporal logic interpreted over
    an execution tree where branching denotes
    non-deterministic actions
  • Explicitly quantify over two modes the path and
    the time
  • Each time we talk about a temporal property, we
    also specify whether it is true on all possible
    paths or whether it is true on at least one path
    - Path quantifiers
  • A for all future paths
  • E for some future path

48
Semantics for CTL
  • For p?AP
  • s ² p ? p ? L(s) s ² ?p ? p ? L(s)
  • s ² f Æ g ? s ² f and s ² g
  • s ² f Ç g ? s ² f or s ² g
  • s ² EX f ? ? ?hs0s1... i from s s1 ² f
  • s ² E(f U g) ? ? ? hs0s1... i from s
  • ?j?0 sj ² g and ?i 0? i ?j
    si ² f
  • s ² EG f ? ? ? hs0s1... i from s ?i ? 0 si ² f

49
Some CTL Operators
AF g
EG g
EF g
AG g
50
CTL
  • A path quantifier can be followed by an arbitrary
    number of temporal operators
  • There are properties expressible in CTL but not
    in LTL and vice-versa
  • LTL, CTL are contained in CTL

51
Task Verify Temporal Properties of a Reactive
System
  • Formally encode the behavior of the system
  • Formally encode the properties of interest
  • Automate the process of checking if the formal
    model of the system satisfies the formally
    encoded properties

52
CTL Model-Checking
  • Straight-forward approach Recursive descent on
    the structure of the query formula
  • Label the states with the terms in the formula
  • Proceed by marking each point with the set of
    valid sub-formulas
  • Global algorithm
  • Iterate on the structure of the property,
    traversing the whole of the model in each step
  • Use fixed point unfolding to interpret Until

53
Naïve CTL Model-Checker
54
Other Model Checking Algorithms
  • LTL Model Checking Tableu-based
  • CTL Model Checking Combine CTL and LTL Model
    Checkers
  • Symbolic Model Checking
  • Binary Decision Diagram
  • OBDD-based model-checking for CTL
  • Fixed-point Representation
  • Automata-based LTL Model-Checking
  • SAT-based Model Checking
  • Algorithmic Algebraic Model Checking
  • Hierarchical Model Checking

55
Complexity Comparison
  • Size of transition system n
  • Size of temporal logic formula m
  • Worst Case Comparison
  • CTL linear - O(nm)
  • LTL exponential n 2O(m)
  • For an LTL formula that can also be expressed in
    ?CTL, LTL model-checking can be done in a time
    linear in the size of the formula
  • LTL is PSPACE complete Hamiltonian Path problem
    can be reduced to an LTL Model Checking problem
  • Fp1 Æ Fp2 Æ Fp3 Æ..
  • G (p1! XG p1) Æ G(p2! XG p2) Æ.

56
Story generation
  • Temporal Logic formulae can be rendered in
    English.
  • Temporal Logic formulae can be generated
    automatically (with care).
  • Each formula can be tested against a set of
    datasets differences can then be noted.

57
Cell Cycle Story Generation Results (HTML
rendering)
  • Report on "Test Experiment WT, 1 Mutant, 2
    Mutants.
  • The results refer to the following datasets
  • The first dataset is named "Experiment/Yeast
    Dataset WT".
  • The second dataset is named "Experiment/Yeast
    Dataset Mut1".
  • The third dataset is named "Experiment/Yeast
    Dataset mut2".
  • CDH1 less than or equal to 1.0071783 will
    always hold until CDH1 activates CYCB, is true
    in the first dataset, is true in the second
    dataset, and is false in the third dataset.
  • CDH1 represses CYCB implies CYCB is greater than
    or equal to 0.65, is false in the first
    dataset, is true in the second dataset, and is
    true in the third dataset.
  • eventually, CDH1 is less than or equal to CYCB,
    is false in the first dataset, is true in the
    second dataset, and is true in the third
    dataset.

58
Model Building
59
Hidden Kripke Model
  • Hidden Kripke Model
  • Reconstruction via ontology based redescription
    of time-sliced clusters of time-course
    measurements (arrays)
  • Information Bottle Neck Parsimony
  • Example Kripke Models
  • Spellmans Yeast Cell Cycle
  • SEB host-pathogen data from WRAIR
  • P. falciparum dataset Bozdech et al, 1(1)085
  • Subset of Genome Module Map dataset Segal et al

60
Information Bottleneck
  • The computational principle uses Information
    Bottleneck Theory
  • Di the random variable samples are the rows in
    submatrix Di
  • A sample corresponds to a gene expression vector
    of size (n - k).
  • Using the GO biological process assignments for
    each gene, we posit a joint distribution P(DiO)
    where O is a second random variable whose sample
    space is the process ontology.
  • Cluster Di into disjoint clusters (without using
    O)Effectively, identifying a third random
    variable Xi such that the mutual information
    between Di and Xi I(DiXi) is minimized.
  • Further, relate the conditional distribution P(O
    Xi) with
  • P(O Xi1), 1 i lt k, and with P(O
    Xi-1), 1 lt i k. In the information bottleneck
    framework, the mutual information terms I(O XiO
    Xi1) and I(O XiO Xi-1) must be
    simultaneously maximized.

61
Information Bottleneck
  • To construct the Hidden Kripke Model, the
    clusters and cluster-edges must optimize the
    mutual information terms
  • A variational approach with a Lagrange multiplier
    that captures the tradeoff between these
    objectives
  • minimize
  • I(DiXi) - b1 I(O Xi O Xi1) - b2 I(O XiO
    Xi-1)
  • Notice that, conditional on Di, O is independent
    of Xi. This suggests an EM-style alternating
    algorithm
  • First cluster each Di, identify connections
    across clusters in neighboring time points
  • Use these connections to derive new constraints
    on clustering, and re-cluster.

62
GOALIE Data Flow
63
Invariants Redescriptions
64
Statistical Tests
  • How do we associate a term (a GO category) to a
    cluster?
  • Fisher Exact Test
  • Used to determine whether or not nonrandom
    associations between two categorical variables
    exist
  • Null hypothesis generally assumes that there is
    no association
  • Actually a multivariate generalization of the
    hypergeometric test
  • Results of the test are p-values
  • Binomial Test
  • Used for large data sets
  • Results are approximated p-values
  • Previous works
  • GOminer, GOStats, CLASSIFI, GeneMAPP, FATIGO

65
Implementation of Fishers Exact Test
  • Let there exist an m n observation matrix with
    entries aij
  • sum both the columns and rows and calculate
  • Add these sums to attain the total sum (first
    figure)
  • Calculate the conditional probability of getting
    the actual matrix given the particular row and
    column sums (second figure)

66
Implementation Continued
  • Once the conditional probability is found, all
    possible matrices of non-negative integers
    consistent with the row and column sums must be
    found
  • For each new matrix, the associated conditional
    probability must be calculated using the previous
    formula, where the sum of these probabilities
    must be 1
  • To compute the p-value of the test, the tables
    must then be ordered by some criterion that
    measures dependence
  • Those tables that represent equal or greater
    deviation from independence than the observed
    table are the ones whose probabilities are added
    together to determine the p-value.
  • This p-value is then compared to the original
    alpha-level to determine statistical significance

67
The Binomial Test and Drawbacks of the Fisher
Exact Test
  • For large mn matrices, the Fisher Exact test
    becomes unwieldy and incredibly difficult to
    compute
  • The binomial test provides a substitute for this
    test in the presence of large amounts of data
  • The test measures whether the proportion of two
    categorical dependent variables significantly
    differs from a hypothesized proportion
  • The result is only an approximate p-value, but
    requires less computational time

68
Controls over the False Discovery Rate
  • In order to avoid a lot of spurious positives,
    the a-level needs to be lowered to account for
    the number of comparisons being performed.
  • Corrections to avoid a large amount of type II
    errors
  • The Bonferroni correction, a simple yet highly
    conservative approach
  • The Benjamini-Hochberg Procedure

69
FDR Controls Continued
  • Benjamini and Hochberg Correction
  • This correction is becoming increasingly popular
    and provides just one alternative to the
    Bonferroni Correction
  • It provides strong control over the rate of false
    discovery under positive regression dependency of
    the null hypothesis

70
The Benjamini-Hochberg Procedure
  • Consider testing a set of hypotheses H1,.Hm
    based on corresponding p-values P1,,Pm
  • Also, impose an ordering on the p-values such
    that P1 P2 . Pm and denote by H(i) the null
    hypothesis corresponding to P(i)
  • Define the following multiple testing procedure
  • Let k be the largest i for which P(i) lt (i/m)q
    where q is some predefined limit
  • Then reject all H(i) i 1, 2, , k
  • For independent test statistics and for any
    configuration of false null hypotheses, the above
    procedure controls the false discovery rate at q

71
GOALIE Architecture
The overall GOALIE architecture. Several Common
Lisp modules have been developed to take care of
bits and pieces of the system. Several libraries
were also used in the process. Core is ANSI,
User Interface is CAPI/Lispworks.
72
Other Genomic Data Analysis Tools
73
BiNGO
  • Java based tool that works in conjunction with
    Cytoscape, a software platform for visualizing
    molecular interaction networks
  • determines statistically overrepresented
    categories in a set of genes
  • Allows for two statistical tests
  • Hypergeometric test (nearly equivalent to the
    Fisher Exact Test)
  • Binomial test
  • Currently allows for only the most widely
    used/basic test corrections
  • Bonferroni correction
  • Benjamini Hochberg correction

74
BiNGO
  • BiNGO produces a color-coded graph which
    visualizes the GO categories that were found
    significantly over-represented
  • Size of the nodes is proportional to the number
    of genes in the test set which are annotated to
    that node
  • The color of the node represents the strength of
    the p-value
  • White not significantly over-represented
  • All others are colored on a scale ranging from
    yellow to dark orange
  • Dark orange represents p-levels that are 5 orders
    of magnitude smaller than the chosen significance
    level

75
Other Aspects of BiNGO
  • Allows for both Standard and Custom Annotations
    and Ontologies
  • Is able to provide both GO and GOSlim ontologies
  • Graphs can be saved and manipulated

76
GoMiner
  • A program package that calculates the enrichment
    or depletion of categories with genes that have
    changed expression
  • Takes as imput two lists
  • The total set on the array
  • The subset that the user flags as interesting
  • Displays the genes within the framework of the
    Gene Ontology hierarchy as a directed aclyclic
    graph and a tree structure

77
GoMiner
  • Incorporates several external data resources
    including LocusLink, PubMed, MedMiner, GeneCards,
    the NCBIs Structure Database, and BioCarta
  • Utilizes a two-sided Fisher Exact Test to
    determine the association between categories
  • Null Hypothesis for each category, there is no
    difference between the proportion of flagged
    genes that fall into the category and the
    proportion of flagged genes that do not fall into
    the category
  • Currently uses the Bonferroni Correction but are
    working on other approaches

78
CLASSIFI (Cluster Assignment for Biological
Inference)
  • Data-mining tool that identifies significant
    co-clustering of genes with similar functional
    properties
  • Uses a Hypergeometric Distribution Test to
    determine statistical significance
  • Orders ontologies based on the p-values
    determined from the test
  • Raw data must already be filtered, normalized,
    and clustered

79
CLASSIFI
  • Does not work well on small data sets
  • User must perform p-value corrections
  • Authors suggest the Bonferroni Correction
  • Generates 3 files
  • A GO file that contains the results from the
    automated annotation
  • An output file that contains all of the
    enumerated variables that were used in the
    hypergeometric test and the probability results
    from the calculation
  • A Top file that lists the GO IDs that give the
    lowest p-value in each of the gene clusters

80
dChip
  • Supports Go Ontology
  • Allows time course data
  • Clusters not connected through time
  • Can filter data by gene annotations

81
STEMShort Time-series Expression Miner
  • Limited to 8 time points (algorithm tailored
    specifically to small datasets)
  • Clusters not connected through time
  • Can compare experiments
  • Uses Gene Ontology database

82
Bozdech et al. P. falciparum analysis with GOALIE
  • A new, very preliminary, analysis with GOALIE
  • CAVEAT the analysis was done in few hours
    without inputs from a biologist
  • Datasets from
  • "The Transcriptome of the Intraeythrocytic
    Developmental Cycle of Plasmodium Falciparum", Z.
    Bozdech, M. Llinas, B. L. Pulliam, E. D. Wong, J.
    Zhu, and J. L. DeRisi, PLOS Biology,
    1(1)085100.
  • Dataset chosen for analysis was s003, which is
    described as the overview dataset.
  • GO associations were downloaded from the GO site
  • (http//www.geneontology.org/GO.current.annotation
    s.shtml)
  • Annotations with GeneDB from Sanger Institute

83
Data preparation
  • The P. falciparum dataset contains 48 time-points
  • The analysis presented, arbitrarily chose to
    group the first 34 into windows of size 8 with
    two time-points overlapping
  • Each time window was clustered separately using a
    k-means algorithm from the Genesis tool (Sturn et
    al., 2003-2005)

84
Examples Follow
  • The examples show chains where many DNA and RNA
    manipulations processes appear
  • The examples also show processes more
    mechanical in nature, like cell-cell adhesion

85
Cell-cell adhesion process
  • The cell-cell adhesion process becomes active
    between the second and third time window and
    remains active until the last one
  • The genes involved then start acting together
    with other involved in tRNA modification and drug
    susceptibility/resistance

86
tRNA modification
  • tRNA modification processes become active
    between the second and the third time window

87
DNA processes
  • In this last example we note how DNA
    (methylation and topological change) processes
    become active while drug susceptibility/resistance
    and cell-cycle related activities stop

88
Dana Scott Advice on Modal Logic
  • We have to consider not only of the flow of time
    but also of alternative courses of events. No,
    come to think of it, that is not the answer
    either, for that only makes the individual
    concept t fatter but not more amusing. Or
    maybe it does.
  • (Oh my, I see that much more thought and
    experimentation are needed to make the ideas into
    something useful. In any case I feel that a
    precise and general semantical framework is
    essential, and that is, as I have been trying to
    indicate, now available.)

89
Dana Scott Advice on Modal Logic
  • Postscript (December, 1969)
  • This paper was written very hastily in the latter
    part of May, 1968. The haste is apparent and the
    style intolerable I find it now very painful
    reading.
  • Dana Scott, Advice on Modal Logic. In
    Philosophical Problems in Logic Some Recent
    Developments, K. Lambert (ed.), pp. 143173,
    Dordrecht Reidel, 1970.

90
Processes involved in Lumen Formation Breast
Cancer
Architecture of Mammary Tissue Brugge-Lab
(Harvard)
91
Epithelial Cell Morphogenesis
92
Genes under transcriptional regulation during
lumen morphogenesis
93
Hypothesis Generation withGOALIE Analysis
  • What triggers the selective apoptosis of inner
    cells?
  • One hypothesis is that loss of adhesion can
    induce epithelial cell apoptosis. If true, what
    is the key player linking the two processes?
  • How were the different morphogenic changes
    coordinated?
  • What are the metabolic processes being regulated
    during the Morphagenesis? How are they
    coordinated with the major events (polarity,
    growth arrest, selective cell death)?
  • What is the difference between the cell adhesion
    program in 3D growth and 2D culture?

94
DEMO
95
(No Transcript)
96
Quine Epistemology Naturalized
  • Philosophers have rightly despaired of
    translating everything into observational and
    logico-mathematical terms. Carnap and the other
    logical positivists of the Vienna Circle had
    already pressed the term metaphysics into
    pejorative use, as connoting meaninglessness and
    the term epistemology was next. Wittgenstein
    and his followers, mainly at Oxford, found a
    residual philosophical vocation in therapy in
    curing philosophers of the delusion that there
    were epistemological problems.

97
How to convert static network maps into dynamic
mathematical models How to predict protein
function ab initio How to build hierarchical
models across multiple scales of time space
How to reduce complex multi- dimensional models
to underlying principles
Glycolysis
SIMPATHICA
98
SimPathica System
99
Simpathica is a multi-functional system
100
Simpathica is a modular system
Canonical Form
  • Characteristics
  • Predefined Modular Structure
  • Automated Translation from Graphical to
    Mathematical Model
  • Scalability

101
Purine Metabolism
  • Purine Metabolism
  • Provides the organism with building blocks for
    the synthesis of DNA and RNA.
  • The consequences of a malfunctioning purine
    metabolism pathway are severe and can lead to
    death.
  • The entire pathway is almost closed but also
    quite complex. It contains
  • several feedback loops,
  • cross-activations and
  • reversible reactions
  • Thus is an ideal candidate for reasoning with
    computational tools.

102
Simple Model
103
Biochemistry of Purine Metabolism
  • The main metabolite in purine biosynthesis is
    5-phosphoribosyl-a-1-pyrophosphate (PRPP).
  • A linear cascade of reactions converts PRPP into
    inosine monophosphate (IMP). IMP is the central
    branch point of the purine metabolism pathway.
  • IMP is transformed into AMP and GMP.
  • Guanosine, adenosine and their derivatives are
    recycled (unless used elsewhere) into
    hypoxanthine (HX) and xanthine (XA).
  • XA is finally oxidized into uric acid (UA).

104
Purine Metabolism
105
Queries
  • Variation of the initial concentration of PRPP
    does not change the steady state.(PRPP 10
    PRPP1) implies steady_state()
  • This query will be true when evaluated against
    the modified simulation run (i.e. the one where
    the initial concentration of PRPP is 10 times the
    initial concentration in the first run PRPP1).
  • Persistent increase in the initial concentration
    of PRPP does cause unwanted changes in the steady
    state values of some metabolites.
  • If the increase in the level of PRPP is in the
    order of 70 then the system does reach a steady
    state, and we expect to see increases in the
    levels of IMP and of the hypoxanthine pool in a
    comparable order of magnitude. Always (PRPP
    1.7PRPP1) implies steady_state()

TRUE
TRUE
106
Queries
  • Consider the following statement
  • Eventually
  • (Always (PRPP 1.7 PRPP1)impliessteady_state(
    )and Eventually
  • Always(IMP lt 2 IMP1))and Eventually
    (Always (hx_pool lt 10hx_pool1)))
  • where IMP1 and hx_pool1 are the values observed
    in the unmodified trace. The above statement
    turns out to be false over the modified
    experiment trace..
  • In fact, the increase in IMP is about 6.5 fold
    while the hypoxanthine pool increase is about 60
    fold.
  • Since the above queries turn out to be false over
    the modified trace, we conclude that the model
    over-predicts the increases in some of its
    products and that it should therefore be amended

False
107
Final Model
108
Purine Metabolism
109
A Tragedy, a Scandal or a Challenge
  • The lack of real contact between mathematics and
    biology is either a tragedy, a scandal or a
    challenge, it is hard to decide which.
  • Gian-Carlo Rota, (1986, in Discrete thoughts)

110
(No Transcript)
111
(No Transcript)
112
Thanks to
  • NYU Bioinformatics
  • Group
  • Anantharaman
  • Antoniotti
  • Daruwala
  • Mishra
  • Paxia
  • Collaborators
  • Bhardwaj
  • Brugge
  • Cordoza
  • Cantor
  • Demidov
  • Dynlacht
  • Fitch
  • Gimzewski
  • Gromov
  • Hubbard
  • Cheng
  • Gill
  • Heymann
  • Ionita
  • Kleinberg
  • Lord
  • Mysore
  • Rudkevich
  • Sun
  • Waltman
  • Jett
  • Lazebnik
  • Ostrer
  • Pagano
  • Parida
  • Ramakrishnan
  • Reed
  • Sobel
  • States
  • Teitell
  • Zhou

113
The end
Write a Comment
User Comments (0)
About PowerShow.com