Title: little b, a language for building modular models
1little b, a language for building modular models
sri international palo alto, ca weds feb 6, 2008
- aneil mallavarapu
- department of systems biology
- harvard medical school
2- today, models are monolithic and used only by a
small cadre of computational biologists - biological complexity requires detailed
accounting at the level of mathematics - how can models become a part of everyday
scientific life, as gene sequences have become?
3little b
- a high-level, terse, modular modeling language
- designed for specifying biological systems,
- and generating mathematical models
4KNOWLEDGE REPRESENTATION
ANALYTIC REPRESENTATION
ode / simulation or steady state analysis
molecules reactions
little b
pde / simulation
cellular structure
mechanisms and kinetic constants
stochastic lattice pi / model checking
matlab
little b
pathways
mechanisms
analysis methods
5representation high and low
knowledge representation (biocyc, ingenuity,
biopax)
high-level specification for search and inference
model specification languages (little b, bng,
kappa, sbml level3)
mid-level specification for model generation
analytic languages (matlab, mathematic, pi
calculus, maude, kappa, numerica, maple, etc.)
low level specification for model execution
6motivations
7motivations
does order, distributivity processivity matter?
8models combine reactions and species in
compartments
E
species and reactions
,
,
P
X
models
1
2
9translating to matlab code introduces further
bookkeeping requirements
dE/dt - EPkEP dP/dt - EPkEP dX/dt
EP kEP
dE/dt - EXkEP dS/dt - ESkES dP/dt
ESkES -EPkEP dX/dt EPkEP
dE/dt 0 dS/dt - ESkES dP/d ESkES
10assumptions affect mathematical structure
1. mechanism (e.g., steps)
dE/dt 0 dS/dt - ESkES dP/dt
ESkES
dE/dt ESkES - ESkES dS/dt -
ESkES dP/dt ESkES dES/dt
ESkES - ESkES
6 changes
3 total of 9 changes
11assumptions affect mathematical structure (2)
2. kinetics
m1R1 m2R2 mnRn ?
mass-action
dE/dt ESkES - E(S/(KS))hes dS/dt -
E(S/(KS))hes dP/dt ESkES dES/dt
E(S/(KS))hes - ESkES
dE/dt ESkES - ESkES dS/dt -
ESkES dP/dt ESkES dES/dt
ESkES - ESkES
12today the modeling community thinks about
- representation
- analytic methods
- semantics
- syntax
13modeling language ilities
- readability
- writeability
- shareability
- reusability
- modularity
- composability
- extensibility
- verifiability
- affordability
- adaptibility
- dependability
- simplicity
14Programs must be written for people to read, and
only incidentally for machines to execute.
- Abelson Sussman, Structure and Interpretation
of Computer Programs
15toy egf receptor model - parts
egf
egfregf
egfr
mapkkk
mapkkk
mapkk
mapkk
mapk
mapk
16the awesome power of lisp macros code which
writes code
develop concise notations for particular
purposes
17toy egf receptor model - reactions
egf
egfregf
egfr
mapkkk
mapkkk
mapkk
mapkk
mapk
mapk
18toy egf receptor model modular mechanism
enzymatic-reaction mapkk mapk mapk
(irreversible)
ES
enzymatic-reaction mapkk mapk mapkk
(reversible irreversible)
enzymatic-reaction generates reaction-types on
your behalf
19modular kinetics
20in the future we can imagine
- libraries of such components have been previously
defined by experts, and are available - over the web
- in a database in your lab
- in your own personal collection
- b enables these parts to be combined
21lets describe a situation composed of predefined
parts
cell-a
dish
22a base language and libraries
common lisp ANSI X3J13
23little b builds symbolic mathematical expressions
cell-a
dish
object-oriented syntax meets symbolic math
enables programmers and theorists to write
debug functions which translate between the world
of objects and the world of mathematical
expressions.
24print-eval consistency objects print as code
fragments of expressions can be copied, pasted,
evaluated a useful manipulation and
inspection capability
25set initial conditions and perform numerical
integration in matlab
26extend the model with a phosphatase
mapkkk
mapkk
mapk
mapk
27recap
- shareable, modular biochemical models
- symbolic language for describing and manipulating
objects - symbolic math system brings mathematics and
objects together - a terse readable, writeable format
- in-memory database and reasoning capability
28Section II
29molecular complexes
- graphs specify physical connectivity
- built from components called monomers
(defmonomer erb L D C) a monomer with 3 sites
30bond sites connect
(defmonomer erb L D C)
- erb 1 1 _
- erb L.1 D.1 C._
- R L.1 D.1
- R 1 1
erb _ 1 1 erb c.1 a._ b.1
by order
by name
implicitly
31connecting monomers
erb
L
D
C
32connecting monomers
(defmonomer egf R)
33specifying classes of reactions with patterns
using the wildcard
erb L._ D. C. egf R._ erb L.1 D.
C.egf R.1
34the rest-bindings , __
- erb L._ D. C. egf R._ -gtgt
- erb L.1 D. C.egf R.1
erb L._ egf R._ -gtgt erb L.1 egf
R.1
35state sites encode state
- (defmonomer mapk
- (b documentation binding site)
- (p states (member u p)
- documentation phosphorylation site)))
mapk _ u mapk p.u
mapk _ p mapk p.p
mapk
mapk
b
b
p
p
u
p
36Four members of the ERB family
hrg
egf
- bind ligands
- form homo and hetro-dimers
- complex with internal components and external
components - internalized into subcellular compartments
erbb1
erbb2
erbb3
erbb4
- carlos lopez (sorger lab)
- will chen (sorger lab)
37with-substitution-table
- define 4 erb receptors with common structure
- (with-substitution-table
- (NAME erbb1 erbb2 erbb3 erbb4)
- (defmonomer name L D C)))
- Expands to
- (progn (defmonomer erbb1 L D C)
- (defmonomer erbb2 L D C)
- (defmonomer erbb3 L D C)
- (defmonomer erbb4 L D C))
38receptor dimerization
39with-data-table
- (with-data-table (rows (R1 L) cols R2 cells
(Kf Kr) - ignore _)
- (( erbb1 erbb2 erbb3 erbb4)
- ((erbb1 egf) (.1 .3) (.1 .3) (.2 .7) (.4
.01)) - ((erbb3 hrg) (.2 .2) _ (.1 .1) (.1 .1))
- ((erbb4 hrg) (.3 .7) (.1 .7) (.4 .6) (.8
.1))) -
- R1 L.1 L R.1 R2 D._ __ ltlt-gtgt
- R1 L.1 D.2 L R.1R2 D.2 __
- documentation "Receptor-ligand Binding"
- (.set-rate-function 'mass-action fwd Kf rev
Kr)) - Expands to 11 reversible reactions
40lisp is the metalanguage
erbb1.documentation Rgt (FLD ERBB1
DOCUMENTATION) erbb1.(in cell.membrane) Rgt
(FLD ERBB1 IN (FLD CELL
MEMBRANE)) erb D.1erb D.1 Rgt (OBJECT
COMPLEX-SPECIES-TYPE (QUOTE (ERB (FLD D
1)) (ERB (FLD D 1)))) A B Rgt (MATH
(OP A B))
41377 species-types 862 reaction-types
Schoeberl et al, Nature Biotech 2002
42egf
hrg
w/ Carlos Lopez Will Chen Peter Sorger
1
3
4
erbb
2
ras
Pip3
shc
PI3K
grb
sos
raf
dep1
phosphatases
vescicle
mek
ptp1b
mkp
kinase cascade
erk
pdk2
19 user-specified monomers 29 user-specified
reaction-patterns 442 lines (incl. comments,
spaces) 6.5 pages of code 247
complex-reaction-types 742 species-types
(complexes) 4947 reaction-types 975
species 10,187 lines of Matlab code
endosome
pdk1
akt
43(No Transcript)
44Inspect and verify using queries
45visualization
B-USER gt erb D.1erb D.1 L.2egf 2.show
46Section III
47multi-compartmental reactions
dish
cell-a
cell-b
48multicellular / multicompartmental
49von dassow et al. wrote ingeneue software to
investigate multicellular models
- topology is relatively robust to parametric
variation - 49 params
- 1/200 randomly chosen parameter sets produce
pattern successfully
The segment polarity network is arobust
developmental module- von Dassow et al., 2000
50accounting for realistic cellular lattices
ingeneues modulo-6 arithmetic S(c,m) reacts
with S(c1,mod(m6,6))
representing this realistic lattice requires
reasoning about geometry and computing
location-sensitive identities of species
matt thomson
51little b symbolic math subsystem enables some
simplification and manipulation
- units, dimensions
- quantities, gaussian distributions
- polynomial, rational-polynomial and radical
expressions
52an extensible system of units and dimensions
53serious mathematics
- Maxima 5.12
- GPL fork of DOE Macsyma (developed _at_ MIT between
1967-1982), under continuous public development
since then - defined the approach used by Mathematica, Maple
and other symbolic tools - Common Lisp
- plotting, simplification, polynomials, elliptic
fns, limits, integration, differentiation,
statistics, matricies, linear algebra, affine,
tensor, series and more - Fortran numerical libraries
- f2cl translator (used in Maxima)
- ODEPACK/DAEPACK (numerical ode)
- LAPACK/BLAS (linear algebra)
- SLICOT (control theory)
-
ben ullian
54little b as service
matlab
(query erbb1 mapk )
octave
little b via JSON
tcp/ip interface
little b
erbb1 _ 1mapk 1 p erbb1 _ 1mapk 1
u erbb1 1 2egf 1mapk 2 u
scipy
55future directions
- maxima integration
- new analytic approaches
- graph-rule-based stochastic modeling
- spatial stochastic lattices
- continuous spatial models (PDEs)
- discrete (boolean) networks
- hybrid models (mixed continuous/discrete )
- pathway libraries (e.g., of specific signal
transduction molecules) - generic multicellular systems (e.g., vescicular
trafficking) - model reduction
- infrastructure
- graphical display
- curation site / model parts database
- output to mathml/sbml/mathematica/
56little b/lisp as metalanguage
- programmable syntax
- programmable semantics
- symbolic computing
- custom notation or even mini-languages
57how did we do on the ilities?
- readability
- writeability
- shareability
- reusability
- modularity
- composability
- extensibility
- verifiability
- affordability
- adaptibility
- dependability
- simplicity
58greenspuns 10th rule
- Any sufficiently complicated C or Fortran
program contains a slow, bug-ridden
implementation of half of Common Lisp - - phil greenspun
59acknowledgements
- erbb modelling
- carlos lopez
- will chen
- peter sorger
- segment polarity network matt thomson
- multisite phosphorylation
- jeremy gunawardena
- german enciso
- vescicular modules felix bonowski
- allostery model david croll
- maxima ben ullian
- visualization albert krewinkel
- biomodels wiki jeremy muhlich
- LISA david young
- Lisp support dave fox
60language as currency
- currency permits exchange
- a history of the universal equivalent
- goats, beads, shells gold money
- language is the currency of ideas
- modeling languages are the currency of
formalized scientific ideas - diagrams, ode models, pi calculus, xml
61I think conventional languages are for the
birds. They're just extensions of the von Neumann
computer, and they keep our noses in the dirt of
dealing with individual words and computing
addresses, and doing all kinds of silly things
like that, things that we've picked up from
programming for computers we've built them into
programming languages we've built them into
Fortran we've built them in PL/1 we've built
them into almost every language. - john
backus, turing award lecture