Title: Reproducible Computational Experiments Using MADAGASCAR Software Package
1Reproducible Computational Experiments Using
MADAGASCAR Software Package
- Sergey Fomel
- Bureau of Economic Geology
- University of Texas at Austin
Applied Inverse Problems Vancouver BC June 29,
2007
http//rsf.sf.net/
2Principles of Scientific Software
- Encapsulation
- File Formats
- Testing
- Reproducibility
- Maintenance
3Principles of Scientific Software
- Encapsulation
- File Formats
- Testing
- Reproducibility
- Maintenance
4Encapsulation
- Information hiding (Parnas, 1972)
- Separation of concerns (Dijkstra, 1974)
- Separate physics from mathematics
- A is physics
- Going from b to is mathematics
5Example Velocity Transform
6Physics of Velocity Transform
7(No Transcript)
8(No Transcript)
9Encapsulation in Programming
- Separation of concerns
- Classes or templates (C)
- Function pointers (C)
- Function interfaces (Fortran-90)
/ initialize velocity transform (A) /
veltran_init (true, x0, dx, nx, s0, ds, nv, o1,
d1, nt, s02, anti, psun1, psun2) /
least-squares minimization of A x b2,
xvscan, bcmp / sf_solver (veltran_lop,
sf_cgstep, ntv, ntx, vscan, cmp, niter,
"err", error, "nmem", 0, "nfreq", miter, "mwt",
mask, "end")
10Encapsulation in UNIX
- Write programs that do one thing and do it well.
- Write programs to work together.
- Write programs to handle text streams, because
that is a universal interface.
11Encapsulation in UNIX Shell
bash sfveltran lt cmp.rsf gt vtran.rsf adjy v01
dv0.025 nv60 bash sfdottest sfveltran
modvtran.rsf datcmp.rsf v01 dv0.025
nv60 sfdottest Lmd21665.9 sfdottest
L'dm21665.9 bash sfdottest sfveltran
modvtran.rsf datcmp.rsf v01 dv0.025
nv60 sfdottest Lmd21906.2 sfdottest
L'dm21906.2 bash sfconjgrad sfveltran lt
cmp.rsf gt vtran.rsf niter3 v01 dv0.025 nv60
sfconjgrad iter 1 of 3 sfconjgrad
grad6.36797e09 sfconjgrad iter 2 of
3 sfconjgrad grad1.39068e09 sfconjgrad iter 3
of 3 sfconjgrad grad7.50257e08
12Principles of Scientific Software
- Encapsulation
- File Formats
- Testing
- Reproducibility
- Maintenance
13The Art of UNIX Programming
- (Raymond, 2004)
- To design a perfect anti-Unix, make all file
formats binary and opaque, and require
heavyweight tools to read and edit them. - If you feel an urge to design a complex binary
file format, or a complex binary application
protocol, it is generally wise to lie down until
the feeling passes.
14RSF (Regularly Sampled Format)
- SEPlib (Stanford Exploration Project)
- Data separated from text headers
- Conceptually N-dimensional hypercubes
- Multiple files for complex geometries
- Not application specific
15Principles of Scientific Software
- Encapsulation
- File Formats
- Testing
- Reproducibility
- Maintenance
16Testing
- Test-driven development (Beck, 2003)
- YAGNI principle
- Always implement things when you actually need
them, never when you just foresee that you need
them. - In scientific software development, tests are
computational experiments
17Testing with SCons
- Software Construction
- Replacement for make
- reliable and extensible dependency analysis
- configuration files are Python scripts
- cross-platform
- open-source
18SConstruct File
Mobil AVO CMP gather 807 at well4
location Fetch('cmp807_raw.HH','rad')
Preprocessing Flow('cmp','cmp807_raw.HH',
'dd formnative tpow tpow2 mutter halfn
v01.3 tp0.2') Plot('cmp','grey title"Input CMP
Gather" ) Velocity Transform Flow('veltran','c
mp','veltran s020.25 v01.250 dv0.025 nv60
adjy') Plot('veltran','grey title"Velocity
Scan" ') Display Side by Side Result('veltran',
'cmp veltran','SideBySideAniso')
19Experimenting with SCons
bash scons retrieve("cmp807_raw.HH", ) lt
cmp807_raw.HH sfdd formnative sftpow tpow2
sfmutter halfn v01.3 tp0.2 gt cmp.rsf lt
cmp.rsf sfgrey title"Input CMP Gather" gt
cmp.vpl lt cmp.rsf sfveltran s020.25 v01.250
dv0.025 nv60 adjy gt veltran.rsf lt veltran.rsf
sfgrey title"Velocity Scan" gt veltran.vpl vppen
yscale2 vpstylen gridnum2,1 cmp.vpl
veltran.vpl gt Fig/veltran.vpl bash sed
s/Velocity/Slowness/ lt SConstruct gt
SConstruct2 bash mv SConstruct2 SConstruct bash
scons lt veltran.rsf sfgrey titleSlowness Scan"
gt veltran.vpl vppen yscale2 vpstylen
gridnum2,1 cmp.vpl veltran.vpl gt Fig/veltran.vpl
20Principles of Scientific Software
- Encapsulation
- File Formats
- Testing
- Reproducibility
- Maintenance
21Reproducible Research at Stanford
- (Knuth, 1992)
- A computer program should be written with human
readability as a primary goal. - (Claerbout and Karrenbach, 1992)
- The purpose of reproducible research is to
facilitate someone going a step further by
changing something. - (Buckheit and Donoho, 1995)
- An article about computational science in a
scientific publication is not the scholarship
itself, it is merely advertising of the
scholarship.
22Reproducible Experiments
- Within the world of science, computation is now
rightly seen as a third vertex of a triangle
complementing experiment and theory. However, as
it is now often practiced, one can make a good
case that computing is the last refuge of the
scientific scoundrel Where else in science
can one get away with publishing observations
that are claimed to prove a theory or illustrate
the success of a technique without having to give
a careful description of the methods used, in
sufficient detail that others can attempt to
repeat the experiment? (LeVeque, 2006)
23(No Transcript)
24(No Transcript)
25Principles of Scientific Software
- Encapsulation
- File Formats
- Testing
- Reproducibility
- Maintenance
26Maintenance
- Computational experiments that are not
continuously maintained loose reproducibility. - Regression testing (Brooks, 1975)
- Contribute computational software and experiments
to a community-maintained repository to enable
research productivity.
27Open Science
28Conclusions
- Principles of Scientific Software
- Encapsulation
- File Formats
- Testing
- Reproducibility
- Maintenance
- Madagascar software package
- Open source, open community, open science