GENIEfy: Collaborative study of GENIE Earth System Models on the Grid PowerPoint PPT Presentation

presentation player overlay
1 / 44
About This Presentation
Transcript and Presenter's Notes

Title: GENIEfy: Collaborative study of GENIE Earth System Models on the Grid


1
GENIEfy Collaborative study of GENIE Earth
System Models on the Grid
  • NERC Annual eScience Meeting
  • 26th 27th April 2006
  • Simon Cox
  • Southampton Regional e-Science Centre

2
The GENIE / GENIEfy Team
  • Principal Investigator - GENIEfy
  • Tim Lenton UEA Norwich
  • Research Team and Collaborators
  • James Annan FRSGC, Japan
  • Chris Armstrong Manchester
  • Chris Brockwell UEA Norwich
  • David Cameron CEH Edinburgh
  • Peter Cox Hadley Centre (UKMO)
  • Neil Edwards Open University
  • Sudipta Goswami UEA Norwich
  • Robin Hankin NOC
  • Julia Hargreaves FRSGC, Japan
  • Phil Harris CEH Wallingford
  • Zhuoan Jiao Southampton e-Science Centre
  • Eleftheria Katsiri London e-Science Centre
  • Valerie Livina UEA Norwich
  • Dan Lunt Bristol
  • Richard Myerscough NOC
  • Principal Investigator - GENIE
  • Paul Valdes Bristol
  • Co-Investigators / Management team
  • Peter Challenor NOC
  • Trevor Cooper-Chadwick Southampton e-Sci.
    Centre
  • Simon Cox Southampton e-Sci. Centre
  • John Darlington London e-Science Centre
  • Rupert Ford Manchester
  • Eric Guilyardi Reading
  • John Gurd Manchester
  • Richard Harding CEH Wallingford
  • Robert Marsh NOC
  • Tony Payne Bristol
  • Graham Riley Manchester
  • John Shepherd NOC
  • Rachel Warren UEA Norwich
  • Andrew Watson UEA Norwich

3
GENIE / GENIEfy
GENIEfy Grid ENabled Integrated Earth System
Model for the Community
  • The GENIE project has developed a Grid-based
    system to
  • Flexibly couple together state-of-the-art
    components to form a unified Earth system model
  • Execute the resulting model on the Grid
  • Share the distributed data produced in
    simulations
  • Provide high-level open access to the system,
    creating and supporting virtual organisations of
    Earth system modellers

4
Scientific Aims
  • Orbital parameters affect incident radiation and
    climate
  • Biological and geological processes interact
    with, and feedback upon, the climate (via, for
    instance, CO2)

5
The target GENIE Model
6
Flexible modelling framework
  • Modularity
  • Swappable components throughout
  • e.g. Atmosphere 2D Energy-Moisture Balance Model
    or 3D Intermediate GCM
  • Scalability
  • Variable resolution
  • e.g. Ocean 18x18, 36x36, 72x72 8-32 depth
    layers
  • Traceability
  • Common physics when resolution is varied
  • Where a process is not resolved, parameterise it
    based on a resolution that does resolve it
    reasonably

7
GENIE Science achievements
  • Scientific outcomes from the first phase of the
    GENIE project

8
GENIE achievements
  • Ground-breaking science with modularised Earth
    system models
  • Stability of the ocean thermohaline circulation
    to multiple forcings
  • Transient simulations from last glacial maximum
    to pre-industrial
  • Long-term climate change projections with a
    closed carbon cycle
  • Using the power of the Grid to enable the
    science
  • Extensive exploration of model parameter space
  • Automated simultaneous tuning of multiple model
    parameters
  • Data assimilation and characterisation of model
    uncertainty
  • Practical software engineering
  • Version control
  • Build-test-deploy cycle
  • Nightly build and test

9
Bi-Stability of the Thermohaline Circulation (THC)
OFF
ON
Single point in model parameter space sensitive
to initial conditions Atlantic Meridional
Overturning Circulation MOC (Sv) Annual
Average Air Temperature Difference (K) ON
OFF How close are we to collapse of the
thermohaline circulation?
Marsh, R. J. et al. (2004) Climate Dynamics
10
THC bi-stability search
OFF
Current state
Vary 2 parameters controlling N. Atlantic
freshwater supply X Anomaly in Atlantic to
Pacific atmospheric moisture transport (DFWX) Y
Atmospheric moisture diffusivity (DIFF), controls
Equator to Pole transport Top panel shows results
of 961 x 4000yr simulations Bottom panel shows
range from 9 restart experiments Total 40 Myr of
simulations
ON
Atmosphere more diffusive
Atmosphere less diffusive
Current state
Narrow region of bi-stability
Marsh, R. J. et al. (2004) Climate Dynamics
11
Data assimilation
James Annan, Julia Hargreaves
  • Ensemble Kalman Filter (EnKF)
  • A sample of the posterior probability
    distribution defined by prior beliefs and climate
    observations
  • Data
  • World Ocean Atlas temperature and salinity
  • NCEP reanalysis surface air temperature, humidity
  • THC strength and heat transport at 25N
  • The system is close to a non-linear threshold
  • Initial conditions are uncertain

12
GENIE ensembles compared to CMIP
13
Unpredictability in THC response
Data assimilated 54 member ensemble, 3.3 x CO2
87
Model with enhanced hydrological sensitivity
14
GENIE e-Science Tools
  • Software deployed within the GENIE framework to
    exploit Grid technology

15
Geodise Toolboxes
  • Geodise Compute Toolbox
  • Grid access from the Desktop
  • Matlab and Jython interfaces
  • Globus and Condor support
  • Geodise Database Toolbox
  • Associate metadata with data
  • Programmatic and GUI access
  • OptionsMatlab
  • Engineering Design Optimisation
  • Suite of multi-dimensional optimisation algorithms

16
Grid Computation
Institutional Resources (GT2)
National Grid Service (GT2)
17
Need Metadata
18
Data Management System
19
GENIE Toolbox
  • Provide functions to manage generic time-stepping
    codes on the Grid
  • User provides
  • Archive containing the model binary input data
  • Metadata describing the model configuration
  • Metadata specifying a compute resource
  • Client manages
  • Model configuration, transfer and job submission
  • Job monitoring
  • Data retrieval and upload to database

20
Client Session Job submission
21
Tuning GENIE Models
  • Exploiting e-Science tools to tune the free
    parameters of models in the GENIE framework

22
Scripted Tuning Study
Matlab gtgt optimum fminsearch(_at_genie, params, )
GENIE Database
function error genie(params)
error gd_query(params) metadata.param1
params(1) metadata.param2 params(2) handle,
retrieve gc_jobsubmit(metadata, runtime,
resource) gd_jobpoll(handle) error
gc_jobretrieve(retrieve) gd_archive(metadata,
error) return error
23
1D and 2D Optimisation
Specify a starting
point parameters
0.5 Perform the
minimisation optimum
fminsearch( _at_cgoldstein_1D, parameters,
optimisation_parameters )
Specify a starting point
parameters 420
5000000 Perform
the minimisation optimu
m fminsearch( _at_cgoldstein_2D, parameters,
optimisation_parameters )
24
Response Surface Modelling
  • Optimisation of 12 parameters in cGOLDSTEIN ocean
    model
  • Each objective function calculation (model run)
    takes 1 hour
  • Direct Search methods require too many
    evaluations to be practical
  • Employ a Kriging method to construct a Response
    Surface Model
  • Search a stochastic process model of the
    underlying objective function
  • Iteratively update the metamodel ? Converge on an
    optimal solution

R20.9052
  • Optimal solutions EnKF 0.4986 ACCPM0.4891 Kri
    g0.4913

25
Genetic Algorithm tuning of IGCM
Default Tuned Data
  • 36 reduction in error statistic compared to
    default parameters
  • Similar result to a parallel study performed
    using the Ensemble Kalman Filter
  • Model physics insufficient to perfectly match
    observational data.

26
Multi-Objective Optimisation
  • Single objective function
  • Weighted sum of (model observation) RMS
    differences
  • Some objectives can be improved at the expense of
    others
  • Little improvement in the precipitation and
    evaporation fields
  • Multi-objective optimisation
  • Employ a Pareto Front to optimise multiple
    objectives
  • Implementation of the Non-dominated Sorting
    Genetic Algorithm (NSGA-II, Deb (2002))
  • 3 objective functions
  • Weighted sum of the RMS differences between
    seasonal averages of model fields and equivalent
    observational data
  • OBJ1 (sensible heat latent heat net solar
    net long)
  • OBJ2 (precipitation rate evaporation)
  • OBJ3 (wind stress_x wind stress_y)
  • IGCM problem definition
  • 32 free parameters (TXBLCNST TYBLCNST)
  • 2 constraints on the parameters
  • HUMCLOUDMAX gt HUMCLOUDMIN
  • SNOLOOK2 gt ALBEDO_ICEHSEET

27
Pareto Front Progression
  • 50 generations of the NSGA-II algorithm

28
Multi-objective Optimisation
  • 5000 model invocations
  • Southampton University Condor pool
  • Iridis2 Compute Cluster
  • National Grid Service
  • Pareto Front driven towards origin
  • 3 objective functions reduced
  • Targeted improvements
  • Evaporation fields improved without compromising
    other fields

29
Collaborative Ensemble Studies
  • Investigation of bi-stability in the THC at
    varying resolution under a dynamic atmosphere
    using a distributed Grid-enabled Problem Solving
    Environment

30
GENIE-2 on 3 different grids experimental design
  • Three grids
  • (i) 36x36 and (ii) 72x72 equal area grids
  • (iii) 64x32 grid of resolution 5.625 (as IGCM)
  • 1-D parameter sweeps, varying Atlantic-Pacific
    freshwater flux adjustment from 0 to 2 x default
    value (0.32 Sv)
  • GENIE-2 run for 1000 years (sufficient for
    equilibrium)
  • Restarts from Conveyor Off and Conveyor On
    states to investigate model dependence of THC
    bistability

31
GENIE-2 Configuration
3D atmosphere IGCM (64 x 32 x 7)
2D slab sea-ice
3D ocean (36 x 36 x 8) (64 x 32 x 8) (72 x 72 x
16)
2D land surface
32
GENIE Grids
  • Some GENIE grids
  • (Lenton et al., in prep.)
  • Lowest resolution, equal-area grid, used since
    2002 (the basis for GENIE predecessor,
    C-GOLDSTEIN)
  • The original grid featuring constant increments
    of latitude
  • The 5.625ยบ resolution grid which exactly matches
    that of the IGCM

33
Surface freshwater flux correction
  • Three zones where A-P flux (Fa) is applied,
    indicating default values (from Oort 1983)

34
Collaborative Study
35
Collaborative Model Study
  • 12 Ensemble Experiments
  • 3 x 1D FWF adjustment
  • 3 GOLDSTEIN grid resolutions (36x36, 64x32,
    72x72)
  • 1 x 2D FWF adjustment, Boundary layer factor
  • 36x36 GOLDSTEIN grid
  • 8 x 1D FWF adjustment, restarted from output of
    phase 1
  • 3 GOLDSTEIN grid resolutions (36x36, 64x32,
    72x72)
  • 72x72 models runs performed on both Linux and
    win32 platforms
  • Addition of resource
  • National Grid Service (Oxford, Leeds, Manchester,
    RAL, Bristol)
  • Condor Pools (Southampton, Bristol, NOC)
  • Clusters
  • Cluster1 (Norwich)
  • Pacifica (Southampton), Iridis2 / Pacifica2 (dual
    processor, dual core)

36
Experiment Setup
  • Create an Experiment definition in the database
  • Specify an ensemble of simulations
  • Associate the control scripts with the experiment
    definition
  • Record metadata describing the experiment
  • Users can contribute by downloading the Setup
    script
  • Query the database for an experiment to
    contribute to
  • Click the hyperlink to SetupExperiment
  • Download the file to the local filesystem
  • Execute the script
  • Experiment files are retrieved from the database
    and installed on the client machine

37
Client Session
38
Autonomous worker script
  • Update experiment
  • Check that local scripts are up to date
  • Stage 1
  • Query for existing jobs
  • Process any completed jobs and archive the
    restart files and output data
  • Stage 2
  • Query database for new work units
  • User specifies logic for job selection
  • Find all simulations yet to be started
  • If available, select restart files with maximum
    achieved timestep in each simulation
  • Filter out simulations with active jobs
  • Stage 3
  • Submission of new jobs
  • Number of timesteps reduced to ensure model
    finishes at the end of a model year

39
Resource Usage
  • 5 client installations
  • 9 Grid resources exploited
  • 352 simulations defined (1000 and 2000 yrs)
  • 3,736 compute tasks submitted
  • 46,992 CPU hours (estimated)
  • 428,000 IGCM-GOLDSTEIN years performed

40
Resource Usage
41
Bistability of the thermohaline circulation in
all three GENIE-2 resolutions
42
Two Parameter Study
43
Summary
  • THC as a function of freshwater fluxes obtained
    for
  • Two levels of atmospheric complexity (including
    previous studies)
  • Three model grids (varying horizontal resolution)
  • THC bistability is more extensive with
  • More complex atmosphere
  • A regular 5.625 grid
  • Results tentatively support the existence of THC
    bistability in the real world (i.e. towards
    infinite complexity)

44
NIEeS GENIE Workshop
  • 26th 28th June 2006, NIEeS, Cambridge
  • Aims of the workshop
  • To showcase the use of Grid technology for Earth
    system modelling
  • To decide on the next development steps for
    GENIE, adopting community standards
  • To begin the design of a generic physical coupler
    for the GENIE framework
  • To start a GENIE user group and ascertain the
    users requirements
  • To strengthen our national and international
    collaborative links
  • To launch the Quaternary QUEST project, which is
    applying GENIE to understanding
    glacial-interglacial variations in atmospheric
    carbon dioxide
Write a Comment
User Comments (0)
About PowerShow.com