Title: climateprediction'net: Project Overview
1climateprediction.net Project Overview
- Dave Frame,
- Climate Dynamics Group,Atmospheric, Oceanic and
Planetary PhysicsUniversity of Oxford - dframe_at_atm.ox.ac.uk
- thanks to David Stainforth, Myles Allen, Carl
Christensen, Andrew Martin, Jamie Kettleborough,
Mat Collins, Tolu Aina, Richard Gault, Martin
Dzbor
2The HadCM3L Climate Model
- HadCM3L
- 2.5(lat)x3.75(lon)
- 19 vertical levels (atm)
- 20 levels (ocean)
- Solves the Navier-Stokes equations
- Does a fairly good job of representing
large-scale features of climate (it can
explicitly model many of these) - Not so good at small scale features
- Features at sub-grid scales need to be
parameterized
3Uncertainty in Climate Prediction
- Models depend on parameterisations of small-scale
processes. - Parameterisations represent the feedbacks between
smaller and larger scales. - Many prescribed parameters (e.g. ice fall speed
in clouds) are poorly constrained.
- Two UK projects working together to explore this
sort of uncertainty in climate models - QUMP
- climateprediction.net
4Perturbed-Physics Ensembles
How many simulations? O(1M) Far more than is
possible on supercomputers.
- To quantify model uncertainties in an AOGCM we
can change parameter values, or whole
parameterizations, and study the models
response. A perturbed physics ensemble. - But parameters/parameterizations interact
non-linearly so we need to try many combinations
of parameter values. - For each parameter combination we need
- An initial condition ensemble to facilitate
comparison with observations. - An ensemble to test against variants of recent
and future forcing.
5climateprediction.net
- The Method
- Invite the public to download a full resolution,
3D climate model and run it locally on their PC. - Use each PC to run a single member of a massive,
perturbed physics ensemble. - Provide visualization software and educational
packages to maintain interest and facilitate
school and undergraduate projects etc.
6Distribution Plans
- We plan to
- Distribute 2M versions of HadCM3L set up for
- Pre-packaged (unique) simulation of 1950-2050
- Estimate uncertainty from collated results
- Participants will
- Run up to 115 years of the UM
- View model output
- Compare their results on the web
- Take an OU short course associated with
climateprediction.net - Utilise KMIs ground-breaking distance learning
resources to find out more about climate (
modelling)
7Experimental Release Plan
Double CO2
15 yr, 2 x CO2
Diagnostics from final 8 yrs.
Derived fluxes
Calibration
15 yr spin-up
15 yr, base case CO2
Control
8Research Sectors
- Computational science (e-Science).
- Collaborators Oxford University Computing
Laboratory, Rutherford Appleton Laboratory,
Tessella, Research Systems Inc., NAG.Also
PCMDI, Oxford Brookes University, UC-Berkeley. - Public involvement and distance, collaborative
learning. - Collaborators The Open University.Also The
Institute of Physics. - Climate science.
- Collaborators Oxford University, Rutherford
Appleton Laboratory, the Hadley Centre, the
University of Reading, University of Victoria. - Also MIT, LSE.
9High-Throughput Distributed Computing
- E-Science
- GRID access to computers and equipment.
- DataGRID access to large, distributed datasets.
- High-throughput distributed computing
- Mechanisms to access the vast unused resources of
PCs connected via intranets and the internet. - Public-resource distributed computing
- A huge computational GRID with a novel
architecture in which bandwidth is the weakest
link. - There is vast potential to tap into
under-utilised resources to - carry out academic and business computing,
- encourage public involvement in research.
10Public-Resource Distributed Computing
- Precedents
- SETI_at_home, Fightaids_at_home, Folding_at_home, Cure
Cancer etc. - DifferencesClimateprediction.net takes the
concept several steps further - Granularity and duration of the computational
task. - Volume management of data.
- Scale of data collection / analysis problem.
- Need to maintain long term interest.
11e-Science in Climateprediction.net
- Develop the tools and expertise necessary to
exploit the opportunities to use public resource,
distributed computing to undertake major
modelling tasks. - Develop tools and expertise to analyse very large
distributed datasets (2-10 petabytes/yr
comparable with LHC) - Core data on 10-100 upload servers worldwide.
- Full dataset on O(millions) machines
intermittently connected by variable bandwidth
connections. - Optimised analysis and visualisation processing.
- Tools suitable for a range of users from
researchers to teachers to children.
12Design Challenges (1)
13Design Challenges (2)
14Computational Structure
- Client/Server Interaction
15Security
- Threats to participants (unexpected costs of
participation) - Software package will be digitally signed.
- Communications will always be initiated by the
client. - HTTP over a secure socket layer will be used
where necessary to protect participant details
and guarantee reliable data collection. - Digitally signed files will be used where
necessary. - Thorough server security management and frequent
backups. - Threats to the experiment (falsified data)
- Two types of run replication
- Small number of repeated identical runs.
- Large numbers of initial condition ensembles.
- Checksum tracking of client package files to
discourage casual tampering. - Server based data analysis will be based on the
US ESG II (Earth System Grid II) software and
Globus utilizing their security features.
16Status and Future Developments
- Current Status
- Beta testing is ongoing
- Public release in August '03.
- Future Work
- Expansion of visualization packages inc.
peer-to-peer platforms. - Educational and public involvement packages.
- DataGRID systems for data extraction and
analysis. - Intranet installation procedures.
- Additional models (PCM, ECHAM, etc).
- Climate research.
17Server Side
- Central Server for Registration, Trickle, Expt
Handling(will be database-driven) - Currently simply MySQL, apache, perl under Linux
- Will be migrating to Oracle
- Possibility of replication for a hot backup and
load balancing - Upload Servers (Unix file-system-driven)
- Will probably be dumb virtual fileservers --
e.g. apache perl, possibly gridFtp for transfer
to researchers - On the order of 10-100 around the world, total
1-3 petabytes - Clients retain most data (500MB-5GB) uploads
5-10 of that, 2 million expected, so clients
potentially have 10 petabytes - Report Servers
- Under design development, will handle summary
data for expt run, and probably cache files from
upload servers
18Use Case
19To Infinity and Beyond...
- 300 beta testers now, adding a few hundred per
week of invited participants - Public launch, late August, 2003, Science Museum
in London - User Forum activity interest is picking up
(http//www.climateprediction.net/board) - Need to iron out central server DBMS upload
server installation (have about 20 institutions
who want to host an upload server) - Strict time monetary constraints as opposed to
a commercial or industrial-backed project -- a
start-up company feel without the possibility of
an IPO! - An interesting inverse of Fred P. Brooks
Mythical Man-Month, e.g. How many resources
(equipment people) can you take away and still
get a great system out on time! (or the
difficult we do today, the impossible takes a
little longer!) - Test, test, test!
20User Forum
21User Pages
Need to give the usersplenty of information
about their runs. Currently developingthe
ability to trickle databack and display this
oncustomisable, interactive web pages. This
will givea context for their runs(and those of
others invarious groups membersof initial
condition ensemble,neighbours in
parameterspace, neighbours in physicalspace,
chosen friends)
22Some Preliminary Results (from beta)