Title: Replica Optimisation Within The EU DataGrid
1Replica Optimisation Within The EU DataGrid
- David Cameron
- e-Science Summer School
- 16 21 September 2002
2Summary
- The need for Grid.
- Grid architecture.
- Replica management and optimisation through
economic models. - Grid simulation OptorSim.
- Some results.
- Simulation demo.
3The Large Hadron Collider
4Complexity CPU Requirements
- Complex events
- Large number of signals
- good signals are covered with background
- Many events
- 109 events/experiment/year
- 1- 25 MB/event raw data
- several PB per year
- Need world-wide
7106 SPECint95 (3108 MIPS) Several PB of
storage space
GRID
computing
5A Physics Event
- Gated electronics response from a proton-proton
collision - Raw data hit addresses, digitally
- converted charges and times
- Marked by a unique code
- Proton bunch crossing number, RF bucket
- Event number
- Collected, Processed, Analyzed, Archived.
- Variety of data objects become associated
- Event migrates through analysis chain
- may be reprocessed
- selected for various analyses
- replicated to various locations.
6Data Structure
Trigger System
Data Acquisition
Run Conditions
Level 3 trigger
Calibration Data
Raw Data
Trigger Tags
Reconstruction
Event Summary Data ESD
Event Tags
REAL and SIMULATED data required
7Data Hierarchy
RAW
Recorded by DAQ Triggered events
2 MB/event
Detector digitisation
ESD
Pseudo-physical information Clusters, track
candidates (electrons, muons), etc.
Reconstructed information
100 kB/event
Physical information Transverse momentum,
Association of particles, jets, (best) id of
particles, Physical info for relevant objects
AOD
Selected information
10 kB/event
Analysis information
TAG
Relevant information for fast event selection
1 kB/event
8Physics Analysis
ESD Data or Monte Carlo
Tier 0,1 Collaboration wide
Event Tags
Event Selection
Calibration Data
Analysis, Skims
INCREASING DATA FLOW
Raw Data
Tier 2 Analysis Groups
Physics Objects
Physics Objects
Physics Objects
Tier 3, 4 Physicists
Physics Analysis
9Tier-0 - CERN
Commodity Processors IBM (mirrored) EIDE Disks..
Storage Systems.
2004 Scale 1,000 CPUs 1 PBytes
10UK Tier-1 RAL
New Computing Farm 4 racks holding 156 dual
1.4GHz Pentium III cpus. Each box has 1GB of
memory, a 40GB internal disk and 100Mb ethernet.
Tape Robot upgraded last year uses 60GB STK 9940
tapes 45TB currrent capacity could hold 330TB.
50TByte disk-based Mass Storage Unit after RAID 5
overhead. PCs are clustered on network switches
with up to 8x1000Mb ethernet out of each rack.
2004 Scale 1000 CPUs 0.5 PBytes
11UK Tier-2 ScotGRID
- 59 IBM X Series 330 dual 1 GHz Pentium III with
2GB memory
- IBM X Series 370 PIII Xeon with 512 MB memory 32
x 512 MB RAM - 70 x 73.4 GB IBM FC Hot-Swap HDD
2004 Scale 300 CPUs 0.1 PBytes
12Grid architecture
13Replica management
- Replica Manager
- copyFile()
- copyAndRegisterFile()
- listReplicas()
- deleteFile()
- Replica Catalogue (LFN PFNs )
- registerEntry()
- unregisterEntry()
14Submitting a Job
The Grid
Site 1
User
Scheduler
Site 2
Site 3
15Replica Optimisation
- Optimise use of computing, storage and network
resources. - Short term optimisation
- Minimise running time of current job.
- Get me the files for my job as quickly as
possible - Long term optimisation
- Minimise running time of all jobs.
- Make sure files are in the best
- places for all my future jobs.
16Optimisation Through Economic Models
- Files represent goods.
- Bought by Computing Elements for jobs.
- Bought and sold by Storage Elements to make
profit. - Investment decision based on projected future
value based on previous file access patterns. - Storage Elements can buy popular files
- independently of running jobs.
17Replica optimiser architecture
- Access Mediator (AM) - contacts replica
optimisers to locate the cheapest copies of files
and makes them locally available - Storage Broker (SB) - manages files stored in
storage element, trying to maximise profit for
the finite amount of storage space available - P2P Mediator (P2PM) - establishes and maintains
P2P communication between grid sites
18Auction Mechanism
- Use Vickrey auction
- Every seller makes a bid lower than the asking
price. - File is sold to lowest bidder at second lowest
price. - Ensures
- Low price for purchaser.
- Trading fairness.
- Minimal messaging
19OptorSim a replica optimiser simulation
- Need to tune optimisation algorithms.
- Develop Grid simulation in JAVA.
- Input network configuration and files and jobs.
- Job transfer the files defined in job
description to CE running job.
20OptorSim a replica optimiser simulation
- Schedule to CE using
- CEcost queueSize
- accessCost
- Files requested according to access pattern.
- Sequential
- Random
- Unitary random walk
- Gaussian random walk
- Zipf distribution (not yet implemented).
- No processing involved, only file transfer.
21OptorSim a replica optimiser simulation
Data Sample Number of Files Total Size (GB)
Central J/y 120 1200
High pt electrons 20 200
Inclusive electrons 500 5000
Inclusive muons 140 1400
High Et photons 580 5800
Z0 -gt b bbar 60 600
- Input site policies and experiment data files
(simplified CDF jobs).
- Tested replication strategies
- No replication
- Always Replicate, Delete Oldest File
- Always Replicate, Delete Least Accessed File
- Economic Model
22Results
- Eco model 40 better for sequential but no better
for others expected since eco model is tuned
for sequential access.
23Future Work
- 3rd party replication
- SAM access patterns
- Integration Optor Reptor Testbed
24Conclusions
- Simulation shows Eco Model successful.
- Further simulation will help tune algorithms.
- Integration into testbed code soon.