Title: Nordic DataGRID Facility
1Nordic DataGRID Facility
2Mega-Science
- The coming era in science is expected to be
dominated by Mega-Science - 104 scientists on a project
- Huge data productions
- Collaborations within the projects
- Examples
- CERN LHC
- ALMA
3NDGF
4Who?
- Denmark
- Finland
- Norway
- Sweden
5Purpose
The NDGF main focus is to create and operate the
mechanisms, organizational structures, and
development projects needed to leverage the
national investments in computing infrastructures
and meet the long term vision. It is important to
underline that the suggested Nordic commitment
does not substitute relevant and needed national
funding for competence and infrastructure
build-up in the participating Nordic countries,
rather NDGF will facilitate sharing of these
national resources creating added Nordic value.
6What?
7What?
8What?
Certificate Data Version 3 (0x2)
Serial Number 2 (0x2) Signature
Algorithm md5WithRSAEncryption Issuer
CDK, STDenmark, OIMADA, OUSDU,
CNNorduGridCA Validity Not
Before Oct 8 120256 2004 GMT Not
After Oct 8 120256 2005 GMT Subject
CDK, STDenmark, LIMADA, OSDU, CNBrian
Vinter Subject Public Key Info
Public Key Algorithm rsaEncryption
RSA Public Key (1024 bit)
Modulus (1024 bit)
00d88649c53d5a0532c443ec786488
snip Exponent
65537 (0x10001) X509v3 extensions
X509v3 Basic Constraints
CAFALSE Netscape Comment
OpenSSL Generated Certificate
X509v3 Subject Key Identifier
AC12726F0BEC4D386B3EC2304610FDC22A
E1D1D6 X509v3 Authority Key
Identifier keyidFA748F4C1D
2B9A14AC93D74155E301BC4CED2B87
DirName/CDK/STDenmark/OIMADA/OUSDU
/CNNorduGridCA serial00
Signature Algorithm md5WithRSAEncryption
1e47530f3f75e0cee7c0188f94a00d39d6
37 e9bd96aa437c7ba1db4a54cd3
682ff2d71cb 91dbfcb2f21f0f37
a4b50c6bcf554d23cca3
f1f0f99a4e572626c7de0f494d283a6bbb
00 6b16c4e310efb7b8c96d1ad16
3d46d58d69d 9f3f47aab33b3b3f
b47938c69d53675180ce
c52c2307b5394715b3b9255e9b4044bb29
31 2193
9GRID
10The GRiD
- Named after the power-grid
- Sometimes referred to as the information power
grid - Like the power-grid GRID should be powered by
large installations - not individual generators
11Philosophy
- Current Internet only allows access to
information - The GRiD should provide access to any desired
resource - CPU/SuperComputers
- Storage
- Applications
12Why is GRID possible now
- We now have the security in place
- Network performance is now here
13Transparent Remote File Access
- Huge input files incur a number of problems
- Download time vs. total execution time
- Job execution on the resource is delayed
- Storage requirements on resources
- Often only small scattered fragments of input
files are needed - How about automatic on-demand download of needed
data?
14Experiments
- 4 experiments
- Overhead read a one byte file
- I/O intensive application Checksum a 1 GB file
- I/O balanced application Process a 1 GB file
- Partial file traversal Search a 360 MB B tree
for a random key - 3 test setups
- Local execution
- Copy model
- Remote access model
15Baseline Performance
16Latency tests
17Checksum
18Balanced
19B Tree
20Examples
21Use resources better
Dalton Portal
22Chemistry / Dalton
- Dalton is a very commonly used application within
chemistry - Large number of functions to support quantum
chemistry - Some models runs excellent on cluster-computers
while others require large shared memory - Grid can help us
23Dalton Cluster or SMP? Today
24Dalton Cluster or SMP? Today
25Dalton Cluster or SMP? Tomorrow
GRID
26Dalton Cluster or SMP? Tomorrow
GRID
27Dalton Cluster or SMP? Tomorrow
GRID
28Dalton
29Dalton Portal
30Dalton
31Use more resources
GRID BLAST
32Sequence alignment the challenge
- Data volume doubles every 6-12 months
- CPU power doubles every 18 months (Moore's law)
- Needed CPU power for a BLAST-search is linearly
related to the data volume
This slide is borrowed from Jan Host Jensen, Novo
Nordic
33GRID BLAST
- First prototype with BLAST compatible interface
ready - Outstanding problems include
- A naming convention for different versions of the
databases - Security many users are unwilling to send their
own databases to the Grid
34Bio-BLAST
- Grid BLAST
- Split the databases in blocks
- Match block to block
- Assemble the results
35Bio-BLASTE.Coli vs. Human
36Bio-BLASTE.Coli vs. Human
37Bio-BLASTE.Coli vs. Human
38Bio-BLASTE.Coli vs. Human
1GB
39Bio-BLASTE.Coli vs. Human
1GB
512 MB
40Use resources easier
GRID VCR
41GRID VCR
- A GRID enabled VCR
- Allows programming as a GRID job
- Allows sharing of recordings in a virtual
organization - Really just a prototype for any apparatus you
might want to share - Or just ease access to
42Grid VCR
(executable"VCR_HOME/vcrrecord.sh")
(arguments-t 000010) (runtimeenvironment"VCRR
ECORD/DK/TV2") (jobName"ng-vcrrecord-job")
(outputFiles("video/.avi" "")("video/" ""))
(stdout"stdout") (stderr"stderr")
(CPUTime100) (startTime"2004-09-28 0852")
PLAY MOVIE
43Own less resources
Grid terminal
44The Grid Terminal
- If all applications are best run on the Grid
anyway why not just do it? - The Grid terminal is a 500MHz VIA processor based
machine (clocked down to 250MHz) - Used 15W
- Has no movable parts so no noise
- All applications are run on Grid
45The Grid Terminal
46Grid terminal
47Use resources easier
High Bandwidth
48High Throughput application Experiments
49High Throughput application Experiments
- The experiment was hooked up with the ATLAS
data-challenge 2 - A cluster was set up in Odense to simulate an
accelerator - Work-clusters were placed in
- DIKU 1Gb link
- NBI 1 Gb link
- IMM 1 Gb link
- IFD Forskningsnettet
- Sweden - Nordunet
- Norway - Nordunet
- Canada Internet ?
50The Setup
51Conclusions
- Using Grid allowed the setup to run continually
though some sites crashed or disappeared during
the test - The 10Gb switch could not switch at 10Gb
- Thus we ended up with 1Gb VLan links in the 10Gb
fiber - We were able to saturate the network
- Not reaching 10Gb but managing to stall both
producers and consumers
52Towards an Open Compute Market
53PLans
Within the NDGF core facility are several
responsibilities the management of a shared
production Grid, integrated middleware
development, an empowered committee to decide on
cross national policy issues, user and
application maintenance and possibly a Nordic
Grid Forum to coordinate academic and industrial
issues and initiatives.
54Towards an Open Compute Market
- We move towards an economy based open Nordic
compute market - Users will have GRID accounts
- but no longer machine accounts
- Jobs are submitted to GRID where the centers
compete to win the task
55NDGF Bourse
56One Motivation
57Advantages
- Competition will spread
- between centers
- to vendors
- Larger market for niche architectures
- Clusters are currently flavor of the day
- SMPs are kept quite small
- No room for
- Vectors
- Reconfigurable architectures
58Organization
- NDGF will provide a common Nordic market for
computing - Shared access to all Nordic resources
- Common storage
- Strong self supported middleware
- NDGF should focus on science not GRID!
59NDGF Tasks
60Evaluation
The Evaluation Panel agrees with the idea of
establishing a common Nordic grid. The Nordic
countries, individually, appear to have
difficulties with establishing the sufficient
critical mass to become attractive collaboration
partners in the larger grid projects in Europe
and the rest of the world. But the establishment
of NDGF will ensure for scientist in the
Nordic countries a grid infrastructure which
is of sufficient critical mass to join various
scientific projects needing grid and high
performance computing in Europe and the rest of
the world.
61(No Transcript)