LATBauerdick, Project Manager - PowerPoint PPT Presentation

1 / 59
About This Presentation
Title:

LATBauerdick, Project Manager

Description:

DAWN Proposal Focusses on Dynamic Workspaces. within the Peta-scale Grid! ... We are focusing on the needs of the particle physics community, specifically ... – PowerPoint PPT presentation

Number of Views:44
Avg rating:3.0/5.0
Slides: 60
Provided by: usc1
Category:

less

Transcript and Presenter's Notes

Title: LATBauerdick, Project Manager


1
Joint DOE/NSF Status Meeting of U.S. LHC
Software and ComputingNational Science
Foundation, Arlington, VA, July 8, 2003
US CMS Software and Computing Overview and
Project Status
  • LATBauerdick, Project Manager
  • Agenda
  • Overview and Project Status -- LATBauerdick
  • Preparations for CMS Milestones -- Ian Fisk
  • Discussions

2
Plan for this Talk
  • Responses to previous recommendations
  • Grids and Facilities
  • Software
  • Management
  • Status FY03
  • Issues for the next 12 months
  • Production Quality Grid
  • Middleware and Grid Services Architecture
  • Environment for Distributed Analysis
  • Grid Services Infrastructure
  • Ian Fisks talk Status and preparation for the
    Data Challenge 2004 and Physics TDR

3
Recommendations Grids
  • The committee encourages both experiments to
    continue efforts to get the prototype Tier 2
    centers into operation and to establish MOU's
    with the iVDGL Tier 2 centers concerning various
    production deliverables.
  • three sites are fully operational and active
    participants in CMS production
  • Caltech, UCSD, U.Florida
  • have finalized and signed the iVDGL MOU --!
  • production deliverables delineated in the MOU
  • However, the prototype Tier 2 centers are also
    excellent sites for testing prototype software
    developments and deployments. They should be
    allowed to operate for part of the time in
    research mode, which is consistent with their
    charter. This research will facilitate
    opportunistic use of non-ATLAS/CMS owned
    resources in the future.
  • working with Tier-2 sites on many aspects of
    Grids
  • e.g. dCache test for pile-up serving, networking
    studies, Replica management etc
  • defining joint Grid-US LHC Grid-3 project with
    all T2 sites participating
  • also has specifically CS-demo component related
    to GriPhyN

4
Recommendations Grids cntd
  • There are good efforts underway to pursue grid
    monitoring. The committee encourages continued
    efforts to develop integrated monitoring
    capabilities that provide end-to-end information
    and cooperation with the LCG to develop common
    monitoring capabilities
  • MonaLISA monitoring provides such an end-to-end
    system for US CMS Grids
  • MonaLISA is a model for agents-based Grid
    services architecture
  • is becoming part of the VDT and thus will be
    available to any VDT-based Grid environment (LCG
    and EDG are VDT-based)
  • Monitoring in LCG is an ongoing activity -
  • definition of requirements through GOC effort
  • have seen initial GOC document, provided
    feedback, new doc coming out today
  • interest of INFN to provide tools
  • new LCG GOC plan is to have a federated
    apporach
  • with GOCs in Europe (RAL), US, Asia
  • work on instrumentation of CMS application not
    yet started,
  • but is part of the work plan

5
Recommendations Grids Facilities
  • US CMS should ensure that FNAL has a
    networking plan that meets CMS needs as a
    function of time from DC04 to turn on (in the
    presence of other FNAL demands).
  • Offsite data transfer requirements have
    consistently outpaced available bandwidth
  • Upgrade by ESnet to OC12 (12/02) becoming
    saturated already
  • FNAL planning to obtain an optical network
    connection to the premier optical network
    switching center on the North American continent
    StarLight in Chicago, which enables network
    research and holds promise
  • for handling peak production loads for times when
    production demand exceeds what ESnet can supply
  • for acting as a backup if the link to ESnet is
    unavailable
  • Potential on a single fiber pair
  • Wavelength Division Multiplexing (WDM) for
    multiple independent data links
  • Capable of supporting 66 independent 40 GB/s
    links if fully configured
  • Initial configuration is for 4 independent 1 Gbps
    links across 2 wavelength
  • Allows to configure bandwidth to provide a mix of
    immediate service upgrades as well as validation
    of non-traditional network architectures
  • Immediate benefit to production bulk data
    transfers, test bed for high performance network
    investigations and scalability into the area of
    LHC operations

6
Current T1 Off-Site Connectivity
  • All FNAL off-site traffic carried by ESnet link
  • ESnet Chicago PoP has 1Gb/s Starlight link
  • Peering with CERN, Surfnet, CAnet there
  • Also peering with Abilene there (for now)
  • ESnet peers with other networks at other places

7
T1 Off-Site Connectivity w/ StarLight link
  • Dark fiber as an alternate path to
    StarLight-connected networks
  • Also an alternate path back into ESnet

8
Network Integration Issues
  • End-to-End Performance - Network Performance and
    Prediction
  • US CMS actively pursues integration of Network
    Stack Implementations that support ultra-scale
    networking rapid data transactions,
    data-intensive tasks
  • Maintain statistical multiplexing end-to-end
    flow control
  • Maintain functional compatibility with Reno/TCP
    implementation
  • FAST Project (Caltech) has shown dramatic
    improvements over Reno Stack by moving from loss
    based congestion to delay based control mechanism
  • with standard segment size and fewer streams
    Sender-side only modifications
  • Fermilab/US CMS is FAST partner
  • as a well supported user having the FAST stack
    installed on
  • Facility RD Data Servers (first results look
    very promising)
  • Aiming at Installations/Evaluations for
    Integration with Production Environment at
    Fermilab, CERN and US Tier-2 sites
  • Work in Collaboration with the FAST project team
    at Caltech and Fermilab Computing Division

9
Real-world Networking Improvements
  • Measured TCP throughput between Fermilab Tier-1
    and CERN
  • requirements until 05 30 MBytes/sec average,
    100MByte/sec peak

using the standard Stack
using the FAST Stack
TCP Window Size
622
622
100
100
Throughput Mbits/s
Throughput in Mbits/s
10
1
0.1
1
TCP Window Size
1
100
10
100
10
1
Streams
Streams
10
Recommendations Grids Facilities
  • US CMS should formulate a clear strategy for
    dealing with homogeneity vs. interoperability
    (LCG-x / VDT / DPE) and argue for it within
    International CMS and the LCG process. We further
    recommend a preference for interoperability to
    facilitate opportunistic use of external
    resources.
  • argued for interoperability and (some)
    heterogeneity with CMS and LCG
  • seem to have found understanding, specifically
    with the LCG/EDG delays
  • developed and deployed DPE as basis for
    Pre-Challenge Production in the US
  • CMS application platform based on VDT and EDG
    components
  • added functionality in terms of Grid services,
    like VO management, storage resource management,
    data movement
  • advertised in the GDB the idea of a federated
    interoperable LHC Grid
  • issues like Grid Operations, different
    implementations of middleware services (RLS,
    MDS), organizational and political realities,
    like EGEE in Europe
  • have seen both resistance and support for these
    ideas - ongoing process
  • devised and agreed on the next joint US LHC/US
    Grid projects step Grid3
  • conceived the long-term goal of Open Science Grid

11
Recommendations Grids Facilities
  • US CMS should work with US ATLAS, DOE and NSF
    to develop a plan for the long-term support of
    grid packages
  • Virtual Data Toolkit (VDT) is now the agreed
    standard for US LHC, LCG and EDG
  • so we are establishing the boundary conditions
    and the players
  • many open issues
  • but also many developments e.g. EGEE working
    with VDT principals etc
  • what can we expect from NMI et al?

12
Recommendations Grids Facilities
  • We recommend that the issue of authentication
    in heterogeneous environments (including Kerberos
    credentials) for all of CMS International should
    be further studied.
  • have started a project to addresses these issues
    VOX project with US CMS, Fermilab, BNL, iVDGL
    (Indiana, Fermilab)
  • US CMS registration services, and Local
    authorization/authentication
  • address labs Kerberos-based vs Grid (PKI)
    security
  • many fundamental issues related to security on
    the Grid-- see also Fermilab Cyber Security
    report, endorsement for Fermilab to tackle!
  • immediate issues like KCA PKI addressed through
    VOX project
  • other issues with Grid authorization will come up
    very soon
  • work with Global Grid Forum activities
  • next issues are privilege management (attributes
    like privilege, right, clearance, quota),
    policy-based resource access, secure credential
    repositories
  • need to address to even be able to formulate
    Service Level Agreements!
  • also TRUST ITR proposal of Foster et al.

13
VOX Architecture
  • Many of the components exist, some are being
    developed
  • allows T1 security requirements and Grid Security
    Infrastructure to co-exist
  • provides registration services for Virtual
    Organization

14
Recommendations Software
  • The committee concurs with the US CMS
    assessment that further reduction of manpower
    would be detrimental to the ability of the team
    to complete its mission. US CMS management should
    not allow the current manpower allocation to
    erode further and it needs to push for the needed
    ramp up in the coming years.
  • with the start of the NSF Research Program in
    FY03 and the funding guidance we have received
    for FY04 it will be feasible to stabilize and
    consolidate the CAS effort
  • CERN is looking at the experiments software
    manpower (including physics)
  • review in September establishing update on FTE
    requirements w/ LCG-AA
  • also looking at physics software
  • expect the previously planned (1FTE) ramp-up for
    CAS in FY04 is required
  • other areas are starting in FY04
  • Grid services support, physics support
  • relies on the availability of NSF Research
    Program funds!

15
Recommendations Software
  • US CMS should prioritize grid projects and
    reassign people as needed in order prevent
    schedule slip but should avoid diverting
    personnel from other high priority projects to
    grid projects.
  • all Grid-related efforts are now in the User
    Facilities sub-project
  • some re-assignments of deliverables/scope from
    CAS to UF has occurred, freeing corresponding CAS
    manpower (1 FTE)
  • however, there is an open position at Fermilab
    for CAS, and we have been as of now unsuccessful
    to fill this slot from existing Computing
    Division manpower
  • The CAS team should provide vigorous and loud
    input to the prioritization of the LCG projects
    in order to ensure completion of those most
    critical to CMS milestones
  • our CAS engineers have strong institutional and
    work team loyalties to CMS
  • are either directly involved in the LCG project
    (e.g. Tuura, Zhen)
  • or work closely with the LCG (Tanenbaum, Wildish,
    etc)
  • this also occurs reasonably efficiently through
    the oversight committees
  • SC2, POB, GDB

16
Recommendations Software
  • CMS should promote tightest possible coupling
    between key CMS software efforts with their
    specific needs to off-project (quasi external)
    efforts that will benefit CMS (e.g., GAE RD).
  • we are actively addressing this
  • main off-project RD effort on distributed
    analysis and grid services architecture small
    ITRs in Cal, MIT, Princeton Caltech GAE project
  • becoming successful in convincing the LCG and CCS
    to actively pursue an architectural effort for
    distributed analysis
  • LCG GridApplicationsGroup effort for analysis use
    cases, including Grid (HEPCAL II)
  • ARDA Architectural Roadmap towards Distributed
    Analysis (RTAG 11)
  • very useful PPDG CS11 activity on Interactive
    Analysis
  • June Caltech workshop discussing Grid services
    architecture for distributed analysis
  • Clarens is being tracked in CMS, presentations
    in CMS and CPT weeks
  • Clarens will be part of the DC04 analysis
    challenge

17
Recommendations Software
  • While the search for NSF funds through the ITR
    program appears necessary, US CMS CAS management
    should be wary of the complex relationships and
    expansive requirements that come into play in
    such an environment.
  • Yes! we are w(e)ary...
  • Nevertheless, we have submitted two ITR
    proposals DAWN, GECSR
  • we also realize the great opportunities the ITR
    program gives
  • we need to have CS directly involved in the LHC
    computing efforts
  • we will need to implement much of what is
    described in DAWN
  • distributed workspaces are at the core of how
    distributed resources get to the users
  • this is new and exciting, and the next step for
    the Grids
  • middleware OGSI -- made available to individuals
    and communities through DW
  • through the ITR program the NSF should foster a
    scientific approach to the problem!!
  • otherwise, we will need to implement much of that
    by straight programing

18
Recommendations Management
  • It would be useful if the US CMS Software and
    Computing Project Management Plan could be
    updated to reflect its new management within the
    US CMS Research Program.
  • there is the project management plan for SC,
    which was written and approved before the start
    of the Research Program
  • this plan is going to be adapted, and the planned
    time scale is to have a update in draft form at
    the next Program Management Group meeting, August
    8, 2003
  • We consider it important to continue to allow
    the flexibility to shift funds among different
    components in the coming era of funding as a
    research project.
  • we agree, and this is being done through the
    Research Program Manager, Dan Green

19
Recommendations Management
  • The committee recommends that US CMS closely
    monitor plans for providing support of externally
    provided software, and especially grid
    middleware.
  • yes, see above!

20
Project Organization
  • In April, Ian Fisk joined Fermilab as Associate
    Scientist
  • and as Level2 manager for User Facilities
  • we are in the process of defining the services
    that UF delivers
  • project deliverables are being formulated in
    terms of services
  • that allows us to map out Computing, Data,
    Support, Grid services, etc
  • a program of work is being performed in order to
    implement and run a service
  • program of work related to service, with
    sub-projects to implement/integrate/deploy etc,
    as needed, and operations teams to run services
  • tracked and managed through the resource-loaded
    WBS and schedule
  • service coordinators responsible for these main
    deliverables
  • also defining the roles of the Tier-1 Facility
    Manager, the User Services Coordinator
  • Robert Clare of UC Riverside is agreeing to
    become the CAS L2 manager

21
FY03 funding
  • Very substantial cut in FY03 DOE funding guidance
    w/r to previous plans
  • OFallon letter March 2002, subsequent DOE
    guidance gt Bare Bones scope
  • US CMS SC was able to maintain the Bare Bones
    scope of 4000k
  • DOE shortfalls mitigated by the start of the NSF
    research program funding
  • Research Program Management has allocated the
    funding accordingly
  • DOE total RP funding is 3315
  • SC got allocated 3115
  • NSF total RP funding is 2500k (1000k 2-years
    grant 1500k new)
  • SC got allocated the 750k, out of the two-year
    grant,
  • total SC funding in FY03 is 3865k (bare bones
    profile 4005k)
  • however, due to the drastic reduction in FY03 DOE
    RP guidance (March 02)
  • SC could not yet start all the foreseen NSF RP
    activities (leadership profile)
  • For FY04 we need DOE to ramp to the required
    level so that we can realize the potential of
    the US LHC program!

22
FY03 Funding Status
  • Budget allocation for FY03

23
FY03 Funding Status
  • Effort as Reported and ACWP as Invoiced

24
Plans for FY04
  • NSF Leadership profile!
  • Physics support and Grid services support start
  • Tier-2 pilot implementation
  • CAS gets an additional FTE
  • Plans for US-CERN Edge Computing Systems
  • Tier-1 stays at 13 FTE
  • rather small T1 upgrades

Bare-Bones Scope BCWS FY02-FY05
25
FY04 US LHC Edge Computing Systems
  • following discussion around the C-RRB meeting
    US CMS scoping out project with the LCG and CCS
  • look at a flexible way of moving streams of raw
    data to the Tier-1 centers
  • with some "intelligence" in doing some selection
    and thus producing pre-selected data streams,
  • to be able to optimize access to it later
  • (and eventual enable re-processing, or even
    processing of dedicated lower-priority triggers)
  • while at the same time ensuring a consistent and
    complete 2nd set of raw data.
  • provide the US Edge-Computing Systems needed at
    CERN for flexible streaming of data to the US
  • part of these facilities would be located at the
    CERN Tier-0
  • main function is to enable LHC computing in the
    US and to facilitate streaming and distribution
    of data to Tier-1 centers off the CERN site
  • The associated equipment would also be available
    to the LCG in helping the planned tests for the
    computing model
  • exact scope of this project being discussed in US
    CMS and with the LCG facilities group
  • going to be looked at by CMS within the next
    couple of months
  • will eventually increase the CMS physics reach,
  • help to understand better the issues of such a
    distributed data model.
  • inputs to the "economics model", PRS
    participation in the project probably would be
    appropriate.
  • presented to US CMS, then CMS SB for approval
    expected costs in equipment about 500k

26
FY03 Accomplishments
  • Prototyped Tier-1 and Tier-2 centers and deployed
    a Grid System
  • Participated in a world-wide 20TB data production
    for HLT studies
  • US CMS delivered key components IMPALA, DAR
  • Made available large data samples (Objectivity
    and nTuples) to the physics community
  • ? successful submission of the CMS DAQ TDR
  • Worked with Grid Projects and VDT to harden
    middleware products
  • Integrated the VDT middleware in CMS production
    system
  • Deployed Integration Grid Testbed and used for
    real productions
  • Decoupled CMS framework from Objectivity
  • allows to write data persistently as ROOT/IO
    Files
  • Released a fully functional Detector Description
    Database
  • Released Software Quality and Assessment Plan
  • well underway getting ready for DC04 gt Ian
    Fisks talk

27
Shifting the Focus to Distributed Analysis
  • going forward to analysis means a significant
    paradigm shift
  • from well-defined production jobs to interactive
    user analysis
  • from DAGs of process to Sessions and state-full
    environments
  • from producing sets of files to accessing massive
    amounts of data
  • from files to data sets and collection of objects
  • from using essentially raw data to complex
    layers of event representation
  • from assignments from the RefDB to Grid-wide
    Queries
  • from user registration to enabling sharing and
    building communities
  • are the (Grid) technologies ready for this?
  • there will be a tight inter-play between
    prototyping the analysis services and developing
    the lower level services and interfaces
  • how can we approach a roadmap towards an
    Architecture?
  • what are going to be the new paradigms that
    will be exposed to the user?
  • user analysis session transparently extended to a
    distributed system
  • but requires a more prescriptive and declarative
    approach to analysis
  • set of services for collaborative work
  • new paradigms beyond analysis

28
(No Transcript)
29
LHC Multi-Tier Structured Computing Resources
Peta Scales!!
30
Building a Production-Quality Grid
  • We need make Grid work, large resources become
    available to experiments

31
Getting CMS DC04 underway
  • pre-challenge Production to provide 50 million
    events for DC04
  • Generation already started
  • Generator level pre-selection in place to enrich
    simulated background samples
  • Goal is 50 million useful events Simulated and
    Reconstructed
  • To fit scale of DC04
  • Also fits scale of first round of Physics TDR
    work
  • Simulation will start in July
  • Assignments going out now
  • all US sites already certified
  • G4 and CMSIM, some samples simulated with both
  • Expect G4 version to go through a few versions
    during production
  • Mix and choice of G4/CMSIM to be determined
  • Data production rate will be about 1TB/day
  • Simulated Data kept at Tier-1 centers
  • Reconstructed data sent to CERN for DC04 proper
    in spring 2004

32
CMS Computing TDR Schedule

33
Plan for CMS DC04
34
Preparation of LHC Grid in US Grid3
  • Prepare for providing the Grid services for DC04
    and beyond the Grid3 project
  • contributions from US LHC and the Trillium
    projects (PPDG, iVDGL, GriphyN)
  • integrate the existing US CMS and US Atlas
    testbeds, including the existing iVDGL
    infrastructure
  • deploy a set of emerging Grid services built upon
    VDT, EDG components (provided by LCG) and some
    specific U.S. services, as required
  • e.g. monitoring, VO management etc
  • demonstrate functionalities and capabilities,
    proof that the U.S. Grid infostructure is ready
    for real-world LHC-scale applications
  • specific well-defined performance metrics robust
    data movement, job execution
  • Not least
  • demonstration Grid, showcasing NSF and DOE
    infostructure achievements
  • provides a focal point for participation of
    others
  • e.g. Iowa, Vanderbilt (BTeV), etc

35
Grid3 is Trillium and US LHC
  • Multiple Virtual Organization persistent Grid for
    applicationdemonstrators that hums along (not
    expected to purr yet)
  • Well aligned with deployment and use of LCG-1 to
    provide US peerservices in fall 2003
  • Well aligned with preparations for US ATLAS and
    US CMS DataChallenges
  • Demonstrators are production, data management or
    analysisapplications needed for data challenges.
  • Application demonstrators running in production
    environment to showcapabilities of the grid and
    allow testing of the envelope.
  • Clear performance goals geared at DC04
    requirementsMetrics will be defined and
    tracked.
  • Computer Science application demonstrators should
    helpdetermine benefits from and readiness of
    core technologies.
  • Grid3 will be LCG-1 compliant wherever possible.
  • This could mean, for example, using the same VDT
    release asLCG-1 but in practice will probably
    mean service compatible versions

36
LCG Middleware Layers
  • LCG Prototype LCG-1 Architecture of Middleware
    Layers

Middleware!!
37
Middleware continues to be a focus
  • LCG is struggling with middleware components -
    commend VDT project

38
Building the LHC GRID
  • Working with Grid middleware providers, have
    found ways to make a CMS Grid work
  • This way large computing resources available
  • If the GRID software and GRID management can be
    good enough
  • GRID Software still has a long way to go
  • And it is only the basic layer
  • Much of what we have is a good prototype
  • Need to address how to approach the next level of
    functionality
  • specifically Globus Toolkit 3 and Open Grid
    Services Infrastructure (OGSI)
  • US and CERN/Europe will need to find a way to
    address the maintenance issue
  • Basic Grid functionality now works
  • Working with LCG to implement and/or integrate
    other required features
  • See a path towards getting the Grid Middleware
    for basic CMS production, data management in place

39
HEP-specific Grid Layers, End-to-end Services
  • HEP Grid Architecture (H. Newman)
  • Layers Above the Collective Layer
  • Physicists Application Codes
  • Reconstruction, Calibration, Analysis
  • Experiments Software Framework Layer
  • Modular and Grid-aware Architecture able to
    interact effectively with the lower layers
    (above)
  • Grid Applications Layer (Parameters and
    algorithms that govern system operations)
  • Policy and priority metrics
  • Workflow evaluation metrics
  • Task-Site Coupling proximity metrics
  • Global End-to-End System Services Layer
  • Workflow monitoring and evaluation mechanisms
  • Error recovery and long-term redirection
    mechanisms
  • System self-monitoring, steering, evaluation and
    optimization mechanisms
  • Monitoring and Tracking Component performance
  • Already investigate a set of prototypical
    services and architectures

(I.Foster et al.)
40
Grid Services Architecture
  • Have seen Grid services technologies, e.g.
    OGSI, how about Architectures?

41
Distributed Analysis
  • Unclear in the LHC community how we should
    approach that new focus
  • Distributed Analysis effort not yet projectized
    in the US CMS WBS
  • Need to understand what should be on CMS, in LCG
    AA, in RD projects
  • perception of (too many) independent
    (duplicating) efforts (?)
  • What can we test/use in DC04?
  • Some prototypes can be tested soon and for DC04
  • What are the assumptions they make on the
    underlying GRID
  • On Physicists work patterns?
  • How are their architectures similar/different?
  • Are their similarities that can sensibly be
    abstracted to common layers?
  • Or is it premature for that
  • Diversity is probably good at this time!
  • LCG RTAG on An Architectural Roadmap towards
    Distributed Analysis
  • review existing, confront with HEPCAL use cases,
    consider interfaces between Grid, LCG and
    Application services,
  • To develop a roadmap specifying wherever possible
    the architecture, the components and potential
    sources of deliverables to guide the medium term
    (2 year) work of the LCG and the DA planning in
    the experiments.

42
LHC Architecture for Distributed Analysis
  • ..

43
DAWN Scientists within Dynamic Workspaces!
  • How will Communities of Scientists Work Locally
    Using the Grid
  • Infrastructure for sharing, consistency of
    physics and calibration data, software

Communities!!
44
Dynamic Workspaces DAWN
  • This is About Communities of Scientists Doing
    Research in a Global Setting

45
Dynamic Workspaces DAWN
  • DAWN Proposal Focusses on Dynamic
    Workspaceswithin the Peta-scale Grid!

46
The Vision of Dynamic Workspaces
  • Science has become a vastly more complex human
    endeavor. Scientific collaborations are becoming
    not only larger, but also more distributed and
    more diverse. The scientific community has
    responded to the challenge by creating global
    collaborations, petascale data infrastructures
    and international computing grids. This
    proposal is about taking the next step, to
    research, prototype and deploy the user level
    tools that will enable far-flung scientific
    collaborators to work together as collocated
    peers. We call this new class of scientific tool
    a dynamic workspace and it will fundamentally
    change the way we will do science in the
    future.Dynamic workspaces are environments for
    scientific reasoning and discovery. These
    environments are based on advanced grid
    middleware but extend beyond grids. Their design
    and development will require the creation of a
    multidisciplinary team that combines the skills
    of computer scientists, technologists and those
    of domain experts. We are focusing on the needs
    of the particle physics community, specifically
    those groups working on the LHC. Dynamic
    Workspaces are managed collections of objects and
    tools hosted on a grid based distributed
    computing and collaboration infrastructure.
    Workspaces extend the current capabilities of the
    grid by enabling distributed teams to work
    together on complex problems, which require grid
    resources for analysis. Dynamic workspaces
    expand the capabilities of existing scientific
    collaborations by creating the ability to
    construct and share the scientific context for
    discovery.

47
CS Research and Work Areas
  • Workspaces are about building the capability to
    involve a community in the process of doing
    science. To develop a community oriented approach
    to science requires progress in three key areas
    of Computer Science research
  • Knowledge Management
  • the techniques and tools for collecting and
    managing the context in which the mechanical
    aspects of the work are done. The resulting
    methods and systems will enable the workspace to
    not only hold the scientific results, but also to
    be able to explain and archive the reasons for
    progress.
  • Resource Management
  • workspaces will sit at the top of resource
    pyramids. Each workspace will have access to
    many types of resources from access to data to
    access to supercomputers. A coherent set of
    policies and mechanisms will be developed to
    enable the most effective use of the variety of
    resources available.
  • Interaction Management
  • The many objects, people and resources in a
    workspace need to be managed in a way that key
    capabilities are available when and where users
    need them. The task of coupling the objects in a
    workspace and facilitating their use by people is
    the goal of interaction management.

48
DAWN Model for CS Applications collaboration
49
DAWN Project Structure ITR US LHC
  • CS Applications Areas LHC Systems Integration

50
Grid Services Infrastructure
  • Grid Layer Abstraction of Facilities Rich
    with Services!

Services!!
51
Steps towards Grid Service Infrastructure
  • Initial Testbeds in US Atlas and US CMS,
    consolidation of middleware to VDT
  • VDT agreed as basis of emerging LCG service,
    basis of the EDG 2.0 distribution
  • Build a functional Grid between Atlas and CMS in
    the US Grid3
  • based on VDT, with a set of common services VO
    management, information services, monitoring,
    operations, etc
  • demonstrate this infrastructure using
    well-defined metrics for LHC applications
  • November CMS demonstration of reliant massive
    production (job throughput), robust data
    movements (TB/day), consistent data management
    (files, sites)
  • to scale of the 5 data challenge DC04, planned
    for Feb. 2003
  • Get LHC Grid stake holders together in the US and
    form the Open Science Consortium
  • LHC labs, Grid PIs, Tera Grid, Networking
  • develop plan for implementing and deploying Open
    Science Grid peering with the EGEE in Europe,
    Asia to provide LHC infrastructure

52
Proposed OSG Goals and Scope
  • We have started to develop a plan
  • and a proposal to the DOE and NSF, over the next
    few months
  • and to forge an organization and collaboration
  • building upon the previously proposed Open
    Science Consortium
  • to build an Open Science Grid in the US on a
    Peta-Scale
  • for the LHC and other science communities
  • Goals and Scope
  • Develop and deploy services and capabilities for
    a Grid infrastructure that would make LHC
    computing resources, and possibly other computing
    resources for HEP and and other sciences (Run 2
    etc) available to the LHC Science community,
  • as a functional, managed, supported and
    persistent US national resource.
  • Provide a persistent 24x7 Grid that peers and
    interoperates, interfaces to and integrates with,
    other national and international Grid
    infrastructures
  • in particular the EGEE in Europe (which will
    provide much of the LHC Grid resources in Europe
    to the LCG)
  • This would change how we do business in US LHS
    and maybe in Fermilab

53
A Project to Build the Open Science Grid
  • Scope out services and interface layers between
    Applications and Facilities
  • LHC already has identified funding for the fabric
    and its operation
  • Work packages to acquire and/or develop enabling
    technologies as needed
  • goal to enable "persistent organizations" like
    the national labs to provide those
    infrastructures to the application communities
    (CMS, Atlas, etc)
  • develop the "enabling technologies" and systems
    concepts that allow the fabric providers to
    function in a Grid environment, and the
    applications and users to seamlessly use it for
    their science
  • develop well defined interfaces and a services
    architecture
  • issues like distributed databases, object
    collections, global queries
  • work on the technologies enabling end-to-end
    managed resilient and fault tolerant systems
    networks, site facilities, cost-estimates
  • devise strategies for resource use, and
    dependable "service contracts"
  • Put up the initial operation infrastructure

54
Initial Roadmap to Open Science Grid
  • PMG members and ASCB members have seen a first
    draft of the Open Science Grid document
  • Briefing of Computing Division, strong
    endorsement from CD head
  • Discussion with Atlas and general agreement
  • Discussions with Grid PIs on OSG and Grid3 (iVDGL
    and PPDG steering)
  • started formulation of the Grid3 plan, and task
    force to define Grid3 workplan
  • initial discussions with DOE/NSF
  • starting to develop a technical document
  • Workshop at Caltech June 2003 and start of ARDA
  • starting the Grid services architecture for Grid
    Enabled Analysis
  • starting to define examples for service
    architectures and interfaces
  • Roadmap towards Architecture for all four LHC
    experiments in October
  • Planning for initial Open Science Consortium
    meeting in July

55
Summary US CMS Grid Activity Areas
  • Peta-Scales Building Production Quality Grids
  • US CMS pre-challenge production, LCG-1, Grid3
  • Middleware Drafting the Grid Services
    Architecture
  • VDT and EGEE, DPE and LCG-1, ARDA and GGF
  • Communities Dynamic Workspaces And
    Collaboratories
  • DAWN and GECSR NSF ITR
  • Services Building the Grid Services
    Infrastructure for providing the persistent
    services and the framework for running the
    infrastructure
  • Open Science Grid and Open Science Consortium,
    labs and DOE
  • Adapting the US project to provide the Grid
    Services Infrastructure

56
Conclusions on US CMS SC
  • US CMS SC Project is delivering a working Grid
    environment, with a strong participation of
    Fermilab and U.S. Universities
  • There is still a lot of RD and most of the
    engineering to do
  • With a strong operations component to support
    physics users
  • US CMS has deployed initial Grid system that is
    delivering to CMS physicistsand shows that the
    US Tier-1/Tier-2 User Facility system can indeed
    work to deliver effort and resources to US CMS!
  • With the funding advised by the funding agencies
    and project oversightwe will have the manpower
    and equipment at the lab and universities to
    participate in strongly in the CMS data
    challenges,
  • bringing the opportunity for U.S. leadership into
    the emerging LHC physics program
  • Next Steps are crucial for achieving an LHC
    computing environment that is truly reaching out
    into the US
  • a global production of 50M events in preparation
    of the next Data Challenge, run in Grid-mode in
    the US
  • Grid3 and the integration with the European Grid
    efforts and the LCG
  • performing the DC, streaming data at 5 level,
    throughout the integrated LHC Grid

57
CMS Timelines
CMS/CCS
CMS/PRS
LCG
POOL
2003
New Persistency Validated
General Release
Physics Model Drafted
OSCAR Validated
LCG-1 Middleware And Centers Ramping up
DC04 Test LCG1 Start PTDR
Computing Model Drafted
2004
LCG-3 Final Prototype
Physics TDR Work
DC05 Test LCG3
Computing TDR
2005
Complete
LCG TDR
Computing MOUs
Physics TDR
2006
PURCHASING
DC06 Readiness Check
58
CCS Level 2 milestones
DC04
LCG-1
59
CMS Milestones v33 (June 2002)
Write a Comment
User Comments (0)
About PowerShow.com