SciDAC Center for Enabling Distributed Petascale Science

About This Presentation

Title:

SciDAC Center for Enabling Distributed Petascale Science

Description:

Stork jobs can be managed with Condor's workflow management software (DAGMan) ... Stork data placement manager and matchmaker for co-scheduling of connections ... – PowerPoint PPT presentation

Number of Views:74

Avg rating:3.0/5.0

Slides: 55

Provided by: jennife62

Category:

more less

Transcript and Presenter's Notes

Title: SciDAC Center for Enabling Distributed Petascale Science

1
SciDAC Center for Enabling Distributed
Petascale Science

Argonne National Laboratory
Fermi National Accelerator Laboratory
Lawrence Berkeley National Laboratory
University of Southern California
University of Wisconsin
www.cedps.net
Jennifer Schopf, ANL
jms_at_mcs.anl.gov

2
The Petascale Data Challenge
U
U
U

DOE facilities generatemany petabytes of data(2
petabytes all U. S. academic research
libraries!)

Remotedistributed users
U
U

Remote users (at labs universities, industry)
need data!

U
Massive data
U

Rapid, reliable accesskey to maximizingvalue of
B facilities

U
DOE facilities
3
Bridging the Divide (1)Move Data to Users When
Where Needed
A
Deliver this 100 Terabytes to locations A, B,
C by 9am tomorrow
B
C

Fast gt10,000x faster thanusual Internet

Reliable recoverfrom many failures
Predictable data arrives when scheduled
Secure protect expensive resources data
Scalable deal with manyusers much data

4
Bridging the Divide (2)Allow Users to Move
ComputationNear Data
A
Perform mycomputation F ondatasets X, Y, Z

Science servicesprovide analysisfunctions
neardata source

Flexible easyintegration of functions
Secure protect expensive resources data
Scalable deal with manyusers much data

X
F
Y
Z
5
Bridging the Divide (3)Troubleshoot
End-to-EndProblems
A
Why did my datatransfer (or remoteoperation)
fail?
B
C

Identify diagnose failures performanceproblem
s

Instrument includemonitoring points inall
system components
Monitor collect data inresponse to problems
Diagnose identify thesource of problems

6
Overview

For Each Area
Current Work
An example in current use
How to combine the tools for CEDPS
Work with Applications
Contributing to Globus

7
Data Services in CEDPS

Ann Chervenak, ISI, is the CEDPS Data lead
annc_at_isi.edu
And these slides are adapted from hers
Develop tools and techniques for reliable,
high-performance, secure, and policy-driven
placement of data within a distributed science
environment
Data placement and distribution services that
implement different data distribution and
placement behaviors
Managed Object Placement Serviceenhancement to
todays GridFTPthat allows for management of
Space
Bandwidth
Connections
Other resources needed to endpoints of data
transfers

8
Existing Globus Data Services

Tools for Efficient, Reliable Data Management
GridFTP
Fast, secure data transport
The Reliable File Transfer Service (RFT)
Data movement services for GT4
The Replica Location Service (RLS)
Distributed registry that records locations of
data copies
The Data Replication Service (DRS)
Integrates RFT and RLS to replicate and register
files
The Data Access and Integration Service (DAIS)
Service to access relational and XML database

9
GridFTP

A high-performance, secure, reliable data
transfer protocol optimized for high-bandwidth
wide-area networks
FTP with well-defined extensions
Uses basic Grid security (control and data
channels)
Multiple data channels for parallel transfers
Partial file transfers
Third-party (direct server-to-server) transfers
Reusable data channels
Command pipelining
GGF recommendation GFD.20

10
GridFTP in GT4
Disk-to-disk onTeraGrid

100 Globus code
No licensing issues
Stable, extensible
IPv6 Support
XIO for different transports
Striping ? multi-Gb/sec wide area transport
Pluggable
Front-end e.g., future WS control channel
Back-end e.g., HPSS, cluster file systems
Transfer e.g., UDP, NetBLT transport

11
GridFTP Does NOT Require GSI

All the GridFTP speed and features with the
following security options
Anonymous mode
Clear Text Passwords
GridFTP-SSH
GridFTP-SSH - only need SSH keys on the server
No certificates or CAs
Keys already exist on most systems.
SSH is used to form the Control Channel
Only need ssh running on the server side.
Standard login audit trails

12
Reliable File Transfer

Service that accepts requests for third-party
file transfers
Maintains state in a DB about ongoing transfers
Recovers from RFT service failures
Increased reliability because state is stored in
a database.
Service interface
The client can submit the transfer request and
then disconnect and go away
Similar to a job scheduler for transfer job
Two ways to check status
Subscribe for notifications
Poll for status (can check for missed
notifications)

13
Reliable File TransferThird Party Transfer
RFT Client

Fire-and-forget transfer
Web services interface
Many files directories
Integrated failure recovery
Has transferred 900K files

SOAP Messages
Notifications(Optional)
RFT Service
GridFTP Server
GridFTP Server
14
The Globus Replica Location Service

A Replica Location Service (RLS) is a distributed
registry that records the locations of data
copies and allows replica discovery
RLS maintains mappings between logical
identifiers and target names
Must perform and scale well support hundreds of
millions of objects, hundreds of clients
E.g., LIGO (Laser Interferometer Gravitational
Wave Observatory) Project
RLS servers at 10 sites
Maintain associations between 11 million logical
file names 120 million physical file locations

15
Replica Location Service

Distributed registry
Records the locations of data copies for replica
discovery
Maintains mappings between logical identifiers
and target names

Local Replica Catalogs (LRCs) contain consistent
information about logical-to-target mappings
Replica Location Index (RLI) nodes aggregate
information about one or more LRCs
LRCs use soft state update mechanisms to inform
RLIs about their state relaxed consistency of
index
Optional compression of state updates reduces
communication, CPU and storage overheads
Membership service registers participating LRCs
and RLIs and deals with changes in membership

16
Motivation for Higher-Level Data Management
Services

Data-intensive applications need higher-level
data management services that integrate
lower-level Grid functionality
Efficient data transfer (GridFTP, RFT)
Replica registration and discovery (RLS)
Eventually validation of replicas, consistency
management, etc.
Goal is to generalize the custom data management
systems developed by several application
communities
Eventually plan to provide a suite of general,
configurable, higher-level data management
services
Globus Data Replication Service (DRS) is the
first of these services

17
Data Replication Service

Included in the GT4.0.2 release
Design based on publication component of the LIGO
Lightweight Data Replicator system
Developed by Scott Koranda
Client specifies (via DRS interface) which files
are required at local site
DRS uses
Globus Delegation Service to delegate proxy
credentials
RLS to discover where replicas exist in the Grid
Selection algorithm to choose among available
source replicas (provides a callout default is
random selection)
Reliable File Transfer (RFT) service to copy data
to site
Via GridFTP data transport protocol
RLS to register new replicas

18
NeST

Software network storage appliance
Provides guaranteed storage allocation
Allocation units, called lots, provide a
guaranteed space for a period of time.
http//www.cs.wisc.edu/condor/nest/

19
Stork

Scheduling and management of data placement jobs
Provides for multiple transfer mechanisms and
retries in the event of transient failures
Integrated into the Condor system
Stork jobs can be managed with Condor's workflow
management software (DAGMan).
http//www.cs.wisc.edu/condor/stork/

20
Storage Resource Manager (SRM)

Provide protocol negotiation
Dynamic transfer URL allocation
Advanced space and file reservation
Reliable replication mechanisms
http//computing.fnal.gov/ccf/projects/SRM/

21
dCache

Manages individual disk storage nodes
Makes them appears as a single storage space with
a single file system root
SRM v1 and v2 interface
Supports GridFTP and other transports for
whole-file data movement
Includes a proprietary POSIX-like interface
(dcap) for random access to file contents
http//www.dcache.org/

22
Globus Data Tools in Production The
LIGO Project

Laser Interferometer Gravitational Wave
Observatory
Data sets first published at Caltech
Publication includes specification of metadata
attributes
Data sets may be replicated at up to 10 LIGO
sites
Sites perform metadata queries to identify
desired data
Pull copies of data from Caltech or other LIGO
sites
Customized data management system the
Lightweight Data Replicator System (LDR)
Built on top of Globus data tools tools GridFTP,
RLS

23
Globus Data Tools in Production The
Earth System Grid

Climate modeling data (CCSM, PCM)
Data management coordinated by ESG portal
RLS, GridFTP
Datasets stored at NCAR
64.41 TB in 397253 total files
IPCC Data at LLNL
26.50 TB in 59,300 files
Data downloaded 56.80 TB in 263,800 files
Avg. 300GB downloaded/day
All files registered and located using RLS, moved
among sites using GridFTP

24
Data Services in CEDS

Develop tools and techniques for reliable,
high-performance, secure, and policy-driven
placement of data within a distributed science
environment
Data placement and distribution services that
implement different data distribution and
placement behaviors
Managed Object Placement Serviceenhancement to
todays GridFTPthat allows for management of
Space
Bandwidth
Connections
Other resources needed to endpoints of data
transfers
Services to move computation to data

25
Layered Architecture
26
Higher-Level Data Placement Services

Decide where to place objects
and replicas in the distributed
Grid
environment
Policy-driven, based on needs of
application
Effectively creates a placement workflow that is
then passed to the Reliable Distribution Service
Layer for execution
Simplest push or pull-based service that places
explicit list of data items
Similar to existing DRS
Metadata-based placement
Decide where data objects are placed based on
results of metadata queries for data with certain
attributes
Example LIGO replication

27
Higher-Level Data Placement Services

N-Copies maintain N copies of data items
Placement service checks existing replicas
Creates/delete replicas to maintain N
Keeps track of lifetime of allocated storage
space
Example UK QCDGrid
Publication/Subscription
Allows sites or clients to subscribe to topics of
interest
Data objects are placed or replicated as
indicated by these subscriptions
Question What higher-level placement policies
would be desirable for Fermi applications?
High energy physics
Others

28
Reliable Distribution Layer

Responsible for carrying out the
distribution or placement plan
generated by higher-level service
Examples
Reliable File Transfer Service
U Wisconsin Stork
LBNL Data Mover Light
Provide feedback to higher level placement
services on the outcome of the placement workflow
Call on lower-level services to coordinate

29
Managed Object PlacementService

Building blocks
Data Transfer Service
GridFTP server, needs resource management
Disk Space Manager

Provides local storage allocation
NeST storage appliance Provides storage and
connection management and bookkeeping
Stork data placement manager and matchmaker for
co-scheduling of connections between the
endpoints
dCache storage management (Fermi) improve
scalability and fault tolerance jointly develop
interfaces and interaction with GridFTP
Storage Resource Manager
Connection Management incoming and outgoing
Scheduler (C-Cache, Stork, RFT) includes queue
Eventually interact with both endpoints of
transfer

30
Science Services in CEDPS

Kate Keahey, ANL, is the CEDPS Scalable Services
lead
keahey_at_mcs.anl.gov
Some slides compliments of Kate
Develop tools and techniques for construction,
operation, and provisioning of scalable science
services
Service construction tools that make it easy to
take application code (whether simulation, data
analysis, command line program, or library
function) and wrap it as a remotely accessible
service
Service provisioning tools that allow dynamic
management of the computing, storage, and
networking resources required to execute a
service, and the configuration of those resources
to meet application requirements.
Services to move computation to data

31
PyGlobus

Python implementation of the WS-Resource
framework
Includes support for WS-Addressing,
WS-Notification, WS-Lifetime management, and
WS-Security
Compatible with the GT4 Java WS Core
A lightweight standalone container
Automatic service startup on container start
Basic API for resource persistence and recovery
Support for wrapping legacy codes and command
line applications as Grid Services

32
Virtual Workspace Project

Virtual Workspace
Abstraction of an execution environment
Dynamically available to authorized clients
Abstraction captures
Resource quota for execution environment on
deployment
Software configuration aspects of the environment
Workspace Service allows a Grid client to
dynamically deploy and manage workspaces
Built on
Xen hypervisor an open source, efficient
implementation.
GT4 authentication and authorization mechanisms

33
Recent Demonstrationwith STAR

Problem STAR is a relatively complex code and
extremely hard to install - even if resources are
available, the users can't use them because there
is no easy way to automatically install the
application
Solution Put STAR in a VM, and use the
workspace service to dynamically deploy those
STAR VMs based on need
Users submit requests for STAR execution to those
nodes
Demod at SC06, biggest obstacle is not
technology but deployment Xen and the workspace
service is not available on many platforms

34
Workspace Service Backstage
The VWS manages a set of nodes inside the TCB
(typically a cluster). This is called the node
pool.
Pool node
Pool node
Pool node
The workspace service has a WSRF frontend that
allows users to deploy and manage virtual
workspaces
VWS Service
Pool node
Pool node
Pool node
VWS Node
Each node must have a VMM (Xen) installed, along
with the workspace backend (software that
manages individual nodes)
Pool node
Pool node
Pool node
Image Node
Pool node
Pool node
Pool node
VM images are staged to a designated image
node inside the TCB
Trusted Computing Base (TCB)
35
Scalable Science Services

Service-enabling applications is too hard for
application developers
Community may already have a required data
analysis function, typically implemented as a
standalone (parallel or sequential) program or
library
Turning this existing implementation into a
service is arduous and time-consuming
Process involves knowledge about the mechanics of
service container implementation and Grid
mechanisms
Solution Formalize and automate this process in
service wrapping tools
Automate the process of embedding application
code into an Application Hosting Service (AHS)

36
Service ConstructionApplication Hosting
Environment

An AHS involves
An application-specific service interface to
analysis code
A management interface that allows the service
provider to monitor and control the AHSs
execution
The AHS interacts with external policy decision
points (PDPs) for policy enforcement and
provisioners for service scalability

37
Service Provisioningwith Variance

If the load on a service varies significantly
over time, then the number of resources allocated
to the service must also vary
Solution Introduce provisioning tools that
Allow a service provider to specify desired
performance levels, while other system components
Monitor service behavior
Dynamically add and remove resources applied to
service activity, so as to adapt to varying
service load

38
Configuring and Discovering

Execution environments are difficult to configure
and discover
Scientific applications often require specific,
customized environments
Variations in OS, middleware version, libs, etc,
and file system pose barriers to application
portability
Solution Use resource catalogs and virtual
machine technology to streamline service
deployment
Our resource catalogs will exploit schemas and
information providers describing the relevant
characteristics of an environment
Advertise these descriptions through MDS4 to
allow the application provisioner to discover and
select a set of platforms suitable for
application execution

39
Time-Varying Requirements

Dynamic community demands mean that the number
and type of science services required can vary
over time
Solution Allow for dynamic service deployment
Mechanisms that can allow new instances of
services to be created on demand
Used to instantiate both application services and
data placement services
Based on current work with dynamic deployment of
Web Services, executable programs, and virtual
machines
Develop these mechanisms further to provide a
powerful and flexible service deployment
infrastructure that allows for the creation,
monitoring, and control of arbitrary services

40
Troubleshooting in CEDPS

Brian Tierney, LBNL, is the CEDPS TS lead
BLTierney_at_lbl.gov
Develop tools and techniques for failure
detection and diagnosis in distributed systems
Better logs and logging services to understand
application and service behavior
Better diagnostic tools to discover failures and
performance faults, and for notification of these
errors

41
MDS4

Basic Grid monitoring service
Information providers translate data form a
variety of sources to a standard interface
Index service is a caching registry
Trigger Service provides errors and warnings

42
NetLogger

Extremely light-weight approach to reliably
collecting monitoring events from multiple
distributed locations
Log file management tools
store-and-forward with rollover and stale file
removal
Targeted for high-volume logs typical of
application instrumentation
Efficient in-memory summarization to reduce data
volume
Prototype anomaly detection tool
locate missing workflow events based on a
predefined list of expected events

43
ESG and Trigger Service

Trigger Service monitors seven services across
five sites
3 years experience
Minimal load on the services
Policy to prevent false positives
Increased ability to detect and access cross-Grid
failures

44
Troubleshooting and CEDPS
45
Need For Unique IDs

Tracking distributed activities that involve many
service and software layers
A single high-level request (e.g., distribute
these files) may involve many distinct
activities (e.g., reserve space, move file,
authenticate) at different sites
To diagnose failures or performance problems we
need to be able to identity and access just the
corresponding log records
Solution associate a globally unique identifier
with every activity and service invocation
Extend from previous demonstration work with
biology workflows where we transferred and logged
the workflows Activity ID at every step

46
Logs and Log Management

Logging and monitoring data is hard to find and
manage
Data scattered across sites and within a site
No agreed on standards for what logs should
look like
Large volumes of data are possible
A heavily loaded GridFTP server with all network
and disk I/O instrumented generates 1.1 GB of log
data per hour
Introduce Best Practices for logging
Then implement for GT, Condor, and others
Log collection service
Work to deploy on OSG as first case
Log management functions
Provide for turning logging on and off, moving
log data back to the user for analysis, deleting
old log data, etc.

47
Automatic Failure Detection

Automated failure detection for infrastructure
services
Distributed systems often run many services with
little or no 24-7 support
Failures are discovered only when user tasks fail
Solution Deploy monitoring information providers
to gather behavior data on resources and services
Use this data to warn system administrators of
faults and to study how fault behaviors change
over time.
Extend MDS4 Trigger Service
Create a NetLogger-based event-driven monitoring
system to gather runtime data from services and
applications

48
Performance Degradation Detection

Performance degradation is often overlooked
Many systems have long running services used by
different users at different times, and no single
group tracking behavior
Solution Develop and apply analysis functions to
archived background monitoring and event-driven
log data
Track executions dynamically and compare them to
past behaviors and service guarantees
When service behavior degrades or deadlines are
threatened the proper services or users can be
notified to take action
Develop analysis components to examine end-to-end
bottleneck analysis and detection, trend
analysis, and alarm generation

49
Work with Applications

Strong collaborations with DOE applications,
SciDAC software centers, and DOE facilities

50
Work with Applications

CEDPS, ESG, OSG starting to plan closer
cooperation as part of SciDAC-2
Earth Systems Grid
Data work
Error and warning alpha tester
Open Science Grid
Data services for ATLAS and CMS
Services work with STAR
Logging service Alpha tester
Second Wave
LIGO (OSG) Data focus
GADU Scalable services and data focus
Fusion- FACETS Keith Jackson, Scalable services
focus
DANSE Keith Jackson, Scalable services focus

51
CEDPS Senior Personnel

PI Ian Foster, foster_at_mcs.anl.gov
Project Manager Jennifer Schopf, jms_at_mcs.anl.gov
Area Leads
Data Ann Chervenak, annc_at_isi.edu
Services Kate Keahey, keahey_at_mcs.anl.gov
Troubleshooting, Brian Tierney, BLTierney_at_lbl.gov
Site representatives
ANL Jennifer Schopf, jms_at_mcs.anl.gov
FNAL Gene Oleynik, oleynik_at_fnal.gov
ISI Carl Kesselman, carl_at_isi.edu
LBNL Keith Jackson, KRJackson_at_lbl.gov
U Wisc Miron Livny, livny_at_cs.uwisc.edu

52
Expanding the Communityof Globus Contributors

Creation of the dev.globus community process
Provides an open forum for discussion and
enhancement of current Globus
Enabling the integration of 20 new components
from the US and Europe as incubators

53
Globus DevelopmentEnvironment

Based on Apache Jakarta
Individual development efforts organized as
projects
Consensus-based decision making
Control over each project in the hands of its
most active and respected contributors
(committers)
Globus Management Committee (GMC) providing
overall guidance and conflict resolution

54
Common Infrastructure

Code repositories (CVS, SVN)
Mailing lists
-dev, -user, -announce, -commit
Issue tracking (bugzilla)
Including roadmap info for future development
Wikis
License (Apache 2)
Known interactions for people accessing your
project

55
Current Technology Projects

Common runtime projects
C Core Utilities, C WS Core, CoG jglobus, Core WS
Schema, Java WS Core, Python Core, XIO
Data projects
Data Replication, GridFTP, OGSA-DAI, Reliable
File Transfer, Replica Location
Execution projects
GRAM, GridWay, MPICH-G2
Information services projects
MDS4
Security Projects
C Security, CAS/SAML Utilities, Delegation
Service, GSI-OpenSSH, MyProxy

56
Non-Technology Projects

Distribution Projects
Globus Toolkit Distribution
Process was used for April 4.0.2 4.0.3 releases
Documentation Projects
GT Release Manuals
Incubation Projects
Incubation management project
And any new projects wanting to join

57
Incubator Process

Entry point for new Globus projects
Incubator Management Project (IMP)
Oversees incubator process form first contact to
becoming a Globus project
Quarterly reviews of current projects
Process being debugged by Incubator Pioneers
http//dev.globus.org/wiki/Incubator/
Incubator_Process

58
Escalation
Incubator Projects
B
Proposal
A
Any Grid Project
59
Current Incubator Projectsdev.globus.org/wiki/Wel
come Incubator_Projects

Distributed Data Management (DDM)
Dynamic Accounts
Grid Authentication and Authorization with
Reliably Distributed Services (GAARDS)
Grid Development Tools for Eclipse (GDTE)
GridShib
Grid Toolkit Handle System (gt-hs)
Higher Order Component Service Architecture
(HOC-SA)
Introduce

Local Resource Manager Adaptors (LRMA)
Metrics
MEDICUS
OGCE
Portal-based User Registration Service (PURSe)
ServMark
UCLA Grid Portal Software (UGP)
Workflow Enactment Engine Project (WEEP)
Cog Workflow
Virtual Workspaces

60
Weve Just hadOur First Escalation!

GridWay Meta Scheduling Project
Iganacio Llorente, Universided Complutense de
Madrid
Provides scheduling functionality similar to that
found on local DRM (Distributed Resource
Management) systems
Advanced scheduling capabilities on a Grid
consisting of Globus services
Dynamic discovery selection
Opportunistic migration
Support for the definition of new scheduling
policies
Detection and recovery from remote and local
failures
Straightforward deployment that does not require
new services apart from those provided by the
Globus Toolkit MDS, GRAM, GridFTP and RFT