Application Services Work Group Report - PowerPoint PPT Presentation

1 / 23
About This Presentation
Title:

Application Services Work Group Report

Description:

... tracking number they use for job status access (e.g. like Fedex and UPS tracking) ... to MonALISA and other repositories (e.g. BOSS) (support for job tracking) ... – PowerPoint PPT presentation

Number of Views:78
Avg rating:3.0/5.0
Slides: 24
Provided by: harvey
Category:

less

Transcript and Presenter's Notes

Title: Application Services Work Group Report


1
  • Application Services Work Group Report

Frank van Lingen California Institute
of TechnologyUltraLight Meeting, NSFJanuary 4,
2006
2
Application Work Group Core Team
  • Frank van Lingen (Coordinator)
  • Rick Cavanaugh (Physics Analysis)
  • Dimitri Bourilkov (Physics Analysis)
  • Jang Uk (Scheduling)
  • Mandar Kulkarni (Scheduling)
  • Laukik Chitnis (Scheduling)
  • Iosif Legrand (Monitoring)
  • Julian Bunn (GAETeraGrid)
  • Conrad Steenberg (GAE)
  • Michael Thomas (GAE)
  • Philipe Galvez (VRVS)

Replaced by ..
3
Network Usage in Scenarios
  • Monte Carlo Production
  • CPU intensive and generates large amounts of
    data
  • Network Distributed generated datasets need to
    be merged
  • User Analysis (Single User View)
  • Analyzing large amounts of data (data sets)
  • Network When site queue is to long, move data to
    another site which enables a user to process it
  • User Analysis (Multi User View)
  • Many users analyzing large amounts of data
  • Network If datasets become popular, and jobs
    have to wait to process it, replicate it.

4
Other Clients Web browser ROOT (analysis tool,ca
ves) Python (codesh) Cojac (detector viz.)/ IGU
ANA (cms viz tool)
Emerging Vision A distributed set of rich and
complex services to support analysis in a
distributed environment
-Clients talk standard protocols to Grid
Services Web Server, -Simple Web service API all
ows simple or complex analysis clients
-Clarens portal hides complexity
-Key features Global Scheduler, Catalogs,
Monitoring, Grid-wide Execution service.
Analysis Flight Deck JobMon Client JobStatus
Client
MCPS Client
  • HTTP,
  • SOAP,
  • XML-RPC,
  • JSON, RMI

Grid Services Web Server
Monitoring Clients MonALISA Clients
Clarens
Tier2 Site
MCPS
Workflow Execution
Workflow Definitions
Discovery
Runjob
JobStatus
Catalogs
Compute Site
Scheduler
DCache
Applications
Metadata
Storage
Fully- Abstract Planner
ROOT
FAMOS
Virtual Data
JobMon
Sphinx
Build services on web service frameworks such as
Clarens and provide end-2-end monitoring using
systems such as MonALISA
Partially- Abstract Planner
Data Management
ORCA
BOSS
Replica
estimators
Monitoring
MonALISA
Fully- Concrete Planner
MonALISA
steering
Network
MonALISA
Global Command Control
Reservation
Planning
Monitoring
BOSS
Execution Priority Manager
Grid Wide Execution Service
GAE Architecture
5
GAE and UltralightMake the Network an Integrated
Managed Resource
Application Interfaces
  • Unpredictable multi user analysis
  • Overall demand typically fills the capacity of
    the resources
  • Real time monitor systems for networks, storage,
    computing resources, E2E monitoring

Request Planning
Monitor
Network Planning
Network Resources
Support data transfers ranging from the
(predictable) movement of large scale (simulated)
data, to the highly dynamic analysis tasks
initiated by rapidly changing teams of scientist
6
(Physics) Analysis on the GridMove from Existing
Components to a Coherent System
8
Client Application
1
2
Steering
Dataset service
  • Catalogs to select datasets,
  • Resource Application Discovery
  • Schedulers guide jobs to resources
  • Policies enable fair access to resources
  • Robust (large size) data (set) transfer

7
3
Discovery
Catalogs
4
9
Planner/ Scheduler
Job Submission
Execution
6
Storage Management
5
5
Monitor Information
Data Transfer
Policy
  • Feedback to users (e.g. status of their jobs)
  • Crash recovery of components (identify and
    restart)
  • Provide secure authorized access to resources and
    services.

Storage Management
Ultralight core data transfer, planning
scheduling, (sophisticated) policy management on
VO level, integration

7
(No Transcript)
8
Clarens Grid Toolkit
  • Provide developers with a framework to develop
    grid enabled web services.
  • Grid portal for users
  • Hide complexity of grid environment from users
  • Standard Services
  • Authentication, Authorization, Access Control
  • File Access
  • VO and User Management
  • Proxy Management
  • Python and Java framework
  • Easier integration with emerging java
    technologies
  • More choice for service developers
  • Also used by
  • LambdaStation
  • OSG Accounting
  • HOTGrid (Astronomy portal)

9
Clarens Java Client
  • Richer GUI environment than HTML/Javascript
  • Enable multiple service/server connections
  • Concept of favorites
  • Store state information
  • Pluggable architecture
  • Enables third party plugin development.
  • Achievements this year
  • Core GUI framework (including connection
    management)
  • File Service plugin
  • Discovery Service plugin (under construction)
  • Rudimentary plugins for estimators and scheduling
    (under construction)

10
Discovery
  • Web Service Catalog, suited for dynamic grid
    environment.
  • Integrates with MonALISA
  • Based on JClarens
  • Achievements this year
  • Software discovery service
  • Associate key/value pairs to service and software
    description
  • Better scalability (minimized resource usage)
  • UDDI backend (UDDIservice discovery standard)
  • Work with EGEE project on standard discovery
    interface
  • Work with Globus on interop. with MDS (Discovery
    Catalog)
  • Part of OSG distribution

Clarens Discovery Servers (JINI Clients)
DS
SS
Clarens Servers
MonALISA JINI Network
Clients
SS
DS
11
Monte Carlo Processing
  • Started as an idea for how to allow users to make
    small custom simulation samples.
  • Enable remote grid submission and return of
    results
  • Separation of the user from the resources by
    connecting through grid (user) interfaces
  • Provide Access Control and Quotas

Based on the concept of Me, My Friends, and the
nonymous Grid
12
Monte Carlo Processing
  • Design for Monte Carlo processing but wider
    applicable.
  • Expose different workflows
  • Users receive tracking number they use for job
    status access (e.g. like Fedex and UPS tracking)
  • Working on improvements for Monte Carlo
    processing
  • Sites pull (parts) of request for processing
    using an agent like
  • Submitting monitor information to MonALISA and
    other repositories (e.g. BOSS) (support for job
    tracking)
  • Achievements this year
  • Service backend with authorization and access
    control.
  • Simple workflow specification format.
  • Auto generation of workflow specifications into a
    web form.
  • HTML/JavaScript front end for user interaction.

13
SPHINX
14
Estimators
  • Schedulers selection of execution site based
    on
  • User deadlines and required quality of service.
  • Quota requirements specified by user.
  • ..
  • Execution site will have its own set of
    estimators
  • Support making intelligent decisions on resource
    selection by estimating
  • Site Access (latency for submitting a job)
  • Job runtime (If possible)
  • Queue wait time
  • File transfer time
  • Achievements this year
  • Runtime estimator (history based approach
  • SDSC Data
  • CMS Data
  • Prime number computation jobs
  • File transfer time estimator
  • IPERF (intrusive and should be replaced)
  • Queue time estimator (works on condor queue)
  • Site access time estimator
  • Integration with steering, job monitoring and
    prototype scheduler services

15
Data Transfer
  • Redesign of CMS transfer tools (Phedex)
  • Redesign of CMS event data model (EDM)
  • Impacts the transfer of data
  • Achievements this year
  • Ultralight kernel based on FAST (See Network
    talk)
  • Monitoring network transfers (See Network
    Services talk)
  • Benchmarks of different transfer protocols (BBCP,
    XRootD) (See network talk)
  • Improvements on SRM/DCache (learning to tune and
    install it)
  • Getting expertise with new Phedex (benchmarks,
    tuning)
  • Feedback to experts on missing functionality.
  • Integrate Storage/Transfer (SRM/Dcache/Phedex)
    with network

16
BOSS Integration (Execution Service)
  • BOSS execution and monitor application used in
    CMS
  • Achievements this year
  • Providing a service wrapper and GUI for BOSS
  • Set of service APIs providing access to BOSS
  • Ability to schedule tasks over web (through a
    GUI)
  • Execution takes places in a secure (sandbox)
    environment
  • Provides task control features, through BOSS
    (kill, delete)
  • Used in demonstration at DOSAR workshop in Sao
    Paulo

17
Jobmon
  • With GRID job submission a large number of things
    can go wrong
  • resources (databases, storage) broken or
    inaccessible
  • User errors
  • ..
  • Quickly and efficient access needed to detect and
    diagnose problems.
  • (Secure) Access to (your) running jobs before
    completion
  • Read log files
  • Kill jobs
  • Developed for CDF, but also
  • applicable in CMS

Achievements this year
18
Many GAE Services Integrated with Network and
Monitor Services
  • Sphinx Scheduler (UFL) Service based scheduler
  • Job Submission BOSS (Collaboration with INFN)
  • Caves (UFL) Analysis code and command sharing
    environment
  • Steering service. First prototype of steering
    service
  • Discovery Service
  • Jobmon. Real time trouble shooting of a users
    jobs (FNAL)
  • Estimators. Providing estimates for schedulers
    and other services on job execution, data
    transfer,

Monitoring
Monitoring
Ultralight will focus on integration and
sophisticated automated decisions based on
monitor information
End-2-end monitoring
Other Synergistic Activities
  • Monte Carlo Processing Service (Fermilab) SC05
    0.13 Tbps challenge
  • Other Science disciplines (Astronomy, Earth
    Science) HotGRID
  • L-Store collaboration. Utilize storage expertise
    to complement network expertise.
  • Lambda Station, Authorized programmability of
    routers using MonALISA CLARENS

19
OutlookTowards System Level Services
  • Development of Java client plugins
  • Include IM functionality for job interactivity (3
    months)
  • Integration with MonALISA clients (2 months)
  • New plugins for managing access control to data
    and services,
  • Improving current framework and plugins (2-3
    months)
  • E2E error trapping and diagnosis cause and
    effect
  • Feedback of job information through MonALISA, via
    JobMon and other sources. (aggregation of
    information)
  • Need a uniform mechanism to propagate and report
    errors in Web Services. (3-4 months)
  • Strategic Workflow re-planning
  • Collaboration with CMS Monte Carlo Team
  • Work on new production environment (3-4 months)

20
OutlookTowards System Level Services
  • Adaptive steering and optimization algorithms
  • First step with steering service prototype and
    estimators
  • Work on refinement of estimator functions (2-3
    months)
  • Work with (CMS) analysis group
  • Integrating current BOSS work (2-3 months)
  • Work on (analysis) submission plugin for client
    (3-4 months)
  • Further propagate work into OSG (ongoing)
  • Integration of network and storage
  • benchmarks, tuning (3-4 months)
  • test and integrate new CMS data management tools
    with networks (3-4 months)

21
Summary
  • Ultralight Application Workgroup made a lot of
    progress
  • Integration of many GAE and Ultralight components
    ongoing
  • End-2-end monitoring through MonALISA
    integration
  • Move towards collaboration between storage and
    networks resources


22
Related Publications
23
www.ultralight.org
Monitor Ultralight
WIKI
News
Related Publications
Write a Comment
User Comments (0)
About PowerShow.com