other servers - PowerPoint PPT Presentation

1 / 1
About This Presentation
Title:

other servers

Description:

tool), ROOT-CAVES client (analysis sharing tool), ... any app that can make XML-RPC/SOAP calls ... to users, based on WSDL/SOAP or XML RPCs with PKI based ... – PowerPoint PPT presentation

Number of Views:53
Avg rating:3.0/5.0
Slides: 2
Provided by: Frankva6
Category:
Tags: servers | soap

less

Transcript and Presenter's Notes

Title: other servers


1
GRID Analysis Environment Where the physics
gets done
More information GAE web page
http//ultralight.caltech.edu/gaeweb/ Clarens web
page http//clarens.sourceforge.net MonaLisa
http//monalisa.cacr.caltech.edu/ SPHINX
http//www.griphyn.org/sphinx/Research/research.ph
p
Scientific Exploration at the High Energy Physics
Frontier
Grid Analysis Environment (GAE)
  • Physics experiments consist of large
    collaborations CMS and ATLAS each encompass 2000
    physicists from approximately 150 institutes
    (300-400 physicists in 30 institutes in the US)
  • Experiments produce petabytes to exabytes of data
  • The Acid Test for Grids crucial for LHC
    experiments
  • Large, diverse, distributed community of users
  • Support for 100s to 1000s of analysis tasks,
    shared among dozen of sites
  • Widely varying task requirements and priorities
  • Need for priority schemes, robust authentication
    and security
  • Operates in a severely resource limited and
    policy constrained global system
  • Dominated by collaboration policy and strategy
  • Requires real-time monitoring task and workflow
    tracking decisions often based on a global
    system view
  • Where physicists learn to collaborate on analysis
    across the country, and accross world regions
  • Focus is on the LHC CMS experiment but
    architecture and services can potentially be used
    in other (physics) analysis environments

HEP Challenges Frontiers of Information
Technology
  • Rapid access to petabytes data stores
  • Secure, efficient, transparent access to
    heterogeneous worldwide distributed computing and
    data handling resources
  • A collaborative scalable distributed environment
    for thousands of physicists to enable physics
    analysis
  • Tracking the state and usage patterns of
    computing and data resources, to make possible
    rapid turnaround and efficient utilization of
    resources

Challenges need to be met so as to provide an
integrated, managed, distributed infrastructure
that can serve virtual organizations on a
global scale
Web browser ROOT (analysis tool) Python Cojac
(detector viz.)/ IGUANA (cms viz tool)
Finding data for CMS analysis (GAE use case)
The GAE Architecture
Structured Peer-to-Peer GAE Architecture
Analysis Client
service
Analysis Client
  • The GAE, based on the Clarens web services
    framework, easily allows a Peer-to-Peer
    configuration to be built, with the associated
    robustness and scalability features
  • Flexible allows easy creation, use and
    management of complex VO structures
  • A typical Peer-to-Peer scheme would involve the
    Clarens servers acting as Global Peers that
    broker GAE client requests among all the Clarens
    servers available worldwide

Host 1
(2) Query for dataset
  • Analysis clients talk standard protocols to the
    Grid Services Web Server, a.k.a. the Clarens
    Grid Portal
  • Simple web service API allows analysis clients
    (simple or complex) to operate in this
    architecture
  • The Clarens portal hides the complexity of the
    Grid services from the client, but can expose it
    in as much detail as required for e.g.
    monitoring.
  • Key features global scheduler, catalogs,
    monitoring, Grid wide execution service

Discover services
HTTP, SOAP, XML-RPC
Grid scheduler/Queue
  • Discovery,
  • Acl management,
  • Certificate based access

Clarens
Grid Services Web Server
Query for data
(2) Query for dataset
Host 2
(3) Submit analysis job(s) with dataset(s)
Scheduler
Catalogs
Fully- Abstract Planner
Metadata
Autonomous replication
Sphinx
(1) Discover catalogs, grid schedulers
Host 3
RefDB
(2) Query for dataset
Host 6
MCRunjob
Partially- Abstract Planner
Client
Virtual Data
MonALISA
ORCA
Applications
Data Management
Chimera
Monitoring
MOPDB
Replica
Host 4
Fully- Concrete Planner
FAMOS
BOSS
Catalog
ROOT
(1) Discover catalog, grid schedulers
POOL
(2) Query for dataset
Grid
Discover services
Host 7
Provenance Catelog
Query for data
Download data
Execution Priority Manager
VDT-Server
Multiple clients will query and submit jobs
Download data
Client code has no knowledge about location of
services, except for several urls for discovery
services
Grid Wide Execution Service
Client
Implementations, developed within physics and cs
community associated with GAE components
Discover service (e.g. Catalog)
Scheduling Push/Pull Model (GAE use case)
GAE backbone Clarens web service framework
GAE development (services)
  • Pool file catalog. Developed at CERN
  • Refdb/PubDB. Production database developed within
    CMS experiment
  • BOSS. Uniform job submission layer developed in
    collaboration with INFN
  • SPHINX. Grid scheduler developed at UFL
  • CAVES. Analysis code sharing environment
    developed at UFL
  • MCRunjob/MOP. Monte Carlo production submission
    and tracking tool developed at FNAL
  • Phedex. Production transfer management
    application for CMS
  • Information service. Stores key/value pairs to
    describe environment. Developed in collaboration
    with LHCb experiment
  • Core services (Clarens) Discovery,
    Authentication, Proxy, Remote file access, Access
    control management, Virtual Organization
    management
  • Under development dcache, catalog, local manager
    (job submission), global manager (scheduler) in
    collaboration with CDF experiment.

Push model has limitations once the system
becomes resource limited
  • Clarens A portal system providing a common
    infrastructure for deploying Grid enabled web
    services
  • Features
  • Access control to services
  • Session management
  • Service discovery and invocation
  • Virtual Organization management
  • PKI based security
  • Good performance (up to 1400 calls per second)
  • Role in GAE
  • Connects clients to Grid or analysis applications
  • Acts in concert with other Clarens servers to
    form a P2P network of service providers
  • Two implementations
  • Python/C using Apache web server
  • Java using Tomcat servlets

service
(1) Submit job(s) with dataset(s) for
reconstruction/analysis
(3) Submit/pull job(s)
Grid scheduler /Queue
Web server
Combining push and pull to get better scalability
(2) Query resource status
(2) Query resource status
(2) Query resource status
http/https
monitors
Uniform job submission layer
Java client, ROOT (analysis tool), IGUANA (CMS
viz. tool), ROOT-CAVES client (analysis sharing
tool), any app that can make XML-RPC/SOAP calls
Clarens scalable web server
other servers
GRID Enabled Analysis User view of a
collaborative desktop
  • Physics analysis requires varying levels of
    interactivity, from instantaneous response to
    background to batch mode
  • Requires adapting the classical Grid
    batch-oriented view to a services-oriented
    view, with tasks monitored and tracked
  • Use Web Services, leveraging wide applicability
    of commodity tools
  • Implement the Clarens Web Services layer as
    mediator between authenticated clients and
    services as part of the GAE architecture
  • Clarens presents a consistent analysis
    environment to users, based on WSDL/SOAP or XML
    RPCs with PKI based authentication for security

Service discovery
Clarens Grid Portal Secure cert-based access
to services through browser
External Services
External Services
Remote file access
Job submission
Catalog access
Write a Comment
User Comments (0)
About PowerShow.com