Interactive Data Analysis on the Grid with JAS and Globus - PowerPoint PPT Presentation

About This Presentation
Title:

Interactive Data Analysis on the Grid with JAS and Globus

Description:

Interactive Data Analysis on the Grid with JAS and Globus ... MS-Windows text line breaks. Abandoned jobs. Firewalls. TechXHome.com. Future Ideas ... – PowerPoint PPT presentation

Number of Views:78
Avg rating:3.0/5.0
Slides: 13
Provided by: chep0
Learn more at: https://chep03.ucsd.edu
Category:

less

Transcript and Presenter's Notes

Title: Interactive Data Analysis on the Grid with JAS and Globus


1
Interactive Data Analysis on the Grid with JAS
and Globus 
David Alexander, Brian Miller, John
Exby Tech-X Corporation (www.techxhome.com) Bould
er, Colorado Tony Johnson, Massimiliano Turri,
Booker Bense Stanford Linear Accelerator
Center Menlo Park, California
Supported by U.S. Department of Energy Small
Business Innovative Research Grant
DE-FG03-02ER83556 and Stanford Linear
Accelerator Center
2
Project Overview
  • Started with Java Analysis Studio (JAS)
  • Has distributed analysis system based on RMI
  • Set up test grids on Linux clusters
  • Used Globus Toolkit 2.0
  • Each node had GRAM GridFTP servers and Java
    Runtime Environment
  • Wrote a JAS grid plug-in
  • Used Java CoG Kit 0.9
  • Demonstrated at SC2002
  • Hit remote and on-site cluster

3
Java Analysis Studio (JAS)jas.freehep.org
  • Open source application
  • Built for interactive data analysis, but flexible
    modularized
  • Publication quality plotting facilities
  • User writes Java code to analyze data

4
Java Analysis Studio (JAS)jas.freehep.org
  • Abstracted data source interface
  • Modules are written to work with a variety of
    file formats (PAW, HIPPO, AIDA, Root, ODBC, flat
    files, SIO, HEP)
  • Distributed System Available
  • Versatile Well used in high energy physics
  • Pure Java (Portable, Web Start installation
    upgrade)
  • Flexible topology (stand-alone, client/server,
    cluster)
  • Integration w/ BaBar, Geant4, Wired

TechXHome.com
5
Design Ideas Added Features
  • Goal clustered deployment, launch, federation
  • Special JAS Job use
  • Minimal prerequisites
  • Bare grid Globus, Java, nothing else
  • Heterogeneous cluster
  • Off-grid (or not) client, data, codebase
  • Clients dont need to be superusers
  • Optional background deployment
  • Single sign on

TechXHome.com
6
About Resource Discovery
  • Resource discovery
  • Software needs location of data files
  • Software needs location of Java-enabled hosts
  • Pluggable LDIF source (MDS, URL of text file)
  • Community Authorization Service
  • Fine-grained access control
  • Is resource discovery in a way

7
Move code to data with GridFTP
  • Location transparency
  • User sees data sets
  • Could also have user choice
  • Automatic deployment of JAS
  • Multi-threaded task set
  • Verification of code version, GridFTP codebase to
    node if new
  • GridFTP/link data to user sandbox
  • Deploy control and catalog servers only on
    cluster head node
  • Worker nodes wait for catalog server to run

TechXHome.com
8
Launch Application with GlobusRun
  • Automatic launch of Java servers
  • Java Data Servers are run on specified
    JRE-enabled nodes
  • Special Grid Job is now started (exit the Wizard)
  • Code loaded into client or written in editor
  • -compiled
  • -automatically distributed to Java Data Servers
  • -results (std out, std err, histograms) sent
    back

TechXHome.com
9
A few more Impressive Features
  • User can stop analysis, change code, restart.
  • Distributed debugging can catch individual node
    failures.
  • Histogram re-bin slider surprisingly responsive

TechXHome.com
10
Headaches and Issues
  • Versions of Globus vs. Java CoG Kit
  • CoG properties configuration
  • Client server clocks disagree
  • MS-Windows text line breaks
  • Abandoned jobs
  • Firewalls

TechXHome.com
11
Future Ideas
  • Upgrade to Globus Toolkit 3
  • Pre-install code on cluster head or portal
    machine and deploy from there
  • Use more grid services (Condor, Replica)
  • Implement interfaces or service descriptions from
    PPDG CS-11 group.

TechXHome.com
12
Further Information on JAS
  • for the latest on JAS see the 3pm Catogory 9
    paper
  • JAS3 - A general purpose data analysis framework
  • for HENP and beyond.
  • CONTACTS
  • David Alexander, alexanda_at_txcorp.com
  • Brian Miller, bmiller_at_txcorp.com
  • Tony Johnson, tony_johnson_at_SLAC.stanford.edu
  • Massimiliano Turri, turri_at_SLAC.stanford.edu
  • Java Analysis Studio, http//jas.freehep.org
Write a Comment
User Comments (0)
About PowerShow.com