A Mobile-Agent-Based Performance-Monitoring System at RHIC - PowerPoint PPT Presentation

1 / 17
About This Presentation
Title:

A Mobile-Agent-Based Performance-Monitoring System at RHIC

Description:

Motivation for a new monitoring system. Design of the Instrumentation system ... Nightly backups. Weekly. de-frag. Results indicate server load, client config. 13 ... – PowerPoint PPT presentation

Number of Views:13
Avg rating:3.0/5.0
Slides: 18
Provided by: WinC150
Category:

less

Transcript and Presenter's Notes

Title: A Mobile-Agent-Based Performance-Monitoring System at RHIC


1
A Mobile-Agent-Based Performance-Monitoring
System at RHIC
  • Richard Ibbotson

2
Overview
  • Motivation for a new monitoring system
  • Design of the Instrumentation system
  • Use of mobile agents (mobile programs vs remote
    procedures)
  • How it works, what it does and doesnt do
  • Practical experiences with a test instrument
  • What works well and what doesnt
  • Future enhancements

3
Monitoring System Purpose
  • The system should
  • Provide performance monitoring at service-level
  • End-to-end tests yielding mixed information on
    the functioning of several services
  • Track performance changes during configuration
    changes
  • Monitor current health of system
  • Provide some error-tracking/reporting
    capabilities
  • Be a tool for administrators experimenters
  • It will not
  • Provide detailed system information for fault
    diagnosis (system-specific, vendor-supplied tools
    already exist)

4
Desired Features of the System
  • View / compare past and current measurements
  • Inspect correlations between metrics
  • Allow variation of sampling rate
  • Automatically execute scheduled measurements
  • Can perform measurements on demand at shorter
    intervals
  • Perform OS-independent measurements
  • Use a small fraction of available resources

5
Components of the System
  • Instruments which perform measurements
  • Centralized database of Instruments (code) and
    time-stamped results
  • Allows simple addition of new metrics
  • Allows previously run tests to be reproduced
  • Mechanism for remote execution of Instruments
  • IBM Aglets mobile-agent system
    (http//www.trl.ibm.co.jp/aglets)

parameters
code
monitor
sequence of measurements
6
Mobile Agents vs. RPC
  • Remote Procedure Call

Users system
Remote system
Datasetto search
Local search utility
Search request
A pre-defined procedure on remote host executes
and returns result
  • Mobile Agent

Increased network load for large agents
Remote system
Users system
Daemon on remote host accepts agent and allows
execution
Datasetto search
Search request
Local search utility
7
Advantages of Mobile Agents
  • Metrics can be defined at any time, and
    implemented on the central host
  • Performance is measured on the relevant host
  • Aglets system is Java-based, providing
    platform-independent execution
  • Sophisticated security model exists for
    restricting actions of the agents

8
Use of Mobile Agents In Monitoring
  • Simplest approach, Single-Remote-Host was
    implemented for initial configuration
  • Waiting between tests is done on central server
    for reliability

Target host
Itineraryapproach
SingleRemote Host approach
Central server
Target host
Target host
Central server
Target host
Target host
Target host
9
Anatomy of an Instrument
The code defining a specific implementation of an
Instrument is ? 30 lines
Inherits from
Inherits from
10
Test Instrument File Access
  • NFS access time (write) used as test of concept
  • File size, location (file-system) are passed as
    parameters in database (specified at run-time)
  • Measurements are started by automated process as
    specified by Schedule table in database
  • Tested access to one file-system on several
    client computers
  • Linux (PIII) system with NFSv2, 1KB blocksize
  • Linux (PIII) system with NFSv2, 8KB blocksize
  • Linux (PIII) system with NFSv3
  • Solaris system with NFSv3

11
Report Generation Tool
  • Sample tests are carried out automatically by a
    Scheduler Aglet
  • Reports are requested via an html form. Users
    specify a test-type, parameter-set and target
    host. A Perl cgi-script queries the database and
    plots results using Gnuplot.

12
Sample Report for File access
Results indicate server load, client config
Nightly backups
Weekly de-frag
13
Problems With the Mobile Agents
  • Transfer interrupted when several agents move to
    / from the same host within ? 1-2 sec
  • Small size of Aglets currently used (?15KB)
    cannot explain the effective dead-time
  • The failure is presented to the Aglet as a
    refusal (can detect, wait and retry)
  • Congestion at central host can be relieved by
    following a circuit before returning (multiple
    hosts)

14
Future System Development
  • Solve transfer interruption problem
  • Development of other mobility patterns
  • NFS read-access may be tested by writing on one
    host and timing a read on a different host (to
    avoid caching)
  • Use of itinerary can ease network congestion at
    the central server
  • A tracking / error-reporting system is being
    developed, and will be connected to a paging
    system

15
Summary
  • Initial implementation is proving useful
  • Mobile agent architecture adds design work but
    eases implementation, adds flexibility
  • Transfer interruption causing scalability
    problems, but not insurmountable
  • Plan to have expanded system running before
    data-taking begins

16
Questions...
Richard Ibbotson, BNL ibbotson_at_bnl.gov
Thanks to
David Stampf, BNL Tom Throwe, BNL Bruce Gibbard,
BNL
17
(No Transcript)
Write a Comment
User Comments (0)
About PowerShow.com