Tycho: A Resource Discovery and Messaging Framework for Distributed Applications PowerPoint PPT Presentation

presentation player overlay
1 / 24
About This Presentation
Transcript and Presenter's Notes

Title: Tycho: A Resource Discovery and Messaging Framework for Distributed Applications


1
Tycho A Resource Discovery and Messaging
Framework for Distributed Applications
Matthew Grove m.grove_at_reading.ac.ukViva
Presentation, November 2006
2
Outline
  • Research Goals,
  • An Overview Of Tycho,
  • Comparative Benchmarks,
  • Applications of Tycho,
  • Tycho Swarm, a Distribution File Utility -
    (Demo),
  • Summary.

3
Some Background
  • Two key services for distributed systems are a
    mechanism for discovering remote components (such
    as a registry) and then sending messages between
    these components
  • These two services are interdependent.
  • Current solutions require the application
    scientists to assemble their systems from a
    diverse range of services.
  • One approach has been to produce toolkits which
    have pre-selected sets of service bundled
    together, for example Globus.

4
Research Goals
  • The thesis of this research work is that by
    combining registry and messaging into a single
    software framework, the task of binding together
    distributed systems can be simplified.
  • The proposed solution uses an Internet-based
    architecture that keeps complexity at the edges
    of a robust and secure set of core services - a
    novel approach!
  • This framework facilitates extensibility while
    limiting the installation and management costs of
    using the software.
  • The design and development of the framework -
    known as Tycho - has an overarching goal of
    reducing the complexity of developing distributed
    applications.

5
High-level Requirements
  • These are the desirable features for Tycho - as
    argued in the dissertation
  • Scalability, be able to cope with the sizes
    typical of modern distributed systems,
  • High-performance,
  • Extensibility, be able to add new features and
    interoperate with other systems,
  • Security out of the box,
  • Manageability, ease of installation and use
  • For example minimizing elememnts like software
    dependencies, firewall requirements and the
    amount of configuration needed to deploy Tycho.

6
The Tycho Implementation
  • Tycho is the reference implementation of the
    framework developed during the PhD
  • The Tycho components are
  • Mediators,
  • Clients (Producers and Consumers),
  • Utilities
  • The Tycho mediator provides services that allow
    clients to discover each other using a Virtual
    Registry (VR) made up of a network of mediators
    this also aids communication over both LAN and
    WAN.
  • Utilities are extensions to Tychos
    functionality.
  • Tycho used to be called javaGMA or jGMA (poor
    choice of name!)

7
Tychos Architecture
8
General Design Philosophy
  • Reuse existing software components, if possible,
    rather than reinvent existing services or
    functionality.
  • Try to make use of existing software
    infrastructure.
  • Ensure that Tycho is simple to install, configure
    and use.
  • Provide a basic release with the ability to
    extend functionality with a further more
    sophisticated component - Tycho utilities.
  • Because we require portability and
    interoperability with other distributed systems,
    Java was a good choice of implementation language.

9
Tycho Mediator Implementation
  • Tycho provides a choice of implementations for
    each core service.
  • Tychos design described in a paper for a
    "Work-in-Progress Novel Grid Technologies" track
    of the IEEE International Conference Cluster
    Computing and Grid 2005 (CCGrid 2005).

10
Tycho Clients Utilities
  • The Tycho Connector provides the API for building
    producers and consumers.
  • Extra functionality can be added as utilities.

11
An Example of Tychos Setup
12
Tycho Benchmarks
  • Three rounds of benchmarking to measure the
    performance of Tycho compared to state-of-the-art
    and widely used systems
  • Communications - measured the performance of
    inter-client and inter-mediator messaging for
    Tycho and NaradaBrokering.
  • Virtual Registry tests - measured and compared
    the performance of the Tycho VR to Globus MDS4
    and gLite R-GMA.
  • Component Tests - different components of the VR
    were tested in various configurations.
  • Results presented in a paper in proceedings of
    the IEEE International Conference on Cluster
    Computing 2006 (Cluster 2006).

13
Sample VR Benchmark Results
MDS4 out of memory
14
Benchmarks Results Summary
  • Tycho has a better performance and
    client-scalability than both R-GMA, MDS4 and
    NaradaBrokering.
  • R-GMA, MDS4 and NaradaBrokering all crashed
    during testing when they exceeded the maximum
    memory available for the tests (1.5 Gbytes).
  • Memory management in Java systems is an issue
  • Without limited buffering or flow control,
    consuming the Java heap is a problem.
  • Storing information internally using XML seems to
    be a source for some of these memory problems
  • Java database solutions such as HSQDLB can
    provide a high-performance solution for
    off-loading some of the storage requirements to
    disk.

15
Tycho Core Future Work
  • Some more performance improvements
  • Caching of local mediator queries to reduce
    response times,
  • Use of a hybrid VR-interconnect to use IRC for
    query routing and HTTP for transporting large
    responses.
  • Additional functionality can be added to provide
    advanced services
  • WS-based transport handlers for interoperability.

16
Tycho Applications
  • We developed a number of applications to further
    validate the implementation.
  • These include
  • Demonstrations of publishing and discovering
    distributed webcams,
  • Remote resource discovery for the VOTechBroker
    project
  • Part of the European Virtual Observatory project,
    Tycho provides automatic resource discovery for
    job submission.
  • Binding components together for the Semantic Log
    Analyser (Slogger) project
  • Here Tycho helps locate and gather distributed
    logs for analysis.

17
Content Distribution With Tycho
  • We wanted to develop a Tycho utility that would
    demonstrate and validate the utility concept
  • We wanted to create something useful!
  • We created a content distribution system call the
    Tycho swarm utility.
  • The swarm utility provides content distribution
    similar to BitTorrent and overcomes the common 2
    Gigabyte file size problem.
  • Content is split into chunks and the VR is used
    to store chunk availability.
  • Peers use the VR to locate each other and decide
    what chunks to download.
  • Tycho messages are used to transfer the chunks
    between peers and peers cooperate to distribute
    the content throughout the swarm.

18
Swarm Utility Architecture
19
Swarm Utility Summary.
  • The utility was developed to test the potential
    of Tycho utilities and also further stress test
    the overall infrastructure
  • By simultaneously utilising the VR and messaging
    functionality,
  • Storing and updating thousands of entry records
    in the VR,
  • Sending thousands of multi-megabyte messages
    between clients.
  • Its potential uses include
  • Distributing files for collaboration purposes,
  • Staging data for computation,
  • Mirroring and managing large data sets.

20
Swarm Utility Demo
21
Summary
  • The reference implementation of Tycho has been
    completed.
  • Tycho has been released under the LGPL Open
    Source license
  • http//acet.rdg.ac.uk/projects/tycho/
  • The focus now is on developing Tycho utilities to
    provide more feature rich functionally.
  • This work has been summarised in a paper accepted
    for a special issue of The Journal of
    Supercomputing.

22
Research Goals
  • Scalability and high-performance have been
    demonstrated by the benchmarking.
  • Extensibility has been shown with the development
    of the swarm utility and the different services
    and protocols supported by Tycho.
  • Tycho has security out of the box, using HTTPS
    and passwords or certificates for wide-area
    access control and encryption - no comparable
    system we reviewed has this currently.
  • Manageability has been maximised, Tycho requires
    one firewall port, has no external dependencies
    other than a JVM and can run with zero
    configuration.

23
Some Experiences / Observations
  • Java developers should think carefully about how
    memory is used in their applications.
  • Systems which store their data internally as XML
    will probably have relatively poor performance
    and require large amounts of memory and resources
    to work.
  • If you use a servlet container, Jetty offers much
    better performance than Apache Tomcat.
  • Instead of using a separate database, consider
    the Java-based HSQLDB, we have shown it can
    achieve excellent performance and it removes an
    external dependency from your software.
  • Java is not a magic bullet for portability,
    systems such as R-GMA are evidence of this.

24
Links
  • Project Web page
  • http//acet.rdg.ac.uk/projects/tycho/
  • The DSG Web page
  • http//dsg.port.ac.uk/
  • The ACET Web page
  • http//acet.port.ac.uk/
Write a Comment
User Comments (0)
About PowerShow.com