A Scalable Service Architecture for Distributed Search - PowerPoint PPT Presentation

1 / 11
About This Presentation
Title:

A Scalable Service Architecture for Distributed Search

Description:

DAME demonstrated Grid for distributed decision support in maintenance ... Data Extractor. Pattern Match Engines. Global MCAT. Signal Data Explorer ... – PowerPoint PPT presentation

Number of Views:37
Avg rating:3.0/5.0
Slides: 12
Provided by: markj96
Category:

less

Transcript and Presenter's Notes

Title: A Scalable Service Architecture for Distributed Search


1
A Scalable Service Architecture for Distributed
Search
  • Mark Jessop
  • University of York

2
The Search Problem
  • DAME demonstrated Grid for distributed decision
    support in maintenance environments Rolls-Royce
  • Core element is searching large distributed
    datasets
  • Query by Content approach is used to find
    similar vibration patterns to observed engine
    anomalies

3
Service Architecture
  • Large datasets and small query move processing
    to the data
  • Service stack Web Services, Native apps, data
    management service
  • PMC Front end manages search requests
  • PMS Performs pattern matching
  • SRB Gives virtualised view of data
  • All services only work at local level

4
Evaluation
  • Aim was to be efficient and scalable in
    distributed search
  • Performance evaluation of the architecture was
    required for verification
  • Three implementation compared
  • Java/OGSA web services
  • C (gSOAP) web services
  • Native sockets (C)
  • How many nodes can architecture support?

5
Modelling the System
  • Assumptions
  • Each node is of the same specification
  • Pattern matching algorithm is fully parallel
  • Every node is equal distance from the master
  • Clients ask for no more than 100 results
  • Investigates performance of a single search

T time, w work, n number of nodes, o
serial part of overhead, c parallel part of
overhead
6
Measuring the System
  • Performance increases as nodes are added
  • until the optimum number of nodes is reached
  • The larger the work load, the more nodes can be
    added and still maintain efficiency
  • The serial overhead governs efficiency and
    scalability
  • Experiments aim to measure this overhead

7
Performance
  • Nodes running at Sheffield, Leeds and York
  • Measured invocation times from Sheffield and
    Leeds to York.
  • Each measurement recorded three search
    invocations at a time for each implementation,
    repeated throughout the day
  • Recorded raw performance, no security
    infrastructure

8
Performance
9
Hierarchical Organisation
  • Current model the overhead is proportional to the
    number of nodes
  • A hierarchy allows concurrency in delivering
    requests and merging results
  • Search time complexity could be reduced from
    w/nn to w/nlog n
  • Increases management complexity
  • Assuming structure doesnt add overhead

10
Hierarchical Organisation
11
Conclusions
  • Use of Grid Services in Java restricts
    performance and scalability
  • C services and raw sockets are similar in
    performance
  • Hierarchical organisation could significantly
    improve performance
Write a Comment
User Comments (0)
About PowerShow.com