Pattern Matching in DAME using AURA technology - PowerPoint PPT Presentation

1 / 18
About This Presentation
Title:

Pattern Matching in DAME using AURA technology

Description:

Pattern Matching in DAME using AURA technology Jim Austin, Robert Davis, Bojian Liang, Andy Pasley University of York Overview Context AURA technology DAME pattern ... – PowerPoint PPT presentation

Number of Views:50
Avg rating:3.0/5.0
Slides: 19
Provided by: aus115
Category:

less

Transcript and Presenter's Notes

Title: Pattern Matching in DAME using AURA technology


1
Pattern Matching in DAME using AURA technology
  • Jim Austin, Robert Davis, Bojian Liang, Andy
    Pasley
  • University of York

2
Overview
  • Context
  • AURA technology
  • DAME pattern matching problem
  • AURA solution
  • Search performance
  • Next steps

3
Context
  • Vibration data from all engines in flight
  • Detection of unusual vibration patterns
  • Novelties, anomalies
  • Automatic or manual
  • Search for similar vibration behaviour
  • Need to search large volumes of historical
    vibration data
  • Investigate search results and associated data
  • Service data records
  • CBR tools Sheffield

4
AURA technology
  • AURA
  • Proven technology for searching large data sets
  • Ability to scale and maintain performance
  • Easily parallelised
  • Examples
  • Address matcher
  • Molecular matcher
  • Operation
  • Vectors compared to stored examples
  • Uses bit level comparison methods
  • Correlation Matrix Memory operations

5
AURA architecture
Candidate Selector
binary
AURA SearchEngine
Search
Data Adaptor
Input pattern
Store Search
Store
Output pattern
Result
Store
Results
Indexer
Store Search
Candidate Engine (Back check)
Indexes or Data
6
AURA storage recall
binary
AURA SearchEngine
Input pattern
Output pattern
2
1
2
0
0
0
0
Correlation Matrix Memories


7
AURA software
  • AURA re-designed
  • To improve performance of the AURA library in
    terms of both memory usage and search times
  • 3 fold reduction in memory
  • 3 fold reduction in search time
  • To make the library easy to use
  • Simple API
  • Typically only 4 or 5 API calls used
  • Enable implementation as an OGSI GT3 service
  • To engineer the library to commercial software
    standards
  • Comprehensive user guide and reference manual

8
Pattern matching problem
  • Vibration data from sensors forms Z-mod data.
  • Tracked orders extracted from Z-mod data

Tracked order
Frequency
Amplitude
Time
Time
9
Pattern matching problem
  • Novelty or anomaly identified in tracked order
    data by feature detectors

Forms Query sub-sequence
10
Pattern matching problem
  • Search for sub-sequences similar to the query in
    a large volume of tracked order data.
  • Need to investigate all possible alignments
  • Benchmark method is sequential scan
  • Noisy data imprecise matching required
  • Various possible similarity measures
  • Euclidian distance
  • Correlation

11
AURA solution
AURA SearchEngine
EncodedQuery
Encoded Time Series
Candidate Matches
QueryTime Series
Stored Time series
AURABackcheck
Results
12
AURA solution
  • Encoding reduction in dimensionality
  • e.g. from 100pts to 10 values.
  • Approximate search
  • From 1,000,000s of alignments down to 1000s of
    candidate matches
  • Backcheck
  • From 1000s candidate matches to 100 or fewer
    results

13
Encoding technique
  • Piecewise Aggregate Approximation
  • Values encoded using integer bins

14
Search efficiency
  • Approximate search using AURA
  • Fast method of discarding poor matches
  • AURA search typically an order of magnitude or
    more faster than sequential scan.
  • Candidate matches typically lt1 of total.
  • Back check stage very efficient due to reduction
    in volume of data
  • typically 1 or less of processing time for full
    sequential scan.

15
Data size
  • Assume
  • Fleet of 100 aircraft, 4 engines each
  • Flying 10 hours per day
  • 5 data points per tracked order per second
  • 4 bytes per data point
  • Totals
  • approx. 100 GigaBytes per year per tracked order
  • Roughly 10 tracked orders of interest so
  • Total approx. 1 TeraByte per year

16
Search performance
  • Deployed system assumptions
  • 100 CPUs 2GHz each with 1GByte RAM.
  • One per aircraft
  • Each search needs to check 25,000,000,000
    alignments of the query per year of tracked order
    data.
  • Sequential scan
  • Measured at approx. 2 seconds for 5,000,000
    alignments of a 100 data point query (one CPU).
  • Extrapolates to approx. 500 seconds to search 5
    years of data assuming 1 CPU per aircraft
  • This is too slow! ? Need to support multiple
    searches and searches on more than one tracked
    order.

17
Search performance
  • Using AURA and PAA based approach
  • Search time reduced by approx an order of
    magnitude.
  • Can search 5 years of data for 100 aircraft in
    approx 50 seconds
  • Believe this to be a workable solution ?
  • But response times potentially slower than this
  • Need to handle a number of searches in parallel
  • Communications and other overheads

18
Next steps
  • Technology
  • Refine similarity measures and encoding methods.
  • Architecture
  • Develop additional services to distribute and
    organise the search
  • Support multiple searches in parallel
  • Measurement
  • Perform scaling trials on engine data
  • Obtain better estimates of overall performance
  • Multiple searches
  • Overheads
Write a Comment
User Comments (0)
About PowerShow.com