Knowledge Extraction from Scientific Data - PowerPoint PPT Presentation

1 / 23
About This Presentation
Title:

Knowledge Extraction from Scientific Data

Description:

From theory or from training set. Integration. registration of datacubes ... quasar. F/G stars? normal. galaxies? symbols: X-ray source counterparts ... – PowerPoint PPT presentation

Number of Views:55
Avg rating:3.0/5.0
Slides: 24
Provided by: roy137
Category:

less

Transcript and Presenter's Notes

Title: Knowledge Extraction from Scientific Data


1
  • Knowledge ExtractionfromScientific Data
  • Roy Williams
  • California Institute of Technology
  • roy_at_caltech.edu
  • SDMIV24 October 2002Edinburgh

KE Tools
S Data
2
Scientific Data
  • Datacubes
  • N-dimensional array
  • spectrum, time-series,
  • image, voxels, hyperspectral image
  • Concentration
  • Pattern matching
  • Integration
  • Event Sets
  • Often derived from pattern matching
  • A set of events is a table
  • Integrating Event Sets
  • Clustering

3
Knowledge Extraction
  • Concentration
  • principle components
  • cluster/outlier finding
  • Datacube ? Eventset
  • Pattern matching
  • From theory or from training set
  • Integration
  • registration of datacubes
  • join / crossmatch of eventsets

4
Datacube
Some stars from the DPOSS survey
5
Datacube
An AVIRIS image of San Francisco Bay
atmospheric absorption
400-2500 nm in 224 bands R. Green, JPL
6
Concentrating Information
  • eg Principle Component Analysis
  • Given a set of vectors
  • Compute dot products
  • (same as correlations)
  • Diagonalize
  • Throw out weaker (noise) components

7
Information concentration
Principle Component Analysis
8
Event Sets
  • Created by pattern matching
  • from a known rule
  • from a training set
  • by finding clusters

9
Event Set Table
103?
namelongitude contentEarth
coordinate unitsdegrees datatypedouble displayf
6.2
nameID contentkey unitsnone datatypechar
43.4 87.2 83.2
E3948547 E3948545 E3943766
108?
10
Gravitational Lenses
Pattern matching finds events in datacubes
A. Szalay, Johns Hopkins
11
Black hole collisions
LIGO Laser Interferometric Gravitational Wave
Experiment
12
Creating Event Sets
Supervised Classification
Given a set of volcanoes, find a lot more
volcanoes Here we use Singular Value Decomposition
13
Multiparameter data colour-colour-fx/fopt
symbols X-ray source counterparts contours all
optical objects
Mike Watson Leicester University
14
Integrating Datacubes
Find a mapping from one domain to the
other Registration of DPOSS and Hubble Deep Field
15
Datacube Registration
Movement of ice inferred from registration
16
Integrating Event Sets
  • Database Join
  • Fuzzy Join
  • eg astronomical crossmatch
  • Distributed Join
  • does the Grid do databases?

17
Integration of Star Catalogs
18
Visualizing Event Sets
Unsupervised clustering
50000 stars in color-color space
19
A Grid of Services
Human gets Data
Understood by human Further processing after
format change
Network of Services
Grid of pipes and engines Switches and actuators
data flow
20
Example Grid of Services
Catalog Service
Query Check Service
Query Estimator
DPOSS Service
Users code
Crossmatch Service
2MASS Service
Storage Service
flexible complex metadata AND broadband binary
21
Computing Challenges
Clustering Classification Visualization Outlier
Detection
  • High-dimensional
  • Visualization of 1010 points
  • Database access to 1010 points
  • Large Distributed Join

22
Standards needed
  • Bundling diverse objects together
  • with code and references
  • Referencing data resources on the Grid
  • local, remote, replicated, ....

23
Problem Solving Environment
  • Plumbing (big data) and electrical (control,
    metadata)
  • Web service and workflow
  • Finding service classes/implementations by
    semantics
  • GUI / Executive / IO adapters / Algorithms
Write a Comment
User Comments (0)
About PowerShow.com