Visual Data Mining for Quantized, Spatial Data - PowerPoint PPT Presentation

1 / 11
About This Presentation
Title:

Visual Data Mining for Quantized, Spatial Data

Description:

135 footprints. AIRS Granules. 1 degree lat/lon. 1500 km. 2250 km. 90 footprints. Geographic space. 1. 1. 1. High-dimensional. data space. Quantization ... – PowerPoint PPT presentation

Number of Views:92
Avg rating:3.0/5.0
Slides: 12
Provided by: amybra5
Category:

less

Transcript and Presenter's Notes

Title: Visual Data Mining for Quantized, Spatial Data


1
Visual Data Mining for Quantized, Spatial Data
Amy Braverman Jet Propulsion Laboratory California
Institute of Technology Mail Stop 169-237 4800
Oak Grove Drive Pasadena, CA 91109-8099 Amy.Braver
man_at_jpl.nasa.gov
2
Outline
  • Motivation.
  • Approach.
  • AIRS data collection.
  • Quantization.
  • Visual data mining (I).
  • Visual data mining (II).
  • Hierarchical Quantization.
  • Visual data mining (III).
  • Summary.

3
Motivation
  • Earth Observing System satellites return
    massive data volume.
  • Traditional approach to data exploration
    produce maps of one degree averages and standard
    deviations for each parameter of interest.
  • Good news this is easy, practical, and
    everybody understands it.
  • Bad news the method throws away almost all of
    the distributional information in the data
    including covariance and higher-order statistics.
  • Need to mine the data, i.e. how do
    characteristics of joint distributions change in
    (time and space) and across resolutions?
    Characterize forcings and feedbacks.

4
Approach
  • New approach produce an estimate of the joint
    (empirical) probability distribution of variables
    of interest within each one degree grid cell.
  • Use a clustering algorithm such as K-means to
    partition data into groups, represent each group
    by its centroid and (normalized) membership
    count.
  • Collection of all 180 x 360 64,800 grid cell
    distribution estimates is a proxy for the
    original data.
  • How to find relationships? We need to visualize
    multivariate relationships while maintaining
    spatial context.

5
AIRS Data Collection
6
Quantization
AIRS Granules
(!)
135 footprints
1 degree lat/lon
7
Visual Data Mining (I)
  • Data 11 AIRS channels observed over 3 days
    (July 20-22, 2002).
  • Compare joint distributions among grid cells
  • Are the grid cell data homogeneous or
    heterogeneous?
  • What physical processes account for the shapes
    of the representatives and the distribution?
  • What physical processes might account for
    differences between grid cells?
  • Are there outliers?

8
Visual Data Mining (II)
  • Data in this region 10,498 clusters
    representing 60,681 observations.
  • Can we summarize the whole region as one?

9
Hierarchical Quantization
10
Visual Data Mining (III)
  • More questions
  • How do the distributions change as you move
    from east to west? (Suggested approach subdivide
    the region into western half and eastern half.
    Summarize separately and compare to each other
    and summary of the whole. Subdivide again, etc.)
    North to south?
  • What other regions are similar to this one? Are
    they the ones we expect based on physics? Does
    spatial resolution matter for answering the
    question? If so, how?
  • Where are the regions of high complexity
    (variability or distribution entropy)? Do the
    physics support this?
  • How does the regression of channel 1 on channel
    2 change spatially?

11
Summary
  • Accept coarser spatial resolution (one degree) to
    achieve replication and estimate distributions.
  • Explore quantized data interactively by comparing
    distributions at different levels of aggregation
    and in different locations (and times).
  • We are mining the data, not making inferences. No
    spatial statistical models.
  • AIRS data will be available at http//daac.gsfc.na
    sa.gov/atmodyn/airs/index.html.
  • More information about AIRS http//www-airs.jpl.n
    asa.gov.
Write a Comment
User Comments (0)
About PowerShow.com