Scientific Data Mining: Digging for Nuggets - PowerPoint PPT Presentation

1 / 62
About This Presentation
Title:

Scientific Data Mining: Digging for Nuggets

Description:

SVM (Support Vector Machines) There are 2 Classes! How do you ... Separate them? ... PI: Art Poland (GMU) Co-I's: Jie Zhang, K. Borne, Harry Wechsler (GMU) ... – PowerPoint PPT presentation

Number of Views:259
Avg rating:3.0/5.0
Slides: 63
Provided by: drkirk
Category:

less

Transcript and Presenter's Notes

Title: Scientific Data Mining: Digging for Nuggets


1
Scientific Data Mining Digging for Nuggets
  • Kirk D. Borne

QSS Group Inc. and George Mason University,
NASA-Goddard kirk.borne_at_gsfc.nasa.gov or
kborne_at_gmu.edu http//rings.gsfc.nasa.gov/nvo_data
mining.html
2
Scientific Data Mining Digging For
NuggetsSSDOO Brownbag Seminar GSFC Code 630
July 5, 2006Kirk Borne (QSS / SSDOO)
  • ABSTRACT Data Mining is the killer app for
    large scientific databases.  It enables discovery
    of new knowledge in large data collections.
    Discovering hidden knowledge is both fun and a
    scientific imperative, as the sizes of our data
    collections grow at exponential rates, faster
    than humans can assimilate their contents.  I
    will describe some of the background, techniques,
    and examples of data mining in action in science
    and elsewhere.  The application of scientific
    data mining to NASA's space science data
    collections will meet these two objectives (1)
    it will demonstrate and augment the legacy value
    of the tremendous investment of resources that
    have gone into the acquisition of these large
    NASA mission data sets and (2) it will enable us
    to reap the maximum scientific benefit from those
    investments.
  • BIO Dr. Kirk Borne has a PhD in Astronomy from
    Caltech, and he subsequently had positions at the
    University of Michigan, Carnegie's Department of
    Terrestrial Magnetism, Space Telescope Science
    Institute, and Hughes/Raytheon STX in Goddard's
    Code 631. He currently works for QSS Group Inc.
    as Program Manager for Goddard's SSDOO support
    contract, managing staff in Codes 612.4, 690.1,
    and 605. Dr. Borne is also Associate Research
    Professor of Astrophysics and Computational
    Sciences at George Mason University (GMU) in
    Fairfax Virginia, and he is also Adjunct
    Associate Professor in the Database Technologies
    Program at the University of Maryland University
    College where he teaches a graduate course in
    data mining. He is a senior member of the U.S.
    National Virtual Observatory (NVO) project and of
    the planned Large Synoptic Survey Telescope
    project. His research interests include
    extragalactic astronomy, numerical modeling,
    scientific data mining, computational science,
    and science education technologies.

3
OUTLINE
  • The New Face of Science
  • Heliophysics (Data) Environment
  • Knowledge Discovery
  • Data Mining Examples and Techniques
  • Heliophysics Example
  • Other Earth and Space Science Examples

4
OUTLINE
  • The New Face of Science
  • Heliophysics (Data) Environment
  • Knowledge Discovery
  • Data Mining Examples and Techniques
  • Heliophysics Example
  • Other Earth and Space Science Examples

5
The New Face of Science 1
  • Big Data (usually geographically distributed)
  • High-Energy Particle Physics
  • Astronomy and Space Physics
  • Earth Observing System (Remote Sensing)
  • Human Genome and Bioinformatics
  • Numerical Simulations of any kind
  • Digital Libraries (electronic publication
    repositories)
  • e-Science
  • Built on Web Services (e-Gov, e-Biz) paradigm
  • Distributed heterogeneous data are the norm
  • Data integration across projects institutions
  • One-stop shopping The right data, right now.

6
The New Face of Science 2
  • Databases enable scientific discovery
  • Data Handling and Archiving (management of
    massive data resources)
  • Data Discovery (finding data wherever they exist)
  • Data Access (WWW-Database interfaces)
  • Data/Metadata Browsing (serendipity)
  • Data Sharing and Reuse (within project teams and
    by other scientists scientific validation)
  • Data Integration (from multiple sources)
  • Data Fusion (across multiple modalities
    domains)
  • Data Mining (KDD Knowledge Discovery in
    Databases)

7
The Promise of e-Science
  • The best of Google and Amazon.com
  • Go to one place to shop for all your data needs
  • Use scientific indexing (through scientific
    metadata)
  • Find the data that you need
  • Ignore data that are not relevant
  • Recommend also relevant data sets
  • Access distributed data seamlessly
    (transparently)
  • Integrate multiple data sets
  • Integrate data sets into analysis/visualization
    software packages
  • Provide value-added services
  • Provide intelligence within the archive
  • Provide intelligence at the point of service

8
OUTLINE
  • The New Face of Science
  • Heliophysics (Data) Environment
  • Knowledge Discovery
  • Data Mining Examples and Techniques
  • Heliophysics Example
  • Other Earth and Space Science Examples

9
Sun-Earth Space Environment Rich Source of
Heliophysical Phenomena
10
Multi-point Observations and Models of Space
Plasmas Deliver a Deluge of Physical Measurements
11
(No Transcript)
12
Space Science data volumes aregrowing and
growing and
  • a few terabytes "yesterday (10,000 CDROMs)
  • tens of terabytes "today (100,000 CDROMs)
  • 100s of petabytes "tomorrow"
    (within 10-20 years) (1,000,000,000 CDROMs)

13
Technological Advances the cause and the
solution?
14
Data Access and Analysis Tools are Essential,
but do not scale well with Exponential Data
Growth
15
The Data Flood is Everywhere!
  • Huge quantities of data are being generated in
    all business, government, and research domains
  • Banking, retail, marketing, telecommunications,
    homeland security, computer networks, other
    business transactions ...
  • Scientific data genomics, space science,
    physics, etc.
  • Web, text, and e-commerce

16
OUTLINE
  • The New Face of Science
  • Heliophysics (Data) Environment
  • Knowledge Discovery
  • Data Mining Examples and Techniques
  • Heliophysics Example
  • Other Earth and Space Science Examples

17
How do we learn about our Universe and the World
around us?
Data ? Information ? Knowledge ? Understanding /
Wisdom!
WE GATHER INFORMATION, FROM WHICH WE DERIVE
KNOWLEDGE, FROM WHICH WE LEARN WHAT IT ALL MEANS
18
Data-Information-Knowledge-Wisdom
  • T.S. Eliot (1934)
  • Where is the wisdom we have lost in knowledge?
  • Where is the knowledge we have lost in
    information?

19
Astronomy Example
Data
(a) Imaging data (ones zeroes)
(b) Spectral data (ones zeroes)
  • Information (catalogs / databases)
  • Measure brightness of galaxies from image (e.g.,
    14.2 or 21.7)
  • Measure redshift of galaxies from spectrum (e.g.,
    0.0167 or 0.346)

Knowledge Hubble Diagram ? Redshift-Brightness
Correlation ? Redshift Distance
Understanding the Universe is expanding!!
20
So what is Data Mining?
  • Data Mining is Knowledge Discovery in Databases
    (KDD)
  • Data mining is defined as an information
    extraction activity whose goal is to discover
    hidden facts contained in (large) databases."

21
OUTLINE
  • The New Face of Science
  • Heliophysics (Data) Environment
  • Knowledge Discovery
  • Data Mining Examples and Techniques
  • Heliophysics Example
  • Other Earth and Space Science Examples

22
Data Mining
  • Data Mining is the Killer App for Scientific
    Databases.
  • Scientific Data Mining References
  • http//rings.gsfc.nasa.gov/nvo_datamining.html
  • http//www.itsc.uah.edu/f-mass/
  • Framework for Mining and Analysis of Space
    Science data (F-MASS)
  • Data mining is used to find patterns and
    relationships in data. (EDA Exploratory Data
    Analysis)
  • Patterns can be analyzed via 2 types of models
  • Descriptive Describe patterns and to create
    meaningful subgroups or clusters. (Unsupervised
    Learning, Clustering)
  • Predictive Forecast explicit values, based
    upon patterns in known results. (Supervised
    Learning, Classification)
  • How does this apply to Scientific Research?
  • through KNOWLEDGE DISCOVERY
  • Data ? Information ? Knowledge ?
    Understanding / Wisdom!

23
Data Mining is a core database function
  • Data Mining has many names / aliases
  • Knowledge Discovery in Databases (KDD)
  • Machine Learning (ML)
  • Exploratory Data Analysis (EDA)
  • Intelligent Data Analysis (IDA)
  • On-Line Analytical Processing (OLAP)
  • Business Intelligence (BI)
  • Customer Relationship Management (CRM)
  • Business Analytics
  • Target Marketing
  • Cross-Selling
  • Market Basket Analysis
  • Credit Scoring
  • Case-Based Reasoning (CBR)
  • Connecting the Dots
  • Intrusion Detection Systems (IDS)
  • Recommendation / Personalization Systems!

24
Examples of real Data Mining in Action
  • Classic Textbook Example of Data Mining
    (Legend?) Data mining of grocery store logs
    indicated that men who buy diapers also tend to
    buy beer at the same time.
  • Blockbuster Entertainment mines its video rental
    history database to recommend rentals to
    individual customers.
  • Astronomers examined objects with extreme colors
    in a huge database to discover the most distant
    Quasars ever seen.
  • Credit card companies recommend products to
    cardholders based on analysis of their monthly
    expenditures.
  • Airline purchase transaction logs revealed that
    9-11 hijackers bought one-way airline tickets
    with the same credit card.
  • Wal-Mart studied product sales in their Florida
    stores in 2004 when several hurricanes passed
    through Florida. Wal-Mart found that, before the
    hurricanes arrived, people purchased 7 times as
    many strawberry pop tarts compared to normal
    shopping days.

25
Strawberry pop tarts???
26
Astronomy Data Mining in Action
Exploringthe Time Domain
Mega-Flares on normal Sun-like stars a star
like our Sun increased in brightness 300X one
night! say what??
27
Data Mining Methods and Some Examples
  • Clustering
  • Classification
  • Associations
  • Neural Nets
  • Decision Trees
  • Pattern Recognition
  • Correlation/Trend Analysis
  • Principal Component Analysis
  • Independent Component Analysis
  • Regression Analysis
  • Outlier/Glitch Identification
  • Visualization
  • Autonomous Agents
  • Self-Organizing Maps (SOM)
  • Link (Affinity Analysis)

Group together similar items and separate
dissimilar items in DB
Classify new data items using the known classes
groups
Find unusual co-occurring associations of
attribute values among DB items
Predict a numeric attribute value
Organize information in the database based on
relationships among key data descriptors
Identify linkages between data items based on
features shared in common
28
Some Data Mining Techniques Graphically
Represented
  • Self-Organizing Map (SOM)

Clustering
Neural Network
Outlier (Anomaly) Detection
Link Analysis
Decision Tree
29
Data Mining Application Outlier Detection
Figure The clustering of data clouds (dc)
within a multidimensional parameter space
(p). Such a mapping can be used to search for
and identify clusters, voids, outliers,
one-of-kinds, relationships, and associations
among arbitrary parameters in a database (or
among various parameters in geographically
distributed databases).
  • statistical analysis of typical events
  • automated search for rare events

30
Outlier DetectionSerendipitous Discovery of
Rare or New Objects Events
31
Learning From Legacy Temporal Data (Time
Series)Classify New Data (Bayes Analysis or
Markov Modeling)
32
Principal Components Analysis Independent
Components Analysis
Cepheid Variables Cosmic Yardsticks -- One
Correlation -- Two Classes!
33
Classification MethodsDecision Trees, Neural
Networks, SVM (Support Vector Machines)
  • There are 2 Classes!
  • How do you ...
  • Separate them?
  • Distinguish them?
  • Learn the rules?
  • Classify them?

Apply Kernel
(SVM)
34
Sample Scientific Data Mining Use Cases
  • Data Mining (KDD) is the killer app for
    scientific databases
  • Space and Earth Science Examples
  • Neural Network for Pixel Classification Event
    Detection and Prediction (e.g., Wildfires)
  • Bayesian Network for Object Classification
  • PCA for finding Fundamental Planes of Galaxy
    Parameters
  • PCA (weakest component) for Outlier Detection
    anomalies, novel discoveries, new objects
  • Link Analysis (Association Mining) for Causal
    Event Detection (e.g., linking Solar Surface,
    CME, and Space Weather events)
  • Clustering analysis Spatial, Temporal, or any
    scientific database parameters
  • Markov models Temporal mining of time series
    data

35
Why use Data Mining?Here are 6 reasons...
  • Most projects now collect massive quantities of
    data.
  • Because of the enormous potential for new
    discoveries in existing huge databases.
  • Data mining moves beyond the analysis of past
    events to predicting future trends and behaviors
    that may be missed because they lie outside
    experts expectations.
  • Data mining tools can answer complex questions
    that traditionally were too time- consuming to
    resolve.
  • Data mining tools can explore the intricate
    interdependencies within databases in order to
    discover hidden patterns and relationships.
  • Data mining allows decision-makers to make
    proactive, knowledge-driven decisions.

36
OUTLINE
  • The New Face of Science
  • Heliophysics (Data) Environment
  • Knowledge Discovery
  • Data Mining Examples and Techniques
  • Heliophysics Example
  • Other Earth and Space Science Examples

37
Existing Space Science Data Infrastructure
  • The Recent Past many independent distributed
    heterogeneous data archives
  • Today VxOs Virtual Observatories
  • Web Services-enabled e-Science paradigm
    (middleware, standards, protocols)
  • Provides seamless uniform access to distributed
    heterogenous data sources
  • Find the right data, right now
  • One-stop shopping for all of your data needs
  • Emerging environment consists of many VxOs for
    example
  • NVO National Virtual Observatory (precursor to
    VAO Virtual Astro Obs)
  • VSO Virtual Solar Observatory
  • VSPO Virtual Space Physics Observatory
  • NVAO National Virtual Aeronomy Observatory
  • VITMO Virtual Ionospheric, Thermospheric,
    Magnetospheric Observatory
  • VHO Virtual Heliospheric Observatory
  • VMO Virtual Magnetospheric Observatory
  • Standards for data formats, data/metadata
    exchange, data models, registries, Web Services,
    VO queries, query results, semantics
  • And of course The Grid, Web Services,
    Semantic Web, etc. ...

38
Space Science Knowledge Discovery
39
Heliophysics Space Weather Example
CME Coronal Mass Ejection SEP Solar Energetic
Particle
40
Machine Learning and Data Mining for Automatic
Detection and Interpretation of Solar Events
PI Art Poland (GMU) Co-Is Jie Zhang, K. Borne,
Harry Wechsler (GMU) Collaborator Oscar Olmedo
(GMU student)
  • Project Objectives
  • Our main objective is to develop an automatic
    system for CME (Coronal Mass Ejection) detection,
    tracking, characterization, and source region
    location.
  • An automatic system is needed for
  • Timely detection, necessary for space weather
    forecasting
  • Objective characterization, removing human bias
  • Reducing human cost
  • Data volume and number of events are enormous
  • Explosive growth of data (from SOHO, STEREO, and
    SDO)
  • Science Problem Which Solar surface features
    are causally related to the generation of CMEs
    (coronal features) that cause Space Weather
    (i.e., hazardous energetic particle events near
    the Earth)?

41
OUTLINE
  • The New Face of Science
  • Heliophysics (Data) Environment
  • Knowledge Discovery
  • Data Mining Examples and Techniques
  • Heliophysics Example
  • Other Earth and Space Science Examples
  • Wildfire Example
  • Space Science Examples

42
OUTLINE
  • The New Face of Science
  • Heliophysics (Data) Environment
  • Knowledge Discovery
  • Data Mining Examples and Techniques
  • Heliophysics Example
  • Other Earth and Space Science Examples
  • Wildfire Example
  • Space Science Examples

43
Automated Wildfire Detection (and Prediction)
through Artificial Neural Networks (ANN)
  • Short Description of Wildfire Project
  • Identify all wildfires in Earth-observing
    satellite images
  • Train ANN to mimic human analysts
    classifications
  • Apply ANN to new data (from 3 remote-sensing
    satellites GOES, AVHRR, MODIS)
  • Extend NOAA fire product from USA to the whole
    Earth

44
NOAAS HAZARD MAPPING SYSTEM
  • NOAAs Hazard Mapping System (HMS) is an
    interactive processing system that allows
    trained satellite analysts to manually integrate
    data from 3 automated fire detection algorithms
    corresponding to the GOES, AVHRR and MODIS
    sensors. The result is a quality controlled fire
    product in graphic (Fig 1), ASCII (Table 1) and
    GIS formats for the continental US.
  • Figure Hazard Mapping System (HMS)
    Graphic Fire Product for day 5/19/2003

45
OVERALL TASK OBJECTIVES
  • To mimic the NOAA-NESDIS Fire Analysts
    subjective decision-making and fire detection
    algorithms with a Neural Network in order to
  • remove subjectivity in results
  • improve automation consistency
  • allow NESDIS to expand coverage globally
  • Sources of subjectivity in Fire Analysts
    decision-making
  • Fire is not burning very hot, small in areal
    extent
  • Fire is not burning much hotter than surrounding
    scene
  • Dependency on Analysts aggressiveness in
    finding fires
  • Determination of false detects

46
Hazard Mapping System (HMS) ASCII Fire Product
  • OLD FORMAT
    NEW FORMAT (as of May 16, 2003)
  • Lon, Lat Lon,
    Lat, Time, Satellite,
    Method of Detection
  • -80.531, 25.351 -80.597, 22.932, 1830,
    MODIS AQUA, MODIS
  • -81.461, 29.072 -79.648, 34.913, 1829,
    MODIS, ANALYSIS
  • -83.388, 30.360 -81.048, 33.195, 1829,
    MODIS, ANALYSIS
  • -95.004, 30.949 -83.037, 36.219, 1829,
    MODIS, ANALYSIS
  • -93.579, 30.459 -83.037, 36.219, 1829,
    MODIS, ANALYSIS
  • -108.264, 27.116 -85.767, 49.517, 1805,
    AVHRR NOAA-16, FIMMA
  • -108.195, 28.151 -84.465, 48.926, 2130,
    GOES-WEST, ABBA
  • -108.551, 28.413 -84.481, 48.888, 2230,
    GOES-WEST, ABBA
  • -108.574, 28.441 -84.521, 48.864, 2030,
    GOES-WEST, ABBA
  • -105.987, 26.549 -84.557, 48.891, 1835,
    MODIS AQUA, MODIS
  • -106.328, 26.291 -84.561, 48.881, 1655,
    MODIS TERRA, MODIS
  • -106.762, 26.152 -84.561, 48.881, 1835,
    MODIS AQUA, MODIS
  • -106.488, 26.006 -89.433, 36.827, 1700,
    MODIS TERRA, MODIS
  • -106.516, 25.828 -89.750, 36.198, 1845,
    GOES, ANALYSIS

47
GOES CH2 (3.78 - 4.03 µm) Northern Florida
Fire
  • 2003 Day 126 , 82.10 Deg West Longitude, 30.49
    Deg North Latitude
  • File florida_ch2.png

48
Zoom of GOES CH2 (3.78 - 4.03 µm) Northern
Florida Fire
2003Day 126,
82.10 Deg W Long, 30.49 Deg N Lat
Local minimum in vicinity of core pixel
used as fire location. File
florida_fire_ch2_zoom.png
File florida_ch2_zoom.png

49
NOAA-NESDIS FIRE DETECTION SYSTEM
FIMMA Fire Identification Mapping and
Monitoring Alg
WF-ABBA Wildfire Automated Biomass Burning Alg
NOAA S/C
NASA TAP-OFF POINT FOR IMAGERY
WF-ABBA FIRE DET CHs 1, 2, 4 (0.62, 3.9, 10.7 µm)
GOES EAST-WEST IMAGER 5 CHAN 10-BIT WDS
10-bit
HAZARD MAPPING SYSTEM (HMS) -------
ENVI
MCIDAS (COTS)
GVAR FORMAT
CHS 1, 2, 4 ( 0.62, 3.9, 10.7 µm ) 8-BIT WDS,
LCC
FIRE ANALYSTS
NOAA 14-17 AVHRR 5 CHAN 10-BIT WDS
FIMMA FIRE DET CHs 2, 3b, 4, 5 (0.91, 3.7,
10.8, 12 µm)
Geo-correction
DAILY NOAA FIRE PRODUCT (automated algorithms and
manual additions)
TERASCAN (COTS)
HRPT FORMAT
10-bit
CHS 1, 2, 3b (0.63, 0.91, 3.7 µm) 8-BIT WDS, LCC
MODIS MOD14 FIRE PRODUCT CHs 2, 22, 31 (0.86,
03.9, 11 µm)
NASA S/C
TERRA-AQUA MODIS 36 CHAN 12-BIT WDS
Bow-Tie Effect Removal
MCIDAS (COTS)
CHS 1, 2, 22 ( 0.66, 0.86, 3.96 µm ) 8-BIT WDS,
LCC
HDF FORMAT
LCC Lambert Conformal Conic Projection
MCIDAS Man Computer Interactive Data Access
System
50
SIMPLIFIED DATA EXTRACTION PROCEDURE

DATA GOES (96 Files/day) AVHRR (25
Files/day) MODIS (14 Files/day)
Daily HMS ASCII Fire Product Geographic Coords
(lat/lon)
SpectralData
Image Coords
Neural Network Training Set
ENVI Function Call Conversion to Image Coords
(row/col)
Image Refs
Filter Out Bad data points
51
DECISION REGIONS AND BOUNDARIES FOR HIGHLY IDEAL
SCATTER PLOT CLUSTERING PATTERNS
Single Fire Signature
Multiple Fire Signatures
X2
X2
Surface Fire
Crown Fire
Fire
Ground Fire
Background
Background
X1
X1
52
Scatter Plot of Background-Subtracted GOES CH 1
vs. CH 2
  • Fire (lower) and non-fire (upper)
    separation of clusters
  • 2003 June 2 Northern Florida
    File scatter_fires12.png

  • (GOES CH1, CH2, CH4 are input to neural network)

53
Scatter Plot of Background Subtracted GOES CH 2
vs. CH 4
  • Fire (left) and non-fire
    (right) separation of clusters
  • 2003 June 2 Northern Florida
    Filescatter_fires22.png
    (GOES CH1, CH2, CH4 are input to
    neural network)

54
Neural Network Configurationfor Wildfire
Detection Neural Network
Connections (weights)
Connections (weights)
Band A Inputs1 - 49
Band B Inputs 50 - 98
Output Classification
(fire / no-fire)
Output Layer 2
Band C Inputs 99 - 147

Input Layer 0
Hidden Layer 1
55
Typical Error Matrix(for MODIS instrument)
RESULTS
True Positive False Positive False Negative True
Negative
TRAINING DATA
Fire NonFire Totals
3007
173 (FP)
2834 (TP)
Fire NonFire Totals
Neural Network Classification
3421
318 (FN)
3103 (TN)
3276
3152
6428
56
Typical Measures of Accuracy
  • Overall Accuracy
    (TPTN)/(TPTNFPFN)
  • Producers Accuracy (fire) TP/(TPFN)
  • Producers Accuracy (nonfire) TN/(FPTN)
  • Users Accuracy (fire)
    TP/(TPFP)
  • Users Acuracy (nonfire) TN/(TNFN)

Accuracy of our NN Classification
  • Overall Accuracy 92.4
  • Producers Accuracy (fire) 89.9
  • Producers Accuracy (nonfire) 94.7
  • Users Accuracy (fire) 94.2
  • Users Acuracy (nonfire) 90.7

57
OUTLINE
  • The New Face of Science
  • Heliophysics (Data) Environment
  • Knowledge Discovery
  • Data Mining Examples and Techniques
  • Heliophysics Example
  • Other Earth and Space Science Examples
  • Wildfire Example
  • Space Science Examples

58
Automated Classification of X-ray Sources (PI
Susan Hojnacki, RIT)
  • High energy X-ray spectrum divided into 42
    spectral bands
  • Photon counts within the 42 bands are used as
    multivariate input variables
  • The plot below spans the first 2 principal
    components showing the source classes
  • Progression of classes moving clockwise around
    the arch forms a sequence of decreasing spectral
    hardness

59
Autonomous Mineral Detectors for Mars Rovers and
Landers
  • PI Martha Gilmore, Wesleyan University

Objective Design and develop software to
enable rovers to autonomously analyze spectral
data and identify data indicating geologically
important signatures. Motivation Both rover
and orbital missions can collect more data than
can be returned due to downlink restrictions.
Results Software is designed to allow onboard
processing of Vis/NIR spectra to identify and
select spectra that contain minerals of geologic
interest autonomously.
Non-carbonates
Carbonates
60
A Neural Map View of Planetary Spectral Images
for Precision Data Mining and Rapid Resource
Identification
  • PI Erzsébet Merényi, Rice University

Uses advanced variants of the self-organized
machine learning paradigm Self-Organizing
Map, applied to spectral imagery. They detected
orthopyroxene and clinopyroxene dominated mineral
subclasses within a rare undifferentiated mineral
type nicknamed "black rock" by geologists. SOM
by eye!
61
Application of Machine Learning Technology to
Martian Geology
PI Ruye Wang, Harvey Mudd College
  • Machine Learning algorithms have been applied to
    the analysis of Themis (Thermal Emission Imaging
    System) image data of Mars, for the purpose of
    studying mountain ranges on Mars (the Thaumasia
    Highlands and Corprates rise).
  • Specifically, various clustering and
    classification algorithms (e.g., K-means,
    competitive neural network, support vector
    machine, Independent Components Analysis) have
    been applied to the Themis image data covering
    certain areas in the Thaumasia highlands.
  • Objectives
  • Develop an intelligent system for robust
    detection and accurate classification in
    multispectral remote sensing image data
  • Demonstrate system in context of Martian geology
    application

62
K-Means Clustering of Martian Geologic Spectral
Features
  • Clustering requires a distance metric. Applied
    two approaches to spectral data
  • Euclidean distance
  • Spectral Angles Mapping (SAM) distance

u
v
Comparison of Clustering Results based upon
Spectral Angular Map (SAM) versus Euclidean
Distances
63
Data MiningIt is more than just connecting the
dots
Reference http//homepage.interaccess.com/purc
ellm/lcas/Cartoons/cartoons.htm
Write a Comment
User Comments (0)
About PowerShow.com