1Peter Bajcsy, 1Chulyun Kim, 2Jihua Wang and 2Yu-Feng Lin - PowerPoint PPT Presentation

1 / 51
About This Presentation
Title:

1Peter Bajcsy, 1Chulyun Kim, 2Jihua Wang and 2Yu-Feng Lin

Description:

1Peter Bajcsy, 1Chulyun Kim, 2Jihua Wang and 2Yu-Feng Lin ... of Applied Health Sciences UIUC, Kinesiology Dept. UIUC, CEE UIUC, CS UIUC, GISLIS UIUC ... – PowerPoint PPT presentation

Number of Views:29
Avg rating:3.0/5.0
Slides: 52
Provided by: lisa269
Category:

less

Transcript and Presenter's Notes

Title: 1Peter Bajcsy, 1Chulyun Kim, 2Jihua Wang and 2Yu-Feng Lin


1
A FRAMEWORK FOR GEOSPATIAL MODELING FROM SPARSE
FIELD MEASUREMENTS USING IMAGE PROCESSING AND
MACHINE LEARNING
  • 1Peter Bajcsy, 1Chulyun Kim, 2Jihua Wang and
    2Yu-Feng Lin
  • 1National Center for Supercomputing Applications
    (NCSA)
  • 2Illinois State Water Survey (ISWS)
  • University of Illinois at Urbana-Champaign (UIUC)

2
Outline
  • Introduction
  • Problems Addressed by Spatial Pattern To Learn
    (SP2Learn)
  • SP2Learn Architecture and Functionality Overview
  • Running SP2Learn
  • Summary

3
Introduction
4
General Problem
  • Compute a set of geo-spatially dense accurate
    predictions of variables
  • given a set of direct geo-spatially sparse point
    measurements and
  • auxiliary variables with implicit relationships
    with respect to the predicted variable
  • Motivation
  • minimize cost of taking direct point measurements
  • maximize accuracy of predictions and
  • automate discovering relationships among direct
    field measurements and indirect variables

5
Formulation
  • Input sets of geo-spatially sparse variables
    Vipij dense auxiliary variables a priori
    tacit knowledge of experts
  • Output geo-spatially dense (raster) Ok
  • Unknown selection of methods workflow of
    operations/methods parameters of methods
    relationships of auxiliary variables w.r.t Ok
    quantitative metric of output goodness

p2j
Interpolations Mathematical models




p1j
V1 V2
O1
Auxiliary Variables Tacit Knowledge
6
Applied Problem
Recharge and Discharge Rate Prediction
Bedrock elevation
Discharged
Recharged
Water table elevation
7
Interdisciplinary Objectives
  • Ground Water (Hydrologic Science) View
  • Evaluation of Alternative Conceptual (implicit
    relationships) and Mathematical Models (explicit
    relationships)
  • Accurate Prediction of Groundwater Recharge and
    Discharge Rates from Limited Number of Field
    Measurements
  • Computer Science View
  • Computer-Assisted Learning to Assess Alternative
    Conceptual and Mathematical Models
  • Optimization of Prediction Models From a Set of
    Geo-Spatially Sparse Point Measurements

DIALOG
8
State-of-the-Art Results
  • Limited Spatial Resolution and Accuracy

9
Existing Software for Groundwater and Surface
Water Modeling
  • MODFLOW is a three-dimensional finite-difference
    ground-water model
  • http//water.usgs.gov/nrp/gwsoftware/modflow2005/m
    odflow2005.html - freeware (2005)
  • PEST - is software for model calibration,
    parameter estimation and predictive uncertainty
    analysis
  • http//www.sspa.com/pest/ - freeware (2007)
    University of Queensland, Australia
  • Precipitation-Runoff Modeling System (PRMS) is
    deterministic, distributed-parameter modeling
    system developed to evaluate the impacts of
    various combinations of precipitation, climate,
    and land use on streamflow, sediment yields, and
    general basin hydrology
  • http//water.usgs.gov/software/prms.html -
    freeware (1996) USGS
  • Deep Percolation Model (DPM) - facilitates
    estimation of ground-water recharge under a large
    range in climatic, landscape, and land-use and
    land-cover conditions
  • http//pubs.usgs.gov/sir/2006/5318/ USGS

10
Related Work
  • Singh A. et al. Expert-Driven Perceptive
    Models for Reducing User Fatigue in an
    Interactive Hydrologic Model Calibration
    Framework

Conductivity (K) and Hydraulic heads (H) for the
hypothetical aquifer
11
Motivation
  • Ground Water (Hydrologic) Science
  • Currently, there is no single method that could
    estimate R/D rates and patterns for all practical
    applications.
  • Therefore, cross analyzing results from various
    estimation methods and related field information
    is likely to be superior than using only a single
    estimation method.
  • Computer Science
  • It is currently impossible
  • (a) to replace an expert with a lot of tacit
    domain knowledge by computer algorithms or
  • (b) to learn by an expert new I/O relationships
    from a plethora of possible variables and an
    extremely large space of processing methods and
    their parameters
  • Thus, assisting experts to discover, evaluate
    and validate new relationships in an iterative
    way will likely enable
  • (a) better understanding of the underlying
    phenomena, and
  • (b) more automated and cost-efficient predictions

12
Problems Addressed by Spatial Pattern To Learn
13
Our Approach
  • Data-Driven Analyses to Test Alternative Models,
    and to Search the Space of Processing Operations
    and Their Parameters
  • Interpolation methods
  • Mathematical models
  • Image processing algorithms
  • Machine learning algorithms
  • Scalability of algorithms with large size data
  • Computer-Assisted Comparisons and Evaluations of
    Multiple Models and Sub-Optimal Solutions
  • Model/Solution Representation
  • Closed Loop (Iterative) Workflows
  • Human Computer Interfaces
  • Overall Approach An Exploration Framework for a
    Class of Alternative Models/Hypotheses and
    Optimal Solutions

14
SP2Learn Problem Formulation
  • Given a set of geo-spatially sparse field
    measurements and auxiliary variables, derive
    accurate, spatially dense, R/D rate map by
  • (a) using physics-based model
  • (b) incorporating boundary conditions and
  • (c) exploring auxiliary variables representing
    prior knowledge about R/D patterns but missing in
    the physics-based model

15
Challenges
  • (1) How to Recognize Meaningful Pattern of
    Predicted Map?
  • (2) How to Quantify the Goodness of the Pattern?
  • Approach
  • (1a) Recognize patterns by utilizing multiple
    image enhancement and segmentation techniques
    applied to R/D rate predictions
  • (1b) Introduce relationship between R/D pattern
    and auxiliary (a priori reference) information
  • (2a) Define goodness w.r.t. reference information
    using experts selection of meaningful
    relationships
  • (2b) Define goodness w.r.t. reference information
    using complexity of machine learning

16
Using Physics-Based Model
R/D Rate Prediction
Field Measurements














Discharged
Recharged

Water table elevation
Hydraulic conductivity
Incoming water
Outgoing water
Bed rock elevation
Ground water fluxhydraulic conductivity cell
area gradient of water table elevation (head)
over cell distance
17
Incorporating Spatial Boundary Conditions
  • BC R/D rate prediction could have smooth
    transitions and recharge discharge regions
    (contiguous pixels) should be clearly delineated
  • Approach Apply Image Restoration and De-noising
    Techniques
  • Moving average based low pass filter
  • TVL (Total Variation regularized L1-norm
    function) based filter
  • Morphological operation based filter
  • Using multiple techniques multiple times

Discharged
Recharged
18
Exploring Auxiliary Variables Driving R/D Patterns
Prior Tacit Knowledge about R/D and Auxiliary
Variables
  • Proximity to River P(R or D area/River is
    close)high
  • Soil Type P(R or D area/SoilClay)low
  • Slope P(R or D area/ slopehigh)low

moving average normalizationTVL
normalizationTVL
moving average
19
From Auxiliary Variables To Knowledge and
Accurate R/D
Load Variables
Integrate Maps
Load R/D Map
Define ROI
Create Decision Tree
Apply Rules
20
SP2Learn Output
  • A set of rules that define relationships between
    predicted (R/D rate) variable and auxiliary
    variables
  • Modified (more accurate) predictions according to
    the user selected rules defining relationships of
    predicted and auxiliary variables
  • Sensitivity analysis results with respect to
  • Methods (interpolations, image enhancement, )
  • Models
  • Parameters

21
Example Results
ROI
  • ltRULE ID138 NUM_OF_CASES3975 SUPPORT32.65gt
  • ltIFgtElevation is not in 330-344 AND
  • Soil type is in RmRoscommon muck AND
  • Proximity to water body is not near_water AND
  • Slope is in 0-0.9 lt/IFgt
  • ltTHENgtR/D rate is -0.004,-0.002lt/THENgt



22
SP2Learn Architecture and Functionality
23
Underlying SP2Learn Technology
24
SP2Learn Functionality Overview
Load Raster Step
Integration Step
Create Mask Step
Rules Step
Attribute Selection Step
Apply Rule Step
25
SP2Learn Workflow
26
On-Line Help
27
Software and Test Data Download
  • Download web page of Image Spatial Data Analysis
    group at NCSA http//isda.ncsa.uiuc.edu/download/

28
Running SP2Learn
29
Input Data to SP2Learn
  • Raster files (maps)
  • Predicted R/D rate models
  • Auxiliary variables
  • For mask creation
  • Tables with geo-points
  • Vector files with boundaries
  • Raster files of categorical or continuous
    variables

30
Image Processing
  • Filtering Methods
  • Low pass (moving average) filters
  • Morphological filters
  • TVL1 (Total Variation regularized L1 function)
  • Using multiple techniques multiple times
  • Parameters
  • Kernel size (row dimension, column dimension)

31
Example Input Maps
Morphological Opening
Morphological Closing
Low Pass Filter
Kernel (10,10)
Kernel (10,10)
Kernel (10,10)
Kernel (5,5)
Kernel (5,5)
Kernel (5,5)
32
Example Auxiliary Maps
  • Slope
  • DEM
  • Soil
  • River Stream

33
Loading Files
  • Load R/D rate models (maps)
  • Load auxiliary maps to explore alternative models
  • Proximity to water
  • Soil type
  • Slope


34
Mosaic Maps
  • Large spatial coverage a set of tiles
  • Out-of-core representation


35
Viewing Images
  • Right mouse click
  • Image information
  • Zoom
  • Check boxes
  • Pseudo-color
  • Auto-fit images


36
Registration
  • Integration of all maps (raster images) to a
    common projection and spatial resolution


Before Convert
After Convert
37
Create Mask
C
A
Mask Parameters
Visualization Panel
B
Mask Operations
38
Mask Creation Options in SP2Learn
39
User Defined Mask Creation
  • Set Parameter User defined
  • Mouse click-and-drag selection of region
  • Click Paint and Show
  • Click Apply

40
Label Editor
  • Assign categorical labels to colors



41
Attribute Selection
  • Output Predicted Variable
  • Input Auxiliary Variables
  • Check-boxes
  • Show Table
  • Prune Tree

42
Decision Tree Based Modeling
  • Tree structure can be represented as a set of
    rules

43
Rules from Decision Tree
  • Num Node number in a decision tree.
  • Support() Among all cases satisfying
    conditions, the ratio of cases having the same
    class (conclusion).
  • of cases The number of cases satisfying
    conditions
  • Class Conclusion of a rule
  • Conditions Conditions of a rule
  • MDL Score MDL score of a decision tree. The less
    the score is, the better the tree is

44
Show Decision Tree
Show Tree Option
45
Export Rules
  • XML format

Export Rules Option
46
Apply Rules
  • Visualization of
  • Modified output variable
  • Changed pixels
  • Magnitude of changes (differences)

47
Summary
  • Novel Frameworks and Methodologies for
    Exploratory Data-Driven Modeling and Scientific
    Discoveries
  • Problems addressed in the prototype SP2Learn
    solution
  • Prediction accuracy improvement by a combination
    of mathematical models and data-driven (knowledge
    based) models, supervised and unsupervised
    iterative model optimization
  • Better Data Utilization!

48
Extra Information
  • A stack of informatics and cyber-infrastructure
    software is open source
  • Other software of potential interest
  • GeoLearn is an exploratory framework for
    extracting information and knowledge from remote
    sensing imagery
  • CyberIntegrator to support creation of
    exploratory workflows, reuse of workflows, remote
    server execution, data and process provenance
    tracking and analysis, streaming data support
  • Image Provenance to Learn (IP2Learn) to support
    decision processes based on visual inspection of
    images
  • Load Estimation (work in progress) to support
    optimal sampling of sediment loads using several
    sediment-discharge rating curves, bias correction
    factors and Monte Carlo simulations to predict
    confidence limits
  • Download web page of Image Spatial Data Analysis
    group at NCSA http//isda.ncsa.uiuc.edu/download/

49
Acknowledgement
  • Funding Agencies
  • NASA, NARA, NSF, NIH, NAVY, DARPA, ONR, NCSA
    Industrial Partners, NCSA Internal, COM UIUC,
    State of Illinois
  • Full Time Employees
  • Peter Bajcsy, Rob Kooper, Sang-Chul Lee, Luigi
    Marini
  • Students
  • Shadi Ashnai, Melvin Casares, Miles Johnson,
    Chulyun Kim, Qi Li, Tim Nee, Arlex Torres, Ryo
    Kondo, Henrik Lomotan, James Rapp
  • Collaborators
  • College of Applied Health Sciences UIUC,
    Kinesiology Dept. UIUC, CEE UIUC, CS UIUC, GISLIS
    UIUC
  • UIC, UC Berkeley, Univ. of Texas at Austin, Univ.
    of Iowa
  • ISWS, NARA, Nielsen, State Farm
  • Instituto Tecnológico de Costa Rica, UNESCO-IHE
    Netherlands

50
Thank you!
  • Questions
  • Peter Bajcsy pbajcsy_at_ncsa.uiuc.edu
  • Need More Details
  • Publications http//isda.ncsa.uiuc.edu

51
Backup
Write a Comment
User Comments (0)
About PowerShow.com