Title: TerraStream: From Elevation Data to Watershed Hierarchies
1TerraStream From Elevation Data to Watershed
Hierarchies
Andrew Danner (Swarthmore), T. Moelhave
(Aarhus), K. Yi (HKUST), P. K. Agarwal (Duke),
L. Arge (Aarhus), H. Mitasova (NCSU)
Thursday, 08 November 2007
2Current Problem Large Point Data Sets
- LIDAR
- NC Coastline 200 million points over 7 GB
- Neuse River basin (NC) 500 million points over
17 GB - Grid elevation models
- Neuse River basin
- 20ft 2.5 GB
- 10ft 10GB
- 5ft 40 GB
- Data too big for RAM
- Must reside on disk
- Disk is slow
3I/O-efficient Algorithms AV88
- Traditional algorithms optimize CPU computation
- Not aware of performance penalty of disk access
- Virtual memory, swap space cant predict disk
access - I/O model
- Memory is finite
- Data is transferred in blocks
- Complexity measured in disk blocks transferred
-
-
B
M
4TerraStream Terrain processing pipeline
5TerraStream Goals
- Scalable All stages must work for 100 million
points/cells - General Stages should work with either TIN or
grid data - Automated No need for manual intervention/prepro
cessing - Modular Users only need to run the stages they
want - Adaptable Allow each stage to support multiple
models
6TerraStream Terrain processing pipeline
7Points to DEM
- Grid DEM
- Interpolation
- Use quad tree to automatically tile terrain
- Use quad tree neighbors for smooth boundary
transitions - TIN
- I/O efficient Delaunay triangulation
- Constrained Delaunay also possible if constraints
(breaklines) fit in memory - Height graph
- View both grids and TINs as a height graph.
- Nodes, neighbors, and edges between neighboring
nodes - Definition of node, neighbor different in
TIN/Grid - Design algorithms to work on height graphs
8Flow Modeling
9Flow Modeling
- Identifying minima due to noise
- Removing noise from terrains
- Modeling flow directions, extracting river
networks
10Coping with Noisy Data
- Identifying minima likely due to noise
- Topological persistence Computed in Sort(N)
I/Os - Assign a significance score to each minima
(low score ? likely noise) - Provide mechanism for removing low scoring sinks
- User can select score threshold
-
11Noise Removal
Noisy terrain
After noise removal
Flooding in Sort(N) I/Os Other Mechanisms?
Carving?
12From Elevation to River Networks
- Where does water go?
- From higher elevation to lower elevation
- Single flow directions form a tree
- Support for multiple flow directions
13Drainage Area
- How much area is upstream of each node?
- Each node has initial drainage area (1)
- Drainage area of internal nodes depends on
drainage area of children
3
3
5
14Computing Flow Directions/Drainage
- Terraflow
- Sort(N) I/Os on grids
- Modified to work on height graphs
- Same I/O bound
- Now works on TINs
- New implementation
- More robust, portable
- Incorporate new sink removal
- Better handling of flat areas
15Flow Modeling Improvements
- Detection of flat areas
- Improved method on grids if O(1) rows fit in
memory - Routing on flat areas
- Soille extension of Garbrecht Martz
- Flat areas usually result of hydrological
conditioning with flooding
16Hierarchical Watershed Decomposition
17Watershed Hierarchies
- Decompose a river network into a hierarchy of
hydrological units - All water in HU flows to a common outlet
- Hierarchy provides tunable level of detail
- Method used Pfafstetter
- Want a solution scalable to large modern hi-res
terrains
18Pfafstetter
- Find main river
- Find four largest tributaries
- Label basins/interbasins
- Recurse until single path
19Example Watershed Boundaries
All levels computed in one run. User selects
level of detail with map algebra
20A Complete Pipeline
21Implementation
- TPIE C primitives for I/O-efficient algorithms
- Standalone command line apps with GDAL
- GRASS Open Source GIS Plugins
- ArcGIS Plugins (soon)
- Test Data
- North Carolina LIDAR
- Neuse river basin 400 million points (NC
Floodmaps) - Outer banks coastal data 128 million points
(NOAA CSC) - USGS 30m NED
22Our Results
- Experimental Results
- Scales to over 400 million points
- Other software tools crash at 25 million points
- Keeps memory usage low using I/O efficient
methods
Format 20ft grid 10ft grid TIN
HG vertices (millions) 397 1590 469
Pipeline stage   Â
Dem Construction 19h 56m 27h 12m 4h 20m
Building height graph 0h 07m 0h 30m 11h 42m
Hydro. Conditioning 1h 17m 7h 25m 10h 03m
Flow Modeling   Â
Routing 1h 26m 6h 34m 15h 08m
Accumulation 1h 40m 7h 35m 2h 05m
Watershed Delineation 2h 28m 14h 39m 6h 26m
Total 25h 54m 63h 34m 49h 44m
23Future Directions Grid Construction
- Interpolate leaves in parallel
- Test other interpolation methods
- Test with more data sources
- Finding the ideal resolution
24Future Directions Noise Removal
- Bridge detection/removal
- Hydrological conditioning with carving
- Scoring of sinks based on volume
- Other flow routing methods
- Further flat routing improvements
25Flow Routing and Bridges
Use flooded terrain for connectivity but Use
original terrain for routing
26Questions?