Augmenting the Generalized Hough Transform to Enable the Mining of Petroglyphs presentation

About This Presentation

Transcript and Presenter's Notes

Title: Augmenting the Generalized Hough Transform to Enable the Mining of Petroglyphs

1
Augmenting theGeneralized Hough Transform to
Enable the Mining of Petroglyphs

Qiang Zhu, Xiaoyue Wang, Eamonn Keogh, 1Sang-Hee
Lee
Dept. Of Computer Science Eng., 1Dept. of
Anthropology
University of California, Riverside

2
Outline

Motivation
Approach
Evaluation
Conclusion

3
Motivation(1) -applications

Petroglyphs are one of the earliest expressions
of abstract thinking.
Providing a rich source of information
climate change
existence of a certain species
patterns of humans migrations and interactions

4
Motivation(2) -difficulties

Progress in petroglyph research has been
frustratingly slow.
due to their extraordinarily diverse and complex
structure
most matching algorithms can not capture the
similarity of petroglyphs
for those that can, even in limited cases, do not
scale to large collections

5
Approach

How to preprocess the raw data?
How to define the distance measure?
How to speed up?

6
Preprocessing(1)

With rare exceptions, petroglyphs do not lend
themselves to automatic extraction with
segmentation algorithms.

The border of this rock may be recognized as the
edge of this petroglyph
7
PetroAnnotator
Load the raw image into our human computation tool
8
PetroAnnotator (cont.)
Draw an approximate boundary around object, and
then trace the shape
9
Preprocessing(2) -downsampling
A

Two overlaid skeleton traces (340 by 250) of the
same image of a Bighorn sheep. Less than 3.5 of
the pixels from each image overlap.
(B) The same two images after downsampling (30 by
23).
75.6 of the pixels (denoted by black) are
common to both.

10
Distance Measure -why GHT?

essentially makes no assumption about the data
open/closed boundaries
connected/disconnected shapes
correctly captures the similarity
subjective/objective similarity on
unlabeled/labeled datasets
tightly lower bound the distance
allowing for very efficient searches in large
datasets

11
Classic GHT

GHT is a useful method for two dimensional
arbitrary shape detection.

12
(1) Find the star-pattern
13
(2) Superimpose Accumulate
14
(3) Find the peak
Q
R
R
0
1
1
1
0
A
0
0
1
0
0
1
2
3
2
1
0
1
1
1
0
15
A Basic Distance Measure

Classic GHT doesnt explicitly encode a
similarity measure
We can simply define a GHT-based distance
minimal unmatched edge points (MUE)
number of edge points in Q maximal matched
edge points
4 3 1 (for our toy example)

16
A New Cell Incrementation Strategy

When can we obtain the value of a particular cell
in the accumulator?
In the classic GHT, until the end of all
incrementation
Is it possible to obtain the value one by one?
Need to check all positions that are possible to
increase the cell value

?
17
Lower Bound
?
Q
C
?
?
?
?
?
?
?
?
In this column Q needs 2 pixels in C, and has 3
In this column Q needs 2 pixels in C, and has 2
In this column Q needs 4 pixels in C, and has
only 2
In this column Q needs 2 pixels in C, and has 2
In this column Q needs 2 pixels in C, and has 3
Minimal missed points
2
0
0
2
0
0
18
Time Complexity

Classic GHT
O(NQNCS2)
superimpose all query vectors to all edge points
in the candidate image
Lower bound GHT
O(S2)
compare one-dimensional signatures
further reduced by early abandon and shifting
order
one to two orders of magnitude speed-up

19
Variants on the Basic Distance Measure

Query-by-Content
Clustering
Finding Motifs

20
Evaluation

We performed three sets of experiments
Evaluation of Utility
-on unlabeled data
Evaluation of Accuracy
-on labeled data
Evaluation of Scalability
-on synthetic data

21
Evaluation of Utility (1)

Our GHT-based distance measure correctly groups
all seven pairs
The higher level structure of the dendrogram also
correctly groups similar petroglyphs

Atlatls
Anthropomorphs
Bighorn Sheep
A clustering of typical Southwestern USA
petroglyphs
22
Evaluation of Utility (2)
23
(No Transcript)
24
Evaluation of Utility (3)

Whether our distance measure can find meaningful
motifs?
2,852 real petroglyphs
4,065,526 possible pairs
52 top motifs (0.00128) by motif cutoff

Motif Cutoff
25
Evaluation of Accuracy -datasets

NicIcon dataset
24,441 images
14 categories
33 volunteers
234234 pixels
WD/WI tests

Farsi digits dataset
From 11,942 registration forms
60,000 digits for training
20,000 digits for testing
5464 pixels (largest MBR)

26
(1) Test the Downsampling Size
30
20
Error Rate ()
WD
In both datasets, the error rate of
one-nearest-neighbor test varies little once the
resolution is greater than 1010
10
WI
0

10
20
30
40
50
60
70
80
5
Resolution (RR) of Downsampled Images (NicIcon)

16
12
Error Rate ()
8
4
2
5
10
20
30
Resolution (RR) of Downsampled Images (Farsi)
27
(2) Competitive accuracy

NicIcon dataset
Error rate for WD 4.78
8.46 for WI
The dataset creators tested on the online data
using three classifiers.
Only one of them (DTWB) is better, however,
slower

Farsi digits dataset
Error rate 4.54
Borji et al. performed extensive empirical tests
on this dataset
Of the twenty reported error rates, the mean was
8.69
Only four beat our approach, but need to set at
least six parameters

28
Evaluation of Scalability -datasets

We made 8 synthetic petroglyph datasets
Based on 22 classic petroglyphs
Duplicated by 10 volunteers on a tablet
Applied a Random Polynomial Transformation
Containing up to 1,280,000 objects

29
(1) Querying by Content

Leave-one-out one-nearest-neighbor test.
Repeated the test for 10 times on each dataset.

30
(2) Finding Motifs

A brute force algorithm requires time quadratic
in the size of dataset.
By using the triangular inequality of our
distance measure, we only need to calculate a
tiny fraction of the exact distance.
Even for the smallest dataset
-our algorithm is 712 times faster
-we can prune 99.84 of the calculations

31
Conclusion

In this work we considered, for the first time,
the problem of mining large collections of rock
art.
Introduced a novel distance measure
Found an efficiently computable tight lower bound
to this measure
Enabled mining large data archives effectively

32
All datasets and the code can be downloaded
from http//www.cs.ucr.edu/qzhu/petro.html
Thanks for your listening ! ?

Write a Comment

User Comments (0)

About PowerShow.com

Augmenting the Generalized Hough Transform to Enable the Mining of Petroglyphs PowerPoint PPT Presentation