Vector Error Diffusion

About This Presentation

Title:

Vector Error Diffusion

Description:

Preferably use properties of the human visual system. Trade-offs in image hashing ... Cells in the visual cortex that help in object recognition ... – PowerPoint PPT presentation

Number of Views:64

Avg rating:3.0/5.0

Slides: 66

Provided by: niranjanda

Learn more at: http://signal.ece.utexas.edu

Category:

more less

Transcript and Presenter's Notes

Title: Vector Error Diffusion

1
Perceptually Based Methods for Robust Image
Hashing
Vishal Monga
Committee Members Prof. Ross Baldick Prof.
Brian L. Evans (Advisor) Prof. Wilson S.
Geisler Prof. Joydeep Ghosh Prof. John E.
Gilbert Prof. Sriram Vishwanath
Ph.D. Qualifying Exam Communications, Networks,
and Systems Area Dept. of Electrical and Computer
Engineering The University of Texas at
Austin April 14th , 2004
2
Outline

Introduction
Related work
Digital signature techniques for image
authentication
Robust feature extraction from images
Open research issues
Expected contributions
Framework for robust image hashing using feature
points
Clustering algorithms for feature vector
compression
Image authentication under geometric attacks via
structure matching
Conclusion

3
Hash Example
Introduction

Hash function Projects value from set with large
(possibly infinite) number of members to set with
fixed number of (fewer) members in irreversible
manner
Provides short, simple
representation of large
digital message
Hash Scheme Sum of ASCII codes of characters
in a name computed modulo N ( 7) ? a prime
number

Database name search example
4
Introduction
Image Hashing Motivation

Hash functions
Fixed length binary string extracted from a
message
Used in compilers, database searching,
cryptography
Cryptographic hash security applications e.g.
message authentication, ensuring data integrity
Traditional cryptographic hash
Not suited for multimedia ? very sensitive to
input, i.e. change in one input bit changes
output dramatically
Need for robust perceptual image hashing
Perceptual based on human visual system response
Robust hash values for perceptually identical
images must be the same (with a high probability)

5
Introduction
Image Hashing Motivation

Applications
Image database search and indexing
Content dependent key generation for watermarking
Robust image authentication hash must tolerate
incidental modifications yet be sensitive to
content changes

Tampered
Original Image
JPEG Compressed
Different hash values
Same hash value h1
h2
6
Introduction
Perceptual Hash Desirable Properties

Perceptual robustness
Fragility to distinct inputs
Randomization
Necessary in security applications
to minimize vulnerability against
malicious attacks

7
Outline

Introduction
Related work
Digital signature techniques for image
authentication
Robust feature extraction from images
Open research issues
Expected contributions
Framework for robust image hashing using feature
points
Clustering algorithms for feature vector
compression
Image authentication under geometric attacks via
structure matching
Conclusion

8
Related Work
Content Based Digital Signatures

Goal
Authenticate image based on extracted signature
Image statistics based on
Intensity histograms of image blocks Schneider
et al., 1996
mean, variance and kurtosis of intensity values
extracted from image blocks and compare then to
statistics of reference image Kailasanathan et
al., 2001
Drawbacks
Easy to modify the image without altering its
intensity histogram ? scheme is less secure
Intensity statistics can be altered easily
without significantly changing the image
appearance

9
Related Work
Content Based Digital Signatures

Feature point based methods
Wavelet based corner detection Bhatacherjee et
al., 1998
Canny edge detection Dittman et al., 1999
Apply public key encryption on the features to
arrive at the digital signature
Relation based methods Lin Chang 2001
Invariant relationship between discrete cosine
transform (DCT) coefficients of two different
blocks
Common characteristic of above methods
work well for some attacks viz. JPEG compression
still sensitive to several incidental
modifications that do not alter the image
appearance

10
Related Work
Robust Image Hashing Method 1

Image statistics vector from wavelet
decomposition of image Venkatesan et al., 2000
Averages of wavelet coefficients in coarse
sub-bands and variances in other sub-bands

Vertical freqs.
Diagonal freqs.
Coarse Details
Horizontal freqs.
Extract Statistics Vector and Quantize
00100101 01110100100111001 001010
Error Correction Decoding
00100101 011 Hash Value
11
Related Work
Robust Image Hashing Method 2

Preserve magnitude of low frequency DCT
coefficients Fridrich et al., 2001
Survives JPEG compression, linear filtering
attacks
Very sensitive to geometric distortions (local
global)
Randomize using a secret key K
Generate N random smooth patterns P(i), i 1,,
N
Take vectorized dot product of low frequency DCT
coefficients (in block B) with random patterns
and use threshold Th to obtain N bits bi

Back
12
Related Work
Robust Image Hashing Method 3
DC sub-band

Invariance of coarse wavelet coefficients
Mihcak et al., 2001
Key observation
Main geometric features of image stay
invariant under small perturbations to image
Hash algorithm
Threshold wavelet coefficients of DC sub-band
(coarse robust features) to obtain a binary
matrix
Perform filtering and re-thresholding to
iteratively arrive at binary map which is then
used as the hash
Iterative procedure is designed so as to preserve
significant image geometry

3- level Haar wavelet decomposition
Back
13
Related Work
Robust Digital Signature Method 4

Interscale relationship of wavelet
coefficientsLu Liao, 2003
Magnitude difference between a parent node and
its four child nodes is difficult to destroy
(alter) under content-preserving manipulations
s wavelet scale, o orientation, 0 i, j 1

w0,0(x,y)
w1,0(2x,2y)
w1,1(2x1,2y)
w1,2(2x,2y 1)
w1,3(2x1,2y1)
2-D wavelet decomposition tree
14
Open Issues
Related Work
Contribution 1

A robust feature point scheme for hashing
Inherent sensitivity to content-changing
manipulations e.g. could be useful in
authentication
Representation of image content robust to both
global and local geometric distortions
Preferably use properties of the human visual
system
Trade-offs in image hashing
Robustness vs. Fragility, Randomness
Question Minimum length of the final hash value
(binary string) needed to meet the above goals ?
Randomized algorithms for secure image hashing

Contribution 3
Contribution 1
Contribution 2
Contribution 2
15
Outline

Introduction
Related Work
Digital signature techniques for Image
Authentication
Robust feature extraction from Images
Open research issues
Expected contributions
Framework for robust image hashing using feature
points
Clustering algorithms for feature vector
compression
Image authentication under geometric attacks via
structure matching
Conclusion

16
Hashing Framework
Expected Contribution 1

Proposed two-stage hash algorithm

Input Image I
Final Hash
Compression

Feature vectors extracted from perceptually
identical
images must be close in a distance metric

17
Hypercomplex or End-stopped cells
End-stopping and Image Geometry, Dobbins, 1989

Cells in the visual cortex that help in
object recognition
Respond strongly to line end-points, corners and
points of high curvature Hubel et al. 1965,
Dobbins 1989

Develop filters/kernels that capture this
behavior
To maintain robustness to changes in image
resolution,
Wavelet based approach is needed

18
End-Stopped Wavelet Basis

Morlet wavelets Antoine et al., 1996
To detect linear (or curvilinear) structures
having a specific orientation
End-stopped wavelet Vandergheynst et al., 2000
Apply First Derivative of Gaussian (FDoG)
operator to detect end-points of structures
identified by Morlet wavelet

x (x,y) 2-D spatial co-ordinates ko (k0, k1)
wave-vector of the mother wavelet Orientation
control -
19
End-Stopped WaveletsExample

Morlet Wavelet along the u-axis
Detects vertically oriented linear structures
FDoG operator along frequency axis v
Applied on the Morlet wavelet to detect
end-points and corners

Synthetic L-shaped image
Response of Morlet wavelet, orientation 0
degrees
Response of the end-stopped wavelet
20
Expected Contribution 1
Computing Wavelet Transform

Generalize end-stopped wavelet
Employ the wavelet family
Scale parameter 2, i scale of the wavelet
Discretize orientation range 0,p into M
intervals i.e.
?k (k p/M ), k 0, 1, M - 1
Finally, the wavelet transform is given by

21
Expected Contribution 1
Proposed Feature Detection MethodMonga Evans,
2004

Compute wavelet transform at suitably chosen
scale i for several different orientations
Significant feature selection Locations (x,y) in
the image that are identified as candidate
feature points satisfy
Avoid trivial (and fragile) features Qualify a
location as a final feature point if

Randomization Partition the image into N random
regions using a secret key K, extract features
from each random region
Probabilistic Quantization Quantize feature
vector based on distribution (histogram) of image
feature points to enhance robustness

22
Expected Contribution 1
Iterative Feature Extraction Algorithm Monga
Evans, 2004

Extract feature vector f of length P from image
I, quantize f probabilistically to obtain a
binary string bf1 (increase count)
2. Remove weak image geometry Compute 2-D
order statistics (OS) filtering of I to produce
Ios OS(Ip,q,r)
3. Preserve strong image geometry Perform
low-pass linear shift invariant (LSI) filtering
on Ios to obtain Ilp
4. Repeat step 1 with Ilp to obtain bf2
5. IF (count MaxIter) go to step 6.
ELSE IF D(bf1, bf2) lt ? go to step 6.
ELSE set I Ilp and go to step 1.
6. Set fv(I) bf2

MaxIter, ? and P are algorithm parameters.
count 0 to begin with fv(I) denotes quantized
feature vector D(.,.) normalized Hamming
distance between its arguments
23
Expected Contribution 1
Preliminary Results Feature Extraction
JPEG, QF 10
Original Image
AWGN, s 20
Image Features at Algorithm Convergence
24
Expected Contribution 1
Preliminary Results Feature Extraction

Quantized Feature Vector Comparsion
D(fv(I), fv(Isim)) lt 0.2
D(fv(I), fv(Idiff)) gt 0.3

Table 1. Comparison of quantized feature vectors
Normalized Hamming distance between quantized
feature vectors of original and attacked
images Attacked images generated by Stirmark
benchmark software
25
Expected Contribution 1
Preliminary Results Feature Extraction
YES ? survives attack, i.e. hash was
invariant content changing manipulations,
should be detected
26
Expected Contribution 1
Highlights

Framework for image hashing using feature points
Two stage hash algorithm
Any visually robust feature point detector is a
good candidate to be used with the iterative
algorithm
Trade-offs facilitated
Robustness vs. Fragility select feature points
such that
T1, T2 large enough ensures that features are
retained in several attacked versions of the
image, else removed easily
Robustness vs. Randomization number of random
regions
Until N lt Nmax, robustness largely preserved
else random regions shrink to the extent that
they do not contain significant chunks of image
geometry

27
Expected Contribution 2
Feature Vector Compression

Goals in compressing to a final hash value
Cancel small perturbations between feature
vectors of perceptually identical images
Maintain fragility to distinct inputs
Retain and/or enhance randomness properties for
secure hashing
Problem statement Retain perceptual significance
Let (li, lj) denote vectors in the metric space
of feature vectors V and 0 lt e lt d, then it is
desired

28
Expected Contribution 2
Possible Solutions

Error correction decoding Venkatesan et al.,
2000
Applicable to binary feature vectors
Break the vector down to segments close to the
length of codewords in a suitably chosen
error-correcting code
More generally vector quantization/clustering
Minimize an average distance to achieve
compression close to the rate distortion limit
P(l) probability of occurrence of vector l,
D(.,.) distance metric defined on the feature
vectors
ck codewords/cluster centers, Sk kth cluster

29
Expected Contribution 2
Is Average Distance the Appropriate Cost for the
Hashing Application?

Problems with average distance VQ
No guarantee that perceptually distinct feature
vectors indeed map to different clusters no
straightforward way to trade-off between the two
goals
Must decide number of codebook vectors in advance
Must penalized some errors harshly e.g. if
vectors really close are not clustered together,
or vectors very far apart are compressed to the
same final hash value
Define alternate cost function for hashing
Develop clustering algorithm that tries to
minimize that cost

30
Expected Contribution 2
Cost Function for Feature Vector Compression

Define joint cost matrices C1 and C2 (n x n)
n total number of vectors be clustered, C(li),
C(lj) denote the clusters that these vectors are
mapped to
Exponential cost
Ensures that severe penalty is associated if
feature vectors far apart and hence perceptually
distinct are clustered together

a gt 0, ? gt 1 are algorithm parameters
31
Expected Contribution 2
Cost Function for Feature Vector Compression

Further define S1 as
S2 is defined similarly
Normalize to get ,
Then, minimize the expected cost
p(i) p(li), p(j) p(lj)

32
Expected Contribution 3
Image Authentication Under Geometric Attacks

Basic premise
Feature points of a reference image and a
geometrically attacked image are related by a
suitable transformation
Affine transformation models the geometric
distortion
x (x1, x2) , y (y1, y2) R 2 x 2 matrix, t
2 x 1 vector
Hausdorff distance to compare feature points from
two images Atallah, 1983 Rote 1991
Used in computer vision for locating objects in
an image
Relatively insensitive to perturbations in
feature points, can tolerate errors due to
occlusion or feature detector failure

33
Expected Contribution 3
Image Authentication Under Geometric Attacks

Hausdorff distance between point sets A and B
A a1,, ap and B b1,, bq
where
Measures degree of mismatch between two sets
Employ structure matching algorithms
Huttenlocher et al. 1993, Rucklidge 1995
To determine G such that
Here, fr and fc denote feature point sets from
reference and candidate image to be authenticated

34
Conclusion
Conclusion Future Work

Feature point based hashing framework
Iterative feature detector that preserves
significant image geometry, features invariant
under several attacks
Trade-offs facilitated between hash algorithm
goals
Algorithms for feature vector compression
Novel cost function for the hashing application
Heuristic clustering algorithm(s) to minimize
this cost
Randomized clustering for secure hashing
Image authentication under geometric attacks
Affine transformation to model geometric
distortions
Hausdorff distance and structure matching
algorithms to determine affine transformation and
authenticate

35
Conclusion
Proposed Schedule
36
Backup Slides
37
Hash Illustrative Example
Introduction

Parsing in compiling a program
Variable names kept in a data structure
Array of pointers, each pointer points to a
linked list
Index into the array is a hash value
Example variable name university
Hashing Scheme Sum of ASCII codes of
characters in a variable name computed modulo N ?
a prime number
Check linked list at array index, add string to
linked list if it had not been previously parsed

38
Expected Contribution 1
End-Stopped WaveletsExample

Morlet Wavelet along
the u-axis
FDoG operator along
frequency axis v

spatial domain
frequency domain
Synthetic L-shaped image
Response of Morlet wavelet, orientation 0
degrees
Response of the end-stopped wavelet
39
Feature Detection
Back
Content Changing Manipulations
Original image
Maliciously manipulated image
40
Algorithm Parameters
Results

Image Conditioning
All images resized to 512 x 512 via triangular
interpolation prior to feature extraction
Intensity planes of color images were used
Pixel neighborhood
Circular to detect isotropic features
Radius of 5 pixels
Iterative Feature Extraction
wavelet scale, i 3
MaxIter 20, ? 0.001, P 128
LSI filter zero-phase low pass filter (11 x 11)
designed using McCllelan transformations
Order statistics filtering median with 5 x 5
window

Back
41
Feature Detection
Experimental Results
90 degree rotation
AWGN s 20
42
Expected Contribution 1
Trade-offs

Perceptual robustness vs. fragility
Size of the search neighborhood large ? feature
points are more robust
Select feature points such that
T1, T2 large enough implies features retained in
several attacked versions of the image else
removed easily
Robustness vs. Randomization
Uptil N lt Nmax, robustness largely retained else
random regions shrink to the extent that they do
not contain significant chunks of image geometry

Back
43
Digital Signature Techniques
Relation Based Scheme DCT coefficients

Discrete Cosine Transform (DCT)
Typically employed on 8 x 8 blocks
Digital Signature by Lin
Fp, Fq, DCT coefficients at the same positions in
two different 8 x 8 blocks
, DCT coefficients in the compressed
image

8 x 8 block
p
q
N x N image
Back
44
Wavelet Decomposition
Multi-Resolution Approximations
45
Back
46
Back
Wavelet Decomposition
Examples of Perceptually Identical Images
Original Image
Contrast Enhanced
JPEG, QF 10
10 cropping
3 degree rotation
2 degree rotation
47
Expected Contribution 1
Iterative Hash Algorithm
Input Image
Probabilistic Quantization
Extract Feature Vector
Linear Shift Invariant Low pass filtering
Order Statistics Filtering
Probabilistic Quantization
D(b1, b2) lt ?
Extract Feature Vector
48
Quantization
Probabilistic Quantization

Feature Vector
fmn m Hn
Quantization Scheme
L quantization levels
Design quantization bins li,li-1) such that
Quantization Rule

Back
49
Feature Detection
Feature Vector Extraction

Randomization
Partition the image into N regions using k-means
segmentation extract feature points from each
region
Secret key K is used to generate initial guesses
for the clusters (centroids of random regions)
Avoid very small regions since they would not
yield robust image features

Back
50
Expected Contribution 1
Preliminary Results
Table 1. Comparison of quantized feature vectors
Normalized Hamming distance between quantized
feature vectors of original and attacked images
51
Clustering Algorithms
Minimizing the Cost

Decision Version of the Clustering Problem
For a fixed number of clusters k, is there a
clustering with cost less than a constant?
Shown to be NP-complete via a reduction from the
k-way graph cut problem Monga et. al, 2004
Polynomial time greedy heuristic to solve the
problem
Select cluster centers based on probability mass
of vectors in V minimize error probabilities in
a rigorous sense
Trade-offs Exclusive minimization of
would compromise and vice-versa
Basic algorithm with variations to facilitate
trade-offs

52
Clustering Algorithms
Basic Clustering Algorithm

Obtain e, d, set k 1. Select the data point
associated with the highest probability mass,
label it l1
Make the first cluster by including all
unclustered points lj such that
D(l1, lj) lt e/2
3. k k 1. Select the highest probability data
point lk amongst the unclustered points such that
where S is any cluster, C set of
clusters formed till this step and
Form the kth cluster Sk by including all
unclustered points lj such that
D(lk, lj) lt e/2
5. Repeat steps 3-4 till no more clusters can be
formed

53
Clustering Algorithms
Visualization of the Clustering Algorithm
54
Clustering Algorithms
Observations

For any (li, lj) in cluster Sk
No errors till this stage of the algorithm
Each cluster is atleast e away from any other
cluster and hence there are no errors by
violating (1)
Within each cluster the maximum distance between
any two points is at most e, and because 0 lt e lt
d there are no errors by violation of (2)
The data points that are left unclustered are
atleast 3 e /2 away from each of the existing
clusters
Next
Two different approaches to handle the
unclustered points

55
Hashing Framework
Expected Contribution 1

Two-stage Hash algorithm

Input Image I
Extract visually robust feature vector
Feature Vectors extracted from perceptually
identical images must be close in a distance
metric
Compress Features
Final Hash Value
56
Clustering Algorithms
Approach 1

Select the data point l amongst the unclustered
data points that has the highest probability mass
For each existing cluster Si, i 1,2,, k
compute
Let S(d)
Si such that di d
IF S(d) F THEN k k 1. Sk l is a
cluster of its own
ELSE for each Si in S(d) define
where denotes the complement of Si i.e.
all clusters in S(d) except Si. Then, l is
assigned to the cluster S arg min F(Si)
4. Repeat steps 1 through 3 till all data points
are exhausted

57
Clustering Algorithms
Approach 2

Select the data point l amongst the unclustered
data points that has the highest probability mass
For each existing cluster Si, i 1,2,, k define
and ß lies in 1/2, 1
where denotes the complement of Si i.e.
all existing clusters except Si. Then, l is
assigned to the cluster S arg min F(Si)
3. Repeat steps 1 and 2 till all data points are
exhausted

58
Clustering Algorithms
Summary

Approach 1
Tries to minimize conditioned on
0
Approach 2
Smoothly trades off the minimization of
vs.
via the parameter ß
ß ½ ? joint minimization
ß 1 ? exclusive minimization of
Final Hash length determined automatically!
Given by bits, where k is the
total number of clusters formed
Proposed clustering can be used to compress
feature vectors in any metric space e.g.
euclidean, hamming

59
Clustering Algorithms
Randomized Clustering for Secure Hashing

Heuristic for the deterministic map
Select the highest probability data point amongst
the unclustered data points
Randomization Scheme
Normalize the probabilities of the existing
unclustered data points to define a new
probability mass such that
where i runs over unclustered points,
Employ a uniformly distributed random variable in
0,1 (generated via a secret key) to select the
data point i as a cluster center with probability

60
Clustering Algorithms
Randomized Clustering Illustration

Example s 1
4 data points with probabilities 0.5, 0.25,
0.125, 0.125
Key Observations
s 0, ? is uniform or any point is
selected as the cluster center with the same
probability
s ? deterministic clustering

Uniform number generation to select data point
61
Clustering Algorithms
Clustering Results

Compress binary feature vector of L 240 bits
Final hash length 46 bits, with Approach 2, ß
1/2
Average distortion VQ at the same rate
Value of cost function is orders of magnitude
lower for the proposed clustering

62
Clustering Algorithms
Conclusion Future Work

Perceptual Image Hashing via Feature Points
Extract Feature Points that preserve significant
image geomtery
Based on properties of the Human Visual System
(HVS)
Robust to local and global geometric distortions
Clustering Algorithms for compression
Randomized to minimize vulnerability against
malicious attacks generated by an adversary
Trade-offs facilitated between robustness and
randomness, fragility
Future Work
Authentication under geometric attacks
Information theoretically secure hashing

63
Perceptual Image Hashing Via Feature Points
Image Hashing Via Feature Points

Feature Points are required to be invariant
across perceptually identical images
Primary geometric features of the image are
largely preserved under small perturbations
Mihcak et. al, 2001
i.e. extract significant image geometry
preserving feature points
Identify what the human eye perceives as robust
or invariant geometric features
Edge based detection is not suited
Has problems with high compression ratios,
quantization and scaling Zheng and Chellapa,
1993
Human recognition performance does not impede
even when much edge information is lost
Beiderman, 1987

64
End-stopping and image features
ES2 Wavelet

Example Wavelets
SDoG operator on the morlet wavelet
Wavelet behavior
produces a strong response at the center of any
oriented linear stimuli of a particular length
determined by s

65
Clustering Algorithms
Clustering Dependence on source distribution

Source distributions may be very skewed
Trivial clusters may be formed i.e. with very low
probability points included
For efficient compression, the number of clusters
formed should accurately represent the statistics
of the source
Solution
Consider the algorithm when m clusters are formed
m lt k and i lt n points already clustered
Assign remaining points i.e. i 1, , n to the
remaining clusters in a fashion similar to the
basic algorithm
Compare the expected cost of this clustering vs.
the one with k clusters as formed by the
algorithm described before, if the increase is
not significant terminate with the current number
of clusters