Title: Vector Error Diffusion
1Perceptually Based Methods for Robust Image
Hashing
Vishal Monga
Committee Members Prof. Ross Baldick Prof.
Brian L. Evans (Advisor) Prof. Wilson S.
Geisler Prof. Joydeep Ghosh Prof. John E.
Gilbert Prof. Sriram Vishwanath
Ph.D. Qualifying Exam Communications, Networks,
and Systems Area Dept. of Electrical and Computer
Engineering The University of Texas at
Austin April 14th , 2004
2Outline
- Introduction
- Related work
- Digital signature techniques for image
authentication - Robust feature extraction from images
- Open research issues
- Expected contributions
- Framework for robust image hashing using feature
points - Clustering algorithms for feature vector
compression - Image authentication under geometric attacks via
structure matching - Conclusion
3Hash Example
Introduction
- Hash function Projects value from set with large
(possibly infinite) number of members to set with
fixed number of (fewer) members in irreversible
manner - Provides short, simple
- representation of large
- digital message
- Hash Scheme Sum of ASCII codes of characters
in a name computed modulo N ( 7) ? a prime
number
Database name search example
4Introduction
Image Hashing Motivation
- Hash functions
- Fixed length binary string extracted from a
message - Used in compilers, database searching,
cryptography - Cryptographic hash security applications e.g.
message authentication, ensuring data integrity - Traditional cryptographic hash
- Not suited for multimedia ? very sensitive to
input, i.e. change in one input bit changes
output dramatically - Need for robust perceptual image hashing
- Perceptual based on human visual system response
- Robust hash values for perceptually identical
images must be the same (with a high probability)
5Introduction
Image Hashing Motivation
- Applications
- Image database search and indexing
- Content dependent key generation for watermarking
- Robust image authentication hash must tolerate
incidental modifications yet be sensitive to
content changes
Tampered
Original Image
JPEG Compressed
Different hash values
Same hash value h1
h2
6Introduction
Perceptual Hash Desirable Properties
- Perceptual robustness
- Fragility to distinct inputs
- Randomization
- Necessary in security applications
- to minimize vulnerability against
- malicious attacks
7Outline
- Introduction
- Related work
- Digital signature techniques for image
authentication - Robust feature extraction from images
- Open research issues
- Expected contributions
- Framework for robust image hashing using feature
points - Clustering algorithms for feature vector
compression - Image authentication under geometric attacks via
structure matching - Conclusion
8Related Work
Content Based Digital Signatures
- Goal
- Authenticate image based on extracted signature
- Image statistics based on
- Intensity histograms of image blocks Schneider
et al., 1996 - mean, variance and kurtosis of intensity values
extracted from image blocks and compare then to
statistics of reference image Kailasanathan et
al., 2001 - Drawbacks
- Easy to modify the image without altering its
intensity histogram ? scheme is less secure - Intensity statistics can be altered easily
without significantly changing the image
appearance
9Related Work
Content Based Digital Signatures
- Feature point based methods
- Wavelet based corner detection Bhatacherjee et
al., 1998 - Canny edge detection Dittman et al., 1999
- Apply public key encryption on the features to
arrive at the digital signature - Relation based methods Lin Chang 2001
- Invariant relationship between discrete cosine
transform (DCT) coefficients of two different
blocks - Common characteristic of above methods
- work well for some attacks viz. JPEG compression
- still sensitive to several incidental
modifications that do not alter the image
appearance
10Related Work
Robust Image Hashing Method 1
- Image statistics vector from wavelet
decomposition of image Venkatesan et al., 2000 - Averages of wavelet coefficients in coarse
sub-bands and variances in other sub-bands
Vertical freqs.
Diagonal freqs.
Coarse Details
Horizontal freqs.
Extract Statistics Vector and Quantize
00100101 01110100100111001 001010
Error Correction Decoding
00100101 011 Hash Value
11Related Work
Robust Image Hashing Method 2
- Preserve magnitude of low frequency DCT
coefficients Fridrich et al., 2001 - Survives JPEG compression, linear filtering
attacks - Very sensitive to geometric distortions (local
global) - Randomize using a secret key K
- Generate N random smooth patterns P(i), i 1,,
N - Take vectorized dot product of low frequency DCT
coefficients (in block B) with random patterns
and use threshold Th to obtain N bits bi
Back
12Related Work
Robust Image Hashing Method 3
DC sub-band
- Invariance of coarse wavelet coefficients
Mihcak et al., 2001 - Key observation
- Main geometric features of image stay
- invariant under small perturbations to image
- Hash algorithm
- Threshold wavelet coefficients of DC sub-band
(coarse robust features) to obtain a binary
matrix - Perform filtering and re-thresholding to
iteratively arrive at binary map which is then
used as the hash - Iterative procedure is designed so as to preserve
significant image geometry
3- level Haar wavelet decomposition
Back
13Related Work
Robust Digital Signature Method 4
- Interscale relationship of wavelet
coefficientsLu Liao, 2003 - Magnitude difference between a parent node and
its four child nodes is difficult to destroy
(alter) under content-preserving manipulations - s wavelet scale, o orientation, 0 i, j 1
w0,0(x,y)
w1,0(2x,2y)
w1,1(2x1,2y)
w1,2(2x,2y 1)
w1,3(2x1,2y1)
2-D wavelet decomposition tree
14Open Issues
Related Work
Contribution 1
- A robust feature point scheme for hashing
- Inherent sensitivity to content-changing
manipulations e.g. could be useful in
authentication - Representation of image content robust to both
global and local geometric distortions - Preferably use properties of the human visual
system - Trade-offs in image hashing
- Robustness vs. Fragility, Randomness
- Question Minimum length of the final hash value
(binary string) needed to meet the above goals ?
- Randomized algorithms for secure image hashing
Contribution 3
Contribution 1
Contribution 2
Contribution 2
15Outline
- Introduction
- Related Work
- Digital signature techniques for Image
Authentication - Robust feature extraction from Images
- Open research issues
- Expected contributions
- Framework for robust image hashing using feature
points - Clustering algorithms for feature vector
compression - Image authentication under geometric attacks via
structure matching - Conclusion
16 Hashing Framework
Expected Contribution 1
- Proposed two-stage hash algorithm
Input Image I
Final Hash
Compression
-
- Feature vectors extracted from perceptually
identical - images must be close in a distance metric
-
17Hypercomplex or End-stopped cells
End-stopping and Image Geometry, Dobbins, 1989
- Cells in the visual cortex that help in
object recognition - Respond strongly to line end-points, corners and
points of high curvature Hubel et al. 1965,
Dobbins 1989
- Develop filters/kernels that capture this
behavior - To maintain robustness to changes in image
resolution, - Wavelet based approach is needed
18End-Stopped Wavelet Basis
- Morlet wavelets Antoine et al., 1996
- To detect linear (or curvilinear) structures
having a specific orientation - End-stopped wavelet Vandergheynst et al., 2000
- Apply First Derivative of Gaussian (FDoG)
operator to detect end-points of structures
identified by Morlet wavelet
x (x,y) 2-D spatial co-ordinates ko (k0, k1)
wave-vector of the mother wavelet Orientation
control -
19End-Stopped WaveletsExample
- Morlet Wavelet along the u-axis
- Detects vertically oriented linear structures
- FDoG operator along frequency axis v
- Applied on the Morlet wavelet to detect
end-points and corners
Synthetic L-shaped image
Response of Morlet wavelet, orientation 0
degrees
Response of the end-stopped wavelet
20Expected Contribution 1
Computing Wavelet Transform
- Generalize end-stopped wavelet
- Employ the wavelet family
- Scale parameter 2, i scale of the wavelet
- Discretize orientation range 0,p into M
intervals i.e. - ?k (k p/M ), k 0, 1, M - 1
- Finally, the wavelet transform is given by
21Expected Contribution 1
Proposed Feature Detection MethodMonga Evans,
2004
- Compute wavelet transform at suitably chosen
scale i for several different orientations - Significant feature selection Locations (x,y) in
the image that are identified as candidate
feature points satisfy - Avoid trivial (and fragile) features Qualify a
location as a final feature point if
- Randomization Partition the image into N random
regions using a secret key K, extract features
from each random region - Probabilistic Quantization Quantize feature
vector based on distribution (histogram) of image
feature points to enhance robustness
22Expected Contribution 1
Iterative Feature Extraction Algorithm Monga
Evans, 2004
- Extract feature vector f of length P from image
I, quantize f probabilistically to obtain a
binary string bf1 (increase count) - 2. Remove weak image geometry Compute 2-D
order statistics (OS) filtering of I to produce
Ios OS(Ip,q,r) - 3. Preserve strong image geometry Perform
low-pass linear shift invariant (LSI) filtering
on Ios to obtain Ilp - 4. Repeat step 1 with Ilp to obtain bf2
- 5. IF (count MaxIter) go to step 6.
- ELSE IF D(bf1, bf2) lt ? go to step 6.
- ELSE set I Ilp and go to step 1.
- 6. Set fv(I) bf2
MaxIter, ? and P are algorithm parameters.
count 0 to begin with fv(I) denotes quantized
feature vector D(.,.) normalized Hamming
distance between its arguments
23Expected Contribution 1
Preliminary Results Feature Extraction
JPEG, QF 10
Original Image
AWGN, s 20
Image Features at Algorithm Convergence
24Expected Contribution 1
Preliminary Results Feature Extraction
- Quantized Feature Vector Comparsion
- D(fv(I), fv(Isim)) lt 0.2
- D(fv(I), fv(Idiff)) gt 0.3
Table 1. Comparison of quantized feature vectors
Normalized Hamming distance between quantized
feature vectors of original and attacked
images Attacked images generated by Stirmark
benchmark software
25Expected Contribution 1
Preliminary Results Feature Extraction
YES ? survives attack, i.e. hash was
invariant content changing manipulations,
should be detected
26Expected Contribution 1
Highlights
- Framework for image hashing using feature points
- Two stage hash algorithm
- Any visually robust feature point detector is a
good candidate to be used with the iterative
algorithm - Trade-offs facilitated
- Robustness vs. Fragility select feature points
such that - T1, T2 large enough ensures that features are
retained in several attacked versions of the
image, else removed easily - Robustness vs. Randomization number of random
regions - Until N lt Nmax, robustness largely preserved
else random regions shrink to the extent that
they do not contain significant chunks of image
geometry
27Expected Contribution 2
Feature Vector Compression
- Goals in compressing to a final hash value
- Cancel small perturbations between feature
vectors of perceptually identical images - Maintain fragility to distinct inputs
- Retain and/or enhance randomness properties for
secure hashing - Problem statement Retain perceptual significance
- Let (li, lj) denote vectors in the metric space
of feature vectors V and 0 lt e lt d, then it is
desired
28Expected Contribution 2
Possible Solutions
- Error correction decoding Venkatesan et al.,
2000 - Applicable to binary feature vectors
- Break the vector down to segments close to the
length of codewords in a suitably chosen
error-correcting code - More generally vector quantization/clustering
- Minimize an average distance to achieve
compression close to the rate distortion limit - P(l) probability of occurrence of vector l,
D(.,.) distance metric defined on the feature
vectors - ck codewords/cluster centers, Sk kth cluster
29Expected Contribution 2
Is Average Distance the Appropriate Cost for the
Hashing Application?
- Problems with average distance VQ
- No guarantee that perceptually distinct feature
vectors indeed map to different clusters no
straightforward way to trade-off between the two
goals - Must decide number of codebook vectors in advance
- Must penalized some errors harshly e.g. if
vectors really close are not clustered together,
or vectors very far apart are compressed to the
same final hash value - Define alternate cost function for hashing
- Develop clustering algorithm that tries to
minimize that cost
30Expected Contribution 2
Cost Function for Feature Vector Compression
- Define joint cost matrices C1 and C2 (n x n)
- n total number of vectors be clustered, C(li),
C(lj) denote the clusters that these vectors are
mapped to - Exponential cost
- Ensures that severe penalty is associated if
feature vectors far apart and hence perceptually
distinct are clustered together
a gt 0, ? gt 1 are algorithm parameters
31Expected Contribution 2
Cost Function for Feature Vector Compression
- Further define S1 as
- S2 is defined similarly
- Normalize to get ,
- Then, minimize the expected cost
- p(i) p(li), p(j) p(lj)
32Expected Contribution 3
Image Authentication Under Geometric Attacks
- Basic premise
- Feature points of a reference image and a
geometrically attacked image are related by a
suitable transformation - Affine transformation models the geometric
distortion - x (x1, x2) , y (y1, y2) R 2 x 2 matrix, t
2 x 1 vector - Hausdorff distance to compare feature points from
two images Atallah, 1983 Rote 1991 - Used in computer vision for locating objects in
an image - Relatively insensitive to perturbations in
feature points, can tolerate errors due to
occlusion or feature detector failure
33Expected Contribution 3
Image Authentication Under Geometric Attacks
- Hausdorff distance between point sets A and B
- A a1,, ap and B b1,, bq
- where
- Measures degree of mismatch between two sets
- Employ structure matching algorithms
Huttenlocher et al. 1993, Rucklidge 1995 - To determine G such that
- Here, fr and fc denote feature point sets from
reference and candidate image to be authenticated
34Conclusion
Conclusion Future Work
- Feature point based hashing framework
- Iterative feature detector that preserves
significant image geometry, features invariant
under several attacks - Trade-offs facilitated between hash algorithm
goals - Algorithms for feature vector compression
- Novel cost function for the hashing application
- Heuristic clustering algorithm(s) to minimize
this cost - Randomized clustering for secure hashing
- Image authentication under geometric attacks
- Affine transformation to model geometric
distortions - Hausdorff distance and structure matching
algorithms to determine affine transformation and
authenticate
35Conclusion
Proposed Schedule
36Backup Slides
37Hash Illustrative Example
Introduction
- Parsing in compiling a program
- Variable names kept in a data structure
- Array of pointers, each pointer points to a
linked list - Index into the array is a hash value
- Example variable name university
- Hashing Scheme Sum of ASCII codes of
characters in a variable name computed modulo N ?
a prime number - Check linked list at array index, add string to
linked list if it had not been previously parsed
38Expected Contribution 1
End-Stopped WaveletsExample
- Morlet Wavelet along
- the u-axis
- FDoG operator along
- frequency axis v
spatial domain
frequency domain
Synthetic L-shaped image
Response of Morlet wavelet, orientation 0
degrees
Response of the end-stopped wavelet
39Feature Detection
Back
Content Changing Manipulations
Original image
Maliciously manipulated image
40Algorithm Parameters
Results
- Image Conditioning
- All images resized to 512 x 512 via triangular
interpolation prior to feature extraction - Intensity planes of color images were used
- Pixel neighborhood
- Circular to detect isotropic features
- Radius of 5 pixels
- Iterative Feature Extraction
- wavelet scale, i 3
- MaxIter 20, ? 0.001, P 128
- LSI filter zero-phase low pass filter (11 x 11)
designed using McCllelan transformations - Order statistics filtering median with 5 x 5
window
Back
41Feature Detection
Experimental Results
90 degree rotation
AWGN s 20
42Expected Contribution 1
Trade-offs
- Perceptual robustness vs. fragility
- Size of the search neighborhood large ? feature
points are more robust - Select feature points such that
- T1, T2 large enough implies features retained in
several attacked versions of the image else
removed easily - Robustness vs. Randomization
- Uptil N lt Nmax, robustness largely retained else
random regions shrink to the extent that they do
not contain significant chunks of image geometry
Back
43Digital Signature Techniques
Relation Based Scheme DCT coefficients
- Discrete Cosine Transform (DCT)
- Typically employed on 8 x 8 blocks
- Digital Signature by Lin
- Fp, Fq, DCT coefficients at the same positions in
two different 8 x 8 blocks - , DCT coefficients in the compressed
image
8 x 8 block
p
q
N x N image
Back
44Wavelet Decomposition
Multi-Resolution Approximations
45Back
46Back
Wavelet Decomposition
Examples of Perceptually Identical Images
Original Image
Contrast Enhanced
JPEG, QF 10
10 cropping
3 degree rotation
2 degree rotation
47Expected Contribution 1
Iterative Hash Algorithm
Input Image
Probabilistic Quantization
Extract Feature Vector
Linear Shift Invariant Low pass filtering
Order Statistics Filtering
Probabilistic Quantization
D(b1, b2) lt ?
Extract Feature Vector
48Quantization
Probabilistic Quantization
- Feature Vector
- fmn m Hn
- Quantization Scheme
- L quantization levels
- Design quantization bins li,li-1) such that
- Quantization Rule
Back
49Feature Detection
Feature Vector Extraction
- Randomization
- Partition the image into N regions using k-means
segmentation extract feature points from each
region - Secret key K is used to generate initial guesses
for the clusters (centroids of random regions) - Avoid very small regions since they would not
yield robust image features
Back
50Expected Contribution 1
Preliminary Results
Table 1. Comparison of quantized feature vectors
Normalized Hamming distance between quantized
feature vectors of original and attacked images
51Clustering Algorithms
Minimizing the Cost
- Decision Version of the Clustering Problem
- For a fixed number of clusters k, is there a
clustering with cost less than a constant? - Shown to be NP-complete via a reduction from the
k-way graph cut problem Monga et. al, 2004 - Polynomial time greedy heuristic to solve the
problem - Select cluster centers based on probability mass
of vectors in V minimize error probabilities in
a rigorous sense - Trade-offs Exclusive minimization of
would compromise and vice-versa - Basic algorithm with variations to facilitate
trade-offs
52Clustering Algorithms
Basic Clustering Algorithm
- Obtain e, d, set k 1. Select the data point
associated with the highest probability mass,
label it l1 - Make the first cluster by including all
unclustered points lj such that - D(l1, lj) lt e/2
- 3. k k 1. Select the highest probability data
point lk amongst the unclustered points such that
- where S is any cluster, C set of
clusters formed till this step and - Form the kth cluster Sk by including all
unclustered points lj such that - D(lk, lj) lt e/2
- 5. Repeat steps 3-4 till no more clusters can be
formed
53Clustering Algorithms
Visualization of the Clustering Algorithm
54Clustering Algorithms
Observations
- For any (li, lj) in cluster Sk
-
- No errors till this stage of the algorithm
- Each cluster is atleast e away from any other
cluster and hence there are no errors by
violating (1) - Within each cluster the maximum distance between
any two points is at most e, and because 0 lt e lt
d there are no errors by violation of (2) - The data points that are left unclustered are
atleast 3 e /2 away from each of the existing
clusters - Next
- Two different approaches to handle the
unclustered points
55 Hashing Framework
Expected Contribution 1
Input Image I
Extract visually robust feature vector
Feature Vectors extracted from perceptually
identical images must be close in a distance
metric
Compress Features
Final Hash Value
56Clustering Algorithms
Approach 1
- Select the data point l amongst the unclustered
data points that has the highest probability mass - For each existing cluster Si, i 1,2,, k
compute - Let S(d)
Si such that di d - IF S(d) F THEN k k 1. Sk l is a
cluster of its own - ELSE for each Si in S(d) define
- where denotes the complement of Si i.e.
all clusters in S(d) except Si. Then, l is
assigned to the cluster S arg min F(Si) - 4. Repeat steps 1 through 3 till all data points
are exhausted
57Clustering Algorithms
Approach 2
- Select the data point l amongst the unclustered
data points that has the highest probability mass - For each existing cluster Si, i 1,2,, k define
-
-
- and ß lies in 1/2, 1
- where denotes the complement of Si i.e.
all existing clusters except Si. Then, l is
assigned to the cluster S arg min F(Si) - 3. Repeat steps 1 and 2 till all data points are
exhausted
58Clustering Algorithms
Summary
- Approach 1
- Tries to minimize conditioned on
0 - Approach 2
- Smoothly trades off the minimization of
vs. - via the parameter ß
- ß ½ ? joint minimization
- ß 1 ? exclusive minimization of
- Final Hash length determined automatically!
- Given by bits, where k is the
total number of clusters formed - Proposed clustering can be used to compress
feature vectors in any metric space e.g.
euclidean, hamming
59Clustering Algorithms
Randomized Clustering for Secure Hashing
- Heuristic for the deterministic map
- Select the highest probability data point amongst
the unclustered data points - Randomization Scheme
- Normalize the probabilities of the existing
unclustered data points to define a new
probability mass such that - where i runs over unclustered points,
- Employ a uniformly distributed random variable in
0,1 (generated via a secret key) to select the
data point i as a cluster center with probability
-
60Clustering Algorithms
Randomized Clustering Illustration
- Example s 1
- 4 data points with probabilities 0.5, 0.25,
0.125, 0.125 - Key Observations
- s 0, ? is uniform or any point is
selected as the cluster center with the same
probability - s ? deterministic clustering
Uniform number generation to select data point
61Clustering Algorithms
Clustering Results
- Compress binary feature vector of L 240 bits
- Final hash length 46 bits, with Approach 2, ß
1/2 - Average distortion VQ at the same rate
- Value of cost function is orders of magnitude
lower for the proposed clustering
62Clustering Algorithms
Conclusion Future Work
- Perceptual Image Hashing via Feature Points
- Extract Feature Points that preserve significant
image geomtery - Based on properties of the Human Visual System
(HVS) - Robust to local and global geometric distortions
- Clustering Algorithms for compression
- Randomized to minimize vulnerability against
malicious attacks generated by an adversary - Trade-offs facilitated between robustness and
randomness, fragility - Future Work
- Authentication under geometric attacks
- Information theoretically secure hashing
63Perceptual Image Hashing Via Feature Points
Image Hashing Via Feature Points
- Feature Points are required to be invariant
across perceptually identical images - Primary geometric features of the image are
largely preserved under small perturbations
Mihcak et. al, 2001 - i.e. extract significant image geometry
preserving feature points - Identify what the human eye perceives as robust
or invariant geometric features - Edge based detection is not suited
- Has problems with high compression ratios,
quantization and scaling Zheng and Chellapa,
1993 - Human recognition performance does not impede
even when much edge information is lost
Beiderman, 1987
64End-stopping and image features
ES2 Wavelet
- Example Wavelets
- SDoG operator on the morlet wavelet
- Wavelet behavior
- produces a strong response at the center of any
oriented linear stimuli of a particular length
determined by s
65Clustering Algorithms
Clustering Dependence on source distribution
- Source distributions may be very skewed
- Trivial clusters may be formed i.e. with very low
probability points included - For efficient compression, the number of clusters
formed should accurately represent the statistics
of the source - Solution
- Consider the algorithm when m clusters are formed
m lt k and i lt n points already clustered - Assign remaining points i.e. i 1, , n to the
remaining clusters in a fashion similar to the
basic algorithm - Compare the expected cost of this clustering vs.
the one with k clusters as formed by the
algorithm described before, if the increase is
not significant terminate with the current number
of clusters