Title: AFRL Presentation
1Lossless Watermarking of Entropy Coded Sources
Electrical and Computer Engineering
Department Villanova University
2Outline
- Goals and motivation for work
- Brief overview of relevant literature
- Theory
- Codespace
- Codeword-pairing
- Binary Tree Structure
- Practiceapplying theory to JPEG
- Introduction to JPEG
- Redundancy in JPEG images
- Actual encoding and decoding
- Results
- Conclusions and Future Related Work
3Goals
- Develop watermarking scheme for metadata
embedding that is - Applied directly in compressed domain
- Losslessly reversible
- File-size preserving
- Format compliant
4Motivations
- Watermarking within the compressed domain allows
for fast, real time applications. - Watermarking within the VLC portion of compressed
data will not change the existing formatallowing
for standard software to read data while
watermarked. - Lossless recovery of original data allows for
many metadata applications.
5Previous Work
- Some algorithms embed in DCT coefficients.
- Algorithms such as JSTEG, OUTGUESS, and F5.
- Embedding in DCT coefficients requires at least
partial decompression of data.
6Previous Work
- Few algorithms work directly with VLCs in the
compressed domain. - Langelaar et al. proposed a form of LSB
watermarking. - Chun-Shien et al. proposed modulating DCT
coefficient level values. - However, both of these methods are lossy.
7Algorithm
Watermark
Offline data analysis
VLC analysis
Compressed format data
Watermarked data
- Offline analysis may only need to occur once,
even for different data. - Not necessary to have full video, only VLC table
is required for analysis.
8Theory Outline
- Concept of codespace in relation to entropy codes
- Improved capacity through codeword-pairing
- Efficiency of binary code trees
9Variable Length Encoding
- Variable length encoding attempts to minimize the
average codeword length by assigning shorter
codewords to symbols that appear more often
within a given data stream. - This requires a priori knowledge of the data (or
reasonable expected distribution). - Huffman coding is the most common method of
variable length encoding and promises the closest
results to entropy coding.
10RVLCs
- VLCs are instantaneously decodable from left to
right. - One code can not be the prefix of another.
- i.e. 10 and 101 can not be in the same VLC table.
- Reversible variable length codes (RVLCs) are
two-way decodable. - Most basic RVLC codes are symmetric.
- 00, 010, 101, 0110, etc.
11RVLCs for English Alphabet
12Codeword Pairs
- Examine a fixed length code example
- Code consists of 10 and 11
- Codespace for this code takes up 50 of total
possible codespace when considering data only one
codeword long. - Consider codeword pairs (i.e. 1010, 1011,
1110, 1111) now only 25 of possible
codespace is being employed.
13Concept
codeword-pairing
VLCs
Vi
Vj
Vij
codeword-pair
RVLCs Codeword-pairs
- Codeword-pairing creates additional watermark
capacity.
14Detecting the Watermark
watermark
Vij
Vij
watermark
Collision Vij Vxy
Good Table
- A collision occurs when a watermarked VLC
violates the prefix condition or causes another
VLC to violate the prefix condition.
15Reversing the Watermark
Initial Table
Final Table
Collision
- Multiple watermarked codeword-pairs may overlap,
making it impossible to identify the original
codeword-pair. - The final table eliminates all such ambiguities.
16Overview of Offline Analysis
Create exhaustive Pairing of codewords
Input VLC table
Locate redundant bits
Watermark bits must be unambiguous
- Limit number of bits before error must be
discovered. - Create codeword pairs.
- Locate redundant bits.
- Specify which redundant bits can unambiguously be
watermared.
17Estimated Capacity
- Estimated capacity is based on the RVLC table
actual capacity will vary based on the compressed
data. - Estimated capacity is calculated by summing over
all codeword-pairs, the product of each
codeword-pairs probability of occurrence and
divide by its length. The result is a percent.
18Results
- The algorithm was applied to the encoding of the
English alphabet using an asymmetric RVLC 3.
19Results
- Algorithm fulfills the following criteria
- Watermark within the compressed domain.
- Does not change format of data.
- Losslessly removable.
- Algorithm has great potential for embedding
metadata in a wide range of applications.
20Binary Tree Structure
- Previous work
- Used computationally expensive, exhaustive
searches to determine watermark bit locations. - Resulted in lookup tables.
- Binary Tree Structure
- Exponentially decreases complexity for
determining watermark bit locations. - Result is binary tree that can be used for both
watermarking and decoding.
21Example Binary Tree
0
1
Leaf node
Branch node
Available node
00
010
011
100
101
110
111
0110
22Codeword-pair Binary Tree
Leaf node
Branch node
0
1
0000
00010
01000
011000
000110
010010
0100110
0110010
01100110
23Failed Watermark Attempt
Leaf node
Branch node
Collision node
0
1
0001
0000
00010
01000
011000
000110
010010
0100110
0110010
01100110
24Successful Watermark Attempt
Leaf node
Branch node
Watermark node
0
1
0010
0000
00010
01000
011000
000110
010010
0100110
0110010
01100110
25Application JPEG
- Knowledge of watermarking and binary tree
structure applied to watermarking of JPEG images. - Goals of JPEG watermarking
- Designed for metadata applications
- Algorithm should still be applied in compressed
domain, file-size preserving, and lossless
26JPEG Compression
Each block is forward DCT transformed
Quantized coefficients are zigzag scanned into
one-dimensional array
Raw image
Add headers and markers JPEG file
Group into 8x8 pixel blocks
Each block is quantized
Entropy encoded
- Quantization table is 8x8 matrix
- Increasing values in quantization table decreases
file size, but deteriorates image quality?allows
for various compression rates - Entropy encoding can be Huffman or arithmetic
27Redundancy Custom Vs. Standard
?Example AC VLC table has 162 codewords. ?Actual
images typically require less than half.
Custom
Standard
- JPEG Standard allows for custom AC VLC tables to
be created for each image - JPEG Standard also has an example table that
includes every possible run/size combination
28Redundancy
- Many images use the example AC VLC table provided
in the standard regardless of content in the
image. - Popular software tools such as MATLAB and
Microsoft Paint use the example table when
creating JPEG images. - Since AC VLC table is not optimized for specific
image, the entropy coding is not perfectmeaning
that there is some inherent redundancy.
29Example Binary Tree
0
1
Leaf node
Branch node
Available node
00
010
011
100
101
110
111
0110
30Watermarking
Binary Tree Find Watermark bit locations in
VLCs
JPEG image
Watermarked JPEG Image
Check capacity of image
Parse pull out used AC VLCs
Choose Metadata
Embed watermark
- The three major components of the JPEG
watermarking tool are - Parsing the image and pulling out AC VCLs that
occur in the image - Using binary tree structure to determine
watermark bit locations - Watermark embedding
31JPEG Results
- Algorithm applied to Lena image
- Image is grayscale, JPEG compressed at quality
factor of 90 - Image is 45.6 Kb
- Uses the example AC VLC table in the JPEG standard
32JPEG Results
- Algorithm applied to Lena image
- Virtually no change in file size (change on the
order of bytes due to zero-padding) - Lossless
- Applied directly in compressed domain
33JPEG Visualization
- Loss of synchronization for watermark unaware
decoders - Example image only has two watermark bits
embedded - Large visual distortion, often can not be
displayed at all
34The Next Step Visual Masking
- Current method causes large visual distortions
for standard decoders (unaware of watermark
algorithm) - Goal is to modify algorithm to mask watermark by
maintaining synchronization with standard
decoders without sacrificing any previous criteria
35Visual Masking
- To mask visual impact, modify the run/size of the
Huffman table for unused VLCs that are mapped to. - Change run/size of watermarked VLCs to match the
original run/size - This is constrained by length of VLCs and number
of expected appended bits (size)
36Future Work
- JPEG Related Work
- JPEG visualization
- Incorporate JPEG work into stand alone package
- Maximize capacity for JPEG application
- Add additional security features to JPEG work
- Apply general algorithm to other specific
compression standards H.263, MPEG-1,2
37References
- 1. G.C. Langelaar et al. Watermarking Digital
Image and Video Data IEEE Signal Proc. Magazine,
Vol. 17, No. 5, Sept. 2000, pp. 20-46. - 2. L. Chun-Shien, J. Chen, H. Liao, and K. Fan,
Real-Time MPEG-2 Video Watermarking in the VLC
Domain, International Conference on Pattern
Recognition, Vol. 2, pp. 552-555, 2002. - 3. F. Hartung and B. Girod, Watermarking of
Uncompressed and Compressed Video, Signal
Processing (special issue on watermarking), Vol.
66, No.3, pp.283-302, 1998. - 4. I. Setyawan et al. Low-bit-rate video
watermarking using temporally extended
differential energy watermarking (DEW) algorithm
SPIE Proceedings on Security and Watermarking of
Multimedia Contents III, San Jose, USA, January
22-25, 2001, pp. 73-84. - 5. Takishima, Wada, and Murakami. Reversible
Variable Length Codes IEEE Trans. on
Communications, Vol. 43 No. 2/3/4,
February/March/April 1995 pp.158-162. - 6. Alattar, Celik, and Lin. Evaluation of
Watermarking Low Bit-rate MPEG-4 Bit Streams
SPIE Proceedings on Security and Watermarking of
Multimedia Contents V, San Jose, USA, January
21-24, 2003, pp.440-451. - 7. Lang, Thiemert, Hauer, Liu, and Petitcolas.
Authentication of MPEG-4 data risks and
solutions SPIE Proceedings on Security and
Watermarking of Multimedia Contents V, San Jose,
USA, January 21-24, 2003, pp.452-461. - 8. I. Moccagatta, S. Soudagar, J. Liang, and H.
Chen, Error-Resilient Coding in JPEG-2000 and
MPEG-4, IEEE Journal on Selected Areas in
Communications, vol. 18, no. 6, pp.899-914, June
2000.