Title: Extracting the Hidden: Paper Watermark Location and Identification
1Extracting the Hidden Paper Watermark Location
and Identification
School of Computing FACULTY OF ENGINEERING
- Roger Boyle, roger_at_comp.leeds.ac.uk
- Kia Ng, kia_at_comp.leeds.ac.uk
2Gants (2000)
- the crucial understanding of paper, the
commodity that links all the individual
warehouses bookstalls, and printing houses. Gants
D
The study of watermarks is a seductive if
somewhat esoteric pastime. While it is normally
the beauty and aesthetic quality of watermarks
that initially attract the researcher, they are
more than just pretty affectations and can shed
light on historic trends and events. Pavelka
.
3Motivation
The manufacture, trading, retailing and use of
paper are key aspects in attribution. But there
are also core codicological interests in
determining what was made and used by whom and
when. The watermark is the best known signature
of a paper mould. There are others arguably
better.
.
4Paper
- Watermarks
- Chain lines
- Laid lines
- Paper texture
- Twins
Watermarks in Incunabula printed in the Low
Countries http//watermark.kb.nl/
5Motivation
- Increasing availability of digital repositories
- Improvements in underlying pattern recognition
and extraction techniques - The right place at the right time
6Challenges
- Hidden (by design)
- Many documents of interest are
- Delicate
- In private collections
- Inscribed recto and verso
- Obstructions and interference
- Defects, e.g. Fold marks, paper texture, etc
7Back-lighting Acquisition System
8Example Input
9Back-light
10Back-light Contrast
11Overall Framework
12Layers
- Separate the input image into several layers
- Ia removing foreground interference, e.g.
writing - Ib Non-uniform background , e.g. texture,
noise, folding, etc - Iw Ia - Ib watermark (and some residual
noise) - Use morphological operations to suppress
interference - A combination of morphological dilation (C A ?
B) and erosion (C A ? B) operations - where A the image and B the structuring element
13Element Size B
- Applying a contrast stretching process
- the darkest pixels zero intensity value
- Find the percentage of such pixels x
- Within the original image, determine the grey
level g such that x of pixels are lt intensity
g - Dilate the input image, starting with structuring
element of size 1, and increasing the size - until all pixels values gt g
- ? optimal structuring size to remove foreground
interference.
Number of pixels of values below g plotted
against structuring element size
14(No Transcript)
15Background Estimation
- To estimate the image background
- Remove the watermark pattern
- Find a structuring element size that is large
enough to cover a single feature of the pattern - Opening is useful for separating touching
features, and removing small regions and sharp
peaks. - Morphological top-hat transform,
- A - (A ? B), where ? is morphological opening
- C A ? B (A ? B) ? B
- A the image and B the structuring element
16Watermark Size
- Now possible, after the removal of obstructing
foreground features - Applying a series of morphological openings with
structuring elements of increasing size - The sum of pixel intensity values in the output
image after each opening is stored
Cumulative intensities plotted against
structuring element radius
17Difference
- Difference of total intensities between two
sequential openings - distribution of objects sizes at that scale
- the pattern spectrum of the image
- a local minimum at a specific radius will
indicate the existence of many image objects of
that radius. - The global minimum, Rmin, indicate the highest
cumulative intensity of objects at that radius.
Granulometry (size distribution ) of image objects
18(No Transcript)
19Difference Contrast
20(No Transcript)
21Top-down approach
- Our approach has been demonstrated successfully
on a range of inputs. - But it will fail on challenging data exhibiting
thick paper, heavy interference, or damage.
22Hard data
- Part of the elaborate Leeds Arabic collection
- Pillage from the battle of Omdurman
- Studied in detail by Brockett (1987)
23What you see
24What you see
A watermark is just discernible in the RH
margin. Most details are very faint. Fainter
examples exist in the text.
25What we do
- Rather than seek results bottom up,
pixel-by-pixel, we construct a computational
model of what the backlighting does.
26What we do
27What we do
28What we do
Leading to a representation of just the verso and
the interior.
29What we do
And thus we derive a picture that isolates the
information not visible as either recto or
verso. This includes the watermark and mould
features, but also paper irregularities and
various other noise.
30What we see
- Statistical attacks and top-down reasoning can
betray the presence of incomplete, damaged
patterns. - Capturing watermark fragments incomplete and
possibly inaccurate is straightforward from
some pages
31What we see
32What we see
33Potential
- Reliable identification of these partial patterns
permits by aggregation recapturing of patterns
not seen before
34Undiscovered countermark information
35Potential
- We can study subtleties of manufacture
36Conclusions
- That which lies within is often as valuable as
the inscription - Sometimes, wrestling with the paper is essential
- These are not new problems enhanced digital
access is new - Computer science can bring tools from other
domains of significant benefit - ( and benefit itself)
37References
Hazem Hiary and Kia Ng, A System for Segmenting
and Extracting Paper-based Watermark Designs,
International Journal on Digital Libraries,
6(4)351-361, Springer, 2007. Roger Boyle and
Hazem Hiary, Watermark location via back-lighting
and recto removal, International Journal of
Document Analysis and Recognition, 12(1), 33-,
2009
Thank you
Thanks to the Special Collections of the Leeds
University Library for the manuscripts and other
test samples used in this research project, and
ICSRiM (www.icsrim.org.uk) for the acquisition
system.