XML Watermarking - PowerPoint PPT Presentation

1 / 39
About This Presentation
Title:

XML Watermarking

Description:

Title: 1 Author: XMSUN Last modified by: jtxy Created Date: 7/17/2004 7:55:04 AM Document presentation format: Company – PowerPoint PPT presentation

Number of Views:92
Avg rating:3.0/5.0
Slides: 40
Provided by: XMS6
Category:

less

Transcript and Presenter's Notes

Title: XML Watermarking


1
XML Watermarking Information Hiding
??? ??????????? ???????????? ???????????????
2
Markup Language
  • SGML (Standard Generalized Markup Language)
  • XML (Extensible Markup Language)
  • HTML (HyperText Markup Language)
  • XHTML

3
Publishing Information in WWW
4
Publishing Information in WWW
5
XML Document
Corresponding Watermarking and information
hiding techniques can be employed
  • XML element type
  • text
  • image
  • Video
  • Audio
  • executive codes

Can we use its own information to do watermarking
or information hiding?
6
Known content-based technique
  • Change font size, color
  • Append white spaces at the end of a line
  • 0-space (x0020)
  • 1-tab (x0009)

7
Shortcomings
  • white spaces at the end of a line
  • Increase page size
  • Layout might be changed
  • Detect very easily by selection

8
Specification
  • Element (Entity)
  • ltname attribute1 attributengt contents lt/name gt
  • ltname attribute1 attributengt lt/name gt
  • ltname attribute1 attributengt
  • Attribute
  • namevalue
  • Example
  • ltfont face"Verdana" size"4" color"FFFF00"gtStud
    ent Number lt/fontgt

9
Properties of markup labels
  • Property 1 Element and attribute names are
    case-insensitive
  • ltfont face"Verdana" size"4" color"FFFF00"gtStud
    ent Number lt/fontgt
  • ltFont face"Verdana" size"4" color"FFFF00"gtStud
    ent Number lt/fontgt
  • ltfont face"Verdana" size"4" color"FFFF00"gtStud
    ent Number lt/Fontgt
  • ltFont face"Verdana" size"4" color"FFFF00"gtStud
    ent Number lt/Fontgt

10
Properties of markup labels
  • Property 2 Attributes are order-insensitive
  • ltfont face"Verdana" size"4" color"FFFF00"gtStud
    ent Number lt/fontgt
  • ltfont size"4" face"Verdana" color"FFFF00"gtStud
    ent Number lt/fontgt

11
Pair attributes technique
  • pair attributes order (Corinna John)
  • key attribute, corresponding attribute
  • key / corresponding (1) corresponding/key (0)
  • ltfont face"Verdana" size"4" color"FFFF00"gtStud
    ent Namelt/fontgt
  • ltfont size"4" face"Verdana" color"FFFF00"gtStud
    ent Namelt/Fontgt
  • key / corresponding table
  • size, detect difficultly

12
Attributes permutation technique
  • equivalent attributes permutation
  • ltfont face"Verdana" size"4" color"FFFF00"gtStud
    ent Namelt/fontgt
  • ltfont face"Verdana" color"FFFF00"
    size"4"gtStudent Namelt/fontgt
  • ltfont size"4" face"Verdana" color"FFFF00"gtStud
    ent Namelt/fontgt
  • ltfont size"4" color"FFFF00" face"Verdana"
    gtStudent Namelt/fontgt
  • ltfont color"FFFF00" face"Verdana" size"4"
    gtStudent Namelt/fontgt
  • ltfont color"FFFF00" size"4" face"Verdana"
    gtStudent Namelt/fontgt
  • lexicographic (alphabetic) order f precedes a
    permutation g iff f(k)ltg(k) for the minimum
    value of k such that f(k)ltgtg(k).

13
Attributes permutation technique
  • Generating attributes permutations in
    lexicographical order
  • ltfont color"FFFF00" face"Verdana" size"4"
    gtStudent Namelt/fontgt
  • ltfont color"FFFF00" size"4" face"Verdana"
    gtStudent Namelt/fontgt
  • ltfont face"Verdana" color"FFFF00"
    size"4"gtStudent Namelt/fontgt
  • ltfont face"Verdana" size"4" color"FFFF00"gtStud
    ent Namelt/fontgt
  • ltfont size"4" face"Verdana" color"FFFF00"gtStud
    ent Namelt/fontgt
  • ltfont size"4" color"FFFF00" face"Verdana"
    gtStudent Namelt/fontgt
  • attributes permutations ?? order numbers
  • color face size 0
  • color size face 1
  • face color size 2
  • face size color 3
  • size face color 4
  • Size color face 5

14
Attributes permutation technique
  • If the number of attributes of an element gt2, it
    may be used to embed hidden information or
    watermark
  • Let be the elements, whose number of
    attributes , in a web page, the
    embedded capacity is

15
Embedded capacity example
Name of web page Capacity (bytes)
www.163.com 48
www.sina.com.cn 279
www.sohu.com.cn 338
www.microsfot.com 15
www.ebay.com 78
www.yahoo.com 33
16
Perceivability
  • Can not perceive when browse the page
  • Hard to perceive through reading the source codes

17
Robust or resistant against editing
  • Contents can be changed

18
Robust or resistant against editing
  • Font, size, color can be changed

19
Security
  • attributes permutations ?? order numbers
  • color face size 0
  • color size face 1
  • face color size 2
  • face size color 3
  • size face color 4
  • Size color face 5
  • Apply hash to concatenation of attributes and key
    to get order number

20
Performance comparison
Type Size change Perceivable by Perceivable by Capacity (bit) Extra payload
Type Size change view code Capacity (bit) Extra payload
White space Y easy easy Page lines N
Case change N N easy Tags N
Attribute pair N N hard Pair table
Equivalent attributes N N hard N
21
Other potential properties
  • String delimiters
  • namevalue
  • namevalue
  • White Space Between the Elements Name and the
    First Attribute
  • ltfont faceverdana size3gt
  • ltfont faceverdana size3gt
  • White Space Between Attributes
  • ltfont faceverdana size3gt
  • ltfont faceverdana size3gt

22
Other potential properties
  • White Space after
  • ltfont faceverdana size3gt
  • ltfont face verdana size3gt
  • White Space Between Elements
  • lttdgtcon1lt/tdgtlttdgtcon2lt/tdgt
  • lttdgtcon1lt/tdgt lttdgtcon2lt/tdgt

23
Other potential properties
  • The default value of an attribute
  • ltfont faceverdana size3gt
  • ltfont faceverdanagt

24
Current progress
  • Introduce insignificant attributes
  • ltfont faceverdanagt
  • ltfont faceverdana xyzabcdgt
  • Break through the capacity bottle neck
  • Web page watermarking
  • Text watermarking

25
Our focus on watermarking
  • Text content security
  • Funded by NSFC Key Project 60736016
  • Funded by NSFC 60373062
  • Software watermarking
  • Funded by NSFC 60573045
  • Wireless sensor network security
  • Funded by 973 Project 2006CB303000
  • Funded by NSFC 60873198
  • Steganalysis
  • Funded by 115 Project

26
??
????0731-8821341,13875971258 Emailsunnudt_at_163.co
m http//nisl.hnu.cn/
27
  • HyperText Markup Language (HTML), version 4.0,
    the publishing language of the World Wide Web
  • Recall that in HTML, element and attribute names
    are case-insensitive the convention is meant to
    encourage readability.
  • Element and attribute names in this document have
    been marked up and may be rendered specially by
    some user agents.
  • http//www.w3.org/TR/1998/REC-html40-19980424/abou
    t.htmlh-1.2.1

28
http//www.w3.org/TR/html/xhtml
  • HTML 4 HTML4 is an SGML (Standard Generalized
    Markup Language) application conforming to
    International Standard ISO 8879, and is widely
    regarded as the standard publishing language of
    the World Wide Web.
  • SGML is a language for describing markup
    languages, particularly those used in electronic
    document exchange, document management, and
    document publishing. HTML is an example of a
    language defined in SGML.
  • SGML has been around since the middle 1980's and
    has remained quite stable. Much of this stability
    stems from the fact that the language is both
    feature-rich and flexible. This flexibility,
    however, comes at a price, and that price is a
    level of complexity that has inhibited its
    adoption in a diversity of environments,
    including the World Wide Web.
  • HTML, as originally conceived, was to be a
    language for the exchange of scientific and other
    technical documents, suitable for use by
    non-document specialists. HTML addressed the
    problem of SGML complexity by specifying a small
    set of structural and semantic tags suitable for
    authoring relatively simple documents. In
    addition to simplifying the document structure,
    HTML added support for hypertext. Multimedia
    capabilities were added later.
  • In a remarkably short space of time, HTML became
    wildly popular and rapidly outgrew its original
    purpose. Since HTML's inception, there has been
    rapid invention of new elements for use within
    HTML (as a standard) and for adapting HTML to
    vertical, highly specialized, markets. This
    plethora of new elements has led to
    interoperability problems for documents across
    different platforms.

29
  • XML is the shorthand name for Extensible Markup
    Language XML.
  • XML was conceived as a means of regaining the
    power and flexibility of SGML without most of its
    complexity. Although a restricted form of SGML,
    XML nonetheless preserves most of SGML's power
    and richness, and yet still retains all of SGML's
    commonly used features.
  • While retaining these beneficial features, XML
    removes many of the more complex features of SGML
    that make the authoring and design of suitable
    software both difficult and costly.

30
  • XHTML is a family of current and future document
    types and modules that reproduce, subset, and
    extend HTML 4 HTML4. XHTML family document
    types are XML based, and ultimately are designed
    to work in conjunction with XML-based user
    agents. The details of this family and its
    evolution are discussed in more detail in
    XHTMLMOD.
  • XHTML 1.0 (this specification) is the first
    document type in the XHTML family. It is a
    reformulation of the three HTML 4 document types
    as applications of XML 1.0 XML. It is intended
    to be used as a language for content that is both
    XML-conforming and, if some simple guidelines are
    followed, operates in HTML 4 conforming user
    agents. Developers who migrate their content to
    XHTML 1.0 will realize the following benefits
  • XHTML documents are XML conforming. As such, they
    are readily viewed, edited, and validated with
    standard XML tools.
  • XHTML documents can be written to operate as well
    or better than they did before in existing
    HTML 4-conforming user agents as well as in new,
    XHTML 1.0 conforming user agents.
  • XHTML documents can utilize applications (e.g.
    scripts and applets) that rely upon either the
    HTML Document Object Model or the XML Document
    Object Model DOM.
  • As the XHTML family evolves, documents conforming
    to XHTML 1.0 will be more likely to interoperate
    within and among various XHTML environments.
  • The XHTML family is the next step in the
    evolution of the Internet. By migrating to XHTML
    today, content developers can enter the XML world
    with all of its attendant benefits, while still
    remaining confident in their content's backward
    and future compatibility.

31
Terrorism
http//www.arabteam2000-forum.com/
Jihad??????????(????)???????
32
Watermark embedding
33
Watermark detection
34
Classification of watermarkingby host
  • Image
  • Audio
  • Video
  • Text (Document)
  • Software / Executive code
  • Database

35
Text watermarking Information Hiding
Watermarking
Information hiding
36
Any redundance?
NO
Character
Code
One to one
37
Utilize format information
  • Line-shift Coding
  • vertically displacing an entire text line
  • Word-shift Coding
  • horizontally shifting the location of a word
    within a text line
  • Character feature coding
  • altering a particular feature of an individual
    character

38
Utilize language information
  • Synonym substitution
  • Syntactic transform
  • TMR tree (text meaning representation)
  • Add spaces at the end of a line

39
Text recoverable watermarking
  • Format based watermarking?
  • Natural language watermarking?
  • How to combine??
  • Text recoverable watermarking???
Write a Comment
User Comments (0)
About PowerShow.com