LeastSignificant Bit Steganography and Steganalysis - PowerPoint PPT Presentation

1 / 37
About This Presentation
Title:

LeastSignificant Bit Steganography and Steganalysis

Description:

Steganography is the science of embedding communications into other 'un-assuming' ... Allows the steganographer to be more discriminating about what data is changed ... – PowerPoint PPT presentation

Number of Views:2921
Avg rating:5.0/5.0
Slides: 38
Provided by: iss56
Category:

less

Transcript and Presenter's Notes

Title: LeastSignificant Bit Steganography and Steganalysis


1
Least-Significant Bit Steganography and
Steganalysis
  • Brian Mearns

2
What is Steganography?
  • Steganography is the science of embedding
    communications into other un-assuming cover
    data.
  • A subfield of data-hiding
  • Cryptography is used to prevent people from
    understanding secret communications
  • steganography is used to prevent people from
    knowing the secret communication even exists!

3
And Steganalysis?
  • Steganalysis is the counter-measure against
    steganography.
  • Attempts to analyze a data stream to determine
    whether or not it contains hidden messages.
  • More ambitiously, can attempt to actually recover
    the hidden message.
  • Frequently, just detecting the presence of a
    hidden message is sufficient.

4
Some historical examples
5
Tattoo messages in Ancient Greece
  • Herodotus reports that messages were tattooed
    onto the shaved heads of slaves. Once the hair
    grew back, the slaves were sent to the recipient,
    with the message hidden in plain sight.

6
DeCSS
  • When the DVD copy-protection circumvention
    program DeCSS was declared illegal, hackers used
    clever (and frequently ironic) steganographic
    techniques to continue spreading the program.

Scan of the preliminary injunction issued against
DeCSS, with the program embedded in the images
color palette.
Image of MPAA president, Jack Valenti. The bane
of his existence is embedded in his face.
7
Quick overview of Information Theory in
Steganography
8
Terminology
Cover object
Lorem ipsum dolor sit amet, consectetuer
adipiscing elit. Nam id est at ante mattis
placerat. Aliquam erat
Stego-object
Payload (secret message)
9
Known interference
  • As with other forms of data hiding, the cover
    object can be viewed as channel interference
    known to sender (but usually not the receiver).
  • With the interference known to the encoder, some
    of the available resources can be used to cancel
    the interference.
  • But this wastes resources and reduces the maximum
    communication rate.

10
Dirty Paper
  • In his 1983 paper, Max Costa likened the
    situation to writing on dirty paper.
  • Instead of wasting resources trying to avoid
    (cancel) the dirty spots, they can be
    incorporated into the communications.
  • Costa showed that the capacity is the same as if
    the interference was not there.

11
Wet Paper
  • Jessica Fridrich et al expanded this metaphor to
    writing on wet paper in their 2005 paper.
  • Their method accommodates the issue of
    perceptibility of the hidden message in the
    stego-object
  • The wet spots are locations in the cover object
    that cant be changed.
  • In an image, for instance, changes made to
    certain pixels may cause greater visual
    distortion than other pixels.

12
Some Steganography Techniques
13
Basic premise for Steganography
  • Any time we have a choice, we have an opportunity
    to encode data.

14
Palette Images
  • Palette images define a list (or palette) of
    all the colors used in the image.
  • Pixels are encoded as indices into the palette,
    instead of the actual RGB values.
  • We have complete freedom to arrange the palette
    however we want, without changing the actual
    image.
  • For a 256 color palette, there are 256! gt 8e506
    possible orderings
  • Equivalent to 1,684 bits (210 bytes) of
    information embedded without any visual change to
    the image.

15
LSB Overwriting
  • For many types of data (like non-palette images),
    values can be altered slightly without much
    perceptible change.
  • Overwriting the least significant bit (LSB) of
    all or some of the bytes in this type of data is
    an effective way to embed a message.

16
LSB Parity Encoding
  • A variation of LSB overwriting is to convey
    message bits in the LSB parity of a group of
    bytes.
  • Freedom to alter any byte in the group in order
    to set the parity
  • Allows the steganographer to be more
    discriminating about what data is changed to
    minimize the perceptibility (like wet-paper
    coding).
  • Also causes the disturbance to the cover image to
    be more randomly distributed. Generally harder to
    detect than periodic disturbances.

17
How to detect hidden messages
18
The stupid way
  • One way to know whether or not a data set
    contains hidden information is to learn every
    steganographic algorithm available, and check the
    data against each one.
  • Even if you had all the time in the world to try
    to pull this off, messages are generally
    compressed and/or encrypted before being
    embedded.
  • When you use a given algorithm to extract a
    hidden message, it will look like random bits.
    How do you know if its a message or just garbage?

19
The smarter way
  • If a class of objects can be shown to share a
    particular set of characteristics, and these
    characteristics change after a message is
    embedded into the object, then these
    characteristics form the basis for a
    stego-analytical investigation of objects in this
    class.

20
Stupid example
  • A particular class of images is composed of all
    those images that are a single solid color.
  • After a message is embedded into such an image,
    it will no longer be a solid color.

21
LSB Steganalysis in natural images
22
Why natural images?
  • Natural images are images of things that exist in
    the real world landscapes, people, food,
  • Digital photos
  • Natural choice for hiding messages because the
    high level of non-uniform detail makes subtle
    changes difficult to perceive.

23
Dumitrescu, et al
  • Technique to estimate the length of messages
    hidden natural images.
  • Divides the image into pairs of adjacent pixels.
  • Puts each pair into one of four mutually
    exclusive primary sets.
  • The authors propose some assumptions about
    natural images that allow them to establish some
    properties regarding the size of the sets.
  • Natural images are presumed to be isotropic the
    gradient in any direction is positive or negative
    with equal probability.
  • The sign of the gradient (of a pixel-pair) is
    independent of whether the second pixel in the
    pair is odd or even.

24
Dumitrescu, et al (cont)Initial Sets
  • P is the set of all pixel pairs (u,v)
  • A pair (u,v) is even or odd based on whether v is
    even or odd (respectively).
  • Gradient of the pair is just u-v
  • X all even pairs that have a negative gradient,
    and all odd pairs with positive gradient.
  • Y opposite of X.
  • Z all pairs with 0 gradient.
  • P X U Y U Z

25
Dumitrescu, et al (cont)Primary Sets
  • Y is subdivided into V and W
  • W is the set of all the pairs from Y which have a
    gradient of /- 1
  • V is everything else from Y
  • The four primary sets are X, Z, V, and W
  • P X U Z U V U W
  • All primary sets are mutually exclusive

26
Dumitrescu, et al (cont)Primary Sets
  • The primary sets can be expressed as the
    following patterns of bit strings
  • X (Q0,QN0), (Q1,QN0), (QN0, Q1),
    (QN1, Q1)
  • V (QN0,Q0), (QN1,Q0) (Q0, QN1),
    (Q1,QN1)
  • W (Q1,Q0), (Q0,Q1)
  • Z (Q0,Q0), (Q1,Q1)
  • Q is any string of (n-1) bits (for n-bit pixels),
    consistent in each pixel pair.
  • N is any integer value such that (QN) gt Q
  • 0 and 1 are the least significant bits

27
Dumitrescu, et al (cont)Primary Set Migration
  • Under LSB manipulation, each pair will undergo
    one of 4 possible mutations
  • Both pixels changed
  • Neither pixels changed
  • One or the other changed (two possible cases)
  • Represent these as a pair of bits, where a 1
    means the corresponding pixel changed.

28
Dumitrescu, et al (cont)Primary Set Migration
  • Using the bit pattern definitions for each of the
    4 primary sets, it is easy (but tedious) to see
    where pixel pairs in each set will end up under
    all possible mutation pattern.

29
Dumitrescu, et al (cont)Primary Set Migration
30
Dumitrescu, et al (cont)Effects of embedding on
cardinality
  • Using these migration patterns, we can generate
    expressions for the size of each set after
    embedding, in terms of the sizes before embedding
    and the probability of each kind of mutation.

31
Dumitrescu, et al (cont)Effects of embedding on
cardinality
  • For example, the set X after embedding will be
    composed of pixels originally in X that underwent
    a 00 or 10 mutation, and all the pixels
    originally in V that underwent a 11 or 01
    mutation.
  • So the size of X after embedding is given by
  • X X(P(00)P(10)) V(P(11)P(01))

32
Dumitrescu, et al (cont)Determining the length
of message
  • Using the two assumptions about natural images,
    it is easy to show that X Y for un-altered
    images (no embedded message).
  • One further assumption is that the altered pixels
    are randomly distributed through the image.
  • This allows us to express the probability of each
    kind of mutation in terms of p, the ratio of
    image pixels to message bits.
  • Some simple arithmetic can now be used to find a
    quadratic expression for p, based on the
    equations for the sizes of each primary set after
    embedding
  • 0.5(WZ)p2 (2X-P)p Y - X 0

33
Dumitrescu, et al (cont)Determining the length
of message
  • (WZ)p2 (2X-P)p Y - X 0
  • In most cases, this should yield two values for
    p.
  • The actual length of the embedded message will
    given by the smaller of the two values.

34
Dumitrescu, et al (cont)back in reality
  • This is a really clever idea using simple
    measurements and calculations.
  • The assumptions about natural images are not
    perfectly accurate
  • A test batch of assorted natural images yielded
    believable false positives.
  • p values as high as 5, which gives a message
    length of 30kB. In fact, there was no message in
    image.

35
Dumitrescu, et al (cont)back in reality
  • Additionally, the assumption about the
    probability of each kind of mutation is way off
    for a lot of common embedding schemes.
  • For instance, if we embed a bit into the LSB of
    every k-th pixel (for kgt1), then the probability
    of both pixels in the pair being altered (i.e.,
    P(11)) is 0.
  • Same for parity encoding where the group size is
    an even number (so no pixel pair spans two
    groups).
  • The probability of each kind of alteration is
    much more complex then given in the paper, which
    changes the expressions for cardinality after
    embedding.

36
Conclusions
  • Steganography is really cool
  • Its fun to play with an basic embedding schemes
    are easy to implement but fairly effective.
  • Obviously has a lot of good and bad applications,
    as with an technology.
  • Steganalysis is still playing catch up
  • Much like with cryptanalysis, early approaches
    were brute-force and clumsy.
  • New approaches involving statistical
    classification are much more promising, but still
    have a ways to go.

37
The end
This image of the Mona Lisa has been embedded
into the Stego-saurus background with lsb
parity encoding, with a groups size of 10 pixels.
Dinosaur image thanks to stegosaurus.org
Write a Comment
User Comments (0)
About PowerShow.com