Feature-Based Steganalysis for JPEG images and its applications for future design of steganographic schemes. - Jessica Fridrich - PowerPoint PPT Presentation

1 / 23
About This Presentation
Title:

Feature-Based Steganalysis for JPEG images and its applications for future design of steganographic schemes. - Jessica Fridrich

Description:

Cropping by 4 pixels is important because 8 x 8 grid of recompression does not ... We can think the cropped/recompressed image as an approximation to the cover ... – PowerPoint PPT presentation

Number of Views:155
Avg rating:3.0/5.0
Slides: 24
Provided by: Prav4
Category:

less

Transcript and Presenter's Notes

Title: Feature-Based Steganalysis for JPEG images and its applications for future design of steganographic schemes. - Jessica Fridrich


1
Feature-Based Steganalysis for JPEG images and
its applications for future design of
steganographic schemes.
- Jessica Fridrich
  • Submitted by

  • Praveena Gummadi

  • Vandana Vasudeva

2
Contents
  1. Abstract
  2. Previous work
  3. Proposed Research
  4. Experimental results
  5. Conclusion
  6. Comments
  7. Acknowledgements

3
Abstract
  • The goal of forensic steganalysis is to detect
    the presence of embedded data and to eventually
    extract the secret message. In the given paper a
    new feature based steganalytic method for JPEG
    images was introduced. The detection method is a
    linear classifier trained on feature vectors
    corresponding to cover and stego images. The
    features are calculated as an L1 norm of
    difference between a specific macroscopic
    functional calculated from stego image and same
    functional obtained from a decompressed, cropped,
    and recompressed stego image. The functionals are
    built from marginal and joint statistics of DCT
    coefficients. Because the features are calculated
    directly from DCT coefficients, conclusions can
    be drawn about the impact of embedding
    modifications on detectability.
  • Three different steganographic examples are
    tested and compared. The experimental results
    reveal new facts about current steganographic
    methods for JPEGs and new design principles for
    more secure JPEG steganography.

4
Previous work on Steganalytic methods
  • Chi-square attack by Westfield original
    version could detect sequentially embedded
    message and was based on first order statistics.
  • Based on distinguishing statistic steganalyst
    first inspects the embedding algorithm and then
    identifies a quantity (distinguishing statistics)
    that changes predictably with the length of
    embedded message. For JPEG images the calibration
    is done by decompressing the stego image,
    cropping up a few pixels in each direction and
    recompressing using same quantisation table. The
    DS calculated from this image is used as an
    estimate for the same quantity from the cover
    image. Using this calibration a highly accurate
    and reliable estimation of the embedded message
    length can be constructed for many schemes.
  • Blind Classifiers by Memon and Farid A blind
    detector learns what a typical unmodified image
    looks like in a multidimensional feature space
    and a classifier is then trained to learn the
    difference between cover and stego image
    features.
  • Introduction of blind detectors prompted further
    research in steganography and Tzscoppe
    constructed a JPEG steganographic scheme HPDM
    (histogram preserving data mapping) which was
    undetectable using Farids scheme but is easily
    detectable using single scalar feature-calibrated
    spatial blockiness in DCT domain rather than from
    a wavelet decomposition.

5
Proposed Research
  • The paper combined the concept of calibration
    with the feature based classification to devise a
    blind detector specific to JPEG images.
  • Calibrated Features
  • Two types of features were used in analysis
    first order features second order features. All
    features were constructed in the following
    manner.
  • A vector functional F is applied to the stego
    JPEG image J1. This functional could be global
    DCT coefficient histogram, a cocurrence matrix,
    spatial blockiness. The stego image J1 is
    decompressed to the spatial domain, cropped by 4
    pixels in each direction and then recompressed
    with the same quantisation as J1 to obtain J2.
    The same vector functional F is then applied to
    J2. The final feature f is obtained as an L1 norm
    of the difference
  • f
    F(J1) F(J2)

6
(No Transcript)
7
  • Basic logic behind this choice for features is
    the following
  • The cropping and recompression produces a
    calibrated image with most macroscopic features
    similar to the original cover image because the
    cropped stego image is perceptually similar to
    the cover image and thus its DCT coefficients
    have approx the same statistical properties as
    the cover image.
  • Cropping by 4 pixels is important because 8 x 8
    grid of recompression does not see the previous
    JPEG compression and thus the obtained DCT
    coefficients are not influenced by previous
    quantisation in DCT domain.
  • We can think the cropped/recompressed image as
    an approximation to the cover image or as side
    information.

8
  • First order Features
  • The simplest first order statistic of DCT
    coefficients is their histogram. Suppose the
    stego JPEG file is represented with a DCT
    coefficient array dk(i,j) and quatisation matrix
    Q(i,j) and global histogram of all 64k DCT
    coefficients will be denoted as Hr where
    rL,..,R Lmin k, i, j dk(i,j) and Rmax k, i,
    j dk(i,j)
  • For a fixed DCT mode (i,j) let hr ij ,denote the
    individual histogram of values dk(i,j). To
    provide additional first order macroscopic
    statistics to our set of functionals, we use dual
    histogram given as
  • Where delta(u,v) 1 if uv and 0 otherwise.
  • The above g value is the number of times the
    value d occurs as (i,j)-th DCT coefficient over
    all total B blocks in JPEG image.

9
Second order Features
  • If the corresponding DCT coefficients from
    different blocks were independent then any
    embedding scheme that preserves the first order
    statistics the histogram would be undetectable
    by Cachins definition of steganographic
    security. Thus we use the features that capture
    inter-block dependencies as they would be likely
    violated by most steganographic algorithms.
  • Let Ir and Ic denote the vectors of block indices
    while scanning the image by rows and columns
    resp. The first functional capturing inter-block
    dependencies is the variation V defined as
  • Embedding changes also increase the
    discontinuities along the 8 x 8 block boundaries,
    thus two blockiness measures Ba , a1,2are
    included to our set of functionals. The
    blockiness is calculated from decompressed JPEG
    image and represents an integral measure of
    inter-block dependency over all DCT modes over
    the whole image.

10
  • In the expression above M and N are image
    dimensions, x ij are grayscale values of the
    decompressed JPEG image.
  • The final three functionals are calculated from
    the co-occurrence matrix C of neighboring DCT
    coefficients which is a DxD matrix, DR-L1 and
    matrix C describes the probability distribution
    of pairs of neighboring DCT coefficients and is
    defined as
  • Let C(J1) and C(J2) b e the co-occurrence
    matrices for JPEG images J1 and J2 resp. Due to
    approx symmetry of Cst around (s, t)(0,0), the
    differences Cst(J1) Cst(J2) for (s, t)
    belonging to (0,1),(1,0),(-1,0),(0,-1) are
    strongly positively correlated and same is true
    for the group (s, t) belonging to
    (1,1),(-1,1),(1,-1),(-1,-1).

11
  • The co-occurrence matrix for the embedded image
    can be obtained as a convolution CP (q), where P
    is the probability distribution of the embedding
    distortion which depend on the relative message
    length, and values of CP (q) will spread out and
    following three quantities were taken as our
    features

12
  • The final set of 23 functionals used in this
    paper is summarized as in table below

13
Experiment
  • The paper used the Greenspun image database
    consisting of 1814 images of size 780x540. All
    these images were converted to grayscale, the
    black border frame was cropped away and images
    were compressed using 80quality JPEG. The paper
    selected three different steganographic
    algorithms namely F5 algorithm, Outguess 0.2, and
    Model based Steganography without and with
    deblocking MB1 and MB2 for JPEG images.
  • Each technique was analyzed separately. For a
    fixed relative message length expressed in bits
    per non-zero DCT coefficients of the cover image,
    a training database of embedded image was
    created. The Fisher Linear Discriminant
    Classifier was trained on 1314 cover and stego
    images. The generalized Eigen vector obtained
    form this training was then used to calculate the
    ROC curve for the remaining 500 cover and stego
    images. The detection performance was evaluated
    using detection reliability P defined as
  • P
    2A-1,
  • Where A is the area under ROC (Receiver
    Operating Characteristic Curve) also called as
    accuracy. The accuracy was scaled to obtain P 1
    for a perfect detection and P 0 when ROC
    coincides with diagonal line (where reliability
    of detection is 0). The detection reliability of
    all the three methods is shown in table 2 as

14
  • Table 2.
  • Detection reliability p for F5 algorithm with
    matrix embedding (1,k,2k -1), F5 turned off
    matrix embedding, Outguess 0.2, Model based
    steganography without and with deblocking (MB1
    andMB2) for different embedded rates.

15

Figure 1.Capacity for the tested techniques
expressed in bits per non-zero DCT coefficients.
  • .

16

Figure 2. ROC curves for embedding
capacities and methods from table 2.
17

Table 3.Detection reliability for individual
features for all three embedding algorithms for
fully embedded images.
18
Conclusion
  • From table 2 we can see that Outguess algorithm
    is the most detectable and also it provides the
    smallest capacity. The detection reliability is
    relatively high even for embedding rates as small
    as 0.05 bpc and this method becomes highly
    detectable for messages above 0.1 bpc.
  • F5 algorithm performs better than outguess on
    turning off the matrix embedding since matrix
    embedding decreases the detectability of short
    messages as it improves the efficiency.
  • From table 3 it can be seen that both MB1 and MB2
    methods clearly have the best performance of all
    three tested algorithms. MB1 preserves not only
    the global histogram but all marginal statistics
    (histograms) for each individual DCT mode. MB2
    algorithm has same embedded mechanism as MB1 but
    reserves one half of the capacity for
    modifications that bring the blockiness of the
    stego image to its original value.

19
Comments
  • Looking at the results in table 1 and table 2
    there is no doubt that the model based
    Steganography MB1 and MB2 is by far the most
    secure method out of three tested paradigms. MB1
    and MB2 not only preserve the global histogram
    but also all histograms of individual DCT
    coefficients and hence all dual histograms are
    also preserved. MB2 also preserves one second
    order functional, L1 blockiness. Thus, we
    conclude that the more statistical measures an
    embedding method preserves, the more difficult it
    is to detect it.
  • One surprising fact revealed is that preserving a
    specific functional does not mean that the
    calibrated feature will be preserved. Preserving
    the blockiness along the original 8x8 grid does
    not mean that the blockiness along the shifted
    grid will also be preserved . This is because the
    embedding and deblocking changes are likely to
    introduce distortion into the middle of the
    blocks and thus disturb the blockiness feature,
    which is the difference between the blockiness
    along the solid and dashed lines as seen in the
    figure 3 below

20
Figure 3
21
  • Its further pointed out that the features derived
    from the co-occurrence matrix are very
    influential for all three schemes esp. for Model
    based steganographic methods.
  • MB2 method is the currently the only JPEG
    steganographic method that takes into account the
    inter-blocking dependencies between the DCT
    coefficients which is the probability
    distribution of coefficient pairs from
    neighboring blocks.

22
Acknowledgements
  • Information Hiding 6th International Workshop,
    IH 2004, Toronto, Canada, May 23-25 2004, Revised
    Selected Papers
  • By Jessica Fridrich
  • Published by Springer, 2004
  • Determining the stego algorithm for JPEG
    imagesPevny, T.   Fridrich, J.   Dept. of
    Computer Science, State Univ. of New York
    Binghamton, NY

23
  • Thank You!
Write a Comment
User Comments (0)
About PowerShow.com