Introduction to Wavelet Transform - PowerPoint PPT Presentation

About This Presentation
Title:

Introduction to Wavelet Transform

Description:

Introduction to Wavelet Transform Time Series are Ubiquitous! What kind of Could be useful? Impulse Function (Haar): Best time resolution Sinusoids (Fourier ... – PowerPoint PPT presentation

Number of Views:871
Avg rating:3.0/5.0
Slides: 120
Provided by: showDocja
Category:

less

Transcript and Presenter's Notes

Title: Introduction to Wavelet Transform


1
Introduction to Wavelet Transform
2
Time Series are Ubiquitous!
A random sample of 4,000 graphics from 15 of the
worlds newspapers published from 1974 to 1989
found that more than 75 of all graphics were
time series (Tufte, 1983).
3
Why is Working With Time Series so Difficult?
Answer We are dealing with subjective notions of
similarity.
The definition of similarity depends on the
user, the domain and the task at hand. We need to
be able to handle this subjectivity.
4
Wavelet Transform - Overview
History
  • Fourier (1807)
  • Haar (1910)
  • Math World

5
Wavelet Transform - Overview
  • What kind of Could be useful?
  • Impulse Function (Haar) Best time resolution
  • Sinusoids (Fourier) Best frequency resolution
  • We want both of the best resolutions
  • Heisenberg (1930)
  • Uncertainty Principle
  • There is a lower bound for(An intuitive prove
    in Mac91)

6
Wavelet Transform - Overview
  • Gabor (1945)
  • Short Time Fourier Transform (STFT)
  • Disadvantage Fixed window size

7
Wavelet Transform - Overview
  • Constructing Wavelets
  • Daubechies (1988)
  • Compactly Supported Wavelets
  • Computation of WT Coefficients
  • Mallat (1989)
  • A fast algorithm using filter banks

8
Discrete Fourier Transform I
Basic Idea Represent the time series as a linear
combination of sines and cosines, but keep only
the first n/2 coefficients. Why n/2
coefficients? Because each sine wave requires 2
numbers, for the phase (w) and amplitude (A,B).
X
X'
0
20
40
60
80
100
120
140
Jean Fourier 1768-1830
0
1
2
3
4
5
6
7
Excellent free Fourier Primer Hagit Shatkay, The
Fourier Transform - a Primer'', Technical Report
CS-95-37, Department of Computer Science, Brown
University, 1995. http//www.ncbi.nlm.nih.gov/CBB
research/Postdocs/Shatkay/
8
9
9
Discrete Fourier Transform II
  • Pros and Cons of DFT as a time series
    representation.
  • Good ability to compress most natural signals.
  • Fast, off the shelf DFT algorithms exist.
    O(nlog(n)).
  • (Weakly) able to support time warped queries.
  • Difficult to deal with sequences of different
    lengths.
  • Cannot support weighted distance measures.

X
X'
0
20
40
60
80
100
120
140
0
1
2
3
4
5
6
7
Note The related transform DCT, uses only cosine
basis functions. It does not seem to offer any
particular advantages over DFT.
8
9
10
History
11
Discrete Wavelet Transform I
Basic Idea Represent the time series as a linear
combination of Wavelet basis functions, but keep
only the first N coefficients. Although there
are many different types of wavelets, researchers
in time series mining/indexing generally use Haar
wavelets. Haar wavelets seem to be as powerful
as the other wavelets for most problems and are
very easy to code.
Alfred Haar 1885-1933
Excellent free Wavelets Primer Stollnitz, E.,
DeRose, T., Salesin, D. (1995). Wavelets for
computer graphics A primer IEEE Computer
Graphics and Applications.
12
Wavelet Series
13
Discrete Wavelet Transform III
  • Pros and Cons of Wavelets as a time series
    representation.
  • Good ability to compress stationary signals.
  • Fast linear time algorithms for DWT exist.
  • Able to support some interesting non-Euclidean
    similarity measures.
  • Signals must have a length n 2some_integer
  • Works best if N is 2some_integer. Otherwise
    wavelets approximate the left side of signal at
    the expense of the right side.
  • Cannot support weighted distance measures.

14
Singular Value Decomposition I
Basic Idea Represent the time series as a linear
combination of eigenwaves but keep only the first
N coefficients. SVD is similar to Fourier and
Wavelet approaches, we represent the data in
terms of a linear combination of shapes (in this
case eigenwaves). SVD differs in that the
eigenwaves are data dependent. SVD has been
successfully used in the text processing
community (where it is known as Latent Symantec
Indexing ) for many years. Good free SVD Primer
Singular Value Decomposition - A Primer. Sonia
Leach
X
X'
SVD
James Joseph Sylvester 1814-1897
0
20
40
60
80
100
120
140
Camille Jordan (1838--1921)
Eugenio Beltrami 1835-1899
15
Singular Value Decomposition II
How do we create the eigenwaves?
We have previously seen that we can regard time
series as points in high dimensional space. We
can rotate the axes such that axis 1 is aligned
with the direction of maximum variance, axis 2 is
aligned with the direction of maximum variance
orthogonal to axis 1 etc. Since the first few
eigenwaves contain most of the variance of the
signal, the rest can be truncated with little
loss.
X
X'
SVD
0
20
40
60
80
100
120
140
This process can be achieved by factoring a M by
n matrix of time series into 3 other matrices,
and truncating the new matrices at size N.
16
Singular Value Decomposition III
  • Pros and Cons of SVD as a time series
    representation.
  • Optimal linear dimensionality reduction
    technique .
  • The eigenvalues tell us something about the
    underlying structure of the data.
  • Computationally very expensive.
  • Time O(Mn2)
  • Space O(Mn)
  • An insertion into the database requires
    recomputing the SVD.
  • Cannot support weighted distance measures or non
    Euclidean measures.

X
X'
SVD
0
20
40
60
80
100
120
140
Note There has been some promising research into
mitigating SVDs time and space complexity.
17
Piecewise Linear Approximation I
Basic Idea Represent the time series as a
sequence of straight lines. Lines could be
connected, in which case we are allowed N/2
lines If lines are disconnected, we are
allowed only N/3 lines Personal experience on
dozens of datasets suggest disconnected is
better. Also only disconnected allows a lower
bounding Euclidean approximation
X
Karl Friedrich Gauss 1777 - 1855
X'
0
20
40
60
80
100
120
140
  • Each line segment has
  • length
  • left_height
  • (right_height can be inferred by looking at the
    next segment)
  • Each line segment has
  • length
  • left_height
  • right_height

18
Problem with Fourier
Fourier analysis -- breaks down a signal into
constituent sinusoids of different frequencies.
A serious drawback in transforming to the
frequency domain, time information is lost. When
looking at a Fourier transform of a signal, it is
impossible to tell when a particular event took
place.
19
Function Representations
  • sequence of samples (time domain)
  • finite difference method
  • pyramid (hierarchical)
  • polynomial
  • sinusoids of various frequency (frequency domain)
  • Fourier series
  • piecewise polynomials (finite support)
  • finite element method, splines
  • wavelet (hierarchical, finite support)
  • (time/frequency domain)

20
What Are Wavelets?
  • In general, a family of representations using
  • hierarchical (nested) basis functions
  • finite (compact) support
  • basis functions often orthogonal
  • fast transforms, often linear-time

21
Function Representations Desirable Properties
  • generality approximate anything well
  • discontinuities, nonperiodicity, ...
  • adaptable to application
  • audio, pictures, flow field, terrain data, ...
  • compact approximate function with few
    coefficients
  • facilitates compression, storage, transmission
  • fast to compute with
  • differential/integral operators are sparse in
    this basis
  • Convert n-sample function to representation in
    O(nlogn) or O(n) time

22
Wavelet History, Part 1
  • 1805 Fourier analysis developed
  • 1965 Fast Fourier Transform (FFT) algorithm
  • 1980s beginnings of wavelets in physics, vision,
    speech processing (ad hoc)
  • little theory why/when do wavelets work?
  • 1986 Mallat unified the above work
  • 1985 Morlet Grossman continuous wavelet
    transform asking how can you get perfect
    reconstruction without redundancy?

23
Wavelet History, Part 2
  • 1985 Meyer tried to prove that no orthogonal
    wavelet other than Haar exists, found one by
    trial and error!
  • 1987 Mallat developed multiresolution theory,
    DWT, wavelet construction techniques (but still
    noncompact)
  • 1988 Daubechies added theory found compact,
    orthogonal wavelets with arbitrary number of
    vanishing moments!
  • 1990s wavelets took off, attracting both
    theoreticians and engineers

24
Time-Frequency Analysis
  • For many applications, you want to analyze a
    function in both time and frequency
  • Analogous to a musical score
  • Fourier transforms give you frequency
    information, smearing time.
  • Samples of a function give you temporal
    information, smearing frequency.
  • Note substitute space for time for pictures.

25
Comparison to Fourier Analysis
  • Fourier analysis
  • Basis is global
  • Sinusoids with frequencies in arithmetic
    progression
  • Short-time Fourier Transform ( Gabor filters)
  • Basis is local
  • Sinusoid times Gaussian
  • Fixed-width Gaussian window
  • Wavelet
  • Basis is local
  • Frequencies in geometric progression
  • Basis has constant shape independent of scale

26
Wavelets are faster than ffts!
27
The results of the CWT are many wavelet
coefficients, which are a function of scale and
position
28
Gabors Proposal Short Time Fourier Transform
Requirements
Signal in time domain require short time window
to depict features of signal.
Signal in frequency domain require short
frequency window (long time window) to depict
features of signal.
29
What are wavelets?
Wavelets are functions defined over a finite
interval and having an average value  of zero.
30
What is wavelet transform?
The wavelet transform is a tool for carving up
functions, operators, or data into components of
different frequency, allowing one to study each
component separately.
The basic idea of the wavelet transform is to
represent any arbitrary function Æ’(t) as a
superposition of a set of such wavelets or basis
functions.
These basis functions or baby wavelets are
obtained from a single prototype wavelet called
the mother wavelet, by dilations or contractions
(scaling) and translations (shifts).
31
The continuous wavelet transform (CWT)
Fourier Transform
FT is the sum over all the time of signal f(t)
multiplied by a complex exponential.
32
The variables s and t are the new dimensions,
scale and translation (position), after the
wavelet transform.
33
s is the scale factor, t is the translation
factor and the factor s-1/2 is for energy
normalization across the different scales.
It is important to note that in the above
transforms the wavelet basis functions are not
specified.
This is a difference between the wavelet
transform and the Fourier transform, or other
transforms.
34
Scale
Scaling a wavelet simply means stretching (or
compressing) it.
35
Scale and Frequency
Low scale a
Compressed wavelet
Rapidly changing details
High scale a
stretched wavelet
slowly changing details
Translation (shift)
Translating a wavelet simply means delaying (or
hastening) its onset.
36
(No Transcript)
37
Discrete Wavelets
Discrete wavelet is written as
j and k are integers and s0 gt 1 is a fixed
dilation step. The translation factor t0 depends
on the dilation step. The effect of discretizing
the wavelet is that the time-scale space is now
sampled at discrete intervals. We usually choose
s0 2
38
A band-pass filter
The wavelet has a band-pass like spectrum
From Fourier theory we know that compression in
time is equivalent to stretching the spectrum and
shifting it upwards
Suppose a2
This means that a time compression of the wavelet
by a factor of 2 will stretch the frequency
spectrum of the wavelet by a factor of 2 and also
shift all frequency components up by a factor of
2.
39
Subband coding
If we regard the wavelet transform as a filter
bank, then we can consider wavelet transforming a
signal as passing the signal through this filter
bank.
The outputs of the different filter stages are
the wavelet- and scaling function transform
coefficients.
In general we will refer to this kind of
analysis as a multiresolution.
That is called subband coding.
40
Splitting the signal spectrum with an iterated
filter bank.
Summarizing, if we implement the wavelet
transform as an iterated filter bank, we do not
have to specify the wavelets explicitly! This is
a remarkable result.
41
The Discrete Wavelet Transform
Calculating wavelet coefficients at every
possible scale is a fair amount of work, and it
generates an awful lot of data. What if we choose
only a subset of scales and positions at which to
make our calculations?
It turns out, rather remarkably, that if we
choose scales and positions based on powers of
two -- so-called dyadic scales and positions --
then our analysis will be much more efficient and
just as accurate. We obtain just such an analysis
from the discrete wavelet transform (DWT).
42
Approximations and Details
The approximations are the high-scale,
low-frequency components of the signal. The
details are the low-scale, high-frequency
components. The filtering process, at its most
basic level, looks like this
The original signal, S, passes through two
complementary filters and emerges as two signals .
43
Downsampling
Unfortunately, if we actually perform this
operation on a real digital signal, we wind up
with twice as much data as we started with.
Suppose, for instance, that the original signal S
consists of 1000 samples of data. Then
the approximation and the detail will each have
1000 samples, for a total of 2000.
To correct this problem, we introduce the
notion of downsampling. This simply means
throwing away every second data point.
44
An example
45
Reconstructing Approximation and Details
Upsampling
46
Wavelet Decomposition
Multiple-Level Decomposition
The decomposition process can be iterated, with
successive approximations being decomposed in
turn, so that one signal is broken down into many
lower-resolution components. This is called the
wavelet decomposition tree.
47
The signal f(t) can be expresses as
DWT
48
(No Transcript)
49
(No Transcript)
50
Wavelet Reconstruction (Synthesis)
Perfect reconstruction
51
(4,0)
y1
x1
(1,0)
52
2-D Discrete Wavelet Transform
A 2-D DWT can be done as follows
Step 1 Replace each row with its 1-D DWT
Step 2 Replace each column with its 1-D DWT
Step 3 repeat steps (1) and (2) on the lowest
subband for the next scale
Step 4 repeat steps (3) until as many scales as
desired have been completed
53
Image at different scales
54
Correlation between features at different scales
55
Wavelet construction a simplified approach
  • Traditional approaches to wavelets have used a
    filterbank interpretation
  • Fourier techniques required to get synthesis
    (reconstruction) filters from analysis filters
  • Not easy to generalize

56

Wavelet construction lifting
  • 3 steps
  • Split
  • Predict (P step)
  • Update (U step)

57
Example the Haar wavelet
  • S step
  • Splits the signal into odd and even samples

even samples
odd samples
58
Example the Haar wavelet
  • P step
  • Predict the odd samples from the even samples

For the Haar wavelet, the prediction for the odd
sample is the previous even sample
59
Example the Haar wavelet
Detail signal
l
60
Example the Haar wavelet
  • U step
  • Update the even samples to produce the next
    coarser scale approximation

The signal average is maintained
61
Summary of the Haar wavelet decomposition
Can be computed in place
..
..
-1
-1
P step
1/2
U step
1/2
62
Inverse Haar wavelet transform
  • Simply run the forward Haar wavelet transform
    backwards!

Then merge even and odd samples
Merge
63
General lifting stage of wavelet decomposition

U
P
Split
-
64
Multi-level wavelet decomposition
  • We can produce a multi-level decomposition by
    cascading lifting stages


lift
lift
lift
65
General lifting stage of inverse wavelet
synthesis
-
P
Merge
U

66
Multi-level inverse wavelet synthesis
  • We can produce a multi-level inverse wavelet
    synthesis by cascading lifting stages

lift
...
lift
lift
67
Advantages of the lifting implementation
  • Inverse transform
  • Inverse transform is trivial just run the code
    backwards
  • No need for Fourier techniques
  • Generality
  • The design of the transform is performed without
    reference to particular forms for the predict and
    update operators
  • Can even include non-linearities (for integer
    wavelets)

68
Example 2 the linear spline wavelet
  • A more sophisticated wavelet uses slightly more
    complex P and U operators
  • Uses linear prediction to determine odd samples
    from even samples

69
The linear spline wavelet
  • P-step linear prediction

Linear prediction at odd samples
Detail signal (prediction error at odd samples)
Original signal
70
The linear spline wavelet
  • The prediction for the odd samples is based on
    the two even samples either side

71
The linear spline wavelet
  • The U step use current and previous detail
    signal sample

72
The linear spline wavelet
  • Preserves signal average and first-order moment
    (signal position)

73
The linear spline wavelet
  • Can still implement in place


-1/2
P step
-1/2
-1/2
-1/2
U step
1/4
1/4
1/4
1/4
74
Summary of linear spline wavelet decomposition
Computing the inverse is trivial
The even and odd samples are then merged as before
75
Wavelet decomposition applied to a 2D image
76
Wavelet decomposition applied to a 2D image
approx
77
Why is wavelet-based compression effective?
  • Allows for intra-scale prediction (like many
    other compression methods) equivalently the
    wavelet transform is a decorrelating transform
    just like the DCT as used by JPEG
  • Allows for inter-scale (coarse-fine scale)
    prediction

78
Why is wavelet-based compression effective?
1 level Haar
Original
1 level linear spline
2 level Haar
79
Why is wavelet-based compression effective?
  • Wavelet coefficient histogram

80
Why is wavelet-based compression effective?
  • Coefficient entropies

81
Why is wavelet-based compression effective?
  • Wavelet coefficient dependencies

X
82
Why is wavelet-based compression effective?
  • Lets define sets S (small) and L (large) wavelet
    coefficients
  • The following two probabilities describe
    interscale dependancies

83
Why is wavelet-based compression effective?
  • Without interscale dependancies

84
Why is wavelet-based compression effective?
  • Measured dependancies from Lena


0.886 0.529 0.781 0.219
85
Why is wavelet-based compression effective?
  • Intra-scale dependencies


X1
X
X8
86
Why is wavelet-based compression effective?
  • Measured dependancies from Lena


0.912 0.623 0.781 0.219
87
Why is wavelet-based compression effective?
  • Have to use a causal neighbourhood for spatial
    prediction

88
Example image compression algorithms
  • We will look at 3 state of the art algorithms
  • Set partitioning in hierarchical sets (SPIHT)
  • Significance linked connected components analysis
    (SLCCA)
  • Embedded block coding with optimal truncation
    (EBCOT) which is the basis of JPEG2000

89
The SPIHT algorithm
  • Coefficients transmitted in partial order

Coeff. number
1 2 3 4 5 6 7 8
9 10 11 12 13 14.
msb
5 4 3 2 1 0
0
lsb
90
The SPIHT algorithm
  • 2 components to the algorithm 
  • Sorting pass
  • Sorting information is transmitted on the basis
    of the most significant bit-plane
  • Refinement pass
  • Bits in bit-planes lower than the most
    significant bit plane are transmitted

91
The SPIHT algorithm
N msb of (max(abs(wavelet coefficient))) for
(bit-plane-counter)N downto 1 transmit
significance/insignificance wrt bit-plane
counter transmit refinement bits of all
coefficients that are already significant
92
The SPIHT algorithm
  • Insignificant coefficients (with respect to
    current bitplane counter) organised into
    zerotrees

93
The SPIHT algorithm
  • Groups of coefficients made into zerotrees by set
    paritioning

94
The SPIHT algorithm
  • SPIHT produces an embedded bitstream

bitstream
.110010101110010110001101011100010111011011101
101.
95
The SLCCA algorithm
Bit-plane encode significant coefficients
Wavelet transform
Quantise coefficients
Cluster and transmit significance map
96
The SLCCA algorithm
  • The significance map is grouped into clusters

97
The SLCCA algorithm
  • Clusters grown out from a seed

Seed
Significant coeff
Insignificant coeff
98
The SLCCA algorithm
  • Significance link symbol

Significance link
99
Image compression results
  • Evaluation 
  • Mean squared error
  • Human visual-based metrics
  • Subjective evaluation

100
Image compression results
  • Mean-squared error 

Usually expressed as peak-signal-to-noise (in dB)
101
Image compression results
102
Image compression results
103
Image compression results
SPIHT 0.2 bits/pixel
JPEG 0.2 bits/pixel
104
Image compression results
SPIHT
JPEG
105
EBCOT, JPEG2000
  • JPEG2000, based on embedded block coding and
    optimal truncation is the state-of-the-art
    compression standard
  • Wavelet-based
  • It addresses the key issue of scalability
  • SPIHT is distortion scalable as we have already
    seen
  • JPEG2000 introduces both resolution and spatial
    scalability also
  • An excellent reference to JPEG2000 and
    compression in general is JPEG2000 by D.Taubman
    and M. Marcellin

106
EBCOT, JPEG2000
  • Resolution scalability is the ability to extract
    from the bitstream the sub-bands representing any
    resolution level

.110010101110010110001101011100010111011011101
101.
bitstream
107
EBCOT, JPEG2000
  • Spatial scalability is the ability to extract
    from the bitstream the sub-bands representing
    specific regions in the image
  • Very useful if we want to selectively decompress
    certain regions of massive images

.110010101110010110001101011100010111011011101
101.
bitstream
108
Introduction to EBCOT
  • JPEG2000 is able to implement this general
    scalability by implementing the EBCOT paradigm
  • In EBCOT, the unit of compression is the
    codeblock which is a partition of a wavelet
    sub-band
  • Typically, following the wavelet transform,each
    sub-band is partitioned into small blocks
    (typically 32x32)

109
Introduction to EBCOT
  • Codeblocks partitions of wavelet sub-bands

codeblock
110
Introduction to EBCOT
  • A simple bit stream organisation could comprise
    concatenated code block bit streams


Length of next code-block stream
111
Introduction to EBCOT
  • This simple bit stream structure is resolution
    and spatially scalable but not distortion
    scalable
  • Complete scalability is obtained by introducing
    quality layers
  • Each code block bitstream is individually
    (optimally) truncated in each quality layer
  • Loss of parent-child redundancy more than
    compensated by ability to individually optimise
    separate code block bitstreams

112
Introduction to EBCOT
  • Each code block bit stream partitioned into a set
    of quality layers




113
EBCOT advantages
  • Multiple scalability
  • Distortion, spatial and resolution scalability
  • Efficient compression
  • This results from independent optimal truncation
    of each code block bit stream
  • Local processing
  • Independent processing of each code block allows
    for efficient parallel implementations as well as
    hardware implementations

114
EBCOT advantages
  • Error resilience
  • Again this results from independent code block
    processing which limits the influence of errors

115
Performance comparison
  • A performance comparison with other wavelet-based
    coders is not straightforward as it would depend
    on the target bit rates which the bit streams
    were truncated for
  • With SPIHT, we simply truncate the bit stream
    when the target bit rate has been reached
  • However, we only have distortion scalability with
    SPIHT
  • Even so, we still get favourable PSNR (dB)
    results when comparing EBCOT (JPEG200) with SPIHT

116
Performance comparison
  • We can understand this more fully by looking at
    graphs of distortion (D) against rate (R)
    (bitstream length)

D
R-D curve for continuously modulated quantisation
step size
Truncation points
R
117
Performance comparison
  • Truncating the bit stream to some arbitrary rate
    will yield sub-optimal performance

D
R
118
Performance comparison
119
Performance comparison
  • Comparable PSNR (dB) results between EBCOT and
    SPIHT even though
  • Results for EBCOT are for 5 quality layers (5
    optimal bit rates)
  • Intermediate bit rates sub-optimal
  • We have resolution, spatial, distortion
    scalability in EBCOT but only distortion
    scalability in SPIHT
Write a Comment
User Comments (0)
About PowerShow.com