Title: CGMB424: IMAGE PROCESSING AND COMPUTER VISION
1CGMB424 IMAGE PROCESSING AND COMPUTER VISION
- image analysis (part I)
- Preprocessing
2Objectives
- To know what is the system model for image
analysis - To learn on processing the images (edge/line
detection, segmentation, discrete transform,
feature extraction)
3Introduction - System Model
- Image analysis process
- Preprocessing
- Used to remove noise and eliminate irrelevant
information - Noise unwanted information that can result from
the image acquisition process - Gray level/spatial quantization
- Finding regions of interest
- Data reduction
- Reducing data in spatial domain
- Transforming it into another domain frequency
domain - Feature analysis
- Feature extracted by the data reduction process
are examined and evaluated to be used in
application
4Introduction - System Model
f i l t e r i n g
Transform
Spectral Information
Preprocessing
Feature analysis
Feature extraction
Input Image
Segmentation
Spatial Information
Acquisition
5PreprocessingRegion of Interest (ROI)
- A more specific area within the image
- To get ROI, we need operations that modify the
spatial coordinates of the image - The operations might involve crop, zoom, enlarge,
shrink, translate, rotate - Crop process of selecting a small portion of
the image and cutting it away from the rest of
the image - Zoom process can be done in numerous ways but
typically zero order hold and first order hold
6PreprocessingZooming Process
- There are numerous ways of zoom process but the
most commonly used is - Zero order hold
- Performed by repeating the previous pixel values
blocky effect - First order hold
- Performed linear interpolation between adjacent
pixels
7Preprocessing Zooming Process
Image enlarged by zero-order hold.
Original image. Area to be zoomed is outlined at
the center
Image enlarged by first-order hold.
8PreprocessingZooming Process First Order Hold
- There are two ways to do the first order hold
- Averaging
- The first two pixels in the first row are
averaged - The averaged is then inserted in between those
pixels - This method allows us to enlarge an N X N sized
image to a size of (2N 1) X (2N 1) - Convolution
- Extend the image by adding rows and columns of
zeros between the existing rows and columns - Perform the convolution
9PreprocessingFirst Order Hold - Averaging
averaged
- Average the first two pixels in the first row,
(84)/2 6 - Insert the value (6) in between the two pixels (8
and 4)
10PreprocessingFirst Order Hold - Convolution
X
X
X
Convolution mask
Original image
Original image
- the convolution mask will slid across the
extended image
11PreprocessingFirst Order Hold - Convolution
- Extend the image by adding rows and columns of
zeros between the existing rows and columns - Use the convolution mask, which will slid across
the extended image
Convolution mask for first-order hold
12PreprocessingFirst Order Hold - Convolution
- The convolution process requires us to overlay
the mask on the image, multiply the coincident
values and sum all these results ? which is also
equivalent to finding the vector inner product of
the mask with the underlying sub image - Referring to the previous example, if we put the
mask over the upper left corner if the image, we
will get - ¼(0) ½(0) ¼(0) ½(0) 1(3) ½(0) ¼(0)
½(0) ¼(0) 3 - Note that the existing image values do not
change. The next step is to slide the mask over
by one pixel and repeat the process, as follows - ¼(0) ½(0) ¼(0) ½(3) 1(0) ½(5) ¼(0)
½(0) ¼(0) 4
13PreprocessingFirst Order Hold - Convolution
- This process will continue until the end of the
of the row. And then it will go one row below,
until all the pixels are done with the
convolution - Question Perform zooming using convolution
process, given image and convolution mask as below
image
Convolution mask
14PreprocessingConvolution
- For each convolution process, the output image
must be put in a separate image array called
buffer - This is to prevent the existing value from being
overwritten - If we call the convolution mask M (r, c) and the
image I (r, c), the convolution equation will be
? ? I (r - x, c - y) M (r, c)
x-8
y-8
8
8
15PreprocessingConvolution Why?
- Many computer imaging boards can perform
convolution in hardware, which is generally fast
? much faster than applying a faster algorithm
software - You can also perform convolution for zero-order
hold by applying this mask - This allow the enlargement of image by factor
(2N-1)
16PreprocessingConvolution Why?
- To enlarge image other that by factor (2N-1),
need to follow this steps - Define the enlargement number, K
- Subtract 2 adjacent pixel values
- Divide the results by K
- Add that result to the smaller value
- Keep adding the results from the 3rd step in a
running total until all (K-1) intermediate pixels
locations are filled
17PreprocessingConvolution Why?
- Example Lets enlarge an image to 3 times of its
original size, and two adjacent pixel values
available are 125 and 140. - Define K. K 3
- The difference between the 2 values. 140-125 15
- Divide the 15 by K. 15/3 5
- Add 5 to smaller value 1255130, 1305135
- The two pixel values between 125 and 140, are 130
and 135 - Question Enlarge an image 4 times of its
original size. Adjacent pixel values given are
150 and 200.
18PreprocessingTranslation and Rotation
- May be performed for many application specific
reasons e.g. to align an image with a known
template - The translation can be done by
- r r r0
- c c c0
- Where r and c are the new coordinates, r and c
are original coordinates and r0 and c0 are the
distance to translate - The rotation process will use this formula
- r r (cos ?) c (sin ?)
- c -r (sin ?) c (cos ?)
- Where r and c are the new coordinates, r and c
are the original coordinates and ? is the angle
to rotate
19PreprocessingTranslation and Rotation
- There can also be a combination of rotation and
translation - r (r r0)(cos ?) (c c0)(sin ?)
- c -(r r0)(sin ?) (c c0)(cos ?)
- Where r and c are the new coordinates and r,
c, r0, c0 and ? are as previously defined
20PreprocessingTranslation and Rotation (Practical
Difficulties)
- Translation - when you translate an image one row
down, what will happen to the top row (leftover
space)? - Fill the top row with a constant value (0 or 255)
or - Wrap around by shifting the bottom row to the top
- Rotation when you rotate an image, it might be
off the screen (image plane), so how can you
solve it? - Translation back to the center might have
leftover space at the corners - So fill this space with a constant or extract the
central, rectangular portion of the image and
enlarge it to the original image size
21PreprocessingTranslation and Rotation (Practical
Difficulties)
Center of the rotated image
?
a. before A four-row image translating down by
one row, r0 1
x
x
a. Image is rotated off the screen
b. Fix by translating towards center
b. after if we wrap around, row 4 goes into ???
Otherwise, the top row is filled with a constant,
typically zero
???
1
2
3
d. Crop and enlarge if desired
c. Translation complete
22Preprocessing - Image Algebra
- There are 2 primary categories of algebraic
operations - Arithmetic
- Addition, subtraction, division and
multiplication - Logic
- AND, OR, NOT
- All operations involved 2 images except for NOT
operation
23PreprocessingArithmetic - Addition and
Subtraction
- Image addition
- Used to combine the information in two images
- Applications include image restoration algorithms
for modeling additive noise and special effects
(morphing) - Subtraction
- Often used to detect motion
- When no changes, the resulting image will be
black (fill with zeros) - When theres changes, the subtraction will
produce a nonzero result
24PreprocessingArithmetic - Addition
Image morphing example
Additive noise example
25PreprocessingArithmetic - Subtraction
From left to right input image, reference
image, output image. (Figure taken from Bare-Hand
Human-Computer Interaction, ACM International
Conference Proceeding Series, Proceedings of the
2001 workshop on Perceptive user interfaces, 2001
)
26PreprocessingArithmetic - Multiplication and
Division
- Multiplication and division are used to adjust
the brightness of the image - Multiplication will produce brighter image
- Division will produce darker image
- Question Why is it like that?
Original image
Image multiplied
Image divided
27PreprocessingLogic AND, OR, NOT
- AND, OR, NOT form a complete set XOR, NOR, NAND
can be created by combination of those 3 basic
elements - AND, OR are used to combine information in 2
images - Usually used for special effects, or masking
operation (to choose ROI) - NOT operation operates a negative of the
original image invert each pixel value
28PreprocessingLogic AND, OR
Square for AND mask
Resulting AND
Original image
Square for OR mask
Resulting OR
29PreprocessingComplement Image
NOT operator applied on the image
Original image
30PreprocessingSpatial Filters
- Done for noise removal or to perform some type of
image enhancement - Mean filters
- Used to conceal/remove noise
- Add softer look to an image
- Median filters
- Used to conceal/remove noise
- Enhancement filters
- Implemented with convolution mask
31PreprocessingSpatial Filters
- Why convolution?
- It provides a result that is a weighted sum of
the values of a pixel and its neighbors linear
filter - Overall effect can be predicted based on their
general pattern - If coefficients of the mask sum to 1, the average
brightness for the image will be retained - If coefficients of the mask sum to 0, the average
brightness will be lost and the image will be
darker - If the coefficients alternates between positive
and negative, the mask is a filter that will
return the edge information - If the coefficient is all positive, it will give
a filter that will blur the image
32PreprocessingSpatial Filters - Mean Filter
- It is an averaging filter
- Operates on local groups of pixels called
neighborhoods and replace the centre pixel with
an average of the pixels in this neighborhood - E.g.
-
- The coefficient sums to 1 brightness retained
- Coefficient all positive blur the image
33PreprocessingSpatial Filters - Median Filter
- A nonlinear filter
- Has a result that cannot be found by weighted sum
of the neighborhood pixels, such as is done with
a convolution mask - Does operate on a local neighborhood
- After the size of the local neighborhood is
defined, the center pixel is replaced with the
median, or center, rather than their average - The output must be written to a separate image (a
buffer) - E.g.
- First sort the values in order
- Since it is a 3X3 neighborhood, the median will
be the 5th values, which is 20 - This 20 is then placed in the center point
34PreprocessingSpatial Filters - Enhancement Filter
- Include laplacian-type and difference filters
- Tends to bring out or enhance the details of the
image - The laplacian-type filters will enhance the
details in all direction equally - The difference filter convolution masks,
corresponding to lines in the vertical,
horizontal, and two diagonal directions
35PreprocessingSpatial Filters - Enhancement Filter
Laplacian type filters
vertical
horizontal
Diagonal 1
Diagonal 2
Difference filters
36PreprocessingSpatial Filters
Original image with noise
Mean filtered image, 8X8 kernel
Original image
Median filtered
37PreprocessingSpatial Filters
Original image
Original image
Horizontal difference filter image
Image after laplacian filter
Vertical difference filter image
Contrast enhanced version of laplacian-fitered
image
38PreprocessingImage Quantization
- The process of reducing the image data by
removing some of the image detail information by
mapping groups of data points to a single point - This can be done by either to the pixel values
themselves I (r, c) or to the spatial coordinates
(r, c) - Operations on pixel values referred to as
gray-level reduction - Operation on spatial coordinates is called
spatial reduction
39PreprocessingImage Quantization - Gray-Level
ReductionThresholding
- The simplest method is thresholding
- Effectively turns a gray-level image to binary
image - What is the usual practice?
- Set a threshold value, a
- Pixel values larger than a, change to 255
- Pixel values smaller than a, change to 0
- Used for extraction of shapes, area and perimeter
40PreprocessingImage Quantization - Gray-Level
ReductionThresholding
Original image
Threshold value 66
Threshold value 186
Threshold value 127
41PreprocessingImage Quantization - Gray-Level
Reduction AND Process
- This can be done by efficiently masking the lower
bits via AND operation - By masking the lower 3 bits, we reduce 256
gray-levels to 32 gray-level256/8 32 - In general we need a mask of k bits, where 2k is
divided into the original gray-level range to get
the quantized range desired - Through this method, we can reduce the
gray-levels to any power of 2 - As the gray-levels decreases, the contouring
increases - Contouring appears in images as false edges/lines
42PreprocessingImage Quantization - Gray-Level
ReductionAND Process
Quantized to 64 gray levels
Original image
Quantized to 128 gray levels
Quantized to 32 gray levels
Quantized to 2 gray levels
Quantized to 16 gray levels
Quantized to 4 gray levels
43PreprocessingImage Quantization - Gray-Level
ReductionIGS
- This false lines can be improved using IGS
(improved gray scale) - ANDing will quantized gray-level values to low
end, and to quantized it to high end, simply use
OR operation - To determine the number of 1 bits in the OR mask,
apply the similar method as ANDing. - You can also map the quantized values to the
midpoint of the range - This is done by an AND after the OR, or
- Doing AND first, the OR, to either shift the
values up or down
44PreprocessingImage Quantization - Spatial
Reduction
- It results in reducing the actual size of the
image - Done by taking a group of pixels that are
spatially adjacent and mapping them to one pixel - Three ways
- Averaging
- Take all the pixels in each group and find
average - Median
- Sort all the pixel values and take the middle
value - Decimation (subsampling)
- Eliminate some of the data
- E.g. to reduce image by a factor of 2, we simply
take every other row and column and delete them - Antialiasing filtering might be needed to improve
the image quality