Title: Automatic Disease Detection In Citrus Trees Using Machine Vision
1Automatic Disease Detection In Citrus Trees Using
Machine Vision
- Rajesh Pydipati
- Research Assistant
- Agricultural Robotics Mechatronics Group (ARMg)
- Agricultural Biological Engineering
2Introduction
- Citrus industry is an important constituent of
Floridas overall agricultural economy - Florida is the worlds leading producing region
for grapefruit and second only to Brazil in
orange production - The state produces over 80 percent of the United
States supply of citrus
3Research Justification
- Citrus diseases cause economic loss in citrus
production due to long term tree damage and due
to fruit defects that reduce crop size, quality
and marketability. - Early detection systems that might detect and
possibly treat citrus for observed diseases or
nutrient deficiency could significantly reduce
annual losses.
4Objectives
- Collect image data set of various common citrus
diseases. - Evaluate the Color Co-occurrence Method, for
disease detection in citrus trees. - Develop various strategies and algorithms for
classification of the citrus leaves based on the
features obtained from the color co-occurrence
method. - Compare the classification accuracies from the
algorithms.
5Vision based classification
6Sample Collection and Image Acquisition
- Leaf sample sets were collected from a typical
Florida grape fruit grove for three common citrus
diseases and from normal leaves - Specimens were separated according to
classification in plastic ziploc bags and stored
in a environmental chamber maintained at 10
degrees centigrade - Forty digital RGB format images were collected
for each classification and stored to disk in
uncompressed JPEG format. Alternating image
selection was used to build the test and training
data sets
7Leaf sample images
Greasy spot diseased leaf
Melanose diseased leaf
8Leaf sample images
Scab diseased leaf
Normal leaf
9Ambient vs Laboratory Conditions
- Initial tests were conducted in a laboratory to
minimize uncertainty created by ambient lighting
variation. - An effort was made to select an artificial light
source which would closely represent ambient
light. - Leaf samples were analyzed individually to
identify variations between leaf fronts and backs.
10(No Transcript)
11Spectrum Comparison with NaturaLight Filter
12Image Acquisition Specifications
- Four 16W Cool White Fluorescent bulbs (4500K)
with NaturaLight filters and reflectors. - JAI MV90, 3 CCD Color Camera with 28-90 mm Zoom
lens. - Coreco PC-RGB 24 bit color frame grabber with 480
by 640 pixels. - MV Tools Image capture software
- Matlab Image Processing Toolbox
- SAS Statistical Analysis Package
13Image Acquisition System
14Camera Calibration
- The camera was calibrated under the artificial
light source using a calibration grey-card. - An RGB digital image was taken of the grey-card
and each color channel was evaluated using
histograms, mean and standard deviation
statistics. - Red and green channel gains were adjusted until
the grey-card images had similar means in R, G,
and B equal to approximately 128, which is
mid-range for a scale from 0 to 255. Standard
deviation of calibrated pixel values were
approximately equal to 3.0.
15Image acquisition and classification flow chart
16Color cooccurence method
- Color Co-occurrence Method (CCM) uses HSI pixel
maps to generate three unique Spatial Gray-level
Dependence Matrices (SGDM) - Each sub-image was converted from RGB (red,
green, blue) to HSI (hue, saturation, intensity)
color format - The SGDM is a measure of the probability that a
given pixel at one particular gray-level will
occur at a distinct distance and orientation
angle from another pixel, given that pixel has a
second particular gray-level
17CCM TEXTURE STATISTICS
- CCM texture statistics were generated from the
SGDM of each HSI color feature. - Each of the three matrices is evaluated by
thirteen texture statistic measures resulting in
39 texture features per image. - CCM Texture statistics were used to build four
data models. The data models used different
combinations of the HSI color co-occurrence
texture features. STEPDISC was used to reduce
data models through a stepwise variable
elimination procedure
18Intensity Texture Features
- I9 - Difference Entropy
- I10 - Information Correlation Measures
1 - I11 - Information Correlation Measures 2
- I12 - Contrast
- I13 - Modus
- I1 - Uniformity
- I2 - Mean
- I3 - Variance
- I4 - Correlation
- I5 - Product Moment
- I6 - Inverse Difference
- I7 - Entropy
- I8 - Sum Entropy
19Classification Models
20Classifier based on Mahalanobis distance
- The Mahalanobis distance is a very useful way of
determining the similarity of a set of values
from an unknown sample to a set of values
measured from a collection of known samples - Mahalanobis distance method is very sensitive to
inter-variable changes in the training data
21Mahalanobis distance contd..
- Mahalanobis distance is measured in terms of
standard deviations from the mean of the training
samples - The reported matching values give a statistical
measure of how well the spectrum of the unknown
sample matches (or does not match) the original
training spectra
22 Formula for calculating the squared Mahalanobis
distance metric
x is the N-dimensional test feature vector (N
is the number of features ) µ is the
N-dimensional mean vector for a particular class
of leaves ? is the N x N dimensional
co-variance matrix for a particular class of
leaves.
23Minimum distance principle
- The squared Mahalanobis distance was calculated
from a test image to various classes of leaves - The minimum distance was used as the criterion to
make classification decisions
24Neural networks
- A neural network is a system composed of many
simple processing elements operating in parallel
whose function is determined by network
structure, connection strengths, and the
processing performed at computing elements or
nodes . - (According to the DARPA Neural Network Study
(1988, AFCEA International Press, p. 60)
25Contd..
- A neural network is a massively parallel
distributed processor that has a natural
propensity for storing experiential knowledge and
making it available for use. It resembles the
brain in two respects - 1. Knowledge is acquired by the network
through a learning process. 2. Inter-neuron
connection strengths known as synaptic weights
are used to store the knowledge. - According to Haykin, S. (1994), Neural
Networks A Comprehensive Foundation, NY
Macmillan, p. 2
26 A Basic Neuron
27Multilayer Feed forward Neural Network
28Back propagation
- In the MFNN shown earlier the input layer of the
BP network is generally fully connected to all
nodes in the following hidden layer - Input is generally normalized to values between
-1 and 1 - Each node in the hidden layer acts as a summing
node for all inputs as well as an activation
function
29MFNN with Back propagation
- The hidden layer neuron first sums all the
connection inputs and then sends this result to
the activation function for output generation. - The outputs are propagated through all the layers
until final output is obtained
30Mathematical equations
- The governing equations are given below
-
-
- Where x1,x2 are the input signals,
- w1,w2. the synaptic weights,
- u is the activation potential
of the neuron, - is the threshold,
- y is the output signal of the
neuron, - and f (.) is the activation
function.
31Back propagation
- The Back propagation algorithm is the most
important algorithm for the supervised training
of multilayer feed-forward ANNs - The BP algorithm was originally developed using
the gradient descent algorithm to train multi
layered neural networks for performing desired
tasks
32Back propagation algorithm
- BP training process begins by selecting a set of
training input vectors along with corresponding
output vectors. - The outputs of the intermediate stages are
forward propagated until the output layer nodes
are activated. - Actual outputs are compared with target outputs
using an error criterion.
33Back propagation
- The connection weights are updated using the
gradient descent approach by back propagating
change in the network weights from the output
layer to the input layer. - The net changes to the network will be
accomplished at the end of one training cycle.
34BP network architecture used in the research
Network Architecture 2 hidden layers with 10
processing elements each Output layer consisting
of 4 output neurons An input layer Tansig
activation function used at all layers
35Radial basis function networks
- A radial basis function network is a neural
network approached by viewing the design as a
curve-fitting (approximation) problem in a high
dimensional space - Learning is equivalent to finding a
multidimensional function that provides a best
fit to the training data
36An RBF network
37RBF contd
- The RBF front layer is the input layer where the
input vector is applied to the network - The hidden layer consist of radial basis function
neurons, which perform a fixed non-linear
transformation mapping the input space into a new
space - The output layer serves as a linear combiner for
the new space.
38RBF network used in the research
Network architecture 80 radial basis functions
in the hidden layer 2 outputs in the output
layer
39Data Preparation
- 40 Images each, of the four classes of leaves
were taken. - The Images were divided into training and test
data sets sequentially for all the classes. - The feature extraction was performed for all the
images by following the CCM method.
40Data Preparation
- Finally the data was divided in to two text
files - 1)Training texture feature data ( with all
39 texture features) and - 2)Test texture feature data ( with all
39 texture features) - The files had 80 rows each, representing 20
samples from each of the four classes of leaves
as discussed earlier. Each row had 39 columns
representing the 39 texture features extracted
for a particular sample image
41Data preparation
- Each row had a unique number (1, 2, 3 or 4) which
represented the class the particular row of data
belonged -
- These basic files were used to select the
appropriate input for various data models based
on SAS analysis.
42Experimental methods
- The training data was used for training the
various classifiers as discussed in the earlier
slides. - Once training was complete the test data was used
to test the classification accuracies. - Results for various classifiers are given in the
following slides.
43Results
44Results
45Results
46Results
47Comparison of various classifiers for Model 1B
Classifier Greasy spot Melanose Normal Scab Overall
SAS 100 100 90 95 96.3
Mahalanobis 100 100 100 95 98.75
NNBP 100 90 95 95 95
RBF 100 100 85 60 86.25
48Summary
- It is concluded that model 1B consisting of
features from hue and saturation is the best
model for the task of citrus leaf classification.
- Elimination of intensity in texture feature
calculation is the major advantage. It nullifies
the effect of lighting variations in an outdoor
environment
49Conclusion
- The research was a feasibility analysis to see
whether the techniques investigated in this
research can be implemented in future real time
applications. - Results show a positive step in that direction.
Nevertheless, the real time system involves some
modifications and tradeoffs to make it practical
for outdoor applications
50 Thank You
- May I answer any questions?