Title: Forecasting future technological needs for rice crop in India Questionnaire
1Pests and Diseases Forewarning System
Amrender Kumar
Scientist Indian Agricultural Statistics
Research Institute, Library Avenue, New Delhi,
INDIA akjha_at_iasri.res.in
2Crop Pests - Weather Relationship
Crop
Weather
Pests
3- Diseases and pests are major causes of reduction
in crop yields. - However, in case information about time and
severity of outbreak of diseases and pests is
available in advance, timely control measures can
be taken up so as to reduce the losses. - Weather plays an important role in pest and
disease development. - Therefore, weather based models can be an
effective scientific tool for forewarning
diseases and pests in advance.
4Why pests and disease forewarning
- Forewarning / assessment of disease important
for crop production management - for timely plant protection measures
- information whether the disease status is
expected to be below or above the threshold level
is enough, models based on qualitative data can
be used qualitative models - loss assessment
- forewarning actual intensity is required -
quantitative model
5Variables of interest
- Maximum pest population or disease severity.
- Pests population/diseases severity at most
damaging stage i.e. egg, larva, pupa, adult. - Pests population or diseases severity at
different stages of crop growth or at various
standard weeks. - Time of first appearance of pests and diseases.
- Time of maximum population/severity of pests and
diseases. - Weekly monitoring of pests and diseases progress.
- Occurrence/non-occurrence of pests diseases.
- Extent of damage.
6Data Structure
Historical data at periodical intervals for 10-15
years
Year Observation Observation Observation Observation Observation Observation Observation
Year 1 2 3 4 . . .
1 y11 y12 . . . . .
2 y21 y22 . . . . .
. . . . . . . .
. . . . . . . .
. . . . . . . .
10-15 . . . . . . .
7- Historical data for 10-15 years at one point of
time - overall status
- disease intensity
- crop damage.
8- Data for 5-6 years at periodic intervals
- For week-wise models, data points inadequate
- combined model for the whole data in two steps
- Data at one point of time for 5-6 years
- Model development not possible
- Qualitative data for 10-15 years
- Qualitative forewarning
- Occurrence / non-occurrence of disease
- Mixed data conversion to qualitative
categories - Data collected at periodic intervals for one year
- Within year growth model
9- Choice of explanatory variables
- Relevant weather variables
- appropriate lag periods depending on life cycle
- Crop stage / age
- Natural enemies
- Starting / previous years last population of
pathogen
10Forecast Models
- Between year models
- These models are developed using previous years
data. - The forecast for pests and diseases can be
obtained by substituting the current year data
into a model developed upon the previous years. - Within year models
- Sometimes, past data are not available but the
pests and diseases status at different points of
time during the current crop season are
available. - In such situations, within years growth model
can be used, provided there are 10-12 data points
between time of first appearance of pests and
diseases and maximum or most damaging stage. - The methodology consists of fitting appropriate
growth pattern to the pests and diseases data
based on partial data. -
11- Thumb rules
- Most common
- Extensively used
- Judgment based on past experience with no or
little mathematical background - Example
- A day is potato late blight favorable if
- the last 5 - day temperature average is lt 25.50
C - the total rainfall for the last 10 days is gt
3.0 cm - the minimum temperature on that day is gt 7.20 C
- Trivedi et al. (1999)
12- Regression models
- Relationship between two or more quantitative
variables - The model is of the form
- Y ?0 ?1 X1?2 X2 . ?p Xp e ,
- where
- ?is are regression coefficients
- Xis are independent variables
- Y variable to forecast
- e random error
- Variables could be taken as such or some suitable
transformations
13- Cotton
- of incidence of Bacterial blight (Akola)
Weekly models (42nd to 44th SMW) - Data used 1993-1999 on MAXTemp, MINTemp, RH1
(morn), RH2 (aft) and RF X1 to X5) lagged by
2 to 4 weeks - Model for 44th SMW
- Y 133.18 - 3.09 RH2L4 1.68 RFL4 (R20.78)
14(No Transcript)
15- Potato
- Potato aphid is an abundant potato pest and
vector of potato leaf-roll virus, potato virus Y
, PVA, etc. - Potato aphid population Pantnagar (weekly
models) - Data used 1974-96 on MAXT, MINT and RH
- X1 to X3) lagged by 2 weeks
- Model for December 3rd week
- Y 80.25 40.25 cos (2.70 X12 - 14.82)
- 35.78 cos (6.81 X22 8.03)
-
16Aphid popn. in 3rd week of December at
Pantnagar
17GDD approach
- GDD ? (mean temperature base temperature)
- The decision of
- Base temperature
- Initial time
- Not much work on base temperature for various
diseases - Normally base temperature is taken as 50 C
- Under Indian conditions, mean temperature is
seldom below 50 C - Use of GDD and simple accumulation of mean
temperature will provide similar results in
statistical models - Need for work on base temperature and initial
time of calculation
18- Under Indian conditions, other variables also
important - Model using simple accumulations not found
appropriate - Models based on weighted weather indices
where
19 Y variable to forecast xiw value of
i?th weather variable in w?th period riw
weight given to i-th weather variable in w?th
period riiw weight given to product of xi and
xi in w?th period p number of weather
variables n1 and n2 are the initial and final
periods for which weather variables are
to be included in the model e error term
20- Experience based weights
- Subjective weights based on experience.
- Weather variable not favourable weight 0
- Weather variable favourable weight ½
- Weather variable very favourable weight 1
21- Example
- Favourable relative humidity ? 92
- Most favourable relative humidity ? 98
- Weather data
- Year Week No.
- 1 2 3 4 5
6 - 1993 88.7 90.1 94.4 98.3 98.0
95.0 - 94.0 93.3 94.9 93.3 92.0
88.1 - 90.3 91.9 90.4 87.9 86.4
89.7 - --------------------------------------------------
-------------- - --------------------------------------------------
--------------
22- Weighted Index
- 0x 88.7 0x90.1 0.5 x 94.4 1 x 98.3
- 1 x 98 0.5 x 95 271.0
- 0.5 x 94 0.5 x 93.3 0.5 x 94.9
- 0.5 x 93.3 0.5 x 92 0 x 88.1
232.6 - 0 x 90.3 0 x 91.9 0 x 90.4 0x 87.9
- 0 x 86.4 0 x 89.7 0.0
- -------------------------------------------------
-------------- - --------------------------------------------------
--------------
23Interaction Both variables not favourable
weight 0 One variable not favourable, one
variable favourable weight 1/8 One variable
not favourable, one variable highly favourable
weight ¼ Both variables favourable weight
½ One variable favourable, one variable
highly favourable weight ¾ Both variables
highly favourable weight 1
24 Correlation based weights riw
correlation coefficient between Y and i-th
weather variable in w?th period riiw
correlation coefficient between Y and product
of xi and xi in w?th period
25- Modified model
- Model using both weighted and unweighted indices
where
26- For each weather variable two types of indices
have been developed - Simple total of values of weather variable in
different periods - Weighted total, weights being correlation
coefficients between variable to forecast and
weather variable in respective periods - The first index represents total amount of
weather variable received by the crop during the
period under consideration - The other one takes care of distribution of
weather variable with reference to its
importance in different periods in relation to
variable to forecast - On similar lines, composite indices were computed
with products of weather variables (taken two at
a time) for joint effects.
27Pigeon pea
- Phytophthora blight (Kanpur)
- Average percent incidence of phytophthora blight
at one point of time - Data used 1985-86 to 1999-2000 on MAXT, MINT,
RH1, RH2 and RF (X1- X5) from 28th to 33rd SMW - Y 330.77 0.12 Z121 .. (R2 0.77)
28- Sterility Mosaic
- Average percent incidence of sterility mosaic
- Data used 1983-84 to 1999-2000 for MAXT, MINT,
RH1, RH2 and RF (X1- X5) from 20th to 32nd SMW - Y -180.41 0.09 Z121 (R2 0.84)
29- Validation for subsequent years
30(No Transcript)
31Groundnut
- Late Leaf Spot Rust Tirupathi
- Disease indices at one point of time
- Data used MAXT, MINT, RH1, RH2, RF and WS from
(X1- X6) - - 10th to 14th SMW (Rabi or post rainy)
- - 41st to 46th SMW (Kharif or rainy)
32Models for LSS and Rust Disease Index -
groundnut (Tirupati)
Disease Data used Model R2
LLS Kharif 1990 - 1998 Y 39.40 - 0.00921 Z120 0.00037 Z460 0.0022 Z141 0.84
LLS Rabi 1990 - 1999 Y 15.95 0.12Z151 0.0057 Z350 0.83
Rust Kharif 1990 - 1995 Y 0.4213 0.0167Z231 - 0.147 Z10 0.94
33(No Transcript)
34- Principal component regression
- Independent variables large and correlated
- Independent variables transformed to principal
components - First few principal components explaining
desired variation selected - Regression model using principal components as
regressors
35- Discriminant function analysis
- Based on disease status years grouped into
different categories low, medium, high - Linear / quadratic discriminant function using
weather data in above categories - Discriminant score of weather for each year
- Regression model using disease data as dependent
variable and discriminant scores of weather as
independent. - Data requirement is more.
- Can also be used if disease data are qualitative
- Johnson et al. (1996) used discriminant analysis
for forecasting potato late blight.
36- Deviation method
- Useful when only 5-6 year data available for
different periods - Week-wise data not adequate for modeling
- Combined model considering complete data.
- Not used for disease forewarning but in pest
forewarning
37- Assumption pest population / disease incidence
in particular year at a given point of time
composed of two components. - Natural growth pattern
- Weather fluctuations
- Natural pattern to be identified using data in
different periods averaged over years. - Deviation of individual years in different
periods from predicted natural pattern to be
related with deviations of weather.
38- Mango
- Mango fruitfly Lucknow (weekly models)
- Data used 1993-94 to 1998-99 on MAXT, MINT and
RH X1 to X3 - Model for natural pattern
t Week no. Yt Fruitfly population count
at week t
39(No Transcript)
40Forecast model
- Y ? 125.766 0.665 (Y2) 0.115 (1/X222 )
10.658 (X212) - 0.0013 (Y23) 31.788 (1/Y3) ? 21.317
(X12) - ? 2.149 (1/X233) ? 1.746 (1/X234)
- Y Deviation of fruitfly population from
natural cycle - Yi Fruitfly population in i-th lag week
- Xij Deviation from average of i-th weather
variable (i - 1,2,3 corresponds to maximum
temperature, - minimum temperature and relative
humidity) in j-th lag - week.
41Soft Computing Techniques
42- With the development of computer hardware and
software and the rapid computerization of
business, huge amount of data have been collected
and stored in centralized or distributed
databases - Data is heterogeneous (mixture of text, symbolic,
numeric, texture, image), huge (both in
dimension and size) and scattered. - The rate at which such data is stored is growing
at a phenomenal rate. - As a result, traditional statistical techniques
and data management tools are no longer adequate
for analyzing this vast collection of data.
43- One of the applications of Information Technology
that has drawn the attention of researchers is
data mining, where pattern recognition, image
processing, machine intelligence i.e concerned
with the development of algorithms and techniques
that allow system to "learn are directly related - Data Mining involves
- Statistics Provides the background for the
algorithms. - Artificial Intelligence Provides the required
heuristics for learning the system - Data Management Provides the platform for
storage retrieval of raw and summary data.
44- Pattern Recognition and Machine Learning
principles applied to a very large (both in size
and dimension) heterogeneous database for
Knowledge Discovery - Knowledge Discovery is the process of identifying
valid, novel, potentially useful and ultimately
understandable patterns in data. Patterns may
embrace associations, correlations, trends,
anomalies, statistically significant structures
etc. - Without Soft Computing Machine Intelligence and
Data Mining may remains Incomplete
45Soft Computing
- Soft Computing is a new multidisciplinary field
that was proposed by Dr. Lotfi Zadeh, whose goal
was to construct new generation Artificial
Intelligence, known as Computational
Intelligence. - The concept of Soft Computing has evolved. Dr.
Zadeh defined Soft Computing in its latest
incarnation as the fusion of the fields of fuzzy
logic, neural network, neuro-computing,
Evolutionary Genetic Computing and
Probabilistic Computing into one
multidisciplinary system. - Soft Computing is the fusion of methodologies
that were designed to model and enable solutions
to real world problems, which are not modeled, or
too difficult to model. These problems are
typically associated with fuzzy, complex, and
dynamical systems, with uncertain parameters. - These systems are the ones that model the real
world and are of most interest to the modern
science.
46- The main goal of Soft Computing is to develop
intelligent system and to solve nonlinear and
mathematically unmodelled system problems Zadeh
1993, 1996, and 1999. - The applications of Soft Computing have two main
advantages. - First, it made solving nonlinear problems, in
which mathematical models are not available,
possible. - Second, it introduced the human knowledge such as
cognition, recognition, understanding, learning,
and others into the fields of computing. - This resulted in the possibility of constructing
intelligent systems such as autonomous
self-tuning systems, and automated designed
systems.
47soft computing tools
- Soft computing tools include
- Fuzzy sets
- Fuzzy sets provide a natural frame work for the
process in dealing with uncertainty - Artificial neural networks
- Neural networks are widely used for modelling
complex functions and provide learning and
generalization capabilities - Genetic algorithms
- Genetic algorithms are an efficient search and
optimization tool - Rough set theory
- Rough sets help in granular computation and
knowledge discovery
48- Why Neural Networks are desirable
- Human brain can generalize from abstract
- Recognize patterns in the presence of noise
- Recall memories
- Make decisions for current problems based on
prior experience - Why Desirable in Statistics
- Prediction of future events based on past
experience - Able to classify patterns in memory
- Predict latent variables that are not easily
measured - Non-linear regression problems
49Application of ANNs
- Modelling and Control
- control systems
- system identification
- composing music
- Forecasting
- economic indicators
- energy requirements
- medical outcomes
- crop forecasts
- environmental risks
- Classification
- medical diagnosis
- signature verification
- character recognition
- voice recognition
- image recognition
- face recognition
- loan risk evaluation
- data mining
50- Neural networks are being successfully applied
across an extraordinary range of problem domains,
in areas as diverse as finance, medicine,
engineering, geology, biology, physics and
agriculture. - From a statistical perspective neural networks
are interesting because of their potential use in
prediction and classification problems. - A very important feature of these networks is
their adaptive nature, where Learning by
Example replaces Programming in solving
problems. - Basic capability of neural networks is to learn
patterns from examples
51- Type of neural network models
- Two types of neural network models
- Multilayer perceptron (MLP) with different hidden
layers and nodes - Radial basis function (RBF)
52Neural network based model
- Steps in developing a neural network model
- Forming training, testing and validation sets
- Neural network model
- No. of input nodes
- No. of hidden layers
- No. of hidden nodes
- No. of output nodes
- Activation function
- Model building
- Sensitivity Analysis
53Data sets
- The data available is divided into three data
sets - Training set represents the input- output
mapping, which is used to modify the weights. - Validation set is required only to decide when to
stop training the network, and not for weight
update. - Test set is the part of collected data that is
set aside to test how well a trained neural
network generalizes.
54- No. of input nodes more than one
- No. of hidden layers one / two
- No. of hidden nodes decided by various rules
- No. of output nodes one
- Activation function hyperbolic
-
55- Activation function
- Activation functions determine the output of a
processing node. Non linear functions have been
used as activation functions such as logistic,
tanh etc. - Activation functions such as sigmoid are commonly
used because they are nonlinear and continuously
differentiable which are desirable for network
learning - Logistic activation functions are mainly used for
classification problems which involve learning
about average behavior - Hyperbolic tangent functions are used for the
problem involves learning about deviations from
the average such as the forecasting problem. - Therefore, in the present study, hyperbolic
tangent (tanh) function has been used as
activation function for neural networks model
based on MLP architecture.
56Input
Output
57Learning of ANNs
- The most significant property of a neural network
is that it can learn from environment, and can
improve its performance through learning - Learning is the process of modifying the weights
in networks - The network becomes more knowledgeable about
environment after each iteration of learning
process. - There are mainly two types of learning paradigms
- Supervised learning
- Unsupervised learning
58A learning cycle in the MLP (Backpropagation
Learning Algorithm)
59Three-layer back-propagation neural network
60- Mustard
- Alternaria blight (Varuna, Rohini Binoy)
- Bharatpur (Raj)
- Behrampur (WB)
- Dholi (Bihar)
- Powdery mildew (Varuna and GM2)
- S.K.Nagar
- Variable to forewarn
- crop age at first appearance of disease
- crop age at peak severity of disease
- maximum severity of disease
- Cotton
- Bacterial blight ( of disease incidence) - Akola
61Pests / diseases forewarning-Mustard
- Data have been taken from Mission Mode Project
under National Agricultural Technology Project,
entitled Development of weather based
forewarning system for crop pests and diseases,
at CRIDA, Hyderabad. - Models were developed for forecasting different
aspects relating to diseases for Alternaria
Blight (AB) and Powdery Mildew (PM) in Mustard
crop. - The field trials were sown on 10 dates at weekly
intervals (01, 08, 15, 22, 29 October, 05, 12,
19, 26 November and 03 December) at each of the
locations viz., Bharatpur, Dholi and Berhampur
for Alternaria Blight and at S.K.Nagar for
Powdery Mildew. - Data for different dates of sowing were taken
together for model development. - Weekly data on weather variables starting from
week of sowing up to six weeks of crop growth
were considered - Forewarning models were developed for two
varieties of mustard crop for - Alternaria Blight on leaf and pod (Varuna and
Rohini Bharatpur, Varuna and Binoy Behrampur
and Varuna and Pusabold Dholi) and - Powdery Mildew on leaf (Varuna and GM2
S.K.Nagar) - Models have been validated using data on
subsequent years not included in developing the
models.
62Mean Absolute Percentage Error of various models
at Bharatpur in different varieties in mustard
crop for Alternaria blight (AB) - 2006-07
Character Variety MLP RBF WI
Maximum severity Varuna (on Leaf) 111.0 153.8 150.1
Age at First app Varuna (on Leaf) 14.0 15.1 14.7
Age at Peak Severity Varuna (on Leaf) 14.1 27.3 22.3
Maximum severity Varuna (on Pod) 113.7 143.6 132.6
Age at First app Varuna (on Pod) 15.7 9.2 14.2
Age at Peak Severity Varuna (on Pod) 3.9 6.4 5.4
Maximum severity Rohini (on Leaf) 184.0 200.6 196.3
Age at First app Rohini (on Leaf) 12.0 15.5 8.9
Age at Peak Severity Rohini (on Leaf) 28.3 27.8 26.2
Maximum severity Rohini (on Pod) 174.8 220.4 229.6
Age at First app Rohini (on Pod) 29.3 28.2 24.7
Age at Peak Severity Rohini (on Pod) 17.2 20.7 19.6
63- Neural networks, with their remarkable ability to
derive meaning from complicated or imprecise
data, can be used to extract patterns and
classifications - Neural networks do not perform miracles. But if
used sensibly they can produce some amazing
results
64- Model for qualitative data
- Data in categories
- Occurrence / non-occurrence, low / medium / high,
etc. - Classified as 0 / 1 (2 categories) 0,1,2
(three categories) - Quantitative data / mixed data can be converted
to categories
65Logistic Regression model
- where, L ß0 ß1x1 ß2x2 .ßnxn
- x1 , x2 , x3 ,xn are weather variables/weather
indices - e random error
- Forecast / Prediction rule
- If P lt 0.5, then the probability of epidemic
occurrence will be minimal - If P ? 0.5, then there is more chance of
occurrence of epidemic. -
66- Rice
- Leaf blast severity () - Palampur at one point
of time - Data used 1991-92 to 1998-99 on MAXT, MINT, RH1,
RH2, BSH RF X1 to X6 from 23th to 31st
SMW. - Model
- L 394.8 -0.0520 Z351-1.5414 Z10
- Validation for subsequent years
Year Observed Forewarning Probabilities
1999-00 1 1 0.88
2000-01 1 1 0.63
67Mustard
- Alternaria blight and White rust
- Data used 1987-88 to 1998-99 on MAXT, MINT, RH1,
RH2 and BSH (X1 to X5) from week of sowing (n1)
to 50th smw (n2) - Model for Alternaria blight
- L - 8.8347 0.0163 Z120 - 0.00037 Z130 -
0.00472 Z450 - Model for White rust
- L 5.8570 - 0.0293Z40 0.00264 Z230
- Forecasts of subsequent years are
Alternaria blight Alternaria blight Alternaria blight Alternaria blight White Rust White Rust White Rust
Year Observed Forewarning Prob. Observed Forewarning Prob.
1999-00 1 1 0.51 1 1 0.96
2000-01 0 0 0.13 0 0 0.49
2001-02 1 1 0.62 0 0 0.37
68- Within year model
- Model using only one years data
- Data availability for several dates of sowing
- If adequate dates of sowing, models similar to
between-year models could be developed - Use for forewarning subsequent years (?)
- Model for single date of sowing
- Forewarning of maximum disease severity
- Applicable when 10-12 data observations between
first disease appearance and maximum disease
severity - Non-linear model for disease development pattern
growth using partial data
69- Mustard
- Alternaria blight cv. Varuna ( disease severity)
- Kumarganj - Data used 1999-2000
- Model
- Yt A exp (B/t)
-
- Yt pds at time t, A and B are
parameters, - t week after sowing (1,2,.)
70Observed, predicted and forecasts of max. percent
disease severity (PDS)
- Reliable forecast of max. pds could be obtained
for 2 weeks in advance
71Models developed at IASRI
- Sugarcane
- Pyrilla
- Early shoot borer
- Top borer
- Pigeon pea
- Pod fly
- Pod borer
- Sterility Mosaic
- Phytophthora Blight
- Rice
- BPH
- Gall midge
- Mango
- Powdery Mildew
- hoppers
- fruit-fly
- Mustard
- Alternaria Blight
- White Rust
- Powdery Mildew
- Aphid
- Cotton
- American boll worm
- Pink boll worm
- Spotted boll worm
- Whitefly
- Groundnut
- Spodoptera litura
- Late leaf blast
- Rust
- Onion
- Thrips
72References
- Agrawal, Ranjana, Jain, R.C. and Jha, M.P.
(1983). Joint effects of weather variables on
rice yields. Mausam, 34(2), 177-81. - Agrawal, Ranjana, Jain, R.C., Jha, M.P., (1986).
Models for studying rice crop weather
relationship, Mausam, 37(1), 67-70. - Agrawal Ranjana, Mehta, S.C., Kumar, Amrender and
Bhar, L.M. (2004). Development of weather based
forewarning system for crop pests and diseases-
Report from IASRI, Mission mode project under
NATP, PI, Dr. Y.S. Ramakrishna, CRIDA,
Hyderabad. - Denton, J.W., 1995. How good are neural networks
for causal forecasting? Journal of Business
Forecasting, 14 (2), 1720. - Desai, A.G., Chattopadhyay, C., Agrawal, Ranjana,
Kumar, A., Meena, R.L., Meena, P.D., Sharma,
K.C., Rao, M. Srinivasa, Prasad,, Y.G. and
Ramakrishna, Y.S. (2004). Brassica juncea
powdery mildew epidemiology and weather-based
forecasting models for India - a case study ,
Journal of Plant Diseases and Protection, 111(5),
429-438. - Gaudart, J., Giusiano, B. and Huiart, L. (2004).
Comparison of the performance of multi-layer
perceptron and linear regression for
epidemiological data. Comput. Statist. Data
Anal., 44, 547-70.
73- Hebb, D.O. (1949) The organization of behaviour
A Neuropsychological Theory, Wiley, New York. - Hopfield, J.J. (1982). Neural network and
physical system with emergent collective
computational capabilities. In proceeding of the
National Academy of Science (USA) ,79, 2554-2558. - Kaastra, I. and Boyd, M.(1996) Designing a
neural network for forecasting financial and
economic time series. Neurocomputing, 10(3),
215-236. - Masters, T. (1993). Practical Neural Network
Recipes in C, San Diego, Academic Press. - Rosenblatt, F. (1958). The perceptron A
probabilistic model for information storage and
organization in the brain. Psychological review,
65, 386-408. - Rumelhart, D.E., Hinton, G.E., and Williams, R.J.
(1986). Learning internal representations by
error propagation, Nature, 323, 533-536
74- Saanzogni, Louis and Kerr, Don (2001) Milk
production estimate using feed forward artificial
neural networks. Computer and Electronics in
Agriculture, 32, 21-30. - Warner, B. and Misra, M. (1996). Understanding
neural networks as statistical tools. American
Statistician, 50, 284-93. - Widrow, B. and Hoff, M.E. (1960). Adaptive
switching circuit. IREWESCON convention record,
4, 96-104 - Zhang, G., Patuwo, B. E. and Hu, M. Y. (1998).
Forecasting with artificial neural networks The
state of the art. International Journal of
Forecasting, 14, 35-62.
75Thank You