Title: Use of spectral preprocessing to obtain a common basis for robust regression
1Robust Regression for Inter-Brand
Standardization Benoit Igne (igneb_at_iastate.edu),
Glen R. Rippke (rippke_at_iastate.edu), Charles R.
Hurburgh, Jr. (tatry_at_iastate.edu). Department of
Agricultural and Biosystems Engineering, Iowa
State University, Ames, Iowa.
Introduction
Results
- Data Preprocessing for common basis
- Baseline and Offset Correction
- Smoothing / Derivative.
- Detrending.
- Weighted Least Squares Baseline.
- Sample Normalization or Light Scattering
Correction - Normalization.
- Standard Normal Variate (SNV).
- Multiplicative Scatter Correction (MSC).
- Interference Removal or Multivariate filtering
- Orthogonal Signal Correction (OSC).
- Generalized Least Squares Weighting (GLSW).
- Variable Scaling
- Mean-Center.
- Autoscaling.
- All reasonable and meaningful spectral
pretreatment combinations were evaluated (n
75). - Common Standardization techniques
- Optical Techniques
- Use of spectral preprocessing to obtain a common
basis for robust regression - 5 spectral preprocessing combinations gave
significantly higher RPDs (a 5) - Second derivative (25-point window)
Normalization. - SNV Second derivative (25-point window)
Normalization. - MSC Second derivative (25-point window)
Normalization. - Second derivative (25-point window)
Normalization OSC Autoscaling. - Second derivative (25-point window) GLSW
Normalization Mean-Center. - The five combinations were averaged and compared
to other standardization techniques. - Comparison among standardization techniques
- Figures 1 and 2 show results obtained when
predicting validation sets where Foss Infratec
1241 and Dickey-john OmegAnalyzer G 6118 were
respective network masters. - PDS and DS gave significantly lower RPDs.
- Other techniques were not significantly
different. - Network master RPDs were significantly higher.
- Among common standardization techniques,
post-regression correction gave as good or better
results than individual models (developed on
their own calibration set). - Robust techniques also gave as good or better
results than other techniques in 6 of 8 cases. - Calibration transfer from Foss Infratec to
Dickey-john OmegAnalyzer G gave more precise
validation results than the reverse case.
- The transfer of calibrations from instrument to
instrument is an important research area. Many
methods have been developed to transfer a
prediction model from a master unit to a
secondary unit - Optical techniques (Piecewise Direct
Standardization, Direct Standardization) - Post-regression correction techniques (slope and
bias or bias only correction) - Model adaptation techniques (robust regression)
- The new challenge is to transfer calibration
across brands. - Robust models are attractive because they allow
the use of historical databases and the use of
the same samples scanned on different
instruments. Robust models often increase the
prediction error because they add additional
noise.
Objectives
- Evaluate the use of preprocessing techniques in
the creation of robust models for inter-brand
standardization. - Compare results with standardization performance
of known standardization techniques.
Materials and Methods
- Data collection
- Soybean Samples (whole)
- Calibration set 638 samples from 2002 to 2006
crop years. - Two validation sets
- Set 1 20 samples representative of the
variability of the calibration set. - Set 2 40 very diverse samples from the 2006
crop year. - Spectral data
- Four transmittance units, spectral range 850
1048 nm with 2 nm increment. - 2 Foss Infratec (Foss North America, Eden
Prairie, MN) Foss Infratec 1229 (S/N 553075)
and Foss Infratec 1241 (S/N 12410350). - 2 Dickey-john/Bruins OmegAnalyzer G (Dickey-john
Corporation, Auburn, IL) S/N 106110 and 106118. - Reference analysis
- Protein content by combustion (AOAC 990.03),
Eurofins, Des Moines. - Oil content by ether extract (AOCS Ac 3-44),
Eurofins, Des Moines. - Calibration method
- Partial Least Squares Regression (PLS).
- Robust Regression
- Two types of robust models were created
- Combine historical databases of each brand
master. - Use historical database of one brand master.
Conclusions
The calculation of intermediate standardization
parameters for optical techniques (DS, PDS)
increased the error. Results from other
standardization techniques were similar across
standardization sets and instruments. The
transformation of spectral data to a common basis
by preprocessing techniques (before robust
regression) gave the best results in 75 of the
cases. The transfer of historical databases from
one instrument brand to another was proven
possible, with or without spectral data from the
secondary brand, using robust regressions
developed on a common basis obtained by spatial
spectral pretreatment.
Procedure
- Establish baseline calibration performance when
each instrument is calibrated on its own
calibration set. - Apply common standardization techniques to
inter-brand standardization (Infratec 1241 and
OmegAnalyzer G 106118 were brand masters). - Compare inter-brand standardization results
developed from the best spectral preprocessing
combinations with common standardization results.
Figure 1 Foss Infratec 1241 as Overall Master
for Common Methods.
Figure 2 Dickey-john OmegAnalyzer G 106118 as
Overall Master for Common Methods.