Multivariate Regression Analysis - PowerPoint PPT Presentation

1 / 26
About This Presentation
Title:

Multivariate Regression Analysis

Description:

Increased Accuracy/Precision in the Information Process ... When e is minimised: y = Xb. Xty = XtXb. The 'Normal equation': (XtX)-1 Xty = b ... – PowerPoint PPT presentation

Number of Views:120
Avg rating:3.0/5.0
Slides: 27
Provided by: geirrun
Category:

less

Transcript and Presenter's Notes

Title: Multivariate Regression Analysis


1
Multivariate Regression Analysis
2
Aim
  • Establish a predictive model between one or more
    response variables and one or more input variables

Measurement
Response
3
Areas where Regression Analysis is useful
  • Process and Environmental Monitoring
  • Process Control
  • Product Quality/Product Properties

4
Why?
  • Reveal correspondences/correlations
  • Increased Accuracy/Precision in the Information
    Process
  • Improved (reduced) Response time in the
    Information Process (on-line, at-line)

5
How?
  • 1. Collect Data
  • 2. Analyse Data
  • 3. Establish a Predictive Model

Y BX, yi f (x1, x2, .., xm) y bx, y f
(x1, x2, .., xm)
6
Start
7
Multivariate Regression
Model
y Xb e
8
The solution of regression problems
y Xb e When e is minimised y Xb Xty
XtXb The Normal equation (XtX)-1 Xty b
Minimise
with respect to b0, b1,,bM
Condition XtX must have full rank
9
Problems
  • Many x-variables, few objects (measurements)
  • Correlation between the x-variables

det XtX ? 0 ?(XtX)-1 does not exist!
  • Noise in X

10
Generalised inverse
Generalised inverse X (XtX)-1 Xt ? Normal
equation b Xy
Biased Regression Methods differ in the way that
the Generalised Inverse is calculated
11
Latent Variable Regression
12
Problem Specification
Standards with known concentrations are measured
on two highly correlated wavelength. Make a
calibration model between the concentrations and
the measured intensities at the two
wavelengths c f(x1,x2)
13
Dimensionality Reduction
t, score vector ? c, concentration
vector Quantitative information about the
concentration in t
14
The Regression
15
Calculation of the Regression Coefficient
16
Regression modelling
17
Solution
  • 1. Decompose the matrix of spectral data (X) into
    (orthogonal) latent variables (LVs)

2. Model the dependent variable in terms of the
latent-variable score vectors
18
Scores and Loadings
  • Scores
  • t f (c1, c2, )
  • Contains quantitative info about the
    concentrations
  • Loadings
  • p f (?1, ? 2, )
  • Contains qualitative info about the spectra

19
Regression Methods
  • Partial Least Squares (PLS)
  • - best for prediction
  • Principal Component Regression (PCR)
  • - best for outlier checking

? Combine the methods
20
Visualisation of PLS
X Y
21
Data described by several Latent Variables
Model
22
Calculation of the regression vector
?
23
Latent-Variable Regression Modelling
The Modelling process
Validation
Interpretation (Regr. coeff., loadings)
Number oflatent variables (Explained var. in X
and Y, Cross Validation, Regr. Coeff., Loadings
etc.)
OutlierDetection
24
Cross Validation (statistical validation)
  • i) Divide the samples into a number of groups,
    ng.
  • ii) For each LV dimension, a1,2,.., A1, perform
    the following calculations 1. Estimate the LV a
    with group k of samples excluded. 2. Predict the
    responses for samples in group k. 3. Calculate
    the squared prediction error for the left-out
    samples,
  • iii) Repeat step ii)until all samples have been
    kept out once, and only once, then calculate
  • iv) If SEP(a)ltSEP(a-1) go to ii), otherwise stop
    and select number of dimensions (LVs) in model as
    a-1, A

25
Application Example 1
  • Process industry, where the principal qualities1
    of products are linked to chemical composition of
    raw material and the manufacturing process.

1 O. M. Kvalheim, Chemom. Intel. Lab. Syst. 19
(1993) iii-iv.
26
Application Example 2
  • Environmental sciences, such as the prediction of
    the diversity of a biological system from
    instrumental fingerprinting of the chemical
    environment, principal environmental responses.
Write a Comment
User Comments (0)
About PowerShow.com