Title: European Molecular Biology Laboratory (EMBL)
1???????????? ?????? ? ???????? ?????????????
??????? ???/???
- ?????? ??????? ?????????
- ?????????? ??????????????? ????????
- European Molecular Biology Laboratory (EMBL)
???????? ? ?????? ??????????? ?????? (?????
WSC-5), 16 ??????? 2006, ??????
2?????? ??????????? ??????????
- ?????? ??????? ?????????
- ?????????? ??????????????? ????????
- European Molecular Biology Laboratory (EMBL)
???????? ? ?????? ??????????? ?????? (?????
WSC-5), 16 ??????? 2006, ??????
3???? ??????
- ??????????? ??????????
- Multivariate Calibration
?????? ??????????? ?????? (???????????)
Multivariate Data Analysis (Chemometrics)
4? ??????? ? ??????? ????????????
- ?????? ???? ??????????? - ??????????
- ???????????? ?? 30 ??? ????????? ??????,
????????, ?????, ??????????? - ????????? ???????????? PCA, PCR, PLS, SIMCA,
RMSEP, etc. - ?? ????????? ? ??????????? - ??????? ???????????? ????????? ??????
- ????? ?? ???????? ??!
- ???????? scores and loadings (!?)
- ????? ?????, ????? ??????? ??????? ????? ? ??????
- ? ????????? ?????? - ???????????? ????????????
5?????????? ??? ????????????
- ? ??????? ????? ??? ??????? ???????
- ?????????? (??????? ?????????) ????????????
????????, ??????????? ? ????? ??????????? ?
????????????? ?????????????? ????????
??????????????? ????????????? ? (???) ???????????
? ?????????? ??????? ????????? - ??????????? ??????????????? ????????, ???
?????? ??????? ??????????????? ???????? ???? ???
???????? ????? ?????????????? ??????? ?????????
????????... - ?? ?????????? ??? ??????????? ??? calibration
- ??????????? ??????????? ??????
- ? ?????? ????? ?????????????? ???????????? ??????
??????????
6????????? ??????????
- Regression is an approach for relating two sets
of variables to each other Kim
Esbensen - Calibration is a process of constructing a
mathematical model to relate the output of an
instrument to properties of samples Kenne
th Beebe - ?????????? ?????????
7????????????? ??????
- ???????? ?????????
- Y XB E
- ??? (PCA) ????????????? (X)
- ????????? ????????????? (X,Y)
8???????????? ??????
??????? (X)
???????????? (Y)
9??? ???? ????? ???????????
- ?????? ??????? ????????? ????????????? ????????,
?????????? ???????, ?????????????? ? ?????? - ????? ??????????? ????????? ???? ?????? ?????????
????????????? ???????? ???????????? - ??????
- ?????????
- ???????? ????? ???????
- ???????? ????????????
- ??????????? ??????????, ? ?. ?.
- ? ??????????? ????? ???????????? ???????? ?????
?????? ?????????!
10??????? ?? ????????? ????????
- ????? ?????????? ?????????? ?1 ???????????????
??????? - ???????? ???????????????? ?????? ????? ????
????????? ??? ????? ??????? - ???????? ???????????? ??????, ????????,
??????????? ?????? ? ????? ?????????????????
(??????? ??) - ?????????? ?????? ???????? ????? ???????????
?????????? ??????????, ?????????? ????????????
????????? ?????? - ?????????? ? ??????? ???????????? ????? ????
???????? ?????? ?? ???????????? ??????
11?????????? ?????????? ???? ?????????
univariate calibration
12?????????? ?????????? ????????????????? ?????
???????????????? ?????
??????????
13??????????? ??????????
yxbe
YXBE
14???????????? ??????????? ??????????
- ??????????? ????????????? ????????? ???????????
???????????? - ??????? ? ???????? ?? ?????????? ???
????????????? ??????????, ? ?.?. ??????
????????????? ????????? (???????) - ??????????? ??????????? ?????? ???????? ?
???????? ???????????? - ????????????????? ????? ? ???????? ? ???????
??????? - ? ?????????? ??? ????????? (PLS-R) ?????????????
???????? ?? ????? ????? ?? ???????? ??????????
??????? ???????
15?????????? ? ????????????
16???????????? ? ????????? ??????
- ??? ???????? ??????? ? ??????????? ??????????
- ???????????? ??? (Classical Least Squares, CLS)
??????? ?? ?????? ??????? ?????????
??????-????????-???? - A Ce X Ye
- ????????? ??? (Inverse Least Squares, ILS) ??????
????????? ???? - ? Ab Y Xb
- ? ????????? ?????? ?????? ILS
17????????????? ???????? ????????? (???)
Multiple Linear Regression (MLR)
yb0 b1x1 b2x2bpxpe
??????? b (XT X)-1 XT y
18?????????? ???
- ??? ????? ?? ?????????, ????
- ?????? ?????????????? ? X (???????)
- ???????????? ??????? ??? ???????????? ?????
??????????? ??????????????? (XT X)-1 XT - ??????? ??????? ????, ?????? ? X
- ?????????? ??????, ??? ???????? (??????? ???
???????????? ??????) - ???? ???????? ??????????? ????? ???????????
?????? X - ?????????? ????????????? ???-???????
??????????????
19?????? ???????????? ???????????????????????
????????????
n ????????? Cn, M
1 2-?????????????? 0 - 1
2 2-???????????????????? 0 - 0.5
3 3-???????????????????? 0 - 0.05
20????????????????? ???????????? ????????? ?
???????? ??????
simdata
21???-?????????? (Simdata)
???????? ???-?????? ??? ?3 (3-?? ??????????
????? ???) ???????????????????
22????? ??????? ????????? (???) - ?????? ??????
??????????????
- ??? (Principle Component Analysis) -
?????????????? X TPT E - ????? T (scores) ? ???????? P (loadings)
?????????? ???????????? ??????? ????????? - T ???????????? ? ???????? ???????? ?????? ?? ??
- T ????? ???????????? ?????? X ??? ??????? (!)
23????????? PCA ?? ???????
XA(430 nm) YA(550 nm) ZA(750 nm)
XA(522 nm) YA(644 nm) ZA(714 nm)
24??? ??? ???! (PCA MLR PCR)
- ???-????? (PCA scores) T ????? ????????????
?????? X ??? ?????????? ???-?????? (MLR) - MLR yXbe bXXT-1XTy ynew
Xnewb (I) - PCR yTbe bTTT-1TTy ynew
Tnewb (II) - ????? ?????????? ????????? ?? ???????
??????????, ??? (Principal Component Regression,
PCR)
25????? ??? (PCR) ?????????
PCA
MLR
26????????????? ???-??????
- ????????????? ?????? ?????? ??? ????????
?????????? ????????? ?????? - ??????
- ???????
- ????? ????? X ? Y
- ??????????? ??????????? ??? (PCA) ???????? ? ???
(PCR) - ?????? ?????? (scores)
- ?????? ???????? (loadings)
- ?????? ?????? ? ???????? ?????? (bi-plot)
- ?????? ???????? (residuals)
- ??????????? ??????????? ???
- ?????????? ?????? ???????? X ? Y
27?????? ???-?????? (Simdata)
28?????? ???-?????? (simdata)
29???????? (?????????) ??????
- ???????? (validation) ?????? ?????? ???
- ??????????? ??????????? ?????? (????? ??)
- ?????? ???????????????? ??????????? ??????
- ???????? ?????? ???????????? ? ??????? ????????
?????? - ???? ?? ????????? ? ???? ?? ???????? ???
????????? ?????? (?? ?? ??????????? ???????) - ?????????? ????????????????
- ??? ?????-????????? (cross-validation)
- ?????? (leave-one-out, LOO)
- ?????????? (????????, Venetian blind)
30?????????????????? ?????? ???????????? (RMSEP)
- RMSE? Root Mean Square Error of Calibration
- RMSEP Root Mean Square Error of Prediction
- ??????? ?? ?????? RMSEP ????????
????????? ????? ?? - RMSEP ?????? ???????? ? ???????? ????????? (!)
- RMSEP ???????????? ??? ????????? ???????
31????? ????????? ?????? ??????? ?? ?????? RMSEP?
?????????? ??????????
?????????? ??????
32?????? ????? ????????? ? ???
- ?????????? ????? ????? ??????? ?????????
(principle components, PC) - ???????? ????????
??????????? ?????????? - ?????? ? ????????????? ?????? ?? (underfitting)
?? ?????????? ???? ???????? ?????????? ?? ?????? - ?????? ? ?????????? ?????? ?? (overfitting)
???????? ???????????? ??? (??????) - ????? ??????????? ??????????? ???????? ????????
?????? (validation set)
33????? ????????? ??? - simdata
34????? ????????? ??? - simdata
35?????? ????? ?? ? ??? ???????????
- ????? ??????? ????????? (??????????? ??????)
???????????? ? ??? (PCR) ??????? ??????????, ? ??
??????????? ????????? ? ??????????? ??? (PCA) - ???????????
- ? ??? ???? RMSEP
- ??????? ???????????? ???????? ?????? (test set)
- ??????? ?? ?????? RMSEP - ???????? ?????????
????? ?? - ??? ???????????? ?????? ????????????? ????? ????
????? X-???????? (X-loadings) - ??????? ?????? ?? ?????????!
36?????????????? ???
- ??? (PCR) ?????? ????? ??????????? ??????????
- ????? ??????????? ???????????? ????? MLR
- ??????, ?? ?????? ????????????? ??? ??????????
- ???????????? ?? ?? ????????? ????????? Y ? ?????
????? X ? Y - ????? ?? ?????? ??? ????? ??? ??????????
???????????? ??????? - ??, ??? ?????? PLS!
37????????? ????????????
- ????????? PCA ????? ????????????? ?????
- X TPT E
- ?????????????? ?????????? ????????? ???????????,
????????? ?????? ?? ????????? ????????????
(factor space) - ?????? ??????? ? T ? P ?????????? ?????????
(factors) - ??????? ?????????? ?????? ?????? ??????????
????????????, ?? ?? ???????????? - ????????? ???????????? ????? ?????????????? ???
??????? ?????????? ?????? - ?? (PC) ?????????? ??? ???????????? ????????? X
- ??? ?????????????? ???????????? ??? ???????????
38PLS ?????? ???????????? PCR
- ????? ???????? ?? ????????? ????????? (???) ?
???-????????? (???-?) - PLS Partial Least Squares -gt
- Projection on Latent Structures
- ???-???????????? ????????? ??? ??????? ????
?????????? X ? Y ???????????? - ???????? ????????????? ??? ?????????
(??????????) ? X, ??????? ??????????? ? Y - ????????, ???????????? ?????? (X), ???????
???????? ?? ???????????? ??????????(??), ????????
? Y, ??????? ? ?????? ??????? ??? - ????? ??? ????????????? ??? ?????????????? ???????
39???-????????? ????????????? ?????????????
- ????????? ??? ??????? X ? Y
- ??????? ?????????????? ?? ??????? ????????
NIPALS - gt 2 ?????? ?????? (scores) T, U ? ????????
(loadings) P, Q ???? ??????? W ??????????
???????? (loading-weights) - ???????????? ????????? ??????, ?????
??????????????? cov(T,U) - ???????????? Y Tnew Bt
- Y Xnew B
- B W(PTW)-1QT
X TPT Ex Y UQT Ey
1 S. Wold, H. Martens, H. Wold, Lecture Notes
Math. 973 (1983) 286293
40??? ????????????? ??? ???1 ? ???2
- ?????????? ??? ?????????? ????????????? ??? ???1
(PLS1) ? ???2 (PLS2) - ???1 ?????? ???????? ??? ???????????? ??????????
y (????????), ????????, ??? ???????????? ??????
?????????? ????? - ???? ????? ?????????? ?? ?????????? ?????????,
???????? ????????? ??????????? ??????? - ???2 ?????????????? ??? ?????????? ???????
???????????? - ????????? ????????? ??????? ??????????
??????????????
41?????? ????????? ???
- ???-???????????? ???????????? ????????? NIPALS
- NIPALS Non-linear Iterative Partial Least
Squares - ??????? ????????? ?? ???????, ???? ?? ??????,
?????? ???? ???????? (??? ? SVD) ?? ?????????? - ???????????? ?????? ???????? uf -gt tf ? uf -gt tf
??? ?????????? ???????? ??????? f -
??????????????? ?????? ???2 - ???????? ???????? ?? ?????????? ????????
?????????? - ??????????? ? ?????????????? ??????, ??????? ?
????? ?????? ???2
42NIPALS ???????? ??? ???2
0. uf ????? ?????????? ??????????? u
1. wf XfTuf/XfTuf ?????? ???????????????? ??????? ?????e???? ???????? w
2. tf XfTwf ?????? ??????? ????? t
3. qf YfT tf /YfT tf ?????? ???????????????? ??????? ???????? q
4. uf YfT qf ?????? ??????? ?????? u
5. tf.new- tf.oldlt lim? ???????? ?????????? ?? -gt go to 1.
6. pf XfTtf/tfTtf ?????? ??????? ????? p
7. bf ufTtf/tfTtf ?????? ??????????? ???????????? ????????? b
8. Xf1 Xf1 - tf Tpf Yf1 Yf1 - bftfqfT ?????? ??????? X ? Y
9. f f 1 ??????? ? ?????????? ???????
43NIPALS ???????? ??? ???1
1. wf XfTyf/XfTyf ?????? ???????????????? ??????? ?????e???? ???????? w
2. tf XfTwf ?????? ??????? ????? t
3. qf yfT tf /tfT tf ?????? ???????? q (??????) ??????? f
4. pf XfTtf/tfTtf ?????? ??????? ????? p
5. Xf1 Xf1 - tf Tpf yf1 yf1 - qftf ?????? ??????? X ? y
6. f f 1 ??????? ? ?????????? ???????
44NIPALS ???????? ??? ???1
1. wf XfTyf/XfTyf ?????? ???????????????? ??????? ?????e???? ???????? w
2. tf XfTwf ?????? ??????? ????? t
3. qf yfT tf /tfT tf ?????? ???????? q (??????) ??????? f
4. pf XfTtf/tfTtf ?????? ??????? ????? p
5. Xf1 Xf1 - tf Tpf yf1 yf1 - qftf ?????? ??????? X ? y
6. f f 1 ??????? ? ?????????? ???????
45???1 ? ???2
- ???1 ?????????? ?????? ???? ?????????? y ?? ???
- ???2 ????????? ???????????? ????? ??????????
?????????? Y ??? ?? ?????????? ????????? - ?? ??????? ????? ?????????? ??? ??????????
?????????? ??????? - ??????, ???1 ???? ?? ????????? ?????? ?? ??????
?? ???????????? ???????, ????????, ? ?????????
?????? ???????? - ?? ????? ?? ????? ??????????? ??????? ??????
?????? ????????? - ???????????? ?????? ???
- ??????? ?????? ?? ????????!
46?????? ???2-?????? (Simdata)
n ????????? Cn, M
1 2-?????????????? 0 - 1
2 2-???????????????????? 0 - 0.5
3 3-???????????????????? 0 - 0.05
47????????????? ???-???????
- ????????????? ?????? ?????? ??? ????????
?????????? ????????? ?????? - ??????
- ???????
- ????????????
- ???????? ? ??? (PCR)
- X-????? ? ???????? (scores loadings)
- ???????????
- ?????? t u ????? ??????????? ????????
(outliers) - ??????? ???????? w w ????? ??????????
- c???????? ???? X-???????? p w ????????? Y
???????? ?? ???????????? X - ?????? w q
48????????????? ??????????2 ?????? ???
PLS2
49????????????? ??????????1 ?????? ???2
50????????????? ???-???????????? X ? Y (Simdata)
51????????????? ???-????????????? (Octane)
52???????? ????????????? ???????
- ???????? (validation) ?????? ?????????? ???
???????? ???? - ??????????? ???????????? ????? ?????????
- ?????? ???????? ??? ? ???
- ??????? RMSEP
- ?????? ???????????????? ??????????? ??????
- ?????? ????????????? ???????????? ?????????
(predicted vs measured) - RMSEP
53???????? ????????????? ??????? simdata ???1
54????????? ??????? Simdata
????????? ??????? ?????????? ????????????????
????? ??? (simdata)
??? MLR ??? PCR ???1-? PLS1-R ???2-? PLS2-R
C1 0.1312 0.0576 0.0575 0.0575
C2 0.0527 0.0241 0.0245 0.0245
C3 0.01579 0.00246 0.00246 0.00249
- ????? ?????? ???, ???1-?, ???2-? ????????
???????? ?????? ??? ?????????? ???? ?????? (???
??????????) - ?????????? ??? ??????????? ????, ??? C3 -
????????????????????
55????????? ??????? ??????????
- ??? (MLR) ????? ???????? ??? ??????????????????
?????? - ??? (PCR) ????? ??????????, ?? ?????? ????????
??? ?????????? ?????????? - ??? ????????? (PLS-R) ???????? ?????? ????????
??? ??????????? ???????????? ????? - PLS1 ??? PLS2?
- ??? ??????? ?????? ?????????!
- ??? ?????????? ?????? ??????? RMSEP
56???????? ????????? ? ????????????
X 100x351
r0.999
57???????????? ??????????? ???????????? ?????
????????
- ?? ??? ???????? ????????????? ? ???????????
????????????? ??????! - ??????????? ????????? ????????, ??
??????????????? ?????? ????????????? ??????
???????? ????? ?? ??????????? ?????????????
??????? - Deviation - ???????????? ????????,
??????????????? ???? ???????????? ?????? ???????
????????????? ?????? - ?????????? ??? ??????
58??????????? ????????????(Simdata)
59??????????? ???????????????1 - Simdata
C2 0 0.5 M
C1 0 1 M
C1 C2 C3
U1 0.5 0.25 0.025
U2 0.9 0 0.01
U3 0 0 0.2
U4 0.5 0 0
U5 0 0.3 0
U6 0.5 0.5 0.5
C3 0 0.05 M
60??????? ?????????? ??????? ??????????
- ????????? ??????????? (???????) ???????
- ????????? ??????? ??????, ???? ??????????,
????????? ??????????????? ????????? ??????
(pre-processing) - ???? ?????????? ????????? ????????????/
??????????? (scaling/weighting) - ???????????????? ??????, ??????? ?????????
??????, ??????? ? ??????? ????????? ??????? - ????????? ??????? ??????????? ??????,
??????????????? ?????? - ??????????????? ????????????
61???? ????????
- ?????? 1. ???????????????? ??????????
???????????????? ????? ??? ?? ???????? ?
??-??????? ??????? (????????????? ??????). - ????? ?????? ??????????, ????????????? ?
??????????? ??????, ???????????? ?? ?????????
?????? - ?????? 2. ??????????? ?????????? ????? ??????? ??
???????? ???????? ??. - ?????????? ?? ???????? ??????, ??????????? ?
???????? ???????? - ?????? 3. ???????? ??????? (?????????????).
- ??????????????? ?????????? ??????????, MSC, ?????
??????????
62????????????? ??????????
- Richard Kramer
- Chemometric Tchniques for Quantitative Analysis
- Kim H. Esbensen
- Multivariate Data Analysis - in Practice
- Kenneth R. Beebee et al.
- Chemometrics a Practical Guide
- Harald Martens, Tormod Naes
- Multivariate Calibration
- Richard G. Brereton
- Chemometrics Data Analysis for the Laboratory
and Chemical Plant - Edmund R. Malinowski
- Factor Analysis in Chemistry
63?????? 1 ?????????? ????? ???
- ???? Simdata
- ???? ????????? ??????? ?????????? ? ??????????
Unscrambler - ??????? ?????? ?????? ?????????, ????????,
unknown - ? ???????, ??? ????? ???????? - ????????? ?????????? ???, ???2 - ???????? ??????
- ????????? ???1 ??? ??????? ?? 3-? ???????????,
?????????? ??????????? ??????? - ??????? ??????? scores, loadings, T-U, predicted
vs measured, RMSEP, Variance ??? ?1 - ?3 ?
?????? ??????????? ???????? - ??????????? ??????????? ???????
64?????? 2 ??????????? ?????????? ????? ???????
- ???. 139, ???? Octane
- ???? ?????? ? ????????? ???????, ??????????? ?
?????????? ???????? - ??????????????? ?? ?????
- ????????? ?????????? ???1, ???????????????
- ?????????? ???????, ???????, ???????? ??????
- ????????? ?????? ?????????? ?????????, ???????
???????? ????? - ????????? ???, ???????? ??????
- ??????????? ??????????? ???????
65?????? 3 ???????? ???????
- ???. 150, ???? Wheat
- ???? ??????????????? ?????????? ?????????????
?????? - ?????????? ??????? ???1/2, ????????? ???????
- ??????????? ? ???????? ????????
- ?????????? MSC
- ??????????? ???????? ?????????? ??? ?????????
??????