Title: Identification of Causal Variables for Building Energy Fault Detection by Semisupervised LDA
1Identification of Causal Variables for Building
Energy Fault Detection by Semi-supervised LDA
Decision Boundary Analysis
2nd Workshop on Domain Driven Data Mining,
Session I S2208 Dec. 15, 2008 Palazzo dei
Congressi, Pisa, Italy
- Keigo Yoshida, Minoru Inui, Takehisa Yairi, Kazuo
Machida - (Dept. of Aeronautics Astronautics, the Univ.
of Tokyo) - Masaki Shioya, and Yoshio Masukawa
- (Kajima Corp.)
2Main Point of the Presentation
- We propose
- A Supportive Method for Anomaly Cause
Identification - by
- Combining Traditional Data Analysis
- and Domain Knowledge
- Applied to Real Building Energy Management System
(BEMS) - Root cause of energy wastes was found
successfully
3Outline
- Introduction
- Theories
- Experiments for Real Data
- Conclusions
4Introduction What is BEMS ?
- Building Energy Management Systems
- Collect/Monitor Sensor Data in BLDG
- (temperature, heat consumption etc)
- Energy-efficient Control
- Discover Energy Faults (wastes)
5Introduction Problem of BEMS
- Hard to identify root causes of Energy Faults
(EF) - Complex Relation between Equipments
- Data Deluge from Numerous Sensors
- (approx. 2000 sensors, 20000 points for
20-story) - Current EF Detection
- Heuristics Based on Experts Empirical Knowledge,
- usually fuzzy IF-THEN rules.
- Heuristic Diagnostics is Incomplete
- Fuzziness False Negative Error
- Detection-Only Cannot Improve Systems
6Early Fault Diagnosis Methods
Performance
- Feature Extraction
- Neural Networks
- FTA/FMEA
- Bayesian
- Filtering
- FDA
Expert System Fuzzy Logic Supervised Learning
Unsupervised Learning / Data Mining
Knowledge Acquisition Bottleneck
Neglecting Useful Knowledge
7Proposed Method
Performance
Proposal Domain Knowledge Data Analysis
Expert System Fuzzy Logic Supervised Learning
Unsupervised Learning / Data Mining
- Characteristics -
Interpretation exploit domain knowledge
Cost not so high, empirical knowledge
only Versatility easy to apply to
various domains problems Performance
better than heuristics
8Conceptual Diagram
Learning Boundary
Experts
Detection Rule
e.g.
Feedback
Data Distribution
Acquire Reliable Labels with Given Rule
DBA
Semi-supervised LDA
9Outline
- Introduction
- Theories
- Semi-Supervised Linear Discriminant Analysis
- Decision Boundary Analysis
- Experiments for Real Data
- Conclusions
10Semi-supervised LDA
Learning Boundary
Data Distribution
Acquire Reliable Labels with Given Rule
11Manifold Regularization M. Belkin et al. 05
Labeled data only
Penalty Term (usually squared function norm)
Squared loss for labeled data
12Manifold Regularization M. Belkin et al. 05
Labeled data only
- Regularized Least Square
- Laplacian RLS
Penalty Term (usually squared function norm)
Squared loss for labeled data
Use labeled unlabeled data
Assumption Geometrically close ? similar label
13Semi-Supervised Linear Discriminant Analysis
(SS-LDA)
- LDA seeks projection for small within-cov.
large between-cov. - Regularized Discriminant Analysis
- Friedman 89
- Semi-Supervised Discriminant Analysis (SS-LDA)
Between-class
Within-class
14Decision Boundary Analysis
15Decision Boundary Analysis
- Feature Extraction method proposed by Lee
Landgrabe - C. Lee D. A. Landgrabe. Feature Extraction
Based on Decision Boundary, IEEE Trans. Pattern
Anal. Mach. Intell. 15(4) 388-400, 1993 - Extract informative features from
- normal vectors on the boundary
16Decision Boundary Feature Matrix
- Define responsibility of each variables for
discrimination
17Outline
- Introduction
- Theories
- Experiments
- Application to Energy Fault Analysis
- Conclusions
18Energy Fault Diagnosis Problem
EF Inverter overloaded
Detection Rule 6h M.A. of Inverter output 100
EF
but I dont know the cause
cold
Inverter
hot
coil
Air Handling Unit
humidity
19Energy Fault Diagnosis Problem
EF Inverter overloaded
Detection Rule 6h M.A. of Inverter output 100
EF
but I dont know the cause
Find out root cause of inverter overload
20Energy Fault Diagnosis - Settings
- Air-conditioning time-series sensor data for 1
unit - instances 744
- Labeled sample 10 for each (3 of all)
- (based on probability proportional to distance
from boundary) - Hyper-parameters
- 13 attributes, all continuous
21 22Results (100 times ave.)
Inverter
ltLDAgt Inverter (96)
Trivial
23Results (100 times ave.)
SA Temp.
Cooling water
ltSSLDAgt Cool water (75) SA temp. (12)
ltLDAgt Inverter (96)
24Results (100 times ave.)
Not Distinctive !
ltSSLDAgt Cool water (75) SA temp. (12)
ltKDAgt Cool water (19) MA. Pressure (15)
Inverter (15)
ltLDAgt Inverter (96)
25Results (100 times ave.)
SA Temp.
1
2
SA Setting
Inverter
3
Cooling water
ltSSLDAgt Cool water (75) SA temp. (12)
ltKDAgt Cool water (19) MA. Pressure (15)
Inverter (15)
ltLDAgt Inverter (96)
ltSSKDAgt Inverter (33) SA temp (19) Cool
Water (17) SA setting (13)
26Energy Fault Diagnosis Examine Row Data
- Cooling water valve Opening 3
- valve opens completely, but this is result of EF,
not cause
27Energy Fault Diagnosis Examine Row Data
- Cooling water valve Opening
- valve opens completely, but this is result of EF,
not cause - SSLDA/SSKDA show SA temp. 1 setting 2
responsible
- To reduce this deviation
- Operate inverter at peak power
- Open cooling water valve
28Evaluation
29Outline
- Introduction
- Theories
- Experiments for Real Data
- Conclusions
30Conclusions
- Introduce identification method of causal
variables - by combining semi-supervised LDA DBA
- Labels are acquired from imperfect
domain-specific rule - SS-LDA/SS-KDA reflect domain knowledge avoid
over-fitting - DBA extract informative features from normal
direction of boundary - Apply to energy fault cause diagnosis
- Succeeded in extracting some responsible features
- beginning with fuzzy heuristics based on domain
knowledge
31Room for improvements
- Consider temporal continuity
- Time-series is not i.i.d.
- Find True Cause from Correlating Variables
32- Thank you for your kind attention
33 34Minor improvements
- Optimize Hyper-parameters
- AIC, BIC,
- Cross Validation
- Regularization Term
- L1-norm will give sparse solution
- Comparison to other discrimination methods
- SVM
- Laplacian SVM etc.
35Extension to Multiple Energy Faults
- In real systems, various faults take place
- Fault cause varies among phenomena
- Need to separate phenomena and diagnose
respectively - ltOur Approachgt
- 1. Extract points detected by existing heuristics
- 2. Reduce dimensionality and visualize data in
low-dim. space - 3. Clustering data and give them labels
- 4. Identify variables discriminating that cluster
from normal data
36Experimental Condition Results
- Air-conditioning sensor data, 13 attributes, same
heuristics - 748 instances, operating time only (hourly data
for 2 months) - 137 points are detected by heuristics
- Reduce dimensionality by isomap J.B. Tenenbaum
00 (kNN 5) - Contribution score is given by SS-KDA (kNN 5,
)
lt2D representationgt
2 major cluster, 4 anomalies
37Experimental Condition Results
- Air-conditioning sensor data, 13 attributes, same
heuristics - 748 instances, operating time only (hourly data
for 2 months) - 137 points are detected by heuristics
- Reduce dimensionality by isomap J.B. Tenenbaum
00 (kNN 5) - Contribution score is given by SS-KDA (kNN 5,
)
lt2D representationgt
Average Temp. is very high inverter operate
hard for air-conditioning Detected, but this is
not EF
2 major cluster, 4 anomalies
38Experimental Condition Results
Contribution score for red points
- Air-conditioning sensor data, 13 attributes, same
heuristics - 748 instances, operating time only (hourly data
for 2 months) - 137 points are detected by heuristics
- Reduce dimensionality by isomap J.B. Tenenbaum
00 (kNN 5) - Contribution score is given by SS-KDA (kNN 5,
)
lt2D representationgt
Deviation of Room Air Temp. around detected
points Detected, this is EF
2 major cluster, 4 anomalies
39Data Distribution
40Data Distribution
41Probabilistic Labeling
- Points distant from boundary are reliable as
class labels - Keep robustness against outliers
- Points are stochastically given labels based on
reliability
Rule
outlier
Unreliable
42Estimate DBFM
- Linear Case
- Nonlinear Case
- Difficult to acquire points on boundary
calculate gradient vector - Disciminant function is linear in feature space
Kernelized SSLDA (SS-KDA)
43DBFM for Nonlinear Distribution (1)
- 1. Generate points on boundary in feature space
- 2. Gradient vector at corresponding point
- for Gaussian kernel
- But to find pre-image is generally
difficult - By kernel trick, pre-image problem is avoidable
Input space
44DBFM for Nonlinear Distribution (2)
- Finally we have gradient vectors on boundary for
each point - 3. Construct estimated DBFM
- Define responsibility of each variables for
discrimination
Max. eigenvalue
45?????????
- ?????????
- ???????
- ?????????????LDA??
- SVM?????????????
- ??????????????
- ????????????????????
46Verification by Benchmark Data wine
discrimination -
- UCI Machine Learning Repository Wine Dataset
- Consider 2-class problem (Original data contain
3) - Number of Instances wine A 59, wine B 71
- 13 attributes, all continuous
- 1. Alcohol
- 2. Malic
- 3. Ash
- 4. Alkalinity of Ash
- 5. Magnesium
- 6. Phenols
- 7. Flavonoids
- 8. Nonflavonoid phenols
- 9. Proanthocyanins
- 10. Color intensity
- 11. Hue
- 12. OD280/OD315 of diluted wines
- 13. Proline
Histogram
47Result on Benchmark Data
- Acquire only 3 labels for each class based on
probability proportional to distance from
boundary (color intensity 4) - Hyper-parameters Nearest neighbors 3,
100 times average
Most 3 responsible attributes ltLDAgt 1.
Flavonoids (7) 18.0 2. Color intensity (10)
13.2 3. Phenols (6) 11.6 42.8 ltSS-LDAgt 1.
Proline (13) 26.5 2. Color intensity (10)
22.1 3. Alcohol (1) 14.2 62.8
48Comparison of SSLDA with LDA
Plot data in space spanned by most 3 responsible
features
LDA
SSLDA
Apparently SSLDA gives effective features for
discrimination