Title: Ph'D' Research Proposal Leveraging Operational Data For Intelligent Decision Support in Construction
1Ph.D. Research ProposalLeveraging Operational
Data For Intelligent Decision Support in
Construction Equipment Management
- Hongqin Fan
- Provisional Ph.D. Candidate in
- Construction Engineering and Management
- University of Alberta
- April 24, 2006
2Agenda
- Introduction
- Problem Statement and research motivation
- Related research
- Research methodology
- Data warehousing for equipment management
- Resolution-based outlier mining algorithm
- AutoRegression Tree based Prediction
- AutoRegression Tree based time series forecasting
- Expected contributions
- Summary and conclusions
3Introduction
Introduction
Research Motivation
Related Research
Research Methodology
Expected Contributions
Conclusions
- Definition of Construction Equipment Management
- Manage equipment resources to maximize return of
capital investments and satisfy the needs of
project management in a timely and cost-effective
manner (Adapted according to Vorster and
Livermore 1994). - Major construction equipment management tasks
- Corporate level Equipment acquisition, finance,
life cycle costing - Operational level logistics, maintenance and
repair.
4Introduction
Introduction
Research Motivation
Related Research
Research Methodology
Expected Contributions
Conclusions
- Recent trends in construction equipment
management - Computerized construction equipment management
- Increasing automation in data collection and
control - Management of large fleet
- Equipment acquisition and service market
diversification.
5Introduction
Introduction
Research Motivation
Related Research
Research Methodology
Expected Contributions
Conclusions
- Most data are collected and stored in electronic
format. - Commonly cited problems with the current
equipment data - Data problem noisy data, fragmented, difficult
to retrieve - Underutilization of data assets, due to lack of
advanced computer tools. - This research will improve the current situation
through data warehousing and data mining.
6Introduction
Introduction
Research Motivation
Related Research
Research Methodology
Expected Contributions
Conclusions
The equipment management team has more advanced
tools for decision support in addition to the
summary reports.
7Introduction
Introduction
Research Motivation
Related Research
Research Methodology
Expected Contributions
Conclusions
- The research will achieve the following
objectives - Build a prototype construction equipment data
warehouse as the enterprise data source for
decision support. Explore the opportunities and
challenges at different stages of data
warehousing, including planning, design and
implementation, for equipment management - Design and test of a novel nonparametric outlier
mining algorithm for generic problem detection in
construction equipment data, as well as other
engineering data. - Testing, evaluation and modification of current
data mining algorithms for decision support in
construction equipment management - Design and implement the prototype intelligent
equipment management system using integrated
equipment data warehouse and embedded data mining
models make recommendations on planning and
design of an intelligent system.
8Problem Statement and Research Motivation
Introduction
Research Motivation
Related Research
Research Methodology
Expected Contributions
Conclusions
- An equipment maintenance management system,
called MTrack, was developed by NSERC/Alberta
Construction Research Chair and used by Standard
General Inc. (SGI) since 1997, generating a large
data collection on equipment management. - Based on the case of SGI and common to the
industry, it is found that a number of problems
undermine the usability of the collected data - Equipment data
- Data quality issues
- Scattered data sources
- Most data are stored in relational database with
high performance in data storage/updating but
limited capability in data analysis - Equipment data utilization
- Large amounts, but low rate of utilization for
decision support in equipment management - Lack of advanced computer tools for decision
support using these data. Data analysis is
commonly conducted by exporting data to
statistical tools or spreadsheets.
9Problem Statement and Research Motivation
Introduction
Research Motivation
Related Research
Research Methodology
Expected Contributions
Conclusions
- Relational databases are back-ends of most
equipment management systems - Equipment data stored in relational database are
considered to be superior than these in
applications, spreadsheets, or text files for
analysis - Still, equipment data in a relational database
suffer two drawbacks in terms of data analysis - Organized in relational data model, which is
optimized for data adding/updating, but does not
perform well in data retrieval - Extracting data from equipment database can only
be performed by database specialists, as a
results, the user has limited control over what
data can be extracted
10Problem Statement and Research Motivation
Introduction
Research Motivation
Related Research
Research Methodology
Expected Contributions
Conclusions
- This research is motivated by the recent
development in data warehousing and data mining
techniques - Affordable as commercial products
- Integration with other applications through
standard communication protocols - And their capability of improving data quality,
data structure and knowledge extraction in terms
of data utilization.
11Problem Statement and Research Motivation
Introduction
Research Motivation
Related Research
Research Methodology
Expected Contributions
Conclusions
- Data warehousing for Construction Equipment
Management - A warehouse is a subject-oriented, integrated,
time-variant and non-volatile collection of data
in support of management's decision making
process Inmon 1996. - Construction equipment data warehouse improves
data quality by preprocessing, and integrate data
by pulling needed data from various sources. - Construction equipment data warehouse uses
multidimensional data model and is built around
the subjects of equipment management, making it
possible to perform interactive data analysis by
end users. - Construction equipment data warehouse can serve
as significantly improved data source for
knowledge discovery and prediction.
12Problem Statement and Research Motivation
Introduction
Research Motivation
Related Research
Research Methodology
Expected Contributions
Conclusions
- Data mining for identification of patterns,
making prediction and forecasting - Data mining is the process of extracting non
trivial, implicit, previously unknown and
potentially useful information from large
collections of data Frawley et al. 1992 - Data mining can
- Identify patterns (common patters or unusual
patterns) - Create data mining models for exploration of
hidden knowledge in data - Use model for prediction and forecasting.
These tasks are common problems in decision
making for equipment management.
13Problem Statement and Research Motivation
Introduction
Research Motivation
Related Research
Research Methodology
Expected Contributions
Conclusions
- Data mining models are data-driven, which means
the data mining algorithms derive models from
historical data, and can update the model
dynamically if there are any changes in data. - Comparison with Statistical, mathematical and
simulation models
14Problem Statement and Research Motivation
Introduction
Research Motivation
Related Research
Research Methodology
Expected Contributions
Conclusions
Three research topics on data mining are selected
15Related Research
Introduction
Research Motivation
Related Research
Research Methodology
Expected Contributions
Conclusions
- Researches on construction equipment management
are primarily focused on - Automation and robotic technologies
- Real-time data communications and information
processing - Statistical and analysis modeling for decision
support in equipment management. - Researches in intelligent equipment/assets
management systems are conducted in the
maintenance operations of the following
facilities - Power plants,
- Industrial plants,
- Military facilities
16Related Research
Introduction
Research Motivation
Related Research
Research Methodology
Expected Contributions
Conclusions
- Researches on applying data warehousing
technology in following areas of construction
industry - Inventory management of construction materials
Chau et al. 2004 - Document management for multi-party and
multi-purposes Ma et al. 2005 - Data mining techniques are applied in various
researches and applications of construction
industry - Estimation of construction productivity using
artificial neural network Lu et al. 2002 - Construction delay evaluation using C 4.5
decision tree Soibelman and Kim 2002 - Classify and quantify cumulative impact of change
orders on productivity using decision tree Lee
et al. 2004 - Automatic classification of construction document
using Support Vector Machine (SVM) Caldas et al.
2004 - Preliminary research is conducted on data
preprocessing and knowledge discovery from
construction data Soibelman and Kim 2002
17Research Methodology
Introduction
Research Motivation
Related Research
Research Methodology
Expected Contributions
Conclusions
Equip. data warehousing
RB-outlier mining
ART prediction
ART forecasting
- The research is based on the case of equipment
management in Standard General Inc., Alberta,
Canada. - An overview of research scope
18(No Transcript)
19Research Methodology (Data warehousing for
equipment management)
Introduction
Research Motivation
Related Research
Research Methodology
Expected Contributions
Conclusions
Equip. data warehousing
RB-outlier mining
ART prediction
ART forecasting
- Procedures for construction equipment data
warehousing - Identify all the data sources containing related
data for equipment management - Extract, transform and load data into the data
warehouse repository - Expose data for interactive data analysis,
knowledge discovery or reports.
20Research Methodology (Data warehousing for
equipment management)
Introduction
Research Motivation
Related Research
Research Methodology
Expected Contributions
Conclusions
Equip. data warehousing
RB-outlier mining
ART prediction
ART forecasting
Procedures for building equipment data warehouse
21Research Methodology (Data warehousing for
equipment management)
Introduction
Research Motivation
Related Research
Research Methodology
Expected Contributions
Conclusions
Equip. data warehousing
RB-outlier mining
ART prediction
ART forecasting
- Data modeling and architectural design
- Design of multidimensional data model for each
subject, using star schema - Each data model contains a fact table surrounded
by a number of dimension tables - Descriptive attributes are usually formed in
hierarchy to allow for data analysis at different
levels of details. - Many data cubes share dimensions in equipment
data warehouse, use Bus Matrix model, proposed by
Kimball and Ross2002, for architectural design
22Research Methodology (Data warehousing for
equipment management)
Introduction
Research Motivation
Related Research
Research Methodology
Expected Contributions
Conclusions
Equip. data warehousing
RB-outlier mining
ART prediction
ART forecasting
Measurements of performance in equipment
management
Dimensions describing the measurements in fact
table
Multidimensional data model for subject Repair
Cost
23Research Methodology(Resolution-based outlier
mining algorithm)
Introduction
Research Motivation
Related Research
Research Methodology
Expected Contributions
Conclusions
Equip. data warehousing
RB-outlier mining
ART prediction
ART forecasting
- Outlier mining identifies irregular patterns, or
inconsistent records from a dataset. - Current statistical methods (e.g. multivariate
outlier detection) and outlier mining algorithms
do not perform well in engineering applications. - An non-parametric outlier mining algorithm based
on resolution change is proposed in this
research.
24Research Methodology(Resolution-based outlier
mining algorithm)
Introduction
Research Motivation
Related Research
Research Methodology
Expected Contributions
Conclusions
Equip. data warehousing
RB-outlier mining
ART prediction
ART forecasting
- Why is it possible to identify outliers during
resolution change? - Given a set of data objects, the underlying
clusters and outliers change when increasing
or decreasing the resolution of data objects. - This makes it possible to identify outliers by
consecutively changing the resolution of a set of
data objects and collect pre-defined statistical
properties. - A resolution based outlier factor (ROF) is
defined for measuring the degree of outlying of a
data point. - Overall procedures for the algorithm
25Flowchart for resolution-based outlier mining
algorithm
26Research Methodology(Resolution-based outlier
mining algorithm)
Introduction
Research Motivation
Related Research
Research Methodology
Expected Contributions
Conclusions
Equip. data warehousing
RB-outlier mining
ART prediction
ART forecasting
Preliminary experimental results on a synthetic
dataset
Top-10 outliers detected from a synthetic
200-tuple Dataset
27Research Methodology(Resolution-based outlier
mining algorithm)
Introduction
Research Motivation
Related Research
Research Methodology
Expected Contributions
Conclusions
Equip. data warehousing
RB-outlier mining
ART prediction
ART forecasting
- The research will compare the proposed algorithm
with the current distance-based outlier mining
algorithm Knnor and Ng 1998 and Local Density
Based outlier mining algorithm Breunig et al.
2000 from the following perspectives - Results of experimental tests on synthetic
datasets and real life equipment datasets - Explain the test results from the outlier
definition and outlier mining algorithms - Compare the pros and cons of the three algorithms
in engineering applications.
28Research Methodology(AutoRegression Tree based
prediction)
Introduction
Research Motivation
Related Research
Research Methodology
Expected Contributions
Conclusions
Equip. data warehousing
RB-outlier mining
ART prediction
ART forecasting
- For numerical target attribute, prediction
problem in data mining is to estimate the target
attribute based on a set of known attributes
(categorical or numerical values). - AutoRegression Tree (ART) data mining algorithm
Meek et al. 2002 is a hybrid algorithm of
decision tree and multivariate linear regression,
designed for prediction purpose. - Using the training dataset, ART algorithm grows a
top-down decision tree with a linear regression
model in each leaf node.
29Research Methodology(AutoRegression Tree based
prediction)
Introduction
Research Motivation
Related Research
Research Methodology
Expected Contributions
Conclusions
Equip. data warehousing
RB-outlier mining
ART prediction
ART forecasting
- Information gain is used to select attributes and
splits for C4.5 decision tree growing Kantardzic
2003.
30Research Methodology(AutoRegression Tree based
prediction)
Introduction
Research Motivation
Related Research
Research Methodology
Expected Contributions
Conclusions
Equip. data warehousing
RB-outlier mining
ART prediction
ART forecasting
- Least square method is used to build the
multivariate linear regression model in each leaf
node. - An example of ART estimation is to evaluate work
orders estimated by the equipment superintendent - Given the factors of impact, such as equipment
manufacturer, age, component, repair type,
estimated hours etc, what is the likely error of
estimates?
31Research Methodology(AutoRegression Tree based
prediction)
Introduction
Research Motivation
Related Research
Research Methodology
Expected Contributions
Conclusions
Equip. data warehousing
RB-outlier mining
ART prediction
ART forecasting
Part of the Induced Work Order Evaluation ART
Model
32Research Methodology(AutoRegression Tree based
prediction)
Introduction
Research Motivation
Related Research
Research Methodology
Expected Contributions
Conclusions
Equip. data warehousing
RB-outlier mining
ART prediction
ART forecasting
- The research will evaluate the ART algorithm in
- Estimate accuracy
- Interpretability of the derived model
- Flexibility in solving different problems in
equipment management. - And compare with Categorization and Regression
Tree (CART) Breiman et al. 1984.
33Research Methodology(AutoRegression Tree based
time series forecasting)
Introduction
Research Motivation
Related Research
Research Methodology
Expected Contributions
Conclusions
Equip. data warehousing
RB-outlier mining
ART prediction
ART forecasting
- Time series data is a series of data collected
over successive increments of time - Time series forecasting predicts future values of
a time series, based on historical observations
and assuming the current trend continues.
34Research Methodology(AutoRegression Tree based
time series forecasting)
Introduction
Research Motivation
Related Research
Research Methodology
Expected Contributions
Conclusions
Equip. data warehousing
RB-outlier mining
ART prediction
ART forecasting
- Traditional statistical methods decompose a time
series into four basic movements trend, cyclic,
seasonal, and irregular movements. - The basic assumption of time series forecasting
model is AutoRegression, which assumes the
current value of a time series depends on its
previous n observed values.
Noise
- In its most simple case, use linear regression
and solve the problem using least square method.
35Research Methodology(AutoRegression Tree based
time series forecasting)
Introduction
Research Motivation
Related Research
Research Methodology
Expected Contributions
Conclusions
Equip. data warehousing
RB-outlier mining
ART prediction
ART forecasting
- Current approaches for solution
- Statistical modeling and forecasting
- ARMA (AutoRegression Moving Average) is a
representative statistical approach for modeling
and forecasting. - Neural Network forecasting
- Use Neural Network (NN) to replace the regression
model. - This research use ART model for forecasting
- Use AutoRegression Tree (ART) data mining model
to replace the regression model.
36Preliminary test results
Forecasting results using ART model for
Predicting monthly equipment repair and
maintenance costs in Standard General Inc.
37Research Methodology(AutoRegression Tree based
time series forecasting)
Introduction
Research Motivation
Related Research
Research Methodology
Expected Contributions
Conclusions
Equip. data warehousing
RB-outlier mining
ART prediction
ART forecasting
- Time-series prediction based on ART data mining
algorithm will be compared with ARMA Statistical
method and neural network from the following
perspectives - Accuracy of prediction
- Extensibility and transparency of prediction
model - Pros and cons in system integration for equipment
management.
38Expected Contributions
Introduction
Research Motivation
Related Research
Research Methodology
Expected Contributions
Conclusions
- This research will provide guidelines for
applying data warehousing technology to
construction equipment management for improved
decision support. These include the
opportunities, challenges, and suggestions for
planning and design of an equipment data
warehouse - A novel non-parametric outlier mining algorithm
is proposed for generic problem detection in both
equipment management and other engineering
applications. This will contribute to the body of
knowledge in data mining community. - Current data mining algorithms, such as
AutoRegression Tree, will be tested, evaluated
and modified for intelligent decision support in
construction equipment management. This research
will report my findings and make recommendations
on the general application of data mining
technology in construction equipment management. - This research will summarize and make
recommendations on the architectural design and
implementation of an intelligent equipment
management information system using combined data
warehousing/data mining techniques, to meet
industrial expectations.
39Conclusions
Introduction
Research Motivation
Related Research
Research Methodology
Expected Contributions
Conclusions
- This research will address data management issues
and facilitate transformation of data into useful
information and knowledge which the equipment
management team can act upon. - High level design of intelligent systems for
equipment management, and detailed study on a
novel data mining algorithm, evaluation of
current ART data mining algorithms for
engineering applications are the focuses of this
research - This research addresses both academic issues and
real life application issues, therefore it will
directly benefit the construction industry.
40References
Introduction
Research Motivation
Related Research
Research Methodology
Expected Contributions
Conclusions
- Breiman, L., Friedman, J., Olshen, R. and Stone,
C. (1984). Classification and Regression Trees,
Chapman Hall/CRC Press, Boca Raton, FL. - Breunig, M., Kriegel, H., Ng, R., and Sander, J.
2000. LOF identifying density-based local
outliers, Proceedings of ACM SIGMOD 2000
International Conference on Management of Data,
Dalles, TX, USA - Caldas, C. H., Soibelman, L. and Han J. (2002)
Automated Classification of Construction Project
Documents. ASCE Journal of Computing in Civil
Engineering, 16(4), 234-243 - Chau, K.W., Cao, Y., Anson, M., and Zhang J.
2002. Application of Data Warehouse and Decision
Support System in Construction Management.
Automation in Construction, 12 213224. - Frawley, w., Piatetsky-Shapiro, G. and Matheus,
C. (1992). Knowledge Discovery in Databases An
Overview. AI Magazine, Fall 1992, pp. 213-228. - Inmon, W.H. (1996). Building the Data Warehouse.
John Wiley Sons, New York. - Kantardzic, M. (2003). Data Mining Concepts,
Models, Methods, and Algorithms. John Wiley
Sons, Inc. NJ. USA. - Kimball, R. and Ross, M. 2002. The Data Warehouse
Toolkit The Complete Guide to Dimensional
Modeling, second edition, John Wiley Sons,
Inc., New York, pp. 1388. - Knorr, E., and Ng, R. (1998) Algorithms for
mining distance-based outliers in large
datasets. Proceedings of Very Large Data Bases
Conference, New York, USA - Lee, M., Hanna, A.S. and Loh, W.Y. (2004).
Decision Tree Approach to Classify and Quantify
cumulative Impact of Change Orders on
Productivity. J. Comp. in Civ. Engrg., ASCE,
18(2), 132 - Lu, M., AbouRizk, S.M. and Hermann U.H. (2002).
Estimating labor productivity using probability
inference neural network J. Comp. in Civ.
Engrg., ASCE, 14(4), 241-248 - Ma, Z., Wond, K.D., Heng, L. and Jun Y. (2005)
Utilizing exchanged documents in construction
projects for decision support based on data
warehousing technique. Automation in
Construction, 14(3), 405-412 - Meek, C., Chickering, D.M. and Heckerman, D.
(2002). Autoregressive Tree Models for
Time-Series Analysis. Proceedings of the 2nd
SIAM International Conference on Data Mining,
Arlington, VA, USA. - Soibelman, L and Kim, H. (2002). Data
Preparation Process for Construction Knowledge
Generation through Knowledge Discovery in
Databases. ASCE Journal of Computing in Civil
Engineering, 16(1), 39-48 - Vorster M. C. and Livermore M. E. (1994).
Executive development for equipment managers.
Proceedings of conference Equipment Resource
Management into the 21st Century. Nashville,
Tennessee, pp 87 95of conference Equipment
Resource Management into the 21st Century.
Nashville, Tennessee, pp 87 95
41H. Fan, Prov. Ph.D. Candidate
University of Alberta