Title: Enterprise and Business Intelligence Systems e'bis'business'utah'edu Research Lab, UA UU Director Ol
1Enterprise and Business Intelligence Systems
(e.bis.business.utah.edu)Research Lab, UA -gt
UUDirectorOlivia R. Liu Sheng, Ph.D.Emma
Eccles Jones Presidential Chair of
BusinessSchool of Accounting and Information
SystemsDavid Eccles School of BusinessUniversity
of Utah801-585-9071, olivia.sheng_at_business.utah.
edu
2e.bis Research Focus
- Enterprise Systems
- E-procurement technology
- Web content caching and storage mgmt
- Enterprise application integration
- Process modeling and re-use
- System security and risk management
- Portal design and management
- Business Intelligence Systems
- Decision support systems
- Data/web mining
- Knowledge management
- Knowledge refreshing
- Personalization
3e.bis Research Output
- Models
- Methods
- Technology
- Analyses
Fueled by Applications!
4Faculty Olivia R. Liu Sheng, Ph.D. UU Paul Hu,
Ph.D. UU Ph.D. students and Post Docs Xiao
Fang, 5th-yr Ph.D. student UA Lin Lin, 3rd-yr
Ph.D. student UA Wei Gao, 3rd-yr Ph.D.
student UA Hua Su, post-doc UA Xiaoyun Sun,
1st-yr Ph.D. student UA Zhongmin Ma, 1st-yr Ph.D.
student UU 6 to 10 Master and UG students per
yr International and industrial collaborators
5Web Mining for Knowledge Management
6What is Data Mining?
- The automated process of discovering
relationships and patterns in data - Related terms knowledge discovery in database
(KDD), machine learning - A step in the knowledge discovery process
consisting of particular algorithms (methods)
that under some acceptable objective, produces a
particular enumeration of patterns (models) over
the data. - An iterative process within which progress is
defined by discovery, through either automatic
or manual methods - The application of statistical and artificial
intelligence techniques (algorithms) for
discovering patterns and regularities in large
volumes of data.
7Why Data Mining
- Type of knowledge (more abstract) and the level
of sophistication in required computation, e.g.,
- Which buyers are likely to be late on future
payments? - Which sellers are likely to be late on future
deliveries? - If a seller increases product-in-week by x units,
how much of sales increase can be expected. - Which buyers are similar in their buying powers
and product and contract preferences?
- Frequency in discovering and applying the
knowledge is met with bottlenecks in human
processing - Decision support for buyers, sellers and market
hosts at each transaction decision point
- Data Visualization Needs
- Going beyond business charts (e.g., pie, line,
bar charts) - Maps, trees, 2-D, and 3-D
8Taxonomies of Data Mining
9Data Mining Tasks
- Association/Sequential Patterns
- The discovery of co-occurrence correlations among
a set of items.
- Clustering
- Identifying clusters embedded in the data, where
a cluster is a collection of data objects that
are similar to one another.
- Classification
- Analyzing a set of training data and constructing
a model for each class based on the features in
the data.
- Class Description
- Providing a concise and succinct summarization of
a collection of data.
- Time-series Analysis
- Analyzing large set of time-series data to find
certain regularities and interesting
characteristics.
10Market Basket (Association Rule) Analysis
- A market basket is a collection of items
purchased by a customer - in an individual customer transaction, which is
a well-defined - business activity
- Ex
- a customers visit a grocery store
- an online purchase from a virtual store such as
Amazon.com
11Market Basket (Association Rule) Analysis
- Market basket analysis is a common analysis run
against - a transaction database to find sets of items, or
itemsets, - that appear together in many transactions. Each
pattern extracted - through the analysis consists of an itemset and
the number of - transactions that contain it.
- Applications
- improve the placement of items in a store
- the layout of mail-order catalog pages
- the layout of Web pages
- others?
12Clustering
Clustering distributes data into several groups
so that similar objects fall into the same
group. For example, we can cluster customers
based on their purchase behavior.
Applications customer, web content, document and
gene segmentation
13Classification
Classification classifies data into pre-defined
outcome classes
Example
14Classification
Age lt25
Car Type in sports
High
Low
High
Applications customer profiling, shopping
prediction Diagnostic decision support
15By Data
- Structured alphanumeric data
- Buyer, supplier, product, order, bank acct
- Image data
- Satellite, patient, document, handwriting,
facial, etc. - Spatial data
- Map, traffic, geological, CAD, graphics, etc.
16By Data, Contd
- Temporal data
- Time series, population, stock, inventory, sales,
etc. - Spatial and temporal data trajectory
- Text documents, web pages, etc.
- Video/audio surveillance video, voice, music,
etc. -
17Web (Data) Mining
- Web data generated or used by the Web
- Web content - static or dynamic
- Web structure hyperlinks
- Web usage web access log
18Why is Web Mining Important?
- Rich data gathering and access medium
- A variety of important applications
- Information retrieval
- Ecommerce CRM, SCM, etc.
- Knowledge management
- Interesting challenges
- Scalability global, multi-lingual, growth
- Agility of knowledge
19What is knowledge?
- Relationships and patterns in data
- Organized, analyzed and understandable
- Truths, beliefs, perspectives, concepts,
procedures, judgments, expectations,
methodologies, heuristics, restrictions, know-how - Applicable to problem solving and decision making
- DBs, documents, policies and procedures as well
as the un-captured, tacit expertise and
experience - Actionable, at the right place and right time!!!
20What is Knowledge Management?
- Views
- Process (KM activities)
- Goal (Operational efficiency and innovations)
- Methodology (formalization, control and
technology) - Delphi Group Leveraging collective wisdom to
increase responsiveness and innovation.
21What is a KM program?
- Processes
- Organizational structure and policies
- Management theories and methodologies
- Information assurance
- Technologies and resources
- Implementation, training and change management
- Measurement, maintenance and evolution
- A multi-disciplinary effort!!!
- Managerial and cultural
- Technological and engineering
- esources, support and technology for
- Creation, acquisition, organization, storage,
retrieval, visualization and sharing of knowledge
22KM Process
- Identify
- Collect
- Organize
- Represent
- Store
- Locate
- Retrieve
- Extract
- Discover
- Visualize
- Interpret
- Share
- Transfer
- Adapt
- Apply
- Monitor
- Evaluate
- Create
23Data Mining KM
- Data mining ? discover knowledge
- Data mining ? support management of KM
infrastructure - (Personalized) content management
- Security management
- Workflow management
- Scalable performance
24Web Mining KM
- Web mining ? discover knowledge
- Web mining ? support management of web KM portal
- RD
- Intranet
- Consulting
- B2B, B2C, e-government, e-financing, e-risk
management
25Web Mining Knowledge Refreshing
26The KDD Process
Data
27Types of Domain Knowledge
DBA Knowledge
Data
Domain Expert Knowledge
Data Mining Expert Knowledge
28Fundamental Problems
- The size of the database is significantly large
- The number of rules resulting from mining
activity is also large - The knowledge derived from a database reflects
only the current state of the database
?
29Issues in the KDD Process
Agility
Scalability
Data
30 Knowledge Refreshing
- The process to efficiently update discovered
knowledge as data and domain knowledge change. - Goals
- Up-to-date knowledge (Agility)
- Knowledge Re-use (Scalability)
31Type of Changes
NEW
NEW
NEW
NEW
NEW
NEW
DBA Knowledge
Data
Domain Expert Knowledge
Data Mining Expert Knowledge
NEW
32Knowledge Refreshing
- Needs assessment
- Monitoring vs. analytic approaches
- Monitoring/estimate changes in knowledge to
determine if and when to re-mine - Incremental data mining (learning)
- How to leverage knowledge previously discovered
from data mining to improve computational
efficiency and quality of knowledge