Title: A Survey of Process Mining in ProM
1A Survey of Process Mining in ProM
Decision Systems Lab (DSL) Seminar School of
Computer Science and Software Engineering Faculty
of Informatics
DSL 7 September 2009
2Outline
- - What is Process Mining?
- Objectives of Process Mining
- Background of Process Mining
- Current Process Mining Techniques
- Effectiveness of Process Mining
- A Process Mining Tool ProM
DSL 7 September 2009
3What is Process Mining?
- Process mining is to automatically determine and
analyse actual process execution How the
processes are performing in a complete new and
process oriented way. - The basic idea behind Process Mining is to
extract knowledge from event logs, recorded by IT
systems. - Data Mining practice has been developed and
adapted to create the business process-mining
techniques that are now being used to mine data
logs containing process execution data.
DSL 7 September 2009
4What is Process Mining? (cont)
- Note that, this concept is not limited to IT
system, it can also be used to monitor other
operational processes or system such as - Complex workflows in a large enterprise
- Complex device working (e.g. X-ray machines,
supercomputer, etc.)
DSL 7 September 2009
5What is Process Mining? (cont)
- An example of Process Mining Paradigm
1. Information System It contains valuable
information about (the performance of) the
organization.
2. The Event Logs It contains historical data of
actual process execution. Indeed, it contains the
implicit answers to the famous questions Who
did, What, When and How.
3. How to get any answer about process execution
it can extract any answer through a process
mining technique.
An Example of Process Mining Paradigm
DSL 7 September 2009
6Objectives of Process Mining
- Using the knowledge that is extracted from event
logs - To maintain business processes
- To improve real business processes
- To (re)design actual business process
An Example of Process Redesign Cycle
DSL 7 September 2009
7Background of Process Mining Techniques (1)
- - Agrawal et al. (1998) were early pioneers of
process mining. Their algorithmic - approach to process mining allowed the
construction of process flow graphs from - execution logs of a workflow application.
- - The discipline of process mining also has its
roots in the work of Cook and Wolf (1998) who
attempted to discover software process models
from the data contained in event logs. - - van der Aalst (2004) compares the method of
extracting process models from data with that of
distillation. - - In terms of business process mining, van der
Aalst (2004) states that almost any transactional
information system can provide suitable data.
References Agrawal, R., Gunopulos, D.,
Leymann, F. (1998), "Mining process models from
workflow logs", in Schek, H.J. (Eds),Proceedings
of the 6th International Conference on Extending
Database Technology Advances in Database
Technology, Springer Verlag, Heidelberg, .
Cook, J.E., Wolf, A.L. (1998a), "Discovering
models of software processes from event-based
data", ACM Transactions on Software Engineering
and Methodology, Vol. 7 No.3, pp.215-49. van
der Aalst, W.M.P. (2004a), "Process mining a
research agenda", Computers in Industry, Vol. 53
pp.231-44.
DSL 7 September 2009
8Background of Process Mining Techniques (2)
- van der Aalst (2003) identifies two broad types
of workflow meta models - (1) Block-orientated meta model
- (2) Graph-orientated meta model
- Each model contains with their own language and
graphical representation. - - Aguilar-Saven (2004) adds net-based languages
to this definition (with block-oriented
models/languages being grouped under the term
workflow languages).
An Example of Block-oriented Meta Model
References van der Aalst, W.M.P. (2003),
"Workflow mining a survey of issues and
approaches", Data Knowledge Engineering, Vol.
47 pp.237-67. Aguilar-Saven, R.S. (2004),
"Business process modelling review and
framework", International Journal of Production
Economics, Vol. 90 pp.129-49.
DSL 7 September 2009
9Background of Process Mining Techniques (3)
- - The most common form of graph oriented
meta-model is the directed graph. - Agrawal et al. (1998) was one of the first to
use directed graphs in process mining. -
- The author describes a number of constructs
involved in the actual graph. Activities, usually
enclosed in boxes or circles, are referred to as
vertices and the arrows between the activities,
that indicate the direction of flow, are known as
edges.
Examples of Graph-oriented Meta Model
References Agrawal, R., Gunopulos, D.,
Leymann, F. (1998), "Mining process models from
workflow logs", in Schek, H.J. (Eds),Proceedings
of the 6th International Conference on Extending
Database Technology Advances in Database
Technology, Springer Verlag, Heidelberg.
DSL 7 September 2009
10Current Process Mining Techniques
- There are several techniques that may be used to
perform mining of business process such as - Genetic algorithms Algorithms designed around
the process of Darwinian natural selection (Alves
de Medeiros et al. 2004) - General algorithmic approach Custom algorithms
designed for mining processes by individual
authors (van der Aalst and Song, 2004) Petri
Net. - Markovian approach An algorithm that examines
past and future behaviour to define a potential
current state (Cook and Wolf, 1998a). - Neural network Models the human mind in its
ability to learn and then identify patterns in
data (Cook and Wolf, 1998a). - Cluster analysis Divides a group of solutions
into homogenous sub groups (Schimm, 2004).
DSL 7 September 2009
11Effectiveness of Process Mining
- Using process mining, typical manager questions
that can be answered include
- What is the most frequent path in a process? -
To what extend do the cases comply with a process
model? - What are the routing probabilities in a
process? - What are the throughput times of a
cases? - What are the service times for a
tasks? - When will a case be completed? - How
much time was spent between any two tasks in a
process? - What are the business rules in a
process, and are they being obeyed? - How many of
people are typically involved in a case? - Which
people are central in an organization?
DSL 7 September 2009
12A Process Mining Tool ProM
- ProM (Process Mining) is a generic open-source
framework for implementing process mining tools
in a standard environment. - It is an extensible framework that supports a
wide variety of process mining techniques in the
form of plug-ins. - It is platform independent as it is implemented
in Java. - The ProM framework receives as input logs in the
Mining XML (MXML) format.
DSL 7 September 2009
13Mining Plugins
- There are mining plugins, such as
- Plugins supporting control-flow mining
techniques (such as the Alpha algorithm, Genetic
mining, Multi-phase mining, ...) - Plugins analysing the organizational perspective
(such as the Social Network miner, the Staff
Assignment miner, ...) - Plugins dealing with the data perspective (such
as the Decision miner, ...) - Plugins for mining less-structured, flexible
processes (such as the Fuzzy Miner) - Elaborate data visualization plugins (such as
the Cloud Chamber Miner) - Furthermore, there are analysis plugins dealing
with - The verification of process models (e.g., Woflan
analysis) - Verification of Linear Temporal Logic (LTL)
formulas on a log - Checking the conformance between a given process
model and a log - Performance analysis (Basic statistical
analysis, and Performance Analysis with a given
process model)
DSL 7 September 2009
14An Overview of Process Mining in ProM
DSL 7 September 2009
15Petri Net
- It is one of several mathematical modelling
languages for the description of discrete
distributed systems. - A Petri net is a directed bipartite graph, in
which the nodes represent transitions (i.e.
discrete events that may occur), places (i.e.
conditions), and directed arcs (that describe
which places are pre- and/or post-conditions for
which transitions).
Example of a bipartite graph
- Petri nets were invented in August 1939 by Carl
Adam Petri at the age of 13.
DSL 7 September 2009
16Petri Net as Graphs
In Petri nets nodes of the first subset of
vertices are called places, nodes of the second
is transitions. ? Places usually model resources
or partial state of the system. The symbol of a
place is a circle or an ellipse ? Transitions
model state transition and synchronization. The
symbol of transition is a solid bar or a
rectangle ? The edges of the graph are called
arcs
Tokens ? The tokens are denoted by a solid dot
and can be placed inside the place symbol. ? They
indicate presence or absence of, for example,
resource. ? Places can hold any number of tokens
or only a limited number (capacitated places).
DSL 7 September 2009
17Petri Net as Graphs (cont)
- Transition (firing) rule
- A transition t is enabled if each input place p
has at least w(p, t) tokens. - An enabled transition may or may not fire.
- A firing on an enabled transition t removes w(p,
t) from each input place p, and adds w(t, p') to
each output place p'.
DSL 7 September 2009
18Petri Net as Graphs (cont)
Firing Example
2H2 O2 ? 2H2O
Starting graph
After firing
DSL 7 September 2009
19Petri Net in ProM
- The type of data in an event log determines
which perspectives of process mining can be
discovered. - ProM is used for mining control-flow from event
logs. - If the log (i) provides the tasks that are
executed in the process and (ii) it is possible
to infer their order of execution and link these
tasks to individual cases (or process instances),
then the control flow perspective can be mined.
DSL 7 September 2009
20An Example of Petri Net in ProM
Petri net illustrating the control-flow
perspective that can be mined from the event log
DSL 7 September 2009
21Cleaning the Log
- To get a better solution for mining knowledge
from event logs, the log should be cleaned
before mining knowledge. - In ProM, a log can be filtered by applying the
provided Log Filter. - There are five log filters Processes, Event
types, Start events, End event , and Events.
- The processes log filter is used to select which
processes should be taken into when running a
process mining algorithm. Note that a log may
contain one or more processes types. - - The event types log filter allows us to select
the types of events (or tasks) that we want to
consider while mining the log. - - The Start events filters the log so that only
the traces (or cases) that start with the
indicated tasks are kept. - - The End Events works in a similar way, but the
filtering is done with respect to the final tasks
in the log trace. The Event filter is used to set
which events to keep in the log.
DSL 7 September 2009
22The Examples of Effectiveness of ProM (1)
- - To mine the control-flow of a process from an
event log.
DSL 7 September 2009
23The Examples of Effectiveness of ProM (2)
- To mine organizational-related information about
a process. - - It can help to answer questions regarding to
social (organizational) aspect of an
organization. The questions should be - 1. How many people are involved in a specific
case? - 2. What is the communication structure and
dependencies among people? - 3. How many transfers happen from one role to
another role? - 4. Who are important people in the
communication flow? (the most frequent flow) - 5. Who subcontracts work to whom?
- 6. Who work on the same tasks?
- - These and other related questions can be
answered by using the mining plug-ins Social
Network Miner and Organizational Miner, and the
analysis plug-in Analyze Social Network.
DSL 7 September 2009
24An Example of The Analyzer Social Network
- - A social network is a description of the social
structure between actors, mostly individuals or
organizations. - - It indicates the ways in which they are
connected through various social familiarities
ranging from casual acquaintance to close
familiar bonds.
DSL 7 September 2009
25An Example of Organizational Miner
DSL 7 September 2009
26Evaluation Mining Techniques in Prom
- ProM uses the same evaluation techniques that
often are used in information retrieval area
Precision and Recall. -
- Recall is percentage of all relevant documents
that are found by a search. - Precision is Percentage of retrieved documents
that are relevant.
DSL 7 September 2009
27Thank you