Title: Mining Plans for Customer-Class Transformation
1Mining Plans for Customer-Class Transformation
- Qiang Yang
- Hong Kong University of Science and Technology
- and
- Hong Cheng
- UIUC, Illinois USA
2Example 1
- What to do to help Sammy get accepted to
postgraduate school?
- Plan 1 (Dylan) improve Rec to above 4 AND GPA to
3.6 - Plan 2 (Steve) improve TOEFL to 610
3A Marketing Example
Name Salary Cars Loan Signup
John 80K 3 None Y
Mary 40K 1 300K Y
Steve 40K 1 None N
- Suppose a company is interested in marketing to a
group of customers in the Customer table - In addition, we have a database of past plans
 Plan No.  Action No.  State Before Action  State Before Action Action Taken
 Plan No.  Action No. Salary Action Taken
1 1 50K Mail
1 2 50K Gift
1 3
2 1 20K Mail
A candidate plan is Step 1 Send mails
Step 2 Call home Step 3 Offer low
interest rate
4A Planning Problem
- Recognize who are the (potential) negative-class
members - Segmentation Problem
- Recommend near optimal sequence of actions to
help them switch to positive class - Planning to achieve goals
- What does near-optimal mean?
- Cost
- Probability of success
- benefit
Utilities
5Related Work I Markov Decision Process (MDP)
Model
- An MDP model consists of
- A set of environment states S
- A set of agent actions A and transition
P(s1,a,s2) and - A set of scalar reinforcement signals R(s,a)
reward - Aim a policy , a mapping from states to
actions - MDP satisfying the Markov property.
- The aim of MDP is to find a policy in which to
direct an agents action no matter where the
agent is observed to land. - An optimal action is chosen based on the agents
observed resulting state. - It is more suitable for direct marketing
(Pednault et al. 2002).
6Policies
- Nonstationary policy
- Stationary policy
- pS ? A
- p(s) is action to do at state s (regardless of
time) - analogous to reactive or universal plan
- These assume or have these properties
- full observability
- history-independence
- deterministic action choice
7Learning an optimal policy
- Policy
- pS x T ? A
- p(s,t) is action to do at state s with
t-stages-to-go - Under policy , the optimal value of a state
is - The optimal value function can be defined as
- Given the optimal value function, the policy is
- Value iteration -- indirect
- An iterative algorithm to learn the optimal value
function. - The optimal policy can be derived from the
optimal value function. - Policy iteration -- direct
- Manipulates the policy directly.
- Try to improve a policy by trying new actions
under states.
8Optimal Policy
P(s0) argmax
9Related Work(II) Sequence Mining
- Han et al. 99 focused on mining significant
patterns of plans in a large plan database using
a divide-and-conquer strategy. - Zaki et al. 98 developed a technique for mining
plans which indicate high incidence of plan
failure. - Discussion
- These works mainly focused on frequent pattern
mining. - No definition of plan utility, state transitions
or action effects. - No composition of multiple segments into a plan
- Apriori-based
- AprioriAll, AprioriSome and DynamicSome Agrawal
and Srikant 95 - GSP Srikant and Agrawal 96
- PSP Masseglia et al. 98
- Lattice-based
- Spade Zaki 01
- Projection-based
- FreeSpan Han et al. 00
- PrefixSpan Pei et al. 01
10Formulate as an MDP?
- An MDP Approach (e.g., Sun and Sessions 01)
- First, optimally solve the MDP for all states
- Then, extract a marketing plan from the state
space - Problem this approach will not provide optimal
solution - We show a counter example next
11Counter Example
Actions a1, a2, utilities marked at
leaves Assume all transition probabilities
p(s,a,s) 0.5 Cost(a2)2, Cost(a1)1.
Best plan a1a2, Utility2
S0
a2
a1
MDP Plan a2a1, Utility1.5
S1
S2
S3
S4
a2
a2
a2
a2
a1
a1
a1
a1
S5
S8
S7
S6
S12
S11
S10
S9
S20
S19
S18
S17
S16
S15
S14
S13
6
6
2
4
2
2
6
6
4
4
2
2
6
6
0
2
12Finding One Plan
- Objective convert customers from negative (-)
class to positive () class with lowest cost - Plan sequence lta1, a2, angt
- Plan Cost
- Success Threshold E(Sn)gts, where
- E(Sn) is the expected value of the state
classification probability p(s) of all terminal
states s the plan leads to - s is a user-defined probability threshold
- Length constraint
- the number of actions must be at most Max_Step
S0
a1
a2
an
S1
Sn
13Finding All Plans AUPlan
- Issues
- MPlan terminates when finding one plan
- We with to find all plans whose utility is
greater than a minimum threshold - We add utility to Apriori Algorithm
- Still consider frequency
- If a plan occurs too infrequently, we have low
confidence on it. - To balance utility and frequency factors, we
define a paramter minSU. Plans must have the
product of utility and support greater than minSU.
14Mapping from Planning to Mining
State0S0 Action1 Action2 Action3
S0 A1 A2 A1
S0 A1 A3 A2
S0 A3 A1 An
State0S0 Action1 State1 Action2 State2
S0 A1 S1 A2 S2
S0 A1 S2 A3 S1
S0 A3 Sk A1 S3
- Consider the problem of finding all plans
starting from state s0 - We need consider only action sequences, not
states. This is because states resulting from
actions cannot be observed during plan execution
15Defining Utility
- Utility is defined as
- P and P are two plans
- thus,
- If s is the initial state for plan P,
- If s is a leaf node,
16AUPlan Candidate generation and pruning
- Utility does not satisfy the anti-monotone
property. - Anti-monotone property for plans if a shorter
plan cannot satisfy the constraint, then super
plan can satisfy the constraint either. - Example
U(Plan2)20-119
U(Plan1)10-55
-6
-5
s2
s1
s0
U(s1)10
U(s2)20
17New Pruning Criterion
- Need to design an upper bound of utility measure
to ensure that we never prune high-utility plans - Utility upper bound
- Max_all reachable states (Pr(S)) Cost(Plan
So Far) - Denote as UtilityUpperBound
- A plan P is pruned if
18Estimating Utility Upper Bound
19AUPlan Candidate generation and pruning
- else
- for(a in action set)
-
- P append a to P
- if( )
-
- prune P
- else
- insert(Ck1, P )
- else
- else
- if ( )
- then
- insert(Lk, P )
- for(a in action set)
-
- Pappend a to P
- insert(Ck1, P )
-
- then
20AUPlan Adapting the Apriori
- Input A plan database, minSU and maxlength.
- Output High utility plans
- Algorithm
- 1. C1 length-1 plans
- 2. K1
- 3. While(K lt maxlength)
- 3.1 Count support for each plan P in Ck.
- Calculate utility for each plan P in Ck.
-
- 3.2 Generate Ck1 from Ck.
- 3.3 K K 1
- End while
- 4.
- 5. For each state s
- 5.1 For each plan P in L starting with s
- Calculate the utility
- 5.2 Select the plan with the highest utility as a
plan starting from state s. - 6. Output plans.
21Experimental Results
- Data Generation
- IBM Synthetic Generator Quest to generate a
Customer table. - We generate a plan database of traces with
temporal order. - 100,000 records
- Nine features
- The positive class has 30K records and the
negative has 70K. - A classifier is trained using the C4.5 decision
tree algorithm. The classifier will give p(s).
- Objective
- Quality and Speed
- Algorithm Candidates
- Optimal OptPlan
- Sun and Sessions 01 QPlan
- MPlan
- AUPlan
- APlan is a simplified version of AUPlan
22Experimental Results (I) Scale Up
- We generate five plan databases with different
sizes.
Plan database Length limit (max of actions) Database size (MB) Switching Rate ()
Plan DB1 5 2.74 20
Plan DB2 9 4.19 40
Plan DB3 14 5.49 60
Plan DB4 29 6.53 80
Plan DB5 100 7.44 100
23Experimental Results (I) Quality (Expected
Utility)
- OptPlan has the maximum utility
- AUPlan has about 80 of the optimal solution.
- MPlan and QPlan has less than 60 of the optimal
solution.
Figure 2. Relative Utility of different
algorithms vs. different plan databases
24Experimental Results (I) DB size with CPU Time
- MPlan is the most efficient algorithm.
- AUPlan comes next.
- OptPlan changes little for constant length
Figure 1. CPU time of different algorithms vs.
the size of plan databases
25Experimental Results Comparison with Pure Apriori
- We compare the quality of plans found by APlan
and AUPlan. - The lift chart shows the percentage of utility of
APlan and AUPlan vs. the percentage of plans. - The curve of AUPlan is closer to the upper left
corner than that of APlan. - This shows that support is not a good measure for
mining plans.
Figure 5. Lift chart Relative Utility of APlan
and AUPlan vs. the percentage of plan number
26Conclusions
- Combine both data mining and planning
- to discover high utility plans from large plan
databases. - We define states, actions and utilities in the
sequence database and go beyond the limit of
sequence mining. - Our experiment shows that
- our plan mining algorithm is more efficient and
scalable than MDP based methods - while the approximate solution quality is about
80 of the optimal solution.
27Future Work
- Partially observable states
- Improve the efficiency of AUPlan.
- Can we mine plans without candidate generation?
- Relate to Conformant Planning