Title: Asim Ansari
1E-Customization
2Introduction
- Customization key to managing relationships
- Marketing
- Targeted Promotions
- List Segmentation
- Conjoint Analysis
- Recommendation Systems
- Computer Science
- Collaborative filtering
- Machine learning
3Customization and Electronic Media
- Electronic media facilitate customization
- Low production costs
- Timely data (received) and information (sent)
- Personalizable
- Reach
4Customization Benefits
- Content Providers
- Increasing site usage via customization can
increase advertising revenue - Internet Advertising forecast to grow to rapidly
- E-commerce
- Increasing sales via customization
5E-Customization Contexts
- Content providers can customize
- content (editorial)
- design (how many links and what order)
- to increase site visits, advertising revenue and
loyalty. - E-commerce firms can customize
- content (products, price, incentives, etc.) and
- design (how many items and what order)
- to increase sales and loyalty.
- The structure of the problem is identical.
6E-Customization Strategies
- Two customization Strategies
- Onsite
- External e-mails
- Customizable at low-cost
- Need not wait for customers to come to site
- We take an external customization approach
7E-mail Example
8E-mail Marketing Volume Growth
250
200
Emails
150
(billions)
100
50
0
99
00
01
02
03
04
Email acquisition services
Source Forrester Report Email Marketing Dialog,
January 2000
Email retention services
9E-mail Marketing Services Revenue Growth
5
4
Revenues
3
(billions)
2
1
0
99
00
01
02
03
04
Email acquisition services
Source Forrester Report Email Marketing Dialog,
January 2000
Email retention services
10Email Design Problem
Determine the Content and Layout of the e-mail on
a one-on-one basis
Sports International News National
News Weather Arts
11Approach
Individual level preference coefficients
E-mail Configuration
Statistical Model
Optimization
New E-mail Configuration
Click-through Data
12Statistical Model
- Probability of clicking on a link depends upon
utility to click - Utility of clicking on a link
- f(
- observed e-mail variables (html, links),
- observed link variables (content and order of
link), - unobserved user effect,
- unobserved e-mail effect,
- unobserved link effect,
- error)
13Probit Model
Population Component
- Uijk m1m2Textjm3NumItemsjm4Positionjk
- m5Contentk
-
- li1li2NumItemsjli3 Positionjk
li4Contentk -
- qj1qj2Positionjkqj3Contentk
-
- gk1
-
- eijk
- i is person, j is e-mail and k is link.
Random across Individuals
Random across Emails
14Modeling Heterogeneity
- Random effects are assumed to come from a
population distribution with zero mean - li G1
- qj G2
- gk G3
-
15Modeling Heterogeneity
16Modeling Heterogeneity Dirichlet Process Priors
- Dirichlet Process Priors can be used to model the
uncertainty about functional form of the
population distribution G - Allows semi-parametric estimation of random
effects
17Dirichlet Process Priors
- A Dirichlet Process prior for a distribution G
has two parameters - A distribution function G0(.) and
- A positive scalar precision parameter a
- We write
-
- where, G0 represents the expected value of G and
a gt 0, represents the strength of prior beliefs
that sampled distributions G will be close to G0
18Dirichlet Process Priors
- Let G be a random distribution from the Dirichlet
Process, - Let then,
19Dirichlet Process Role of a
- Large a
- Large number of distinct values from the base
distribution - Sampled distribution approximates base
distribution - Small a
- Sample will have a small number of distinct
values - Sampled distribution approximates a finite mixture
20Dirichlet Process Priors Advantages
- Accommodates non-normality, multi-modality and
skewness - Provides a semi-parametric alternative to the
normal distribution - Provides accurate individual-level estimates
- Allows a synthesis of Finite Mixtures and Normal
Heterogeneity
21Modeling Heterogeneity
- li G1 D(N(0,L), a1)
- qj G2 D(N(0,Q), a2)
- gk G3 D(N(0,t), a3)
22Inference
- Bayesian Inference
- Priors
- m Multivariate Normal
- L-1 Wishart
- Qll Inverse Gamma
- t Inverse Gamma
- a1, a2, a3, Gamma
23Sampling Based Inference
- Joint Posterior Density is very complex and
cannot be summarized in closed form - Sampling Based Inference
- Gibbs Sampling
24Full Conditionals
- Unknowns include
- u, m, li, qj, gk, L, Q, t, a1, a2, a3
- Full conditionals for DP mixed model are very
similar to those for normal population
distributions
25Full Conditionals for Individual-level
parameters DP model
And Gb is the posterior distribution under the
normal base distribution This is akin to
collaborative filtering on parameter space
26Application
- Large content provider with many areas in site
- One area in the site sends e-mails to registered
recipients in an effort to attract them to the
area - Permission marketing
- Design targeting issues
- Number of links, order of links, text or html
- Content targeting issues
- Content type (health, financial, etc.)
27Data
- Three months of e-mails, 1048 users
- E-mail file e-mail date, number of links, order
of links, link content, html or text - User file when received, by whom (registration
data), which links clicked (cookies) - Sample 11,475 observations
- 7 response rate for links
- 36 click on more than one link
28Models
- No heterogeneity
- Person heterogeneity
- Person, E-mail and Link heterogeneity (Full
Model)
29Predictive Ability
Predicted
Click
False Positives
Click
Actual
False Negatives
False Negative Fraction c/(cd), False Positive
Fraction b/(ab)
30Predictive Ability Link Level ROC Curves
True Positive Fraction 1-FNF
False Positive Fraction
31Predictive Ability Email Level ROC Curves
True Positive Fraction 1-FNF
False Positive Fraction
32Results - Parameter Estimates Full Model
- Parameter Value Prob(m lt0)
- Design Variables
- Intercept (m0) -1.47 (1.0)
- Person Random Effects ( Std. l0i) 0.51
- E-mail Random Effects (Std. q0j) 0.45
- Link Random Effects (Std. g0k) 0.21
- E-mail Type (m1) 0.29 (0.48)
- Link Order (m2) -0.37 (1.0)
- Person Random Effects (Std. l2i) 0.49
- E-mail Random Effects (Std. q2j) 0.22
- Number of Links (m3) -0.02 (0.55)
- Person Random Effects (Std. l3i) 0.18
33Parameter Estimates
Dirichlet Process Precision parameters User a1
103 gt 61 clusters Email a2 114 gt 65
clusters Links a3 383 gt 383 clusters
34Link Level Predictions - Calibration Data
35Link Level Predictions - Validation Data
36E-mail Level Prediction - Calibration Data
37E-mail Level Predictions - Validation Data
38Optimization Model Overview
- Editorial content is fixed on a given day.
- n links available for k positions, n k
- How many links to include, what content to
include, and how should it be ordered? - Objective
- Maximize the expected number of click-backs to
the site - Maximize the likelihood of returning to the site
39Optimization Procedures
- Alternative 1 Complete Enumeration
- With many links, computational constraints
- Alternative 2 Assignment Algorithm
40Optimization Objective Function
- Maximize expected number of click-throughs to
site - Let xij 1 if link i is in position j
- Let pij be the probability of click through if
link i is in position j - Maximize Objective function
- Maximize likelihood of at least one click-through
- Minimize Objective function
41Optimization Model
- Step 1
- Maximize Obj (x p(x)k)
- Subject to
- Assignment algorithm provides exact solution
- Step 2
- Maximize over k1, , n.
42Heuristic Approaches
- Original - No change in content or order
- Greedy - No change in content, order highest
utility first - Order - No change in content, optimize order
- Optimal - Optimize content (number of links) and
order (our procedure)
43Optimization Results
44Optimization Results
- Objective At Least One Click
- Optimal leads to 56 increase in at least one
click. - Re-ordering gives 52 improvement, content
selection is the balance. - Optimal improves over Order for 43 of e-mails
(those adverse to clutter). - Greedy and Order are similar, however for users
who have high positive effect for order (scroll
to bottom), Greedy does poorly (one user went
from 81 to 43). - Objective Expected Number of Clicks
- Similar results
45Optimization Results
46Conclusions
- Modeling link response
- Varies with content (information) and design (how
much, what order) - Heterogeneity in persons, links, and e-mails
- E-targeting
- Potential to considerable enhance clicks (and
presumably advertising revenue and loyalty) - Our approach can be applied to both internal and
external targeting strategies - Our approach can also be applied to e-tailing
47Future
- Targeting
- Products and services for purchases
- Advertising
- E-grocers (features, displays, prices)
- How much is a feature worth?
- Other areas
- On-line choice processes
- Agent queries
48Dirchlet Process Moments
- EG(B)EG0(B)
- and VarG(B)G0(B)(1-G0(B))/a
49Full Conditionals for Individual Level Model
Normal Heterogeneity
- Standard Case (Simple Model)
50Dirichlet Process Priors
- A c.d.f., G on Q follows a Dirichlet Process if
for any measurable finite partition of (B1,B2,
.., Bm), of Q, the joint distribution of the
random variables - ( G(B1), G(B2), , G(Bm)) is
- Dirichlet(aG0(B1), ., aG0(Bm)),
- where, G0 is a the base distribution and a is
the precision parameter