Chapter 6 slides, Computer Networking, 3rd edition - PowerPoint PPT Presentation

1 / 43
About This Presentation
Title:

Chapter 6 slides, Computer Networking, 3rd edition

Description:

Building Data Mining Applications for CRM, by ... Online Newshour April 7, 1998 reports ... (Online Newshour April 7, 1998) Data Mining and Its Applications. 26 ... – PowerPoint PPT presentation

Number of Views:304
Avg rating:3.0/5.0
Slides: 44
Provided by: JimKurosea346
Category:

less

Transcript and Presenter's Notes

Title: Chapter 6 slides, Computer Networking, 3rd edition


1
Data Mining and Its Applications
Data Mining Techniques For Marketing, Sales,
and Customer Support, by Michael J.A. Berry and
Gordon Linoff, John Wiley Sons, Inc.,
1997. Discovering Data Mining from concept to
implementation, by Cabena, Harjinian, Stadler,
Verhees and Zanasi, Prentice Hall, 1997. Building
Data Mining Applications for CRM, by Alex Berson,
Stephen Smith and Kurt Thearling, McGraw Hall,
1999. Data Mining Cookbook Modeling Data for
Marketing, Risk, and Customer Relationship
Management, by Olivia Parr Rud, John Wiley
Sons, Inc, 2001. Mastering Data Mining The Art
and Science of Customer Relationship management,
by Michael J.A. Berry and Gordon S. Linoff, John
Wiley Sons, Inc, 2000. Machine Learning, by Tom
M. Mitchell, McGraw-Hill, 1997. Data Mining
Concepts and Techniques, by Jiawei Han and
Micheline Kamber, Morgan Kaufmann,
2001. Introduction to Data Mining, by Pang-Ning
Tan, Michael Steinbach, and Vipin Kumar, Addison
Wesley, 2005.
2
Why Mine Data?
  • Lots of data is being collected and warehoused
  • Web data, e-commerce
  • purchases at department/grocery stores
  • Bank/Credit Card transactions
  • Computers have become cheaper and more powerful
  • Competitive Pressure is Strong
  • Provide better, customized services for an edge
    (e.g. in Customer Relationship Management)

3
Mining Large Data Sets - Motivation
  • There is often information hidden in the data.
  • Human analysts may take weeks to discover useful
    information.
  • Much of the data is never analyzed at all.

The Data Gap
Total new disk (TB) since 1995
Number of analysts
4
What is Data Mining?
  • Many Definitions
  • Non-trivial extraction of implicit, previously
    unknown and potentially useful information from
    data
  • Exploration analysis, by automatic or
    semi-automatic means, of large quantities of
    data in order to discover meaningful patterns

5
What is (not) Data Mining?
  • What is Data Mining?
  • Certain names are more prevalent in certain US
    locations (OBrien, ORurke, OReilly in Boston
    area)
  • Group together similar documents returned by
    search engine according to their context (e.g.
    Amazon rainforest, Amazon.com,)
  • What is not Data Mining?
  • Look up phone number in phone directory
  • Query a Web search engine for information about
    Amazon

6
Origins of Data Mining
  • Draws ideas from machine learning/AI, pattern
    recognition, statistics, and database systems
  • Traditional techniquesmay be unsuitable due to
  • Enormity of data
  • High dimensionality of data
  • Heterogeneous, distributed nature of data

Statistics/AI
Machine Learning/ Pattern Recognition
Data Mining
Database systems
7
Data Mining Tasks
  • Prediction Methods
  • Use some variables to predict unknown or future
    values of other variables.
  • Description Methods
  • Find human-interpretable patterns that describe
    the data.

From Fayyad, et.al. Advances in Knowledge
Discovery and Data Mining, 1996
8
Data Mining Tasks...
  • Classification
  • Clustering
  • Association Rule Discovery
  • Sequential Pattern Discovery

9
The Virtuous Cycle of Data Mining
Identify business problems and
areas where analyzing data can
Act on the information
provide value
Measure the results of your efforts to provide
insight on how to
exploit your data.
Taken from a talk given by Michael J.A. Berry on
Data Mining for CRM.
10
Some Typical Business Problems
  • Customer profiling
  • Customer segmentation
  • Direct marketing
  • Customer retention
  • Basket analysis (retail)
  • Cross selling
  • Fraud detection

11
Customer Profiling
  • Question
  • What kinds of customers were profitable in last
    year?
  • Data
  • Customer details such as Age, Gender, Occupation,
    Salary Levels, Account, etc.
  • Earnings from customers in last year.
  • Data Mining
  • Divide customers into profitability categories
    according to earnings such as highly profitable,
    profitable, non-profitable, loss.
  • Find rules using data mining techniques.
  • Analyze the rules and take actions.

12
Customer Profiling Rules
  • IF age gt 30 and Age lt45 and
  • occupation is professional and
  • salary level is between 50,000 and 70,000
  • Then this user is profitable
  • The rules are with some statistic support such as
    support and confidence.

13
Customer Segmentation
  • Consumers are not same. They need to be treated
    differently. Segmentation is essential in
    marketing.
  • Different spending capability
  • Different spending potentials
  • Different behaviors
  • Different profitability
  • Different preferences
  • Different hobbies
  • Different life style

14
Customer Segmentation
  • Customer segmentation is a process to divide
    customers into different groups or segments.
    Customers in the same segment have similar needs
    or behaviors so that similar marketing strategies
    or service policies can be applied to them.
  • Customer segments are required in several
    business areas including
  • Marketing
  • Customer services
  • Products and service development
  • Sales promotion
  • Customer retention

15
Direct Marketing
  • Question
  • Select a customer mailing list for a product
    campaign.
  • Purposes
  • Reduce the campaign cost and obtain a high
    responding rate.
  • Data
  • Customer details and previous campaign data.

16
Life Cycle of a Loan Product
17
Business Objectives
  • Mellon Bank Corporation is a major financial
    services company head-quarted in Pittsburgh.
  • Build an extendible loan secured by the values of
    a clients own property.
  • Achieve the highest possible Return On Investment
    (ROI).
  • Based on customers with DDA, build a model for
    HELOC.

18
Data Preparaton
  • The primary data source was the approximately
    40,000 Mellon customers who had (or once had)
    HELCOCs and DDAs.
  • Data
  • Demographic data sourced both internally and
    externally (age, income, length of residence, and
    other indicators of economic condition)
  • DDA data (history of loan balance over 3, 6, 9,
    12, 18 months, history of returned checks,
    history of interest rates.
  • Property data sourced externally (home purchase
    price, loan-to-value ratio)
  • Other data related to credit worthiness
  • Use 120 variables

19
(No Transcript)
20
Responders
21
Classification
22
Customer Retention
  • Question
  • Find out what kinds of customers tend to churn
    and build a model which can predict the
    likely-to-churn customers.
  • Data mining solution
  • Collect data about the customers who have
    churned.
  • Select a set of customers who have been loyal.
  • Merge the two data sets to form training, testing
    and evaluation data sets.

23
Basket Analysis
24
Basket Analysis
A
A
B
A
B
B
C
C
C
D
C
D
D
E
E
Rule A ? D C ? A A ? C B C ? D
Support 2/5 2/5 2/5 1/5
Confidence 2/3 2/4 2/3 1/3
25
Cross Selling Citicorp/Travelers Groups merger
  • Online Newshour April 7, 1998 reports
  • In the largest proposed corporate merger in
    history, the banking giant Citicorp and insurance
    titan Travelers will join forces. The new
    company, to be called Citigroup, would be the
    largest financial services company in the world
  • One of the rationale for the merger
  • MARCUS ALEXIS, Northwestern University commented
  • Well, they have competitive issues. There
    are certain synergies. Not only do customers like
    to get a full range of services from a single
    vendor but also there are certain economies in
    cross selling by them.

(Online Newshour April 7, 1998)
26
Opportunities of Cross Selling
  • Travelers Group can increase sales of insurance
    products from Citicorp customer base.
  • Citicorp can increase sales of financial services
    from Travelers Group customer base.
  • Customers get convenience by doing one stop
    shopping for both financial service and insurance
    products.

27
Cross Selling and up Selling
  • Cross selling is the process of selling current
    customers new products after they purchased
    products of different categories.
  • E.g., sell car maintenance products to customers
    who just bought new cars.
  • Up selling is the process of selling current
    customers upgraded products or services after
    they purchased products of same category.
  • E.g., Sell mobile voice service users data service

28
Understanding Customers
More Efficient Acquisition
Longer Lasting Relationship
More Frequent Up/Cross Sell
More Profit
Profit
Revenue
Less Loss
Time
Loss
Taken from SPSS talk.
29
Understanding Customers
More Efficient Acquisition
Longer Lasting Relationship
More Frequent Up/Cross Sell
Even More Profit
Profit
Revenue
Less Loss
Time
Loss
Taken from SPSS talk.
30
How Cross Selling Works
  • Assume a marketing manager in a bank has the
    following products for customers
  • Saving account
  • Check account
  • Standard credit card
  • Gold credit card
  • Primary mortgage
  • Secondary mortgage
  • The manager wants to design a new campaign to
    customers who
  • Prepare to buy a new home
  • Prepare to refinance an existing home
  • Prepare to add a second mortgage

31
How to match customers with offers
  • Determine three offers to customers
  • New first mortgage
  • Refinance of first mortgage
  • Second mortgage
  • Each customer is only made one offer

Customers
32
The Impact of Fraud
  • GAO (The United States General Accounting Office)
    cited 19.1 billion in improper government
    payments in 17 major programs for fiscal year
    1998.
  • Medicare 12.6 Billion
  • Supplemental Security Income 1.6 B
  • The Food Stamp Program 1.4 B
  • Old Age and Survival Insurance 1.2 B
  • Disability Insurance 941 Million
  • Housing Subsidies 847 Million
  • Veterans Benefits, Unemployment Insurance and
    Others 514 Million

33
Background
  • HIC (The Health Insurance Commission) in
    Australia is a federal government agency.
  • HIC pays insurance claims more than 20 million
    Australian dollars and pay out about A8 billion
    in funds every year.
  • More than 300 million transactions are processed
    and stored every year. 1.3TB in five year.

34
Preventing Fraud and Abuse
  • Business Objectives
  • The focus of the HIC project was on the recent
    and steady 10 annual rise in the cost of
    pathology claims for clinical tests.
  • Approaches
  • To identify potential fraudulent claims or claims
    arising from inappropriate practice, and
  • To develop general profiles of the GP practices
    in order to compare practice behaviors of
    individual GPs.

35
Data Proprocessing
  • Two databases
  • Episode Database
  • One Episode record records a patient visit.
  • In total, 6.8 million records.
  • There were 227 different pathology tests.
  • GP (doctor) database
  • There are 17,000 records related to active GPs
  • The behavior of 10,409 GPs was to be studied.
  • A matrix of 10,409 by 227 elements.
  • The elements were then scaled from 0 to 1 with
    respect to the total number of tests of each kind.

36
Input to Segmentation
37
Overview
38
Data Mining
  • They conducted association rule mining, when
    support 0.25,the team decided that the
    presence of some tests in the input database was
    causing spurious rules to be revealed (Pathology
    Episode Initiation (PEI)).
  • PEI tests depend on who ordered them and where
    they were ordered.
  • When the PEI tests were removed, the number of
    rules dropped significantly.

39
Result Analysis
  • A request for a microscopic examination of feces
    for parasites (OCP) was associated with a
    cultural examination of feces (FCS) in 0.85 of
    cases.
  • A 92.6 chance that if OCP tests were requested,
    they would be done with FCS.
  • A 0.61 of chance, OCP was associated with a
    different more expensive test called MCS32, which
    costs A13.55 per test.

40
GP Profiles
41
Discussions
  • Segment 13
  • Represent the majority of traditional GPs who
    are practicing conventionally. 5,450 GPs. Total
    52 of GPs.
  • Only 6.2 of the medical pathology tests
  • Segment 4
  • 54 GPs. Only 0.51 of GPs.
  • 2.7 of the medical pathology tests.

42
?? 2004.4.21
43
(No Transcript)
Write a Comment
User Comments (0)
About PowerShow.com