Chapter 1 Why - PowerPoint PPT Presentation

1 / 30
About This Presentation
Title:

Chapter 1 Why

Description:

Chapter 1. Why & What is Data Mining? Note: Included in this Slide Set is both ... Supermarkets & Superstores (Vons, Albertsons, Wal-Mart, Costco) ... – PowerPoint PPT presentation

Number of Views:64
Avg rating:3.0/5.0
Slides: 31
Provided by: ronno
Category:
Tags: chapter | costco

less

Transcript and Presenter's Notes

Title: Chapter 1 Why


1
Chapter 1Why What is Data Mining?
  • Note Included in this Slide Set is both Chapter
    1 material and additional material from the
    instructor.

2
Data Mining is a subset of Business Intelligence
(BI)
3
Topics to Discuss in Session 1
  • What is Data Mining (DM)?
  • Who uses DM?
  • Why DM?
  • Where DM?
  • When DM?
  • How DM?
  • Why study DM?

4
Data Mining Definition Goal
What, Who
  • Definition
  • DM is the exploration and analysis of large
    quantities of data in order to discover
    meaningful patterns and rules.
  • Goal
  • To allow an enterprise to IMPROVE its ______
    through better understanding of its ______ .
  • Potential for Competitive Advantage.

Synonyms include corporation, firm, non-profit
organization, government agency
5
Foundations of Data Mining
  • Data mining is the process of using raw data to
    infer important business relationships.
  • Despite a consensus on the value of data mining,
    a great deal of confusion exists about what it
    is.
  • Data Mining is a collection of powerful
    techniques intended for analyzing large amounts
    of data.
  • There is no single data mining approach, but
    rather a set of techniques that can be used stand
    alone or in combination with each other.

6
Data Mining Why now?
Why, Where, When
  • So much data are being produced!
  • Data are being warehoused
  • Computing power is more affordable
  • Competitive pressures are enormous
  • Data Mining software is available

7
Customer Relationship Management (CRM)
How
8
Customer Relationship Management (CRM)
How
In order to form a learning relationship with its
customers, an enterprise (firm) must be able to
  • Notice what its customers are doing
  • Remember what it and its customers have done
    over time
  • Learn from what it has remembered
  • Act On what it has learned to make customers
    more profitable

9
Based on Transaction Data
How
10
Based on Transaction Data
How
11
Identifying and Remembering Relationships is the
Key!
How
12
Group Exercise 1
  • Time Box 15 minutes
  • Teams of 4 or less
  • Discuss DM situations among yourselves and pick
    one to report to the class
  • What to report (verbally 5 minute max)
  • Describe the DM situation
  • How does it help the enterprise?
  • Presentationsanother 15 to 30 minutes

13
Why Study Data Mining? Open discussion to
identify these
14
Topics to Discuss in Session 2
  • Data Mining History
  • Data Warehouse
  • Data Mart

15
Data Mining History
  • The approach has roots in practice dating back
    over 40 years.
  • In the early 1960s, data mining was called
    statistical analysis, and the pioneers were
    statistical software companies such as SAS and
    SPSS.
  • By the late 1980s, the traditional techniques had
    been augmented by new methods such as fuzzy
    logic, heuristics and neural networks.

16
Definitions of a Data Warehouse
A subject-oriented, integrated, time-variant and
non-volatile collection of data in support of
management's decision making process
1.
- W.H. Inmon
A copy of transaction data, specifically
structured for query and analysis
2.
- Ralph Kimball
17
Data Warehouse
  • For organizational learning to take place, data
    from many sources must be gathered together and
    organized in a consistent and useful way hence,
    Data Warehousing (DW)
  • DW allows an organization (enterprise) to
    remember what it has noticed about its data
  • Data Mining techniques make use of the data in a
    DW

18
Data Warehouse
Enterprise Database
Customers
Orders
Transactions
Vendors
Products
Etc
  • Data Miners
  • Farmers they know
  • Explorers - unpredictable

Copied, organized summarized
(Prospectors)
Data Warehouse
Data Mining
19
Data Warehouse
  • A data warehouse is a copy of transaction data
    specifically structured for querying, analysis
    and reporting hence, data mining.
  • Note that the data warehouse contains a copy of
    the transactions which are not updated or changed
    later by the transaction system.
  • Also note that this data is specially structured,
    and may have been transformed when it was copied
    into the data warehouse.

20
Data Mart
  • A Data Mart is a smaller, more focused Data
    Warehouse a mini-warehouse.
  • A Data Mart typically reflects the business rules
    of a specific business unit within an enterprise.

21
Data Warehouse to Data Mart
Decision Support Information
Data Warehouse
Decision Support Information
Decision Support Information
22
Data Warehouse Mart
  • Set of Tables 2 or more dimensions
  • Designed for Aggregation

23
Group Exercise 2
  • Time Box 15 minutes
  • Teams of 4 or less
  • Discuss Data Warehouse to Data Mart situations
    among yourselves and pick one to report to the
    class
  • What to report (verbally 5 minute max)
  • Describe the DW to Data Mart situation
  • How does it help the enterprises business
    unit?
  • Presentationsanother 15 to 30 minutes

24
Topics to Discuss in Session 3
  • Data Mining Flavors
  • Data Mining Examples
  • Data Mining Tasks
  • Data Minings Biggest Challenge
  • What does all of this mean?

25
Data Mining Flavors
  • Directed Attempts to explain or categorize some
    particular target field such as income or
    response.
  • Undirected Attempts to find patterns or
    similarities among groups of records without the
    use of a particular target field or collection of
    predefined classes.

26
Data Mining Examples in Enterprises
For Illustration Purposes Only
  • US Government
  • FBI track down criminals (Local Police also)
  • Treasury Dept suspicious international funds
    transfer
  • Phone companies
  • Supermarkets Superstores (Vons, Albertsons,
    Wal-Mart, Costco)
  • Mail-Order, On-Line Order (L.L. Bean, Victorias
    Secret, Lands End, Amazon!)
  • Financial Institutions (BofA, Wells Fargo,
    Charles Schwab)
  • Insurance Companies (USAA, Allstate, State Farm)
  • Tons of others

27
Data Mining Tasks
  • Classification example Fr, So, Jr, Sr, ND
  • Estimation example household income
  • Prediction example predict credit card balance
    transfer average amount
  • Affinity Grouping Example people who buy X,
    often buy Y also with probability Z
  • Clustering similar to classification but no
    predefined classes
  • Description and Profiling behavior begets an
    explanation such as Men tend to prefer
    BurgerKing women prefer Wendys.

28
Data Minings Biggest Challenge
  • The largest challenge a data miner may face is
    the sheer volume of data in the data warehouse.
  • It is quite important, then, that summary data
    also be available to get the analysis started.
  • A major problem is that this sheer volume may
    mask the important relationships the data miner
    is interested in.
  • The ability to overcome the volume and be able to
    interpret the data is quite important.

29
What Does All of This Mean?
  • On a regular basis, farmers and explorers
    utilize their data warehouses to give guidance
    for and/or answer a limitless variety of
    questions.
  • Nothing is free, however, and the costs may be
    heavy.
  • The value of a data warehouse and subsequent data
    mining is a result of the new and changed
    business processes it enables competitive
    advantage also.
  • There are limitations, though - A Data Warehouse
    cannot correct problems with its data, although
    it may help to more clearly identify them.

30
End of Chapter 1
Write a Comment
User Comments (0)
About PowerShow.com