229-611 Data Warehousing and Data Mining - PowerPoint PPT Presentation

About This Presentation
Title:

229-611 Data Warehousing and Data Mining

Description:

229-611 Data Warehousing and Data Mining . . ... – PowerPoint PPT presentation

Number of Views:153
Avg rating:3.0/5.0
Slides: 35
Provided by: staffCsP9
Category:

less

Transcript and Presenter's Notes

Title: 229-611 Data Warehousing and Data Mining


1
229-611 Data Warehousing and Data Mining
  • ??.??. ?????? ??????????????
  • ?????????????????????????? ?????????????????????
    ???
  • ???? CS 108 E-mail wwettayaprasit_at_yahoo.com
  • Website http//staff.cs.psu.ac.th/wiphada

2
????????????
  • 1.  ??????????????????????????????????????????????
    ?????????????????
  • 2.  ??????????????????????????????????
  • 3. ??????????????????????????????????????????????
    ?????? ???????????????

3
???????
  • Chapter 1 Introduction
  • Chapter 2 Data Warehouse
  • Chapter 3 Data Mining
  • Chapter 4 Basic Data Mining Techniques
  • Chapter 5 Data Mining a Closer Look
  • Chapter 6 Cross Validation
  • Chapter 7 Decision Tree
  • Chapter 8 Association Rules
  • Chapter 9 The K- Means Algorithm
  • Chapter 10 Neural Networks
  • Chapter 11 Statistical Techniques
  • Chapter 12 Rule Base System

4
??????????????????
  • 1. Data Mining A tutorial-Based Primer,
    Richard J. Roiger and Michael W. Geatz,
  • Pearson Education Inc., 2003.
  • 2. Mining Very Large Databases with Parallel
    Processing, Alex A. Freitas
  • and Simon H. Lavington, Kluwer Academic
    Publishers, 1998.
  • 3. ??????????????????????????? (Data
    Warehouse), ????? ?????????????,
  • ???????????? ????? ????? ????????, 2546
  • 4. ??????????????????????????????
    ??????????????????? (Decision Support
  • Systems and Expert Systems), ?????????
    ????????, ?????? ?????? ????? ?????
  • ????????, 2546

5
Chapter 1
  • Introduction

6
Content
  • Data Warehouse (??????????)
  • Data Warehousing (?????????????)
  • Data Mining (????????????)

7
?????????????????????????????????????
  • 1. H/W S/W ????????
  • 2. Data Redundancy ????????????????????????
  • 3. Data Inconsistency ????????????????????
  • 4. Coding System ?????????????????????????????????
    ?? (Multiple Standard)
  • ?????????????????????? (Silo-based System)
  • ????????????

8
??????????????????
  • Business Integration
  • ??????????????????????????????????????????????
  • ?? 2 ???
  • 1. Partial Business Integration
  • Point to Point Business Integration
  • Middleware Business Integration
  • 2. Overall Business Integration

9
??????????????????
  • 1. Partial Business Integration
  • Point to Point Business Integration
  • ???????????????????? 2 ???????????????????
  • ???????????????????????????????
  • ??????? Spaghetti Phenomenon
  • Middleware Business Integration
  • ??????? H/W S/W ?????????????????????????????????
    ????????????????????????????
  • ????????????????????

10
Point to Point Business Integration
11
??????????????????
  • 2. Overall Business Integration
  • ?????????????????????? ???????????????????????????
    ?????????????????????????????? ??????????????????
  • ??????????????????????? ???????????????
  • Unified Standard
  • Maximize data consistency
  • Minimize data redundancy

12
Data Warehouse (??????????)
  • ?????????? ???????.... ??????????????????
    ???????????? ????????????
    ??????????????????????????????????
    ????????????????????? ??????????????????????
  • ??????????????????????????????????????????????
    ?????
  • ?????????? ???????.... ???????????????????????????
    ?? ???????????????????????????????????????????????
    ??
  • ?????????? ??????????????? ?????????????????
  • ?????????? ???????????????????????????????
  • (Organization Customized System)

13
??????????????????????
  • 1. Subject-Oriented
  • 2. Integrated
  • 3. Time-Variant
  • 4. Non-Volatile

14
??????????????????????
  • 1. Subject-Oriented
  • ???????????????????????????????????????
    ?????????????????????????? ????
  • ?????? ?????? ??????
  • ????????....?????????????????????????....?????????
    ??????????????? ????
  • ??????????????????? ?????????????????
  • 2. Integrated
  • ????????????????????????????????
    ???????????????????????

15
??????????????????????
  • 3. Time-Variant
  • ??????????????????????? ?????????????????? 5-10
    ??
  • 4. Non-Volatile
  • ??????????????????????????????????????????????????
    ?? ?????????????????????????
  • ??????????????????....?????????...?????????
    Normalize ????????????????? (Data based)

16
??????????????????
  • 1. ????????????????????????
  • 2. ???????????????? ??????????????????????????????
    ??????????????????????? ?????????????????????
  • 3. ?????????????????????????????
    ???????????????????????????????????

17
????????????????????
  • 1. ??????????????????????????????
    ?????????????????????????????????????????
  • 2. ?????????????????????????????????????????
    ?????????????????????????????????
  • 3.??????????????????????????????
  • 4.??????????????????????????????

18
Data Warehousing (?????????????)
  • ????????????? ??? ????????? ??????????????????????
    ???????????????? ??????
  • ?????????????????????
  • ?????????????????????????????????
  • ???????????????? ?????????????????????????????

19
????????????????????????
  • 1. Data Acquisition System
  • 2. Data Staging Area
  • 3. Data Warehouse Database /Data Store
  • 4. Data Provisioning Area /Data Mart
  • 5. End User Terminal
  • 6. Metadata Repository

20
????????????????????????
21
????????????????????????
  • 1. Data Acquisition System
  • ??????????????????
  • 2. Data Staging Area
  • Data Cleansing ??????????????????????
  • Filtering ?????????????????????????????
  • 3. Data Warehouse Database /Data Store
  • Data Model ????????????????????
  • ????????????????

22
????????????????????????
  • 4. Data Provisioning Area / Data Mart
  • ??????????????????????????????????????????????????
    ????
  • 5. End User Terminal
  • Simple Report Tool
  • Multi Dimensional Tools
  • Data Mining Tools
  • 6. Metadata Repository
  • ?????????????????????????????????????????????

23
??????????????????????????????
  • 1. Query and Report Generator
  • 2. Multidimensional Data Analysis
  • 3. Online Analysis Processing (OLAP)
  • 4. Data Mining Tools

24
??????????????????????????????
25
??????????????????????????????
26
Online Analysis Processing (OLAP)
  • ?????????????????????????????????????
    ????????????????????????????????????
    (Multidimensional Data Analysis)
  • ??????????????? OLAP
  • 1. Roll up / Consolidation
  • ????????????????????????????????
    ??????????????????...????????..????????
  • 2. Drill Down
  • ????????????????????????????????
    ???????????????...????????.. ??????????????
  • 3. Slice
  • ???????????????????????????????????????
    ??????????????????????????????????????????????????
    ?????????????
  • 4. Dice
  • ?????????????????????????????????
    ????????????????????????????????

27
Data Mining (????????????)
  • ???????????? ?????????????????????????????????????
    ???????????????????????????????????
  • ???????????? ?????????????? ??? Application
    ????????????????????????????????????????????
  • ???????????? ?????????????????????????
    ?????????????? ???????????????????????????????????
    ??????????????????????????????????
    ??????????????????????????????????
  • ?????????????????? (Knowledge Discovery)
  • ????????????????????????? (Rule)

28
???????????????????????
  • 1. Classification
  • 2. Clustering
  • 3. Association
  • 4. Visualization

29
???????????????????????
p.85
  • 1. Classification ??????????????????????????????
    ??????????????????????????????????????
  • ????????????????????????????????????? (Predictive
    Model) ??????? ???????? ......Supervised Learning
  • ?? 2 ??????
  • Tree Induction
  • Neural Network
  • 2. Clustering ??????????????????????????????????
    ????????????????????????????????
    ????????????????????????????????????????????
    ???????? .......Unsupervised Learning
  • 3. Association ???????????????????????????????
    ??????????????????????????????????????????????????
    ????????????????????
  • 4. Visualization ????????????????????????????????
    ????????? ??????????????

30
????????????????????????
  • 1. ??????????????????????????????????????
  • 2. ?????????????????????????????????
    ????????????????????
  • 3. ??????????????????????????????????
    ????????????????????????????????????????
  • 4. ???????????????????????????????????????????

31
?????????????????????????????????
  • 1. ???????
  • ????????????????????????????????????
  • 2. ????????????????
  • ???????????????????????????????????
  • 3. ?????????
  • 4. ?????? ???????
  • 5. ??????????????
  • 6. ???????????????
  • 7. H/W S/W ???????????
  • 8. ?????????????
  • 9. ?????????

32
???????????????????????
  • 1. ???????????????????????????????????????????
  • 2. ????????????????? Client/Server
  • 3. ?????????????????????????????????????????????
  • 4. ???????????????????????????????????????????????
    ? ????????????????????????????????????????
  • 5. ???????????????????????????????????????????????
    ????????????????????

33
Homework 1
  • 1. ?????????????????????????????? ??????????? 2
    ?????
  • ???????????????????????????????????????
  • Data Warehouse (??????????)
  • Data Mining (????????????)
  • 2. ??????? Data Mining Tool ??????????
    ?????????????????????????
  • ??? ?????? (next week in class)
  • Hard Copy
  • File
  • Presentation 2 min (no slide)

34
The road to success is always
under construction
Jim Miller
Write a Comment
User Comments (0)
About PowerShow.com