Data Wharehousing OLAP Data Mining - PowerPoint PPT Presentation

About This Presentation
Title:

Data Wharehousing OLAP Data Mining

Description:

... Systems: The Complete Book, by Hector Garcia-Molina, Jeff Ullman, and Jennifer ... Aggiornata a intervalli stabiliti (a seconda dell'applicazione) ... – PowerPoint PPT presentation

Number of Views:72
Avg rating:3.0/5.0
Slides: 21
Provided by: giusepped8
Category:

less

Transcript and Presenter's Notes

Title: Data Wharehousing OLAP Data Mining


1
Data Wharehousing OLAP Data Mining
  • S. Costantini
  • Università degli Studi di LAquila
  • stefcost_at_di.univaq.it

2
Ringraziamenti (Acknowledgment)
  • Part of this material is taken from Database
    Systems The Complete Book, by Hector
    Garcia-Molina, Jeff Ullman, and Jennifer Widom,
    edited by Prentice-Hall.
  • URL http//www-db.stanford.edu/ullman/dscb.html

3
Cosè in sostanza un Data Wharehouse?
  • E una vista materializzata
  • Aggiornata a intervalli stabiliti (a seconda
    dellapplicazione)
  • E un cosiddetto sistema di integrazione di
    dati perché può contenere dati provenienti da
    vari database (detti sorgenti)

4
Perché i Data Warehouse?
  • Perché le query di analisi statistica ed esame
    dei dati per estrarne varie informazioni (dette
    query OLAP, vedi seguito) sono pesanti e
    diminuiscono troppo la performance del sistema.
    Però non necessitano della versione più
    aggiornata dei dati.

5
Perché i Data Wharehouse
  • Allora conviene separare le query usuali dalle
    query OLAP, creando per queste ultime un Data
    Wharehouse
  • Per le query OLAP il modello relazionale non è
    ottimale, quindi nel creare un Data Wharehouse il
    modello dei dati viene modificato.

6
Observation
  • Traditional database systems are tuned to many,
    small, simple queries.
  • Some new applications use fewer, more
    time-consuming, complex queries.
  • New architectures have been developed to handle
    complex analytic queries efficiently.

7
The Data Warehouse
  • The most common form of data integration.
  • Copy sources into a single DB (warehouse) and try
    to keep it up-to-date.
  • Usual method periodic reconstruction of the
    warehouse, perhaps overnight.
  • Frequently essential for analytic queries.

8
OLTP
  • Most database operations involve On-Line
    Transaction Processing (OTLP).
  • Short, simple, frequent queries and/or
    modifications, each involving a small number of
    tuples.
  • Examples Answering queries from a Web interface,
    sales at cash registers, selling airline tickets.

9
OLAP
  • Of increasing importance are On-Line Application
    Processing (OLAP) queries.
  • Few, but complex queries --- may run for hours.
  • Queries do not depend on having an absolutely
    up-to-date database.

10
OLAP Examples
  • Amazon analyzes purchases by its customers to
    come up with an individual screen with products
    of likely interest to the customer.
  • Analysts at Wal-Mart look for items with
    increasing sales in some region.

11
Data Warehouses
  • Doing OLTP and OLAP in the same database system
    is often impractical
  • Different performance requirements
  • Analysis queries require data from many sources
  • Solution Build a data warehouse
  • Copy data from various OLTP systems
  • Optimize data organization, system tuning for
    OLAP
  • Transactions arent slowed by big analysis
    queries
  • Periodically refresh the data in the warehouse

12
Common Architecture
  • Relational Databases handle OLTP.
  • Local databases copied to a central warehouse
    overnight.
  • Analysts use the warehouse for OLAP.

13
Definition of data warehousing
A data warehouse is a subject-oriented,
integrated, time-variant and non-volatile
collection of data in support of managements
decision making process.
14
Loading the Data Warehouse
Data is periodically extracted
Data is cleansed and transformed
Data Staging Area
Users query the data warehouse
Source Systems
(OLTP)
Data Warehouse
15
Data Mining
  • Data mining is a popular term for queries that
    summarize big data sets in useful ways.
  • Examples
  • Clustering all Web pages by topic.
  • Finding characteristics of fraudulent credit-card
    use.

16
Data Warehouse
Enterprise Database
Customers
Orders
Transactions
Vendors
Etc
Etc
  • Data Miners
  • Farmers they know
  • Explorers - unpredictable

Copied, organized summarized
Data Warehouse
Data Mining
17
Market-Basket Data
  • An important form of mining from relational data
    involves market baskets sets of items that
    are purchased together as a customer leaves a
    store.
  • Summary of basket data is frequent itemsets
    sets of items that often appear together in
    baskets.

18
Data Mining Flavors
  • Directed Attempts to explain or categorize some
    particular target field such as income or
    response.
  • Undirected Attempts to find patterns or
    similarities among groups of records without the
    use of a particular target field or collection of
    predefined classes.

19
Data Mining Examples in Enterprises
  • Government
  • Track down criminals (Police also)
  • Treasury Dept suspicious intl funds transfer
  • Phone companies
  • Supermarkets Superstores
  • Mail-Order, On-Line Order

20
Data Mining Examples in Enterprises
  • Financial Institutions
  • Insurance Companies
  • Web sites
  • Many others
Write a Comment
User Comments (0)
About PowerShow.com