OAW - PowerPoint PPT Presentation

1 / 13
About This Presentation
Title:

OAW

Description:

Data warehouse / data webhouse. Dimensionel modellering ... Subdomain. parameter. frase. keyword. type. Content dimention. content_key (PK) content_section ... – PowerPoint PPT presentation

Number of Views:38
Avg rating:3.0/5.0
Slides: 14
Provided by: andersbla
Category:
Tags: oaw | subdomain

less

Transcript and Presenter's Notes

Title: OAW


1
OAW LEKTIONSGANG 11
2
AGENDA
  • Data warehouse / data webhouse
  • Dimensionel modellering (Stjerne skema)
  • Extract, Transform and load (ETL)

3
WHAT IS ER
FLAT DATA
ENTITY RELATION
4
WHAT IS ER
  • Relational database revolution bloomed in the mid
    1980s.
  • Highest form of ER modeling is to remove all
    redundancy in the data
  • With hundreds or even thousands of entities we
    have made a database that cannot be queried!
  • It creates the demand of a simpler model

5
WHAT IS DM
  • One table with multiple keys the fact table.
    This table contains only keys (integers).
  • A set of smaller tables called dimension tables
    containing text information
  • Highly recognizable to the end user.
  • Gracefully extensible to new data elements

6
ER vs DM
7
HVAD ER ET DATA WAREHOUSE?
8
OPERATIONELLE DATA
  • Svære at forstÃ¥ (funktionel orienteret)
  • Mange kilder (distribuerede)
  • Opdateres og indeholder ofte ingen historik
  • Megen data er irrelevant fra et analytisk
    perspektiv

9
KRAV TIL ANALYTISK DATA
  • Data skal være forstÃ¥elige (Emne orienteret)
  • Tilgængelig for alle
  • Hurtige svartider
  • Konsistente
  • Interaktive og flexible
  • Skal indeholde historik

10
Eksempel på en dimentionel model (stjerne skema)
Med lidt god vilje En stjerneform
11
UDDYBNING, DIMENSIONSTABELLEN
  • Tid (mÃ¥ned, dag, minut)
  • Content (sektion, undersektion)
  • Region (land, amt, kommune, by)

12
NAVIGATION AF DATA
  • Operation
  • select count(click_key) from click_fact
  • Drill up/ drill down (group by)
  • select count(click_key) from click_fact,
    content_dimention group by content_section
  • select count(click_key) from click_fact,
    content_dimention group by content_subsection
  • Slice and dice (section udland)
  • select count(click_key) from click_fact,
    content_dimention where section udland
  • select count(click_key) from click_fact,
    content_dimention, time_dimension where section
    udland and year 2002

13
EXTRACT, TRANSFORM AND LOAD
  • Typisk udført en gang om dagen
  • Typisk om natten, hvor virksomheden sover
  • Men webben sover jo aldrig hvordan hÃ¥ndterer vi
    dette i realtid?
  • Log data er fyldt med fejl grundet ukoordineret
    webdesign
Write a Comment
User Comments (0)
About PowerShow.com