Cash Registers

1 / 28
About This Presentation
Title:

Cash Registers

Description:

Medicine Gene/Disease Associations. Etc. Data Management Goal ... Attributes, entities, and relationships are described in the data dictionary. Entity ... – PowerPoint PPT presentation

Number of Views:17
Avg rating:3.0/5.0
Slides: 29
Provided by: Nes87

less

Transcript and Presenter's Notes

Title: Cash Registers


1
Cash Registers Satellites
  • Briefing to the 2006 NOAATech Conference
  • November 2, 2005
  • Stan Cutler
  • Stanley.cutler_at_noaa.gov
  • 301-457-5210 x 163
  • Mitretek Systems/NESDIS/OSD

2

Purpose
  • Improve communication between NOAAs developers
    and the wider community of data management
    professionals
  • Introduce vocabulary
  • Identify NOAA applications that can be described
    using common vocabulary

3

Agenda
  • Universal Data Management Challenges
  • Notional Data Warehouse Architecture
  • Data Modeling Approaches
  • Relational
  • Dimensional

4
I. Universal Data Management Challenges
5
Data Mining Example Market Basket Analysis
  • Decisions
  • Move beer display closer to the diaper display
  • On Thursdays, sell beer diapers at full price
  • Rationale
  • 1) When men bought diapers on Thursdays and
    Saturdays, they also tended to buy beer
  • 2) Men typically did their weekly grocery
    shopping on Saturdays
  • 3) On Thursdays, they only bought a few items


6
Many Disciplines Mine Their Data
  • Law Enforcement - Optimal Deployment
  • Health Care Coverage Risks
  • E-Commerce Pop-up/Link Selection
  • Medicine Gene/Disease Associations
  • Etc.

Data Management Goal Develop systems in which the
data and procedures are configured to answer
questions that are important to the enterprise
7
NOAAs Future
We are not unique. Any enterprise that collects
large amounts of data has the same kind of
challenges and goals
  • Integrating Global (Environmental Observations)
    and Data Management
  • Ensuring Sound, State-of-the-Art (Research)
  • Developing, Valuing, and Sustaining a World-Class
    Workforce

8
We can find valuable expertise outside the NOAA
community
  • Ask the same kinds of questions as those
    challenged with similar problems
  • Understand the constructs and vocabulary
  • Architectures
  • Data Modeling

9
II. Notional Data Warehouse Architecture
10
Hub and Spoke Architecture
Application Specific Data Marts use OLAP
Technologies ()
ETL Extract, Transform and Load
Internal Data
Transform Cleanse
Data Staging Area
Data Ware-house
External Data
Application Neutral
OLAP Online Analytical Processing
11
Retail ApplicationHub and Spoke Architecture
Sales Data
Transform Cleanse
External Customer Lists
Application Neutral
OLAP Data Marts (Application Specific)
12
Notional NOAA Hub and Spoke Architecture
NOAA Applications (Data Marts using OLAP)
CLASS
ESPC
Data Staging Area (Rich Inventory?)
Data Ware-house
Ecosystems Management
Transform Cleanse
Other Satellite Archives
Data Centers
Commerce Transportation
External Customers
Application Neutral
13
III. Data Modeling Approaches
14
Relational Vocabulary
  • Relational technologies
  • Relational Data Base Management Systems (RDBMS)
  • COTS Products (INFORMIX, DB2, ORACLE, MS/SS,
    etc.)
  • Proprietary data management/manipulation software
  • RDBMS Extensions (Most COTS products built on an
    RDBMS)
  • GUIs, CASE Tools, COOP, Application Generators,
    Security, etc.
  • Relational Data Models - Evolutionary approach
    to data base design
  • Conceptual Entity Relationship Diagrams (ERD)
    used to identify data requirements,
    relationships, rules
  • Diagrams
  • Data Dictionaries
  • Logical ERDs used to normalize (eliminate
    redundancies)
  • Physical models are the Table Schema entered into
    the RDBMS
  • Online Transaction Processing (OLTP)
  • e.g., CLASS

15
Entity Relationship Diagram (ERD)
The foundation of all OLTP systems, such as CLASS
Entity
Entity Class
Relationship
Attributes
Cardinality (1, Many, or 0)
Attributes, entities, and relationships are
described in the data dictionary
16
Object Models inherit ERD constructs
Object Class
key Behavior gtgtgtgt gtgtgtgt
17
Pros Cons of systems based on Relational models
  • Strengths
  • Referential integrity
  • Data locking
  • Fast Look-up and Retrieval
  • GUIs
  • Weaknesses
  • Entity proliferation
  • Users dont understand them
  • Complex code must be written to accumulate
    multiple instances (Hard to use for Data Mining)

18
Dimensional Data Models
  • Fact
  • An instance of numeric data
  • Dimension
  • Foreign key
  • Fact Table
  • Key is a concatenation of foreign keys
    (dimensions)
  • An instance can have dozens of foreign keys
  • Millions of instances (rows) often required
  • Programmers revenge on Data Base Administrators
  • Break many relational rules
  • Re-invented often

19
A Dimensional Data Model for Retailing
  • Who (buys, sells)
  • Customer (age, gender, marital status,
    occupation, etc.)
  • Sales person ( , , training, etc.)
  • Cash Register
  • What (products)
  • Brand, color, size, type, etc
  • When
  • Time of day, day of week, season
  • Where
  • Store (location, size, type), Shelf
  • Why
  • Promotions, advertising, discounts, economic
    trends
  • How much (was spent)
  • Per product, per total sale

20
Classical Star Schema Point of Sale
FACT
Time_key Customer_key Store_key Clerk_key Promo_ke
y Product_key Register_key Dollars Sold Units
Sold Dollars Cost
21
Snowflake Schema Point of Sale
Brand-ID_PK Maker-Desc
Model-Num_PK Model-Desc
FACT
Sub-Type_PK Sub-Type-Desc
Time_key Customer_key Store_key Clerk_key Promo_ke
y Product_key Register_key Dollars Sold Units
Sold Dollars Cost
Product Dimension
Product_Type_PK Product_Type_Desc
Sub-Type_PK Sub-Type-Desc
Sub-Type_PK Sub-Type-Desc
Brand-ID_PK Maker-Desc
Model-Num_PK Model-Desc
Sub-Type_PK Sub-Type-Desc
22
Metadata in Dimensional Modeling
  • NOAA usage
  • If its not a fact
  • If its not a key
  • Its metadata
  • Conventional Dimensional usage
  • If its not a fact
  • If its not a key
  • Its documentation
  • BUT
  • If its a key
  • Its metadata (because it describes the
    fact)

23
Dimensional Models for NOAA
  • Which
  • Satellite
  • Instrument
  • When
  • Orbit, UTC, Season, decade, epoch, etc
  • Where
  • Geospatial coordinates
  • Who
  • User affiliation
  • Developer affiliation
  • FACT How much?
  • Temperature, moisture, radiance, color, etc.

24
A NOAA Star Schema?
FACT TABLE
Time_key (fk) Location-key (fk) Altitude key
(fk) Product_key (fk) Satellite_key
(fk) Instrument_key (fk) Temperature
25
Pros Cons of systems based on dimensional models
  • Strengths
  • Very few entity types needed
  • Decision Support Systems (DSS)
  • End-Users construct complex queries by selecting
    dimensions from a GUI
  • Statistical analysis of very large data bases
  • Artificial Intelligence (AI)
  • Automated scheduling of continuous executions
  • System identifies (discovers) new relationships
  • Discoveries shape successive execution
  • Weaknesses
  • Development Cost
  • Storage
  • Operational Cost - Requires much care and
    feeding

26
False Dichotomy Relational vs. Dimensional
  • Relational and dimensional systems are not
    mutually exclusive
  • Data warehouses usually extract fact tables from
    relational data bases
  • Data warehouse capabilities are extensions in
    RDBMSs
  • Depends on the business
  • Feasibility Is the application data good enough
    for ETL?
  • ROI Does the business benefit outweigh the cost?

27
SUMMARY
  • NOAAs data mining challenge is similar to that
    of other enterprises
  • A world-wide community of IT professionals uses a
    particular vocabulary to address the challenge
  • Relational technologies models are the
    essential first step
  • Dimensional technologies models come next

28
Questions
Stan Cutler Mitretek System/NESDIS/OSD Stanley.Cut
ler_at_nooa.gov 301-457-5210 ex 163
Write a Comment
User Comments (0)