Title: OLAP Online analytical processing
1OLAP On-line analytical processing
- dr. János Abonyi
- University of Veszprem
- abonyij_at_fmt.vein.hu
- www.fmt.vein.hu/softcomp/dw
- www.fmt.vein.hu/ai_phd
2Business Intelligence Environment
3What is Data Warehouse?
- Defined in many different ways, but not
rigorously. - A decision support database that is maintained
separately from the organizations operational
database - Supports information processing by providing a
solid platform of consolidated, historical data
for analysis. - A data warehouse is a subject-oriented,
integrated, time-variant, and nonvolatile
collection of data in support of managements
decision-making process.W. H. Inmon - Data warehousing
- The process of constructing and using data
warehouses
4OLTP vs. OLAP
5From Tables and Spreadsheets to Data Cubes
- A data warehouse is based on a multidimensional
data model which views data in the form of a data
cube - A data cube, such as sales, allows data to be
modeled and viewed in multiple dimensions - Dimension tables, such as item (item_name, brand,
type), or time(day, week, month, quarter, year) - Fact table contains measures (such as
dollars_sold) and keys to each of the related
dimension tables - In data warehousing literature, an n-D base cube
is called a base cuboid. The top most 0-D cuboid,
which holds the highest-level of summarization,
is called the apex cuboid. The lattice of
cuboids forms a data cube.
6Cube A Lattice of Cuboids
0-D(apex) cuboid
1-D cuboids
2-D cuboids
3-D cuboids
4-D(base) cuboid
7A Sample Data Cube
Total annual sales of TV in U.S.A.
8Cuboids Corresponding to the Cube
all
0-D(apex) cuboid
country
product
date
1-D cuboids
product,date
product,country
date, country
2-D cuboids
3-D(base) cuboid
product, date, country
9Example of Star Schema
Sales Fact Table
time_key
item_key
branch_key
location_key
units_sold
dollars_sold
avg_sales
Measures
10Example of Fact Constellation
Shipping Fact Table
time_key
Sales Fact Table
item_key
time_key
shipper_key
item_key
from_location
branch_key
to_location
location_key
dollars_cost
units_sold
units_shipped
dollars_sold
avg_sales
Measures
11Multidimensional Data
- Sales volume as a function of product, month, and
region
Dimensions Product, Location, Time Hierarchical
summarization paths
Region
Industry Region Year Category
Country Quarter Product City Month
Week Office Day
Product
Month
12Example I.
all
all
Europe
North_America
...
region
Mexico
Canada
Spain
Germany
...
...
country
Vancouver
...
...
Toronto
Frankfurt
city
M. Wind
L. Chan
...
office
13(No Transcript)
14Browsing a Data Cube
- Visualization
- OLAP capabilities
- Interactive manipulation
15A Star-Net Query Model
Customer Orders
Shipping Method
Customer
CONTRACTS
AIR-EXPRESS
ORDER
TRUCK
PRODUCT LINE
Product
Time
DAILY
QTRLY
ANNUALY
PRODUCT ITEM
PRODUCT GROUP
CITY
SALES PERSON
COUNTRY
DISTRICT
REGION
DIVISION
Each circle is called a footprint
Location
Organization
Promotion
16Typical OLAP Operations
- Roll up (drill-up) summarize data
- by climbing up hierarchy or by dimension
reduction - Drill down (roll down) reverse of roll-up
- from higher level summary to lower level summary
or detailed data, or introducing new dimensions - Slice and dice
- project and select
- Pivot (rotate)
- reorient the cube, visualization, 3D to series of
2D planes. - Other operations
- drill across involving (across) more than one
fact table - drill through through the bottom level of the
cube to its back-end relational tables (using SQL)
17OLAP operations
18OLAP operations I.
- Roll up (drill-up) summarize data
- by climbing up hierarchy or by dimension
reduction - Drill down (roll down) reverse of roll-up
- from higher level summary to lower level summary
or detailed data, or introducing new dimensions
19OLAP operations II.
- Slice and dice
- project and select
- Pivot (rotate)
- reorient the cube, visualization, 3D to series of
2D planes. - Other operations
- drill across involving (across) more than one
fact table - drill through through the bottom level of the
cube to its back-end relational tables (using SQL)
20Pivot Tables and Charts
The Pivot Table in Excel is an interactive
worksheet that allows you to summarize large
amounts of information. The graphical
representation of the Pivot Table is a Pivot
Chart.
21Excel Data List Files
In the Excel files suitable for Pivot Table
analysis, columns are considered fields, column
headings are field names and rows are records.
The file must not contain blank columns or rows.
22Name the Data Range
Highlight the data list. Then enter a name for
the range in the Range Name box.
23Create Pivot Table
Place cursor in data range
Choose Pivot Table from Data menu
24Pivot Table Step 1 of 3
Select the data source Excel list
Select the desired result
25Pivot Table Step 2 of 3
If a data range has been defined, the range will
automatically be displayed
26Pivot Table Step 3 of 3
Choose pivot table destination
Click the Layout Button
27Pivot Table Layout
28Sample Pivot Table
The data list fields are pivoted around the
core data, gross sales, providing different
views of the data.
29Change Table Layout
View gt Toolbars gt Pivot Table
Pivot Chart Wizard
Or, drag field names to create new layout
30Create a Pivot Chart
View gt Toolbars gt Pivot Table
Pivot Chart Button
31Sample Pivot Chart
32 Pivot Table Capabilities
- Drill Down for Detail
- Automatic Updates
- Multiple Data Views
33Drill Down
Double click in cell of Pivot Table
Results in Detail from Raw Data
34Updates to Core Data
When core data is updated . . .
. . . use Pivot Table Toolbar to Refresh table
35Consolidate Grouped Data
36Grouping Data Fields
- Highlight Jan, Feb Mar labels
- Data gt Group and Outline gt Group
- Enter Quarter Label
37Data Grouped by Quarter
38Tutorial