Title: Advanced Topics: Business Intelligence Features
1Advanced TopicsBusiness Intelligence Features
- Data Warehousing, Materialized View,
Partitioning, SQL Analytics
2Data Warehousing
- Purpose Query and Analysis of Data
- Usually contains historical data derived from
transaction data (OnLine Trans. Processing or
OLTP) - OLAP (OnLine Analytical Processing)
- Enable Fast execution of ad-hoc queries
3Data Warehouse Architecture
Data Mart small, specialized DW for a dept or so
DW Purpose Enable analysis using Unified Data
4OLAP example
- Assume one (real-valued) measure value eg Sales
Amount and several finite dimensional attributes
e.g. Item, Place, Month. - Example fact Iced Tea was sold in Auckland in
January. Measure 20k - Maybe no fact about Iced Tea in Auckland in
August. - Mapping ItemPlaceTime ? Sales Amount called a
cube. (Think array.) - Answer queries based on dimensions time, place
group by month, country or year, state, etc.) - All possible combinations need to be fast.
5Logical Design Star Schema
Central Fact table storing measure attributes
such as sales Dimension tables store
information about the dimension
attributes Queries Find sales by product,
region, customers, time (year),..
6Dimensions
- Hierarchy of attributes
- Analysis based on any level in hierarchy
- sales on region, subregion, country_name,
7Logical Design Snowflake Schema
- Snowflake complex improvisation of star.
- Products divided into categories and suppliers
(more normalized) - Queries will be slower
8Physical Design
Tables Can be regular tables or Partitioned
Tables Indexes Specialized indexes like
bitmap Additional Objects Materialized views,
Dimensions
9Partitioned Tables
- Behave like regular tables
- Query on entire table or on a specific partition
- Parallelism on multiple partitions
- Can load/modify multiple partitions concurrently
10Query Analysis Functions
- All SQL analysis functions
- SELECT SUM(amout_sold), p.pname
- FROM sales s, products p, times t
- WHERE s.product_id p.product_id
- GROUP BY p.pname
11More SQL Analytic Functions
Conversion Functions ASCIISTRCASTCHARTOROWIDCO
MPOSECONVERTDECOMPOSEHEXTORAWNUMTODSINTERVALN
UMTOYMINTERVALRAWTOHEXRAWTONHEXROWIDTOCHARROWI
DTONCHARSCN_TO_TIMESTAMPTIMESTAMP_TO_SCNTO_BINA
RY_DOUBLETO_BINARY_FLOATTO_CHAR
(character)TO_CHAR (datetime)TO_CHAR
(number)TO_CLOBTO_DATETO_DSINTERVALTO_LOBTO_M
ULTI_BYTETO_NCHAR (character)TO_NCHAR
(datetime)TO_NCHAR (number)TO_NCLOBTO_NUMBER,
- NUMERICABSACOSASINATANATAN2BITANDCEILCOSC
OSHEXPFLOORLNLOGMODNANVLPOWERREMAINDERROU
ND (number)SIGNSINSINHSQRTTANTANHTRUNC
(number)WIDTH_BUCKET
- CHARACATERCHRCONCATINITCAPLOWERLPADLTRIMNLS
_INITCAPNLS_LOWERNLSSORTNLS_UPPERREGEXP_REPLAC
EREGEXP_SUBSTRREPLACERPADRTRIMSOUNDEXSUBSTR
TRANSLATETREATTRIMUPPER
- DATEADD_MONTHSCURRENT_DATECURRENT_TIMESTAMPDBT
IMEZONEEXTRACT (datetime)FROM_TZLAST_DAYLOCALT
IMESTAMPMONTHS_BETWEENNEW_TIMENEXT_DAYNUMTODSI
NTERVALNUMTOYMINTERVALROUND (date)SESSIONTIMEZO
NESYS_EXTRACT_UTCSYSDATESYSTIMESTAMPTO_CHAR
(datetime)TO_TIMESTAMPTO_TIMESTAMP_TZTO_DSINTER
VALTO_YMINTERVALTRUNC (date)TZ_OFFSET
12Analytic Functions
- AVG, CORR, COVAR_SAMP, COVAR_POP, COUNT,
CUME_DIST, DENSE_RANK, FIRST, LAST, LAG, LEAD,
MAX, MIN, PERCENTILE_RANK, RANK, ROW_NUMBER,
STDDEV, SUM, VARIANCE,
13Materialized Views
- Queries joining fact and dimension tables
- Precompute the join and store as Materialized
Views - Denormalized and so not in 3NF
- Queries are automatically rewritten using
Materialized view - Materialized views
- Facilitate fast execution of queries
14Summary
- Fast Query and Analysis
- Denormalization into Materialized Views
- Queries implicitly rewritten under the covers
- Research issues
- What queries can be rewritten using the MV
- Pushing the updates to the MVs
- Refreshing the MVs