Title: Oracle Database 10g SQL Model Clause
1(No Transcript)
240166
Oracle Database 10g SQL Model Clause
Andy Witkowski, Architect Thomas Kyte, VP
Oracle Corporation
3Whats now in SQL for Modeling
- Aggregation Enhancements
- Cube, Rollup, Grouping Sets
- New aggregates Inverse Distribution,
FIRST/LAST,etc - Analytic Functions
- Window Functions Rank, Moving, Cumulative
- Statistical Functions Correlation, Linear
Regression,etc - Old tools still have more modeling power than SQL
- Spreadsheets, MOLAP engines
SQL Model enhances SQL with modeling power
4Case Study Modeling with Excel
- Excel fits well at the personal scale
- UI and Formatting
- Calculations (build-in functions, formulas)
- What-If analysis
- Excel fits poorly at corporate scale for modeling
- Cryptic row-column addressing
- No metadata, No standards, No mathematical model
- 100s of spreadsheets and consolidation by hand
- Does not scale (1000s formulas, TB of data)
- Perpetual data exchange databases-gtExcel
Replace Excel Modeling with SQL Modeling
5Modeling with SQL Model
- Language Spreadsheet-like calculations in SQL
- Inter-row calculation. Treats relations as an
N-Dim array - Symbolic references to cells and their ranges
- Multiple Formulas over N-Dim arrays
- Automatic Formula Ordering
- Recursive Model Solving
- Model is a relation can be processed further in
SQL - Multiple arrays with different dimensionality in
one query - Performance
- Parallel Processing in partitioning formulas
- Multiple-self joins with one data access
structure - Multiple UNIONs with one data access structure
- Why Better?
- Automatic Consolidation (models as views
combine using SQL) - Self Adjusting (as database changes no need to
re-define) - One version of truth (calc directly over data
base, no exchange)
6 SQL Model Concepts
7Define Relation as Array
prod time s
vcr 2001 9 dvd 2001 0
Relation
SELECT prod, time, s FROM sales
Array
1999
1 2 3 4
2000
5 6 7 8
DIMENSION BY (prod, time) MEASURES (s)
time
2001
9 0 1 2
vcr
dvd
tv
pc
prod
8Define Business Rules
prod time s
vcr 2001 9 dvd 2001 0
Relation
SELECT prod, time, s FROM sales
Array
1999
1 2 3 4
2000
5 6 7 8
DIMENSION BY (prod, time) MEASURES (s)
time
2001
9 0 1 2
vcr
dvd
tv
pc
RULES UPSERT ( sANY, 2000 sCV(prod),
CV(time) - 1 2, svcr, 2002 svcr,
2001 svcr, 2000, sdvd, 2002 AVG(s)
CV(prod), timelt2001 )
prod
Sales in 2000 2x of previous year
Predict vcr sales in 2002
Predict dvd sales in 2002
9Evaluate Formulas 1st
prod time s
vcr 2001 9 dvd 2001 0
Relation
SELECT prod, time, s FROM sales
Array
1999
1 2 3 4
2000
2 4 6 8
DIMENSION BY (prod, time) MEASURES (s)
2001
9 0 1 2
vcr
dvd
tv
pc
RULES UPSERT ( sANY, 2000 sCV(prod),
CV(time) - 1 2, svcr, 2002 svcr,
2001 svcr, 2000, sdvd, 2002 AVG(s)
CV(prod), timelt2001 )
Sales in 2000 2x of previous year
Predict vcr sales in 2002
Predict dvd sales in 2002
10Evaluate Formulas 2nd
prod time s
vcr 2001 9 dvd 2001 0
Relation
SELECT prod, time, s FROM sales
1999
1 2 3 4
2000
2 4 6 8
DIMENSION BY (prod, time) MEASURES (s)
2001
9 0 1 2
2002
11
RULES UPSERT ( sANY, 2000 sCV(prod),
CV(time) - 1 2, svcr, 2002 svcr,
2001 svcr, 2000, sdvd, 2002 AVG(s)
CV(prod), timelt2001 )
vcr
dvd
tv
pc
Sales in 2000 2x of previous year
Predict vcr sales in 2002
Predict dvd sales in 2002
11Evaluate Formulas 3rd
prod time s
vcr 2001 9 dvd 2001 0
Relation
SELECT prod, time, s FROM sales
1999
1 2 3 4
2000
2 4 6 8
DIMENSION BY (prod, time) MEASURES (s)
2001
9 0 1 2
2002
11 3
RULES UPSERT ( sANY, 2000 sCV(prod),
CV(time) - 1 2, svcr, 2002 svcr,
2001 svcr, 2000, sdvd, 2002 AVG(s)
CV(prod), timelt2001 )
vcr
dvd
tv
pc
Sales in 2000 2x of previous year
Predict vcr sales in 2002
Predict dvd sales in 2002
12Return as Relation
prod time s
vcr 2001 9 dvd 2001 0
Relation
SELECT prod, time, s FROM sales
1999
1 2 3 4
2000
2 4 6 8
DIMENSION BY (prod, time) MEASURES (s)
2001
9 0 1 2
2002
11 3
RULES UPSERT ( sANY, 2000 sCV(prod),
CV(time) - 1 2, svcr, 2002 svcr,
2001 svcr, 2000, sdvd, 2002 AVG(s)
CV(prod), timelt2001 )
vcr
dvd
tv
pc
Self-join.
join UNION
join UNION
Relation again
vcr 2001 9 dvd 2001 0
Rows updated inserted by the Model clause
vcr 2002 11 dvd 2002 3
13Model Clause Components
Model clause
14Key Concepts (1)
- New SQL Model Clause
- Data as N-dim arrays with DIMENSIONS MEASURES
- Data can be PARTITION-ed - creates an array per
partition - Formulas defined over the arrays express a
(business) model - Formulas within a Model
- Use symbolic addressing using familiar array
notation - Can be ordered automatically based on dependency
between cells - Can be recursive with a convergence condition
recursive models - Can UPDATE or UPSERT cells
- Support most SQL functions including aggregates
15Key Concepts (2)
- Result of a SQL Model is a relation
- Can participate further in processing via joins,
etc. - Can define views containing Model computations
- SQL Model is the last query clause
- Executed after joins, aggregation, window
functions - Before ORDER BY
- Main Model and Reference Models
- Can relate models of different dimensionality
16Formula Fundamentals (1)
- Formulas SQL expressions over cells with aggs,
functions, etc. - Formula has a left and right side and represents
assignment - svcr, 2002 svcr, 2001 svcr, 2000
single ref - svcr, 2002 AVG(s)vcr, tlt2002
multi ref on right - Left side can qualify multiple cells
- sp IN (vcr,dvd), tlt2002 1000
multi ref on left - sANY, t2002 2 sCV(p), CV(t)-1
left-right correlation - sp IN (SELECT prod FROM prod_tb), 2000 1000
- Formula can operate in update or upsert mode
- update svcr, 2002 svcr, 2001 svcr,
2000 - upsert svcr, 2002 svcr, 2001
svcr, 2000
17Formula Fundamentals (2)
- Function CV(dimension) propagates values from
left to the right side. In example, products in
2002 are sum of two previous years. - sANY, 2002 sCV(p), CV(t)-1 sCV(p), CV(t)
2
18Formula Fundamentals (2)
- Function CV(dimension) propagates values from
left to the right side. In example, products in
2002 are sum of two previous years. - sANY, 2002 sCV(p), CV(t) -1 sCV(p),
CV(t) 2 - Formula result can depend on processing order.
Can specify order in each formula. E.g., shift by
time - svcr, ANY ORDER BY t svcr, CV(t) - 1
19Formula Fundamentals (2)
- Function CV(dimension) propagates values from
left to the right side. In example, products in
2002 are sum of two previous years. - sANY, 2002 sCV(p), CV(t) -1 sCV(p),
CV(t) 2 - Formula result can depend on processing order.
Can specify order in each formula. E.g., shift by
time - svcr, ANY ORDER BY t svcr, CV(t) - 1
vcr 2001 300.00 0 vcr
2002 350.00 300.00 vcr 2003
400.00 vcr 2004 450.00 vcr
2005 500.00
ORDER BY t
20Formula Fundamentals (2)
- Function CV(dimension) propagates values from
left to the right side. E.g, products in 2002 are
sum of two previous years - sANY, 2002 sCV(p), CV(t) -1 sCV(p),
CV(t) 2 - Formula result can depend on processing order.
Can specify order in each formula. E.g., shift by
time - svcr, ANY ORDER BY t svcr, CV(t) - 1
vcr 2001 300.00 0 vcr
2002 350.00 300.00 vcr 2003
400.00 350.00 vcr 2004 450.00
vcr 2005 500.00
ORDER BY t
21Formula Fundamentals (2)
- Function CV(dimension) propagates values from
left to the right side. E.g, products in 2002 are
sum of two previous years - sANY, 2002 sCV(p), CV(t) -1 sCV(p),
CV(t) 2 - Formula result can depend on processing order.
Can specify order in each formula. E.g., shift by
time - svcr, ANY ORDER BY t svcr, CV(t) - 1
vcr 2001 300.00 0 vcr
2002 350.00 300.00 vcr 2003
400.00 350.00 vcr 2004 450.00
400.00 vcr 2005 500.00 450.00
ORDER BY t
22Model Options Fundamentals
global options
rule options
23NAV Options Handling Sparse Data
West dvd 2001 300.00 West tv
2002 500.00 West vcr 2001
200.00 West vcr 2002 400.00
keep nav
?
2001
West dvd 2001 300.00 West tv
2002 500.00 West dvd 2003
- West tv 2003 500.00 West
vcr 2001 200.00 West vcr
2002 400.00
24NAV Options Handling Sparse Data
West dvd 2001 300.00 West tv
2002 500.00 West vcr 2001
200.00 West vcr 2002 400.00
ignore nav
assume 0
West dvd 2001 300.00 West tv
2002 500.00 West dvd 2003
300.00 West tv 2003
500.00 West vcr 2001 200.00 West
vcr 2002 400.00
25NAV Options Handling Sparse Data
West dvd 2001 300.00 West tv
2002 500.00 West vcr 2001
200.00 West vcr 2002 400.00
keep nav
ignore nav
assume 0
?
2001
West dvd 2001 300.00 West tv
2002 500.00 West dvd 2003
- West tv 2003 500.00 West
vcr 2001 200.00 West vcr
2002 400.00
West dvd 2001 300.00 West tv
2002 500.00 West dvd 2003
300.00 West tv 2003
500.00 West vcr 2001 200.00 West
vcr 2002 400.00
26Automatic Formula Ordering
27Automatic Formula Ordering
28Automatic Formula Ordering
29UPDATE, UPSERT Partitions
Region Product Time s East
dvd 2001 100
East dvd 2002 150 East
vcr 2002 100 West dvd
2001 200
30UPDATE, UPSERT Partitions
Region Product Time s East
dvd 2001 100
East dvd 2002 150 East
vcr 2002 100 West dvd
2001 200
Region Product Time Old s New
s East dvd 2001 100
100 East dvd 2002 150
100 East vcr 2002
100 120 West dvd
2001 200 200
updated
31UPDATE, UPSERT Partitions
Region Product Time s East
dvd 2001 100
East dvd 2002 150 East
vcr 2002 100 West dvd
2001 200
Region Product Time Old s New
s East dvd 2001 100
100 East dvd 2002 150
100 East vcr 2002
100 120 East
dvd 2003 -
250 West dvd 2001 200
200 West dvd 2003 -
200
updated
upserted
32Different dimensions Reference
Relate Models with different dimensions.
Represent each as n-dimensional array one main,
others as reference or lookup arrays.
Sales Table
Conv table converts currency to
c p t s USA dvd
2001 300.00 USA tv 2001
500.00 Poland vcr 2001
200.00 zl France vcr 2001 100.00
fr
c ratio USA 1 Poland
0.24 France 0.12
33Different dimensions Reference
Sales Table
Conv table converts currency to
c p t s USA dvd
2001 300.00 USA tv 2001
500.00 Poland vcr 2001
200.00 zl France vcr 2001 100.00
fr
c ratio USA 1 Poland
0.24 France 0.12
USA dvd 2001 300.00 USA
tv 2001 500.00 Poland vcr
2001 48.00 France vcr 2001
12.00
Converted values
34Recursive Model Solving
- Model can contain cyclic (recursive) formulas. -
If cyclic formulas desired, use ITERATE option-
If ITERATE not present, cyclic formulas
automatically detected, and an error
reported. - Use ITERATE clause to specify of iterations or
- Use UNTIL clause to specify convergence conditions
Iteration 1 2 3
4 5 6 7
8 S value 1024 512 128
64 32 16 8
4
35Recursive Model Solving
- Model can contain cyclic (recursive) formulas. -
If cyclic formulas desired, use ITERATE option-
If ITERATE not present, cyclic formulas
automatically detected, and an error
reported. - Use ITERATE clause to specify of iterations or
- Use UNTIL clause to specify convergence conditions
Iteration 1 2 3
4 5 6 7
8 S value 1024 512 128
64 32 16 8
4
36Recursive Model Solving
- Model can contain cyclic (recursive) formulas. -
If cyclic formulas desired, use ITERATE option-
If ITERATE not present, cyclic formulas
automatically detected, and an error
reported. - Use ITERATE clause to specify of iterations or
- Use UNTIL clause to specify convergence conditions
Iteration 1 2 3
4 5 6 7
8 S value 1024 512 128
64 32 16 8
4
37Recursive Model Solving
- Model can contain cyclic (recursive) formulas. -
If cyclic formulas desired, use ITERATE option-
If ITERATE not present, cyclic formulas
automatically detected, and an error
reported. - Use ITERATE clause to specify of iterations or
- Use UNTIL clause to specify convergence conditions
Iteration 1 2 3
4 5 6 7
8 S value 1024 512 128
64 32 16 8
4
38Recursive Model Solving with Until
- Model can contain cyclic (recursive) formulas. -
If cyclic formulas desired, use ITERATE option-
If ITERATE not present, cyclic formulas
automatically detected, and an error
reported. - Use ITERATE clause to specify of iterations or
- Use UNTIL clause to specify convergence conditions
Iteration 1 2 3
4 5 6 7
8 9 10 S value
1024 512 256 128
64 32 16 8
4 2
previous(s1)
- s1 512
39Recursive Model Solving with Until
- Model can contain cyclic (recursive) formulas. -
If cyclic formulas desired, use ITERATE option-
If ITERATE not present, cyclic formulas
automatically detected, and an error
reported. - Use ITERATE clause to specify of iterations or
- Use UNTIL clause to specify convergence conditions
Iteration 1 2 3
4 5 6 7
8 9 10 S value
1024 512 256 128
64 32 16 8
4 2
previous(s1)
- s1 256
40Recursive Model Solving with Until
- Model can contain cyclic (recursive) formulas.
They are automatically detected, and error is
reported. Unless cycles are intentional which is
indicated with ITERATE option - Use ITERATE clause to specify of iterations or
- Use UNTIL to specify convergence conditions. Stop
if true.
Iteration 1 2 3
4 5 6 7
8 9 10 S value
1024 512 256 128
64 32 16 8
4 2
previous(s1)
- s1 4
41 SQL Model Business Examples
42Time Series Calculation (1)
Compute the ratio of current month sales of each
product to sales one year ago, one quarter ago
and one month ago.
Assume Sales cube with product sales per year,
quarter, and month a time table mapping
periods to prior year, quarter and month
time table maps t to y_ago, q_ago, m_ago
Sales cube prod sales per y, q, m
t y_ago q_ago m_ago
1999-m01 1998-m01 1998-m10 1998-m12
1999-m02 1998-m02 1998-m11 1999-m01
1999-q01 1998-q01 1998-q04 NULL
1999-y 1998-y NULL NULL
t product sales
1999-m01 vcr 100.00
1999-m02 vcr 120.00
1999-q01 vcr 360.00
1999-y vcr 2200.00
43Time Series Calculation (2)
- Reference model with Time table acts like
look-up table - CV carries values from the left side to the
right side - Without Model, you need 3 outer joins and a
regular join
SELECT product, sales, r_y_ago, r_q_ago,
r_m_ago FROM sales_cubeMODEL REFERENCE r ON
(SELECT from time) DIMENSION BY (t) MEASURES
(y_ago, q_ago, m_ago)MAIN PARTITION BY (product)
DIMENSION BY (t) MEASURES (sales, 0
r_y_ago, 0 r_q_ago, 0 r_m_ago)RULES (
r_y_agoANY sCV(t) / s y_agoCV(t) ,
-- year ago r_q_agoANY sCV(t) / s
q_agoCV(t) , -- quarter ago
r_m_agoANY sCV(t) / s m_agoCV(t)
-- month ago )
44Time Series Calculation (3)
Compute the ratio of current period sales of each
product to sales a year ago, quarter ago and a
month ago. For each row, we use the reference
Model to find 3 other rows.
Sales cube prod sales per y, q, m
t product sales r_y_ago r_q_ago a_m_ago
1999-m01 vcr 100.00 0.050 0.280 0.830
1999-m02 vcr 120.00
1999-q01 vcr 360.00
1998-q04 vcr 370.00
1999-y vcr 2200.00
2000-y vcr 2100.00
45Time Series Calculation (3)
Compute the ratio of current period sales of each
product to sales a year ago, quarter ago and a
month ago. For each row, we use the reference
Model to find 3 other rows.
Sales cube prod sales per y, q, m
t product sales r_y_ago r_q_ago a_m_ago
1999-m01 vcr 100.00 0.050 0.280 0.830
1999-m02 vcr 120.00 0.055 0.330
1999-q01 vcr 360.00 0.160 0.970 null
1998-q04 vcr 370.00
1999-y vcr 2200.00
2000-y vcr 2100.00
46Time Series Calculation (3)
Compute the ratio of current period sales of each
product to sales a year ago, quarter ago and a
month ago. For each row, we use the reference
Model to find 3 other rows.
Sales cube prod sales per y, q, m
t product sales r_y_ago r_q_ago a_m_ago
1999-m01 vcr 100.00 0.050 0.280 0.830
1999-m02 vcr 120.00 0.055 0.330 ...
1999-q01 vcr 360.00 0.160 0.970 null
1998-q04 vcr 370.00 null
1999-y vcr 2200.00 1.050 null null
2000-y vcr 2100.00 null null
47Recursive Model Solving Ledger (1)
In my ledger, I have accounts Net income,
Interest, Taxes, etc.
- I want to have 30 of my Net income as Interest
(F1) - My Net income is Salary minus Interest, minus Tax
(F2) - Taxes are 38 of Gross (salaryinterest) and 28
of Capital_gain (F3)
48Recursive Model Solving Ledger (1)
In my ledger, I have accounts Net income,
Interest, Taxes, etc.
- I want to have 30 of my Net income as Interest
(F1) - My Net income is Salary minus Interest, minus Tax
(F2) - Taxes are 38 of Gross (salaryinterest) and 28
of Capital_gain (F3)
net
two cycles in the formulas
F1
F2
F3
interest
tax
interest
49Recursive Model Solving Ledger (2)
In my ledger, I know Salary Capital_gains.
What are my Net income, Interest expense
Taxes?
Input Ledger
Output
Account Balance salary
100,000 capital_gains 15,000 net
0 tax
0 interest 0
Account Balance salary
100,000 capital_gains 15,000 net
100,000 tax
42,220 interest 30,000
Iterate till accuracy of .01
after 1st iteration
50Recursive Model Solving Ledger (2)
In my ledger, I know Salary Capital_gains. What
is my Net Taxes?
Input Ledger
Output
Account Balance salary
100,000 capital_gains 15,000 net
0 tax
0 interest 0
Account Balance salary
100,000 capital_gains 15,000 net
27,800 tax
30,800 interest 8,340
Iterate till accuracy of .01
after 2nd iteration
51Recursive Model Solving Ledger (2)
In my ledger, I know Salary Capital_gains. What
is my Net Taxes?
Input Ledger
Output
Account Balance salary
100,000 capital_gains 15,000 net
0 tax
0 interest 0
Account Balance salary
100,000 capital_gains 15,000 net
48,735 tax
36,644 interest 14,620
Iterate till accuracy of .01
after reaching accuracy (26 iterations)
52Financial Functions NPV
NPV net present value of a series of periodic
cash flows.
Cash_Flow table
year i prod amount npv
1999 0 vcr -100.00
2000 1 vcr 12.00
2001 2 vcr 10.00
2002 3 vcr 20.00
1999 0 dvd -200.00
2000 1 dvd 22.00
2001 2 dvd 12.00
2002 3 dvd 14.00
53Financial Functions NPV
NPV net present value of a series of periodic
cash flows.
Cash_Flow table
year i prod amount npv
1999 0 vcr -100.00
2000 1 vcr 12.00
2001 2 vcr 10.00
2002 3 vcr 20.00
1999 0 dvd -200.00
2000 1 dvd 22.00
2001 2 dvd 12.00
2002 3 dvd 14.00
amount1/power(1rate,1) npv1-1
54Financial Functions NPV
NPV net present value of a series of periodic
cash flows.
Cash_Flow table
year i prod amount npv
1999 0 vcr -100.00
2000 1 vcr 12.00
2001 2 vcr 10.00
2002 3 vcr 20.00
1999 0 dvd -200.00
2000 1 dvd 22.00
2001 2 dvd 12.00
2002 3 dvd 14.00
amount2/power(1rate,2) npv2-1
55Financial Functions NPV
NPV net present value of a series of periodic
cash flows.
Cash_Flow table
year i prod amount npv
1999 0 vcr -100.00
2000 1 vcr 12.00
2001 2 vcr 10.00
2002 3 vcr 20.00
1999 0 dvd -200.00
2000 1 dvd 22.00
2001 2 dvd 12.00
2002 3 dvd 14.00
amount3/power(1rate,3) npv3-1
56Financial Functions NPV
NPV net present value of a series of periodic
cash flows.
Cash_Flow table
year i prod amount npv
1999 0 vcr -100.00
2000 1 vcr 12.00
2001 2 vcr 10.00
2002 3 vcr 20.00
1999 0 dvd -200.00
2000 1 dvd 22.00
2001 2 dvd 12.00
2002 3 dvd 14.00
amounti/power(1rate, i) npvi-1
57Financial Functions NPV (2)
NPV Net present value of a series of
periodic cash flows.
Cash_Flow table and npv for rate 0.14
year i prod amount npv
1999 0 vcr -100.00 -100.00
2000 1 vcr 12.00 -89.47
2001 2 vcr 10.00 -81.78
2002 3 vcr 20.00 -68.28
1999 0 dvd -200.00 -200.00
2000 1 dvd 22.00 -180.70
2001 2 dvd 12.00 -171.47
2002 3 dvd 14.00 -162.02
58 SQL Model Performance
59SQL Model Time Series
Earlier example ratio of sales to year, quarter
and month ago SELECT product, sales, r_y_ago,
r_q_ago, r_m_ago FROM sales_cubeMODEL REFERENCE
r ON (SELECT from time) DIMENSION BY (t)
MEASURES (y_ago, q_ago, m_ago)MAIN PARTITION BY
(product) DIMENSION BY (t) MEASURES
(sales, 0 r_y_ago, 0 r_q_ago, 0 r_m_ago)RULES
( r_y_agoANY sCV(t) / s
y_agoCV(t) , -- year ago
r_q_agoANY sCV(t) / s q_agoCV(t) ,
-- quarter ago r_m_agoANY sCV(t) /
s m_agoCV(t) -- month ago )
- ANSI SQL version needs outer join for each
formula plus a join for reference model. - N formulas, M reference models ? NM joins ? 4
joins in this example - sales_cube ? time ? sales_cube ?
sales_cube ? sales_cube
60SQL Model vs. ANSI Joins
Query response time
400
ANSI joins
350
300
250
200
SQL Model
150
100
50
Number of rules or joins
1
6
2
3
4
5
10
9
8
7
11
12
13
14
61Summary
- New facility for spreadsheet-like computations in
SQL - High Performance
- Replaces multiple joins, unions
- Scalable in size and parallel processing
- Powerful optimizations
- Collaborative analysis
- Move external processing such as spreadsheets
into RDBMs for manageability and consolidation
62Next Steps.
- Demonstration at Oracle DEMOgrounds
- Exhibit hall, Booth 1326, Database Area
- Monday 500 PM - 800, Tuesday 1030 - 100,
300 - 600, Wednesday 1100 - 430, Thursday
1030 - 200 - Hands-on Lab
- Marriott Hotel - Golden Gate B1
- Lab Section Use Information from your Data
WarehouseLesson 1 Using the SQL Model clause - Monday 1030 - 500, Tuesday 830 - 1230,
300 - 500,Wednesday 830 - 430,
Thursday 830 - 230
63Reminder please complete the OracleWorld
online session surveyThank you.
64A
65(No Transcript)