Title: Explicit and Implicit LIST Aggregate Function for Relational Databases
1Explicit and Implicit LIST Aggregate Function for
Relational Databases
- Witold LitwinUniversité Paris 9 Dauphine
- mailtoWitold.litwin_at_dauphine.fr
2Summary
- New Aggregate Function
- Transforms a set of values into single one
- Char type
- A basic long time need
- Should be highly useful
3Plan
- Motivating Examples
- Explicit LIST
- Implicit LIST
- Conlusion
- Further Work
4Motivating Example 1
- The Supplier-Part (SP) table of the best-known
S-P database
5Motivating Example 1
select SP.S, Sum(SP.Qty) AS Total Qty from
SP group By SP.S S Total Qty S1 1300 S2 700
S3 200 S4 900
- How to get also the individual quantities ?
6Motivating Example 2
- A database of persons having
- Multiple Hobbies
- Multiple preferred Restaurants
- Many Friends
- Best design
- four 4-NF tables
- P (SS , Name), H (SS, Hobby), R (SS, Rest), F
(SS, Friend)
7Database
8Fragment
9Query
Select Name, Friends, Restaurants, Hobbies, of
Person SS1
SQL
select P.SS, P.Name, F.Friend, R.Rest,
H.Hobby from ((P INNER JOIN F ON P.SS
F.SS) INNER JOIN H ON P.SS H.SS)
INNER JOIN R ON P.SS R.SS where P.SS
"ss1"
10Result
Usable ???
11General Problem
- Current RDBs manage tables in 1NF
- All attributes are single-valued (atomic values)
- Example 1 We wished
- Single-valued attribute
- SUM(QTY)
- Multi-valued attribute
- Individual quantities
- The result would not be 1NF
12General Problem
- RDB manages tables in 1NF
- All attributes are single-valued
- Example 2 We wished
- Single-valued attributes
- S, Name
- Multi-valued attributes (multi-sets)
- Hobby, Rest, Friend
- The result is normalized to 1NF
- (ss1,Witold, x, y, z) x ? Hobby, y ? Rest, z
? Friend - The table is not in 4NF
- Subject to well-known anomalies
13Solutions
- Design RDBS for 0NF tables
- A revolution
- 0NF RDBS will not be here for years
- Aggregate set or multi-set values into atomic
values - An evolution
- All RDBS already do it using
- SUM, AVG, COUNT
- perhaps with GROUP BY
- We need a new aggregate leaving the entire set
visible - E.g (multi)-set of values X gt (single) list
of values X
14Local Culinary Example
- The set-valued attribute
- (Schwarz, Wälder Kirchen, Chocoladen, Torte)
- The aggregated attribute
- Schwarzwälderkirchenchocoladentorte
- Local specialty, try it !
15Explicit LIST function
Select S, sum (Qty) AS Total Qty, LIST (Qty)
AS Histogram from SP group by S
16Explicit LIST function
select P.SS, Name, LIST (DISTINCT (Friend)),
LIST (DISTINCT (Rest)), LIST (DISTINCT (Hobby))
from P, F, R, H where P.SS F.SS and F.SS
R.SS and R.SS H.SS and P.SS "ss1" group
by P.SS, Name
17Explicit LIST function
- Simulated actual output using MsAccess forms
with list boxes - Form with three subforms
- No SQL query used
18Explicit LIST function
- select P, SUM (Qty) as Total Qty, LIST (S,
Qty) as Per supplier - from SP
- group by P
19Implicit LIST function
- For any single-valued A
- A LIST (A)
- Any non-aggregated attribute in an SQL query has
to be in the GROUP BY clause - Now, any non-aggregated perhaps composite
attribute A from a single table and not in GROUP
BY clause is implicitly under - LIST (DISTINCT (A))
- Queries may become less procedural
20Implicit LIST function
- select P, SUM (Qty) as Total Qty, S, Qty
- from SP
- group by P
- having S QTY like s4
- Implicit LIST is LIST (S, QTY)
21Implicit LIST function
- Query
- Select S., P, Qty From S, SP Where S.S SP.S
- Repeats all the data of the supplier S in
every resulting tuple - 6 times for S1 its Name, City, Status
- Query
- Select S., P, Qty From S, SP Where S.S SP.S
- Group By S.S
- Does it only once per supplier
- Less redundancy
22Implicit Joins and From
- Equijoins following the referential semantic
links or integrity may be implicit - MsAccess, SQL Server
- FROM clause content can be inferred from the
attribute names - Even less procedural formulation may result
- select P.SS, Name, Friend, Rest, Hobby
- group by P.SS, Name
23Implementation Issues
- Should be easy for the RDBS owner
- Any RDB already processes the aggregates
Already done hiding the list
Should also be shown
24Implementation Issues
- For explicit LIST, foreign function interface may
suffice - Oracle, DB2, Yukon
- See related work in the paper for current
(limited) proposals - Oracle iAnywhere (core code)
- Not for the implicit LIST
- Access to core code is necessary
25Conclusion
- LIST is a new aggregate function
- Aggregates a multi-valued attribute into a single
value - Responds to a long-standing fundamental RDBS user
need - - 30 years ?
- Should be rather easy to implement
- Future work should start with the implementation
- Using foreign functions for explicit LIST
26Research Support
- European Commission ICONS Project
- no. IST-2001-32429.
- Microsoft Research
27Thank You for Your Attention
Witold LitwinUniversité Paris 9
Dauphine mailtoWitold.litwin_at_dauphine.fr