Title: Multi-valued Dependencies and Fourth Normal Form
1Multi-valued Dependencies and Fourth Normal Form
2Topics Covered
- Definition of Multivalued Dependencies
- Reasoning about MVDs
- Fourth Normal Form
3Motivation
- There are schemas that are in BCNF that do not
seem to be sufficiently normalized
Stars
name
street
city
title
year
C. Fisher
123 Maple Str.
Hollywood
Star Wars
1977
C. Fisher
5 Locust Ln.
Malibu
Star Wars
1977
C. Fisher
123 Maple Str.
Hollywood
Empire Strikes Back
1980
C. Fisher
5 Locust Ln.
Malibu
Empire Strikes Back
1980
C. Fisher
123 Maple Str.
Hollywood
Return of the Jedi
1983
5 Locust Ln.
C. Fisher
Malibu
Return of the Jedi
1983
4Attribute Independence
- No reason to associate address with one movie and
not another - When we repeat address and movie facts in all
combinations, there is obvious redundancy - However, NO BCNF violation in Stars relation
- There are no non-trivial FDs at all, all five
attributes form the only superkey - Why?
5Multi-valued Dependency
- Definition Multivalued dependency (MVD)
- A1A2An ?? B1B2Bm holds for relation R if
- For all tuples t, u in R
- If tA1A2...An uA1A2...An, then there
exists a v in R such that - (1) vA1A2...An tA1A2...An
uA1A2...An - (2) vB1B2Bm tB1B2Bm
- (3) vC1C2Ck uC1C2Ck, where
C1C2Ck is all - attributes in R except (A1A2...An ?
B1B2Bm)
6Pictorially Speaking...
As
Bs
Others
t
v
u
w
- An MVD guarantees v exists
- The existence of a fourth tuple w is implied by
interchanging t and u
7Example name ?? street city
Stars
name
street
city
title
year
C. Fisher
123 Maple Str.
Hollywood
Star Wars
1977
t
C. Fisher
123 Maple Str.
Hollywood
Empire Strikes Back
1980
v
C. Fisher
5 Locust Ln.
Malibu
Empire Strikes Back
1980
u
8Example name ?? street city
Stars
name
street
city
title
year
C. Fisher
123 Maple Str.
Hollywood
Star Wars
1977
u
C. Fisher
5 Locust Ln.
Malibu
Star Wars
1977
w
C. Fisher
123 Maple Str.
Hollywood
Empire Strikes Back
1980
v
C. Fisher
5 Locust Ln.
Malibu
Empire Strikes Back
1980
t
9More on MVDs
- Intuitively, A1A2An ?? B1B2Bm says that the
relationship between A1A2An and B1B2Bm is
independent of the relationship between A1A2An
and R -B1B2Bm - MVD's uncover situations where independent facts
related to a certain object are being squished
together in one relation - Functional dependencies rule out certain tuples
from being in a relation - How?
- Multivalued dependencies require that other
tuples of a certain form be present in the
relation - a.k.a. tuple-generating dependencies
10Lets Illustrate
- In Stars, we must repeat the movie (title, year)
once for each address (street, city) a movie star
has - Alternatively, we must repeat the address for
each movie a star has made - Example Stars with name ?? street city
name
street
city
title
year
C. Fisher
123 Maple Str.
Hollywood
Star Wars
1977
C. Fisher
5 Locust Ln.
Malibu
Empire Strikes Back
1980
C. Fisher
123 Maple Str.
Hollywood
Return of the Jedi
1983
- Is an incomplete extent of Stars
- Infer the existence of a fourth tuple under the
given MVD
11Trivial MVDs
- Trivial MVD
- A1A2An ?? B1B2Bm where B1B2Bm is a subset
of A1A2An or (A1A2An ? B1B2Bm ) contains all
attributes of R
12Reasoning About MVDs
- FD-IS-AN-MVD Rule (Replication)
- If A1A2An ? B1B2Bm then
- A1A2An ?? B1B2Bm holds
13Reasoning About MVDs
- COMPLEMENTATION Rule
- If A1A2An ?? B1B2Bm then A1A2An ?? C1C2Ck
where C1C2Ck is all attributes in R except
(A1A2An ? B1B2Bm ) - AUGMENTATION Rule
- If X??Y and W?Z then WX ??YZ
- TRANSITIVITY Rule
- If X??Y and Y??Z then X ?? (Z-Y)
14Coalescence Rule for MVD
X ?? Y
Then X ? Z
If
?
?
?WW ? Z
Remark Y and W have to be disjoint and Z has to
be a subset of or equal to Y
15Definition 4NF
- Given relation R and set of MVD's for R
- Definition R is in 4NF with respect to its MVD's
if for every non-trivial MVD A1A2An??B1B2Bm ,
A1A2An is a superkey - Note Since every FD is also an MVD, 4NF implies
BCNF - Example Stars is not in 4NF
16Decomposition Algorithm
- (1) apply closure to the user-specified FD's and
MVD's - (2) repeat until no more 4NF violations
- if R with AA -gtgt BB violates 4NF then
- (2a) decompose R into R1(AA,BB) and
-
R2(AA,CC), where CC is all - attributes in R except (AA ?
BB) - (2b) assign FD's and MVD's to the new
relations - MVD's hard problem!
- No simple test analogous to computing the
attribute closure for FDs exists for MVDs. You
are stuck to have to use the 5 inference rules
for MVDs when computing the closure!
17Exercise
- Decompose Stars into a set of relations that are
in 4NF. - name??street city is a 4NF violation
- Apply decomposition
- R(name, street, city)
- S(name, title, year)
- What about name??street city in R and name??title
year in S?
18Exercise
- For the relation R(A,B,C,D) with only MVDs
A??B and A??C find all 4NF violations and
decompose R into a collection of relation schemas
in 4NF.
19Solution
- Since there are no functional dependencies, the
only key is all four attributes, ABCD. - Thus, each of the nontrivial multivalued
dependencies A-gt-gtB and A-gt-gtC violate 4NF. - Separate out the attributes of these
dependencies, first decomposing into AB and ACD - Then decompose the latter into AC and AD because
A-gt-gtC is still a 4NF violation for ACD. - The final set of relations are AB, AC, and AD.
20Exercise
- Suppose we have relation R(A,B,C) with MVD
A??B. If we know that the tuples (a,b1,c1),
(a,b2,c2), and (a,b3,c3) are in the current
instance of R, what other tuples do we know must
also be in R?
21Solution
- Since A-gt-gtB, and all the tuples have the same
value for attribute A, we can pair the B-value
from any tuple with the value of the remaining
attribute C from any other tuple. - Thus, we know that R must have at least the nine
tuples of the form (a,b,c), where b is any of b1,
b2, or b3, and c is any of c1, c2, or c3. That
is, we can derive, using the definition of a
multivalued dependency, that each of the tuples
(a,b1,c2), (a,b1,c3), (a,b2,c1), (a,b2,c3),
(a,b3,c1), and (a,b3,c2) are also in R.
22Another View of 4NF
True MVD X??Y non-trivial X?Y does not hold
s
True MVDs
MVDs that are also FDs
Remark If X??Y is a true MVD then X cannot be a
superkey (because X?Y does not hold) Therefore,
true MVDs always violate 4NF (true MVDs are
always bad)
Trivial MVDs
4NF Relation is in BCNF and there are no true
MVDs (yellow part is empty)
X??Y and X?Y
X??Y and not X?Y
X??Y
23H1-2005-Problem8
- 8) Normalization 6 graded
- R(A,B,C,D,E,F) is given with (1) AB?CD (2)CD?AB
(3)AB?F (4) F?E - What are the candidate keys of relation R? 1
- b) Transform R into a relational schema that is
in BCNF and does not have any lost functional
dependencies! 5 - Correct Solution
- Candidate keys AB and CD
- Decompose R into R1(A,B,C,D,F) with local FDs
(1), (2), (3) and R2(E,F) with local FDs (4) Due
to the fact that all four dependencies are still
present no functional dependency has been lost.
Moreover, all functional depencies are good - A non-optimal (too many relations) solution I
also saw was Decompose R into R1(A,B,C,D) with
local functional dependencies AB?CD and CD?AB,
R2(A,B,F) with local functional dependencies AB?F
and R3(F,E) with local functional dependencies
F?A..
24Problem 1 H1-2004
- Candidate keys are a,b, a,d, a,e
- 14 superkeys total
- All but the first functional dependency are bad
25Problem 2 H2-04
- No E??BC is a true multi-valued dependency and
E is not a candidate key (as a matter of fact
EA,D,E,F see below) - No (but just mentioning neither E ?ABC nor E? CF
holds is not sufficient (e.g. if E??ABC holds
then the decomposition is lossless!) ) --- a
counter examples should be given to show that the
statement is false! - Yes C is candidate key therefore C?BDEF
therefore C??BDEF - Yes E ??BC and BC ??BCD implies E??D due to
MVD-transitivity (C?CD?BC?BCD ? BC??BCD) - Yes E??BC therefore E??ADF moreover, C?ADF and
using the Coalescence Rule we obtain E?ADF
therefore, E?A holds - No R is not in BCNF because E?ADF holds and E is
not a candidate key.
26Problem 3a-2004
- From A??B and A??C we can infer A??BC??
- (1) A??C ?A??AC
- (2) A??B ? AC??ABC
- (3) AC??ABC ? AC??DE
- (4) A??AC, AC??DE ?A??DE
- (5) A??DE ?A??BC
- Using 1. Augmentation, 2.Augmentation,
3.Complementation, 4.Transitivity, 5.
Complementation
Wrong!!
Remark This problem will be revised in
Homework3-2005 it is too complicated to worry
about it for the midterm exam!
27MDVs and FDs --- Ungraded Homework
- Assume we have a relation R(A,B,C,D,E) with the
following dependencies - (1)Â Â AB? CDE
- (2)Â Â CD ? ABE
- (3)Â Â E?? DB
- Answer the following questions giving reasons for
your answers - a)Â Â Â Â Â Is R in BCNF? ????? (answer after Spring
break) Warning The presence of the MVD might
imply other functional dependencies (see textbook
page 637) - b)Â Â Â Â Â Does ABE ? D hold for R? yes
- c)Â Â Â Â Â Does CD ?? B hold for R? yes
- d)Â Â Â Â Â Does E?? D always hold for R (either show
that this dependency can be inferred from the
given 3 dependencies, or give a counter example
of a relation that satisfies (1), (2), (3) but
violates E??D)? No