Title: Solution of Midterm3
1Solution of Midterm3
Lecture 24 CS 157 B
2Purpose of Normalization
- To reduce the chances for anomalies to occur in a
database. - normalization prevents the possible corruption of
databases stemming from what are called
"insertion anomalies," "deletion anomalies," and
"update anomalies."
3Normal Forms
- Each normal form is a set of conditions on a
schema that guarantees certain properties
(relating to redundancy and update anomalies) - First normal form (1NF) is the same as the
definition of relational model (relations sets
of tuples each tuple sequence of atomic
values) - Second normal form (2NF) a research lab
accident has no practical or theoretical value
wont discuss - The two commonly used normal forms are third
normal form (3NF) and Boyce-Codd normal form
(BCNF)
4(No Transcript)
5BCNF
- Definition A relation schema R is in BCNF if for
every FD X? Y associated with R either - Y ? X (i.e., the FD is trivial) or
- X is a superkey of R
- Example Person1(SSN, Name, Address)
- The only FD is SSN ? Name, Address
- Since SSN is a key, Person1 is in BCNF
6(non) BCNF Examples
- Person (SSN, Name, Address, Hobby)
- The FD SSN ? Name, Address does not satisfy
requirements of BCNF - since the key is (SSN, Hobby)
- HasAccount (AcctNum, ClientId, OfficeId)
- The FD AcctNum? OfficeId does not satisfy BCNF
requirements - since keys are (ClientId, OfficeId) and (AcctNum,
ClientId) not AcctNum.
7Third Normal Form
- A relational schema R is in 3NF if for every FD
X? Y associated with R either - Y ? X (i.e., the FD is trivial) or
- X is a superkey of R or
- Every A? Y is part of some key of R
- There is no X? Y for non-prime attributes X,Y.
- 3NF is weaker than BCNF (every schema that is in
BCNF is also in 3NF)
BCNF conditions
83NF Example
- HasAccount (AcctNum, ClientId, OfficeId)
- ClientId, OfficeId ? AcctNum
- OK since LHS contains a key
- AcctNum ? OfficeId
- OK since RHS is part of a key
- HasAccount is in 3NF but it might still contain
redundant information due to AcctNum ? OfficeId
(which is not allowed by BCNF)
9Example
- R1 (A1, A2, A3, A5)
- R2 (A1, A3, A4)
- R3 (A4, A5)
- FD1 A1 ? A3 A5
- FD2 A5 ? A1 A4
- FD3 A3 A4 ? A2
10Example (cont)
- A1 A2 A3 A4 A5
- R1 a(1) a(2) a(3) b(1,4)
a(5) - R2 a(1) b(2,2) a(3) a(4)
b(2,5) - R3 b(3,1) b(3,2) b(3,3) a(4)
a(5)
11Example (cont)
- By FD1 A1 ? A3 A5
- A1 A2 A3 A4 A5
- R1 a(1) a(2) a(3) b(1,4)
a(5) - R2 a(1) b(2,2) a(3) a(4)
b(2,5) - R3 b(3,1) b(3,2) b(3,3) a(4)
a(5)
12Example (cont)
- By FD1 A1 ? A3 A5
- we have a new result table
- A1 A2 A3 A4 A5
- R1 a(1) a(2) a(3) b(1,4)
a(5) - R2 a(1) b(2,2) a(3) a(4)
a(5) - R3 b(3,1) b(3,2) b(3,3) a(4)
a(5)
13Example (cont)
- By FD2 A5 ? A1 A4
- A1 A2 A3 A4 A5
- R1 a(1) a(2) a(3) b(1,4)
a(5) - R2 a(1) b(2,2) a(3) a(4)
a(5) - R3 b(3,1) b(3,2) b(3,3) a(4)
a(5)
14Example (cont)
- FD2 A5 ? A1 A4
- we have a new result table
- A1 A2 A3 A4 A5
- R1 a(1) a(2) a(3) a(4)
a(5) - R2 a(1) b(2,2) a(3) a(4)
a(5) - R3 a(1) b(3,2) b(3,3) a(4)
a(5)
15- FD1. A?C, FD2. B?C, FD3. C?D FD4. D, E?C,
FD5. C,E? A,
16R(A,B,C,D,E)
17FD1 A-gtC
18FD2 B-gtC
19FD3 C-gtD
20FD4 D,E-gtC
21FD5 C,E-gtA
22It is Lossless
23Multivalued Dependencies (cont)
- t1 a t2 a t3 a t4 a
- t3 b t1 b
- t3 R - b t2 R - b
- t4 b t2 b
- t4 R - b t1 R - b
- The multivalued dependency a ?? b says that the
relationship between a and b is independent of
the relationship between a and R - b.
24Multivalued Dependencies (cont)
- If the multivalued dependency a ?? b is satisfied
by all relations on schema R, then a ?? b is a
trivial multivalued dependency on schema R. - Thus, a ?? b is trivial if b C a or b 4 a R
- Tabular representation of a ?? b
25Multivalued Dependencies (cont)
- To illustrate the difference between functional
and multivalued dependencies, we consider again
the BC-schema. - Graph 1
26Multivalued Dependencies (cont)
- On graph 1, we must repeat the loan number once
for each address a customer has, and we must
repeat the address for each loan a customer has.
This repetition is unnecessary, since the
relationship between that customer and his
address is independent of the relationship
between that customer and a loan. - If a customer (say, Smith) has a loan (say, loan
number L-23), we want that loan to be associated
with all Smiths addresses.
27Multivalued Dependencies (cont)
- The relation on graph 2 is illegal, therefore to
make this relation legal, we need to add the
tuples (L-23, Smith, Main, Manchester) and (L-27,
Smith, North, Rye) to the bc relation of graph 2. - Graph 2 (an illegal bc relation)
28Multivalued Dependencies (cont)
- Comparing the preceding example with our
definition of multivalued dependency, we see that
we want the multivalued dependency to hold. - customer-name ?? customer-street customer-city
- As was the case for functional dependencies, we
shall use multivalued dependencies in two ways - 1. To test relations to determine whether they
are legal under a given set of functional and
multivalued dependencies. - 2. To specify constraints on the set of legal
relations we shall thus concern ourselves with
only those relations that specify a given set of
functional and multivalued dependencies.
293NF
- One FD structure causes problems
- If you decompose, you cant check all the FDs
only in the decomposed relations. - If you dont decompose, you violate BCNF.
- Abstractly AB ? C and C ? B.
- Example 1 title city ? theatre and theatre ?
city. - Example 2 street city ? zip,zip ? city.
- Keys A, B and A, C, but C ? B has a left
side that is not a superkey. - Suggests decomposition into BC and AC.
- But you cant check the FD AB ? C in only these
relations.
30Multivalued Dependencies
- The multivalued dependency X ?? Y holds in a
relation R if whenever we have two tuples of R
that agree in all the attributes of X, then we
can swap their Y components and get two new
tuples that are also in R. - X Y others
31Example
- Drinkers(name, addr, phones, beersLiked) with MVD
Name ?? phones. If Drinkers has the two tuples - name addr phones beersLiked
- sue a p1 b1
- sue a p2 b2
- it must also have the same tuples with phones
components swapped - name addr phones beersLiked
- sue a p2 b1
- sue a p1 b2
- Note we must check this condition for all pairs
of tuples that agree on name, not just one pair.
32Dependency Preservation
- Let F F1 È F2 È . È Fn.
- F is a set of functional dependencies on schema
R, but, in general, F ¹ F.
33Dependency Preservation
- A decomposition having the property
F F - is a dependency-preserving decomposition.
34(1) Normal forms are (a) classifications
of relations based on the types of modification
anomalies to which they are vulnerable. (b)
Techniques for preventing anomalies. (c) Both
(a) and (b). (d) None of the above. Answer
35(2) Given a relation schema and associated
functional dependencies, it is always possible to
a) find a dependency preserving
decomposition of the relation into BCNF b)
find a lossless join decomposition of the
relation into BCNF c) both (a) and (b)
d) none of the above Answer
36) Given relation schema R(A,B,C,D) with FDs F
AB?C BC?D A?B, then which of the
following statements is true? a) B?C is a
member of F b) ABC?D is a member of
F c) CD?CD is a member of F d) Both
(b) and (c)
37(4) Given the relation schema R(A,B,C) and
functional dependencies F AB? C, B?A
C?B . Which attribute(s) are prime, i.e. part
of a candidate key? a) only A b) only
B c) A and B d) B and C
38(5) Given the relation schema R(A,B,C) and
functional dependencies F A?B, B?C,
AC?B. What is the result of using the
Relational database design algorithm for
producing a database schema which is dependency
preserving and has the lossless join property for
relations in 3rd normal form? a) R1(A,B),
R2(B,C) and R3(A,C,B) b) R1(A,B) and
R2(A,C) c) R1(A,B) and R2(B,C)
d) none of the above
39(2) Which of the following are informal design
guidelines for relational schema? (a)Reduce
the redundant values in tuples (b)Reduce the
null values in tuples (c )Disallow the
potential for generating spurious tuples
(d)All of the above Answer
40 (3) Given the relation schema, DeptSales(DeptNo,
Dname, Month, Year, Sales) and the set of
functional dependencies, F DeptNo?Dname
DeptNo,Month,Year?Sales, then which of the
following functional dependencies is a valid
inference? (a)DeptNo?Sales (b)DeptNo,Month,Year?Dn
ame (c)Dname?Sales (d)None of the above Answer
41(5) Given relation schema R(A,B,C,D) with FDs F
AB?C BC?D A?B, then which of the
following statements is true? B?C is a member
of F ABC?D is a member of F CD?CD is a member
of F Both (b) and (c) Answer
42Q2. (1 mark) Imagine that we have the relation R
(ABCDE) with FD1. AB?C, FD2. AB?D, FD3.
C,D?A,B. (a)Decide whether this relation is in
3NF. Answer yes (b)Is it in in
BCNF? Answer no
43Q3.(1 mark) What is the higest normal form of
this table? Draw the functional dependencygraph
Prime attributes P, F, B Thus R is 3NF
44Q4. (1 mark) Given a transaction table D, find
the support and the confidence for an
association rule B,D ? E Answer Support
confidence
45Q5.(1 mark) Use the Prims algorithm find the
minimum spanning tree step by step . Start from s
46Q2. (1 mark) Suppose Prim's minimum-spanning tree
algorithm is applied to this graph, starting with
node C . List the edges that are chosen for the
MST, in the order they are chosen.
47Q6. (1 mark) Give a relation schema R(A,B,C,D)
and functional dependencies F , such that the
Table is 3NF but not BCNF. Answer
48Q5.(1 mark) Consider a relation R(A,B,C,D).
which contains the following four tuples A
B C D 1 2 3 1 1 2 4
1 1 2 3 3 5 1 6 2
f we know MVD AC?gtD , how many tuples
(including the above 4 tuples) at least R must
have ?
49Give an example of a table which is 3NF but not
BCNF Example 1 F1. A?BCD
F2. AB ?CD Wrong it is BCNF Example 2 F1. AB
?C F2. AB ?D
F3. C ?D Wrong Example 3 F1. ABC ?D
F2. B ?C Wrong Example 4 F1. A ?C
F2. B ?D Wrong
50Example 5 F1. ABC ?D F2. D
?B F3. B ?D Wrong Example 6
F1. A ?BC F2. D ?B Wrong
51(No Transcript)
52(No Transcript)
53(No Transcript)
54(No Transcript)
55(No Transcript)
56(No Transcript)
57(No Transcript)
58(No Transcript)
59(No Transcript)
60(No Transcript)
61(No Transcript)
62(No Transcript)
63(No Transcript)
64(No Transcript)
65(No Transcript)
66Q8. (2 marks) Use the Apriori algorithm, find the
frequent-item sets . Show all your steps.
67(No Transcript)
68Q4. (1 mark) Given the following data set
Find the support and confidence of 3? 2.
Solution
69Q8. (2 marks) Use the Apriori algorithm find the
frequent itemsets for the following transaction
table with min-supp 40.
Solution 5x40 2
70Q8.
L1
Item-set
a1 a2 a3 a5
2 3 3 3
L3
a2 a3 a5
2
Fequent-item set a1, a2, a3, a5, a1a3, a2a3,
a2a5, a3a5, a2a3a5