Midterm 3 Revision 1 - PowerPoint PPT Presentation

1 / 61
About This Presentation
Title:

Midterm 3 Revision 1

Description:

Note the very different meanings of the attribute salary' Note ... for all rows in S which have the same symbols in the columns corresponding to attributes in X ... – PowerPoint PPT presentation

Number of Views:18
Avg rating:3.0/5.0
Slides: 62
Provided by: Lee144
Learn more at: http://www.cs.sjsu.edu
Category:

less

Transcript and Presenter's Notes

Title: Midterm 3 Revision 1


1
Midterm 3 Revision 1
CS157A Lecture 19
  • Prof. Sin-Min Lee
  • Department of Computer Science

2
Revision
  • For each of the E-R problems, develop a
    relational schema
  • Identify the primary keys, foreign keys and
    attribute domains
  • Comment on the data quality of each of the
    relational schema that you have developed

3
Normalization
  • A process to design good relations
  • Not a requirement of the relational model or any
    other model
  • Attempts to reflect the semantics of the data.
    This meaning is determined by the organization.
  • What does salary mean? Profits? Stock Price?
  • Relations can be in 1 NF, 2 NF, 3 NF, Boyce-Codd
    NF, 4 NF, and 5 NF
  • In this revision, we will concern ourselves with
    1 NF, 2 NF, 3 NF and 4NF.

4
First Normal Form
  • All attributes must be single-valued
  • Each row can assume only one value for a given
    attribute
  • Consider employees working for a given department
    at a given salary level
  • EMPLOYEE(EMPNO, department, salary)
  • Now consider employees who works for a given
    department and receive a raise every year. We
    are interested in maintaining data about each
    employees yearly salary.
  • EMPLOYEE(EMPNO, YEAR, department, salary)
  • Note the very different meanings of the attribute
    salary
  • Note the change in primary key - why?

5
Second Normal Form
6
(No Transcript)
7
Third Normal Form
  • Consider the following A large urban university
    offers several types of scholarships. Lets
    assume that there are four types and we have a
    predefined code for each. The type of
    scholarship that a student is eligible for is
    based on their major. An analyst sets up the
    following database
  • (SSNO, name, major, GPA, scholarship type)
  • What problems are we likely to encounter? Why?
  • Note that scholarship type is not dependent on
    (determined by) SSNO. Rather, it is determined
    by major, which is determined by SSNO.
  • This represents a transitive dependency.

8
Third Normal Form
  • Decompose the relation into two relations
  • Student(SSNO, name, major, GPA)
  • Scholarship_Eligibility(Major, Scholarship_Type)
  • The 3 NF rule
  • Once you are in 2 NF, examine if there are any
    transitive dependencies
  • Engage in lossless decomposition so as to remove
    these transitive dependencies
  • You are now in 3 NF!

9
Review
  • Given R(A, B, C) and FA-gtB
  • Is R in 2NF?
  • Is R in 3NF?
  • Is R in BCNF

10
Review
  • Given R(A, B, C) and FA-gtB, B-gtC
  • Is R in 2NF?
  • Is R in 3NF?
  • Is R in BCNF

11
Review
  • Given R(A, B, C) and FA-gtB, B-gtC
  • Is R in 2NF?
  • Is R in 3NF?
  • Is R in BCNF

12
Lossless Join Property
  • A decomposition D R1, R2, ..., Rm of R has
    the lossless join property with respect to the
    set of dependencies F on R if for every relation
    instance r of R that satisfies F, the following
    holds
  •   (?ltR1gt(r), ... , ?ltRmgt(r)) r

13
(No Transcript)
14
 Testing for the lossless join property
  • Consider R(A, B, C, D, E) and
  • FAB ? C, B ? D
  • DR1(A, B, C), R2(B,D)
  • Is D a lossless decomposition?

15
Lossless Join Property
  • Consider R(A, B, C, D) and F AB ?C
  • Let D1R1(A, B, C), R2(C, D)
  •  Let r be one of the legal instance
  • A B C D
  • ----------------------------
  • a1 b1 c1 d1
  • a2 b1 c1 d2
  • a2 b2 c2 d2

16
Lossless Join Property
  • ?ltR1gt ?ltR2gt
  • A B C C D
  • ------------------- -----------
  • a1 b1 c1 c1 d1
  • a2 b1 c1 c1 d2
  • a2 b2 c2 c2 d2

17
Lossless Join Property
  • ?ltR1gt ?ltR2gt
  • A B C D
  • --------------------------------
  • a1 b1 c1 d1
  • a1 b1 c1 d2 lt------- Spurious tuple
  • a2 b1 c1 d1 lt-------- Spurious tuple
  • a2 b1 c1 d2
  • a2 b2 c2 d2
  • R1 and R2 are not a lossless decomposition.

18
Lossless Join Property
  • Informally, a decomposition is lossless if no
    spurious tuples appear when the relations in the
    decomposition are JOINed.
  • Thus, decomposition D1 is not lossless

19
 Testing for the lossless join property
  • Step1 create a matrix S with one row i for each
    relation Ri in the decomposition D, and one
    column j for each attribute Aj in R
  • Step 2 set S(i,j)bij for all matrix entries

20
 Testing for the lossless join property
  • Step 3.
  • for each row i representing relation schema Ri
  • for each column j representing attribute Aj
  • if Ri includes attribute Aj
  • then set S(i,j)aj

21
 Testing for the lossless join property
  • Step 4.
  • repeat the following until a loop execution
    results
  • in no changes to S
  • for each functional dependency X ? Y in F
  • for all rows in S which have the same
    symbols in the columns corresponding to
    attributes in X

22
 Testing for the lossless join property
  • Step 4 continued .
  • make the symbols in each column that correspond
  • to an attribute in Y be the same in all these
    rows as follows
  • if any of the rows has an "a" symbol for the
    column, set the other rows to the same "a" symbol
    in the column
  • if no "a" symbol exists for the attribute in any
    of the rows choose one of the "b" symbols that
    appear in one of the rows for the attribute and
    set the other rows to the "b" symbols in the
    column

23
 Testing for the lossless join property
  • Step 5
  • If a row is made up entirely of "a" symbols,
    then the decomposition has the lossless join
    property-otherwise, it does not
  •  

24
 Testing for the lossless join property
  • Consider R(A,B,C) and FA?B, B ? C
  • D R1(A,B), R2(B,C)
  • A B C A B C
  • ---------------------- ----------------------
  • R1 a1 a2 b13 gt a1 a2 a3
  • R2 b21 a2 a3 b21 a2 a3
  •  
  • Thus, D is a lossless decomposition.

25
(No Transcript)
26
Review
  • Given R(A, B, C) and
  • FFD1,FD2,FD3,,FDk
  • Is R in 2NF?
  • Is R in 3NF?
  • Is R in BCNF
  • which is in 3NF but not in BCNF.

27
(No Transcript)
28
(No Transcript)
29
(No Transcript)
30
(No Transcript)
31
Overview
  • It is possible to decompose any relational schema
    into a set of relational schemas with the
    following properties
  • 1) Attribute Preserving
  • 2) FDs preserving
  • 3) Lossless Join

32
(No Transcript)
33
(No Transcript)
34
BCNF normalisation
35
(No Transcript)
36
The Decomposition Algorithm
  • Input A relational schema R and a set of FDs F.
  • Output A set of relational schemas R1, R2, ...,
    Rm

37
The Decomposition Algorithm
  • Step 1. Find a minimal cover G for F
  • Step 2. For each left-hand side X that appears in
    G, create a relation schema with attributes
  • X ? A1 ? A2, ... , ? Am
  • where X ? A1, X ? A2, ... , X ? Am are all
    dependencies in G with X as left-hand side.
  • Step 3. If none of the relation schemas contains
    a key of R, create one more relation schema that
    contains attributes that form a key for R.
  •  

38
Example
  • Consider R(A, B, C, D, E) and
  • FAB ? C, A ? BE, C ?E
  • Step 1. minimal cover
  • FminA?C, A?B, C?E
  • Step 2. R1(A,B,C), R2(C,E)
  • Step 3. Key A,D
  • we have R3(A,D)
  •  Final Result
  • R1(A, B,C), R2(C,E), and R3(A,D)

39
BCNF Decomposition
  • Property LJ1
  • A decomposition DR1, R2 of R has the lossless
    join property with respect to a set of functional
    dependencies F on R if and only if either
  • the FD ((R1 ? R2) ? (R1 - R2)) is in F, or
  • the FD ((R1 ? R2) ? (R2 - R1)) is in F

40
BCNF Decomposition
  • Property LJ2
  • If a decomposition DR1, R2, ..., Rm of R has
    the lossless join property with respect to a set
    of functional dependencies F on R, and if a
    decomposition D1Q1, Q2, ... ,Qk of Ri has the
    lossless join property with respect to the
    projection of F on Ri, then the decomposition
  • D2R1, R2, ... Ri-1, Q1, Q2, ..., Qk, Ri1,
    ..., Rm of R has the lossless join property with
    respect to F

41
BCNF Decomposition - Algorithm
  • 1. Set D ? R
  • 2. While there is a relation schema Q in D that
    is not in BCNF do
  • begin
  • choose a relation schema Q in D that is not in
    BCNF
  • find a functional dependency X ? Y in Q that
    violates BCNF
  • replace Q in D by two schemas
  • (Q-Y) and (X ? Y)
  • end
  •  

42
BCNF Decomposition - Example
  • Consider R(A,B,C,D) and
  • FA ? B, B ? C, D ? B
  • Decompose R into BCNF relations.
  • Step 1.DR(A,B,C,D)
  • Step 2. Loop 1.R is not in BCNF because A ? B and
    A is not a superkey
  • decompose R into R1(A, C, D), and R2(A, B)
  • Loop 2. R1 is not in BCNF because A ? C and A
    is not a superkey
  • decompose R1 into R11(A, D) and R12(A, C)
  • ResultDR11(A,D), R12(A,C), R2(A,B)
  •  

43
(No Transcript)
44
Questions
  • Is the decomposition always unique?
  • Is a decomposition still useful if it does not
    preserve the lossless property?
  •  Is a decomposition still useful if it does not
    preserve the dependency preservation condition?
  •  Is a decomposition still useful if it does not
    preserve the attribute preservation condition? 

45
Overview
  • Given a relation schema R(A1, A2, ... , An).
  • If R is not in the third normal form (3NF), we
    wan to decompose it into a set of relation schema
    D R1, R2, ... ,Rm , where each Ri is in 3NF,
    such that the following conditions are held
  • Attribute preservation condition.
  • Dependency preservation condition
  • Lossless join condition

46
(No Transcript)
47
Attribute Preservation Condition
  • Attribute preservation condition states that the
    union of attributes of Ri equal to the set of
    attributes of R.
  •  For example Given R(A, B, C, D) and the
    decomposition
  • D1 R1(A,B), R2(B,C) and R3(A,C,D). D1
    satisfies the attribute preservation condition.

48
Attribute Preservation Condition
  • Given R(A, B, C, D) and the decomposition
  • D2R1(A, B), R2(B,C), R3(A, C),
  • The attribute preservation condition is violated
    because D is missing (not preserved in the
    decomposition).

49
Dependency Preservation Condition
  • We say that a decomposition DR1, R2, ... , Rm
    of R is dependency preserving with respect to F
    if the union of the projections of F on each Ri
    in D is equivalent to F. That is
  • ((?F(R1) ? ... ? ?F(Rm)) F
  • Given a set of dependencies F on R, the
    projection of F on Ri, denoted by ?F(Ri) where Ri
    is a subset of R, is the set of dependencies X ?
    Y in F such that the attributes in X ? Y are all
    contained in Ri.

50
Dependency Preservation Condition
  • Given R(A, B, C, D) and F A ? B, B ? C, C ?
    D
  •  Let D1R1(A,B), R2(B,C), R3(C,D)
  • ?F(R1)A ? B
  • ?F(R2)B ? C
  • ?F(R3)C ? D
  • FDs are preserved.

51
Dependency Preservation Condition
  • Given R(A, B, C, D) and F A ? B, B ? C, C ?
    D
  •  Let D2R1(A,B, R2(B,C), R3(A, D), then FDs
    are not preserved.
  •  

52
Dependency Preservation Condition
  • Given R(A, B, C, D) and F A ? B, B ? C, C ?
    D
  •  Let D2R1(A,B, R2(B,C), R3(A, D), then FDs
    are not preserved.
  •  

53
Dependency Preservation Condition
  • We want to preserve the dependencies because each
    dependency in F represents a constraint on a
    database.

54
Dependency Preservation Condition
  • If one of the dependencies is not represented by
    the dependencies on some individual relation Ri
    of the decomposition, we will not be able to
    enforce this constraint by looking only at an
    individual relation, instead, to enforce the
    constraint, we will have to join two or more of
    the relations in the decomposition and then check
    that functional dependency hold in the result of
    the join operation. This is very inefficient.

55
(No Transcript)
56
Picture of MVD X -gt-gtY
X Y others equal exchange
57
MVD Rules
  • Every FD is an MVD.
  • If X -gtY, then swapping Y s between two tuples
    that agree on X doesnt change the tuples.
  • Therefore, the new tuples are surely in the
    relation, and we know X -gt-gtY.
  • Complementation If X -gt-gtY, and Z is all the
    other attributes, then X -gt-gtZ.

58
Fourth Normal Form
  • The redundancy that comes from MVDs is not
    removable by putting the database schema in BCNF.
  • There is a stronger normal form, called 4NF, that
    (intuitively) treats MVDs as FDs when it comes
    to decomposition, but not when determining keys
    of the relation.

59
4NF Definition
  • A relation R is in 4NF if whenever X
    -gt-gtY is a nontrivial MVD, then X is a
    superkey.
  • Nontrivial means that
  • Y is not a subset of X, and
  • X and Y are not, together, all the attributes.
  • Note that the definition of superkey still
    depends on FDs only.

60
BCNF Versus 4NF
  • Remember that every FD X -gtY is also an MVD, X
    -gt-gtY.
  • Thus, if R is in 4NF, it is certainly in BCNF.
  • Because any BCNF violation is a 4NF violation.
  • But R could be in BCNF and not 4NF, because
    MVDs are invisible to BCNF.

61
(No Transcript)
Write a Comment
User Comments (0)
About PowerShow.com