Closure - PowerPoint PPT Presentation

About This Presentation
Title:

Closure

Description:

Star Wars. Paramount. Paramount. Fox. studioName. Mike Meyers ... Star Wars. year. title. MovieStudioStar(title, year, length, studioName, starName, filmType) ... – PowerPoint PPT presentation

Number of Views:21
Avg rating:3.0/5.0
Slides: 29
Provided by: tson
Learn more at: https://www.cs.nmsu.edu
Category:

less

Transcript and Presenter's Notes

Title: Closure


1
Closure
  • The closure of B1Bk under the set of FDs S,
    denoted by B1Bk, is defined as follows
  • B1Bk B any relation satisfies S
  • will also satisfies B1Bk?B

2
Computing the closure
  • Given the set S and A1,,An
  • Compute A1,,An- denote this set by X
  • Step 1 X A1,,An
  • Step 2 find a FD B1Bk?B in S such that
    B1,,Bk ? X and B ?X, then XX?B
  • Step 3 repeat step 2 until nothing more can be
    added to X, then go to step 4
  • Step 4 return X

3
Example
  • S AB?C, BC?AD, D?E, CF?B
  • Compute A,B
  • Step 1 X A,B
  • Step 2 X X ?CA,B,C because AB?C
  • Step 3 back to step 2 X X ?D because BC?AD
  • Step 3 back to step 2 X X ?E because D?E
  • Step 3 back to step 2 nothing more
  • Step 3 go to step 4 return A,B,C,D,E

4
Correctness of closure algorithm
  • It computes true functional dependencies
  • proof show that if B belongs to A1,,An then
    A1An?B holds. By induction over the number of
    steps (n) used in adding an attribute B into the
    set X
  • n0 then B belongs to A1,,An and so A1An?B is
    a trivial functional dependency
  • n ? n1 if B is added to X in the step n1, then
    A1An?Bj for all j by inductive hypothesis this,
    together with B1Bk?B, implies that A1An?B
  • It computes all functional dependencies
  • proof show that if B does not belong to
    A1,,An then A1An?B does not hold. By
    constructing an instance I of the relation R such
    that the FD does not hold.

A in the closure Others 111 111
000 000 111 111 111 111
5
Simple questions
  • What is A1,,An if A1,,An is a key of the
    relation?
  • Can A1,,An A1,,An?
  • Does B1,,Bm? A1,,An imply B1,,Bm
    ?A1,,An?

6
Transitive Rules
  • Given A1An ? B1Bm
  • B1Bm ? C1Ck
  • then A1An ? C1Ck

7
Closing sets of FDs
  • Given a set of FDs we can derive some other FDs
    using the rules about FDs (e.g. combining,
    splitting, and transitive)
  • For a relation R, a set of FD is called a basis
    for R if all other FDs of R can be derived form
    it.
  • A basis is minimal if none of its proper subsets
    is a basis.

8
Projecting FDs
Bs
R
S
  • Given
  • R with a set of FDs F
  • S (a new relation) is obtained by removing the
    attributes B1,,Bm from R
  • Questions What are the FDs of S?
  • Answer if A1An?C1Ck is a FD of R and none of
    the Bs appears on the left or right side
    (B1,,Bm?A1,,An,C1,,Ck?) is a FD of S

9
Projecting - Example
  • Given R(A,B,C,D) with the FDs A?B, B?C, and C?D.
  • Remove the attribute B from R, we obtain a new
    relation S(A,C,D).
  • What are the FDs of S?
  • A?C?
  • A?D?
  • C?D?
  • We can compute this by
  • Compute all the closure of every subset of
    A,C,D
  • by using the FDs of R that do not contain B.

10
Homework
  • 3.5.1 Consider a relation with schema R(A,B,C,D)
    and FDs AB?C, C?D, and D?A.
  • What are all the nontrivial FDs that follow from
    the given FDs? List only the FDs with one
    attribute on the right? (5pt)
  • What are the keys of R?
    (5pt)
  • What are the superkeys but not keys?
    (5pt)
  • 3.5.3 Show that the following rule holds
    (5pt)
  • if A1An?B1Bm and C1Ck?D1Dt hold
  • then A1AnC1Ck? B1BmD1Dt also holds.

11
For those whole like fun
  • 3.5.4 Does the following hold
  • if A?B then B?A
  • if AB?C and A?C then B?C
  • 3.5.8 A set of attributes is closed if XX. What
    are the FDs of a relation R(A,B,C,D) if
  • all sets of four attributes are closed
  • the only closed sets are and A,B,C,D
  • the closed sets are , A,B, A,B,C,D
  • (note the cases are considered separate)
  • Stars try the exercises with stars.

12
Design of Relational Database Schema
title year length studioName starName filmType
Star Wars 1977 124 Fox Mark Hamill color
Star Wars 1977 124 Fox Harrison Ford color
Star Wars 1977 124 Fox Carrie Fisher color
Mighty Ducks 1991 104 Disney Emilio Estevez color
Waynes World 1992 95 Paramount Dana Carvey color
Waynes World 1992 95 Paramount Mike Meyers color
  • Some observations
  • value of studioName is the same in several
    tuples
  • value of filmType is also repeated

CAN WE AVOID THESE ANORMALIES?
  • What wrong with it?
  • redundancy ? store the same value
    unnecessary several time
  • update anormalies ? an update might require
    several changes
  • deletion anormalies ? losing information if
    delete a value

13
Possible ways to avoid anormalies (Intuition)
  • The bad way start again (Oh, no!)
  • The natural way try to decompose the given
    relation into two or more relations that
  • contain the same information
  • avoid the anormalies

14
Example
title year length studioName starName filmType
Star Wars 1977 124 Fox Mark Hamill color
Star Wars 1977 124 Fox Harrison Ford color
Star Wars 1977 124 Fox Carrie Fisher color
Mighty Ducks 1991 104 Disney Emilio Estevez color
Waynes World 1992 95 Paramount Dana Carvey color
Waynes World 1992 95 Paramount Mike Meyers color
title year length studioName filmType
Star Wars 1977 124 Fox color
Mighty Ducks 1991 104 Disney color
Waynes World 1992 95 Paramount color
title year starName
Star Wars 1977 Mark Hamill
Star Wars 1977 Harrison Ford
Star Wars 1977 Carrie Fisher
Mighty Ducks 1991 Emilio Estevez
Waynes World 1992 Dana Carvey
Waynes World 1992 Mike Meyers
MovieStudioStar(title, year, length, studioName,
starName, filmType) is decomposed into 2
relations MovieStudio(title, year, length,
studioName, filmType) and StarsIn(title, year,
starName)
15
Decomposition
  • Given a relation R with schema A1,,An. A
    decomposition of R into two relations S and T
    with schemas B1,,Bm and C1,,Ck,
    respectively, such that
  • A1,,An B1,,Bm ? C1,,Ck
  • The tuples in S are the projections onto
    B1,,Bm of all the tuples in R.
  • The tuples in T are the projections onto
    C1,,Ck of all the tuples in R.

16
Example Projections
title year length studioName starName filmType
Star Wars 1977 124 Fox Mark Hamill color
Star Wars 1977 124 Fox Harrison Ford color
Star Wars 1977 124 Fox Carrie Fisher color
Mighty Ducks 1991 104 Disney Emilio Estevez color
Waynes World 1992 95 Paramount Dana Carvey color
Waynes World 1992 95 Paramount Mike Meyers color
How do we come up with this decomposition?
title year length studioName filmType
Star Wars 1977 124 Fox color
Mighty Ducks 1991 104 Disney color
Waynes World 1992 95 Paramount color
title year starName
Star Wars 1977 Mark Hamill
Star Wars 1977 Harrison Ford
Star Wars 1977 Carrie Fisher
Mighty Ducks 1991 Emilio Estevez
Waynes World 1992 Dana Carvey
Waynes World 1992 Mike Meyers
MovieStudioStar(title, year, length, studioName,
starName, filmType) is decomposed into 2
relations MovieStudio(title, year, length,
studioName, filmType) and StarsIn(title, year,
starName)
17
Boyce-Codd Normal Form (BCNF)
  • BCNF a relation R is in BCNF iff whenever there
    is a nontrivial FD A1An?B for R, it is the case
    that A1,,An is a superkey for R.
  • Why this definition? Answer if a relation is in
    BCNF then there is no anormaly.
  • Example

MovieStudioStar(title, year, length, studioName,
starName, filmType) not in BCNF MovieStudio(title
, year, length, studioName, filmType) in BCNF
StarsIn(title, year, starName) in BCNF
18
Decomposition into BCNF
  • Suppose that we decompose a relation R into two
    relations S and T which are in BCNF. The
    requirements for S and T
  • S and T is a decomposition of R
  • it is possible to reconstruct R from S and T
  • Will every decomposition of R satisfy these two
    conditions?
  • What are the FDs of the new relations?

19
Algorithm
  • Given a relation R with the attributes A1,,An.
  • Step 1 For every nontrivial FD B1Bm?B if
    B1,,Bm is a superkey then returns R (no
    decomposition is needed)
  • Step 2 Takes a nontrivial FD B1Bm?B such that
    B1,,Bm is not a superkey, then decomposes R
    into two relations S and T with the following
    schema
  • Ss schema B1,,Bm
  • Ts schema B1,,Bm ? (A1,,An\B1,,Bm)
  • Repeat Step 12 for S and T until no
    decomposition is needed for every new relation
    return the set of new relations as the result

20
Example
  • The new movie relation with the following
    attributes title,year,studioName,president,presA
    ddress (we call this set ALL)
  • with the FDs title year?studioName,
    studioName?president, president?presAddress
  • Only one key title,year
  • studioName?president violated BCNF
  • Step 2 takes studioName?president, decomposes
    into
  • S with the schema studioNamestudioName,presid
    ent,presAddress
  • T with the schema
  • studioName,title,yearstudioName ? (ALL\
    studioName)
  • Check studioName,title,year is in BCNF (the
    first two FDs)
  • studioName,president,presAddress is not in
    BCNF
  • Continue with the decomposition of S using
    president?presAddress and we get the following
    two relation schemas president,presAddress and
    president,studioName both are in BCNF.
  • The final result studioName,title,year,
    president,presAddress,president,studioName

21
Recovering information from a decomposition
  • Suppose that R with the schema A1,,An is
    decomposed into two relations S and T according
    to the algorithm whose attributes are B1,,Bm
    and B1,,Bm? (A1,,An\B1,,Bm)
  • The tuples of R can be obtained by joining all
    possible pairs of S and T where B1,,Bm have
    the same values.

22
Recovering
the Bs
others
the rest of the closure
t (R)
t (S)
Projection
Join
t (T)
B1,,Bm
B1,,Bm\ B1,,Bm
A1,,An\B1,,Bm
23
Example Decomposition and Recovering
title year length studioName starName filmType
Star Wars 1977 124 Fox Mark Hamill color
Star Wars 1977 124 Fox Harrison Ford color
Star Wars 1977 124 Fox Carrie Fisher color
Mighty Ducks 1991 104 Disney Emilio Estevez color
Waynes World 1992 95 Paramount Dana Carvey color
Waynes World 1992 95 Paramount Mike Meyers color
title year length studioName filmType
Star Wars 1977 124 Fox color
Mighty Ducks 1991 104 Disney color
Waynes World 1992 95 Paramount color
title year starName
Star Wars 1977 Mark Hamill
Star Wars 1977 Harrison Ford
Star Wars 1977 Carrie Fisher
Mighty Ducks 1991 Emilio Estevez
Waynes World 1992 Dana Carvey
Waynes World 1992 Mike Meyers
MovieStudioStar(title, year, length, studioName,
starName, filmType) is not in BCNF is decomposed
into 2 relations that are in BCNF MovieStudio(tit
le, year, length, studioName, filmType) and
StarsIn(title, year, starName)
24
Some remarks
  • The algorithm will stop and output a set of BCNF
    relations.
  • Not every decomposition according to the
    algorithm is good
  • The FDs for the new relations are determined by
    projecting.
  • If a decomposition is based on FDs (according to
    the algorithm) then the recovering process will
    give us exactly the original relation.
  • If a decomposition is not based on FDs then we
    might not be able to recover the original
    relation from the new ones
  • Example R(A,B,C) with A?B and we decompose it
    into S(A,B) and T(B,C)

A B C
1 2 3
1 2 5
4 2 3
4 2 5
B C
2 3
2 5
A B C
1 2 3
4 2 5
A B
1 2
4 2
25
Third Normal Form (3NF)
  • So far if a relation is not in BCNF then
    anormalies arise.
  • Given a relation Bookings with the attributes
  • title name of the movie
  • theater name of the theater where the movie is
    being shown
  • city the city where the theater is located
  • (a tuple (m,t,c) represents the fact that movie
    m is shown at theater t in city c)

26
Bookings(title,theater,city)
  • The FDs of the relations
  • theater ? city
  • title city ? theater
  • theater?city violates the BCNF condition, why?
  • decomposition yields theater,city and
    theater,title
  • Consider the relations

Possible relations according to the FDs of each
schema
recovering
Violate the FD title city?theater
theater city
Guild Menlo
Park Menlo
theater title
Guild Net
Park Net
theater title city
Guild Net Menlo
Park Net Menlo
27
3NF
  • A relaxation of the BCNF condition a relation R
    is in 3NF if whenever there is a nontrivial FD
    A1An?B, either A1,,An is a superkey or B is
    a member of some key.
  • Bookings(title,theater,city) is in 3NF

28
Checking BCNF and 3NF
  • Given R(A,B,C,D) with FDs AB?C, C?D, D?A.
  • Question Indicate the BCNF violations and 3NF
    violations.
  • Steps in answering the question
  • Step 1 compute all nontrivial FDs (right side
    one att)
  • Step 2 find all keys
  • Step 3 find all the violations
  • Step 1 AB?C, C?D, D?A, AB?D, C?A, DB?C, AC?D
  • Step 2 Keys A,B, C,B, and D,B
  • Step 3
  • BCNF violation C?D, D?A, C?A, AC?D and their
    trivial extensions (e.g. CD?D, DA?A,)
  • 3NF violation none
Write a Comment
User Comments (0)
About PowerShow.com