Boyce-Codd NF - PowerPoint PPT Presentation

1 / 64
About This Presentation
Title:

Boyce-Codd NF

Description:

Boyce-Codd NF & Lossless Decomposition Professor Sin-Min Lee Armstrong s Axioms For computing the set of FDs that follow a given FD, the following rules called ... – PowerPoint PPT presentation

Number of Views:66
Avg rating:3.0/5.0
Slides: 65
Provided by: TySa
Learn more at: http://www.cs.sjsu.edu
Category:
Tags: bingo | boyce | codd

less

Transcript and Presenter's Notes

Title: Boyce-Codd NF


1
Boyce-Codd NF Lossless Decomposition
CS157A L16
  • Professor Sin-Min Lee

2
Armstrongs Axioms
  • For computing the set of FDs that follow a given
    FD, the
  • following rules called Armstrongs axioms are
    useful
  • Reflexivity If B ? A, then A ? B
  • Augmentation If A ? B, then A ? C ? B ? C
    Note also that if A ? B, then A ? C ? B for any
    set of attributes C.
  • Transitivity If A ? B and B ? C then A ? C

3
Example of 3NF but not BCNF
  • R(A B C D)
  • 1 2 1 2
  • 1 3 2 1
  • 2 1 1 1
  • 1 3 2 2
  • 2 1 3 2
  • Is this table BCNF?

4
  • FD graph of R is
  • A B C D
  • B ? A
  • AC ? B
  • BD ? A, C
  • CD ? A, B
  • Prime Attribute key attributes are B,C,D
  • B ? A and B is not a key, So R is not 3NF
  • R is not BCNF

5
Projecting FDs
  • Given a relation R (A,B,C,D) and F(R) A?B,
    B?C, C?D.
  • Suppose S is projected from R as S(A,C,D). What
    is F(S).
  • To compute F(S), start by computing the closures
    of all attributes
  • in S.
  • In R, A A?B, A?C, A?D
  • In S, A A?C, A?D
  • C C?D and
  • D D
  • Since A contains all attributes of S, it is not
    required to compute
  • (AC), (AD) or (ACD).

6
Inference Rules for FDs
A1, A2, , An ? B1, B2, , Bm
Splitting rule and Combining rule
Is equivalent to
A1 ... Am B1 ... Bm





A1, A2, , An ? B1 A1, A2, , An ? B2 . . . .
. A1, A2, , An ? Bm
7
Inference Rules for FDs(continued)
Trivial Rule
A1, A2, , An ? Ai
where i 1, 2, ..., n
A1 Am





Why ?
8
Inference Rules for FDs(continued)
Transitive Closure Rule
A1, A2, , An ? B1, B2, , Bm
If
and
B1, B2, , Bm ? C1, C2, , Cp
A1, A2, , An ? C1, C2, , Cp
then
Why ?
9
A1 Am B1 Bm C1 ... Cp





10
Example (continued)
1. name ? color 2. category ? department 3.
color, category ? price
Start from the following FDs
Infer the following FDs
Inferred FD Which Ruledid we apply ?
4. name, category ? name
5. name, category ? color
6. name, category ? category
7. name, category ? color, category
8. name, category ? price
11
Another Rule
Augmentation
A1, A2, , An ? B
If
then
A1, A2, , An , C1, C2, , Cp ? B
Augmentation follows from trivial rules and
transitivityHow ?
12
Problem infer ALL FDs
  • Given a set of FDs, infer all possible FDs
  • How to proceed ?
  • Try all possible FDs, apply all 3 rules
  • E.g. R(A, B, C, D) how many FDs are possible ?
  • Drop trivial FDs, drop augmented FDs
  • Still way too many
  • Better use the Closure Algorithm (next)

13
Closure of a set of Attributes
Given a set of attributes A1, , An The
closure, A1, , An , is the set of attributes
Bs.t. A1, , An ? B
name ? color category ? department color,
category ? price
Example
Closures name name, color
name, category name, category, color,
department, price color color
14
Closure Algorithm
Start with XA1, , An. Repeat until X doesnt
change do if B1, , Bn ? C is a FD
and B1, , Bn are all in X then
add C to X.
Example
name ? color category ? department color,
category ? price
name, category name, category,
color, department, price
15
Example
A, B ? C A, D ? E B ? D A, F ? B
R(A,B,C,D,E,F)
Compute A,B X A, B,
Compute A, F X A, F,

16
Using Closure to Infer ALL FDs
Example
A, B ? CA, D ? B B ? D
Step 1 Compute X, for every X
A A, B BD, C C, D D AB ABCD,
AC AC, AD ABCD ABC ABD ACD ABCD
(no need to compute why ?) BCD BCD, ABCD
ABCD
Step 2 Enumerate all FDs X ? Y, s.t. Y ? X and
X?Y ?
AB ? CD, AD?BC, ABC ? D, ABD ? C, ACD ? B
17
Problem Finding FDs
  • Approach 1 During Database Design
  • Designer derives them from real-world knowledge
    of users
  • Problem knowledge might not be available
  • Approach 2 From a Database Instance
  • Analyze given database instance and find all FDs
    satisfied by that instance
  • Useful if designers dont get enough information
    from users
  • Problem FDs might be artifical for the given
    instance

18
Find All FDs
Student Dept Course Room
Alice CSE C 020
Bob CSE C 020
Alice EE HW 040
Carol CSE DB 045
Dan CSE Java 050
Elsa CSE DB 045
Frank EE Circuits 020
Do all FDsmake sensein practice ?
19
Answer
Course ? Dept, Room Dept, Room ? Course Student,
Dept ? Course, Room Student, Course ? Dept,
Room Student, Room ? Dept, Course
Do all FDsmake sensein practice ?
20
Keys
  • A key is a set of attributes A1, ..., An s.t. for
    any other attribute B, we have A1, ..., An ? B
  • A minimal key is a set of attributes which is a
    key and for which no subset is a key
  • Note book calls them superkey and key

21
Computing Keys
  • Compute X for all sets X
  • If X all attributes, then X is a key
  • List only the minimal keys
  • Note there can be many minimal keys !
  • Example R(A,B,C), AB?C, BC?AMinimal keys AB
    and BC

22
Examples of Keys
  • Product(name, price, category, color)
  • name, category ? price
  • category ? color
  • Keys are name, category and all supersets
  • Enrollment(student, address, course, room, time)
  • student ? address
  • room, time ? course
  • student, course ? room, time
  • Keys are

23
Relational Schema Design(or Logical Schema
Design)
  • Main idea
  • Start with some relational schema
  • Find out its FDs
  • Use them to design a better relational schema

24
Data Anomalies
  • When a database is poorly designed we get
    anomalies
  • Redundancy data is repeated
  • Update anomalies need to change in several
    places
  • Delete anomalies may lose data when we dont want

25
Relational Schema Design
Example Persons with several phones
Name SSN PhoneNumber City
Fred 123-45-6789 206-555-1234 Seattle
Fred 123-45-6789 206-555-6543 Seattle
Joe 987-65-4321 908-555-2121 Westfield
SSN ? Name, City
but not SSN ? PhoneNumber
  • Anomalies
  • Redundancy repeat data
  • Update anomalies Fred moves to Bellevue
  • Deletion anomalies Joe deletes his phone
    number what is his city ?

26
Relation Decomposition
Break the relation into two
Name SSN PhoneNumber City
Fred 123-45-6789 206-555-1234 Seattle
Fred 123-45-6789 206-555-6543 Seattle
Joe 987-65-4321 908-555-2121 Westfield
Name SSN City
Fred 123-45-6789 Seattle
Joe 987-65-4321 Westfield
SSN PhoneNumber
123-45-6789 206-555-1234
123-45-6789 206-555-6543
987-65-4321 908-555-2121
  • Anomalies have gone
  • No more repeated data
  • Easy to move Fred to Bellevue (how ?)
  • Easy to delete all Joes phone number (how ?)

27
Relational Schema Design
Conceptual Model
Relational Model plus FDs























Normalization Eliminates anomalies
28
Decompositions in General
R(A1, ..., An, B1, ..., Bm, C1, ..., Cp)
R1(A1, ..., An, B1, ..., Bm)
R2(A1, ..., An, C1, ..., Cp)
R1 projection of R on A1, ..., An, B1, ..., Bm
R2 projection of R on A1, ..., An, C1, ..., Cp
29
Decomposition
  • Sometimes it is correct

Name Price Category
Gizmo 19.99 Gadget
OneClick 24.99 Camera
Gizmo 19.99 Camera
Name Price
Gizmo 19.99
OneClick 24.99
Gizmo 19.99
Name Category
Gizmo Gadget
OneClick Camera
Gizmo Camera
Lossless decomposition
30
Incorrect Decomposition
  • Sometimes it is not

Name Price Category
Gizmo 19.99 Gadget
OneClick 24.99 Camera
Gizmo 19.99 Camera
Whatsincorrect ??
Name Category
Gizmo Gadget
OneClick Camera
Gizmo Camera
Price Category
19.99 Gadget
24.99 Camera
19.99 Camera
Lossy decomposition
31
Decompositions in General
R(A1, ..., An, B1, ..., Bm, C1, ..., Cp)
R1(A1, ..., An, B1, ..., Bm)
R2(A1, ..., An, C1, ..., Cp)
If A1, ..., An ? B1, ..., Bm Then the
decomposition is lossless
Note dont need necessarily A1, ..., An ? C1,
..., Cp
Example name ? price, hence the first
decomposition is lossless
32
Normal Forms
  • First Normal Form all attributes are atomic
  • Second Normal Form (2NF) old and obsolete
  • Third Normal Form (3NF) this lecture
  • Boyce Codd Normal Form (BCNF) this lecture
  • Others...

33
(No Transcript)
34
  • R (J, K, L)
  • F (JK ? L, L ? K)
  • Two candidate keys JK and JL
  • R is in 3NF
  • JK ? L JK is a superkey
  • L ? K K is prime
  • BCNF decomposition yields
  • R1 (L,K), R2 (L,J)
  • testing for JK ? L requires a join
  • There is some redundancy in R

35
Boyce-Codd Normal Form
A simple condition for removing anomalies from
relations
A relation R is in BCNF if If A1, ..., An ?
B is a non-trivial dependency in R , then
A1, ..., An is a key for R
In English (though a bit vague) Whenever a
set of attributes of R is determining another
attribute, it should determine all the
attributes of R.
36
(No Transcript)
37
(No Transcript)
38
BCNF Decomposition Algorithm
Repeat choose A1, , Am ? B1, , Bn that
violates the BNCF condition split R into
R1(A1, , Am, B1, , Bn) and R2(A1, , Am,
others) continue with both R1 and R2Until
no more violations
Is there a 2-attribute relation that is not in
BCNF ?
As
Bs
Others
R1
R2
39
Example
Name SSN PhoneNumber City
Fred 123-45-6789 206-555-1234 Seattle
Fred 123-45-6789 206-555-6543 Seattle
Joe 987-65-4321 908-555-2121 Westfield
Joe 987-65-4321 908-555-1234 Westfield
What are the dependencies? SSN ? Name,
City What are the keys? SSN, PhoneNumber Is it
in BCNF?
40
Decompose it into BCNF
Name SSN City
Fred 123-45-6789 Seattle
Joe 987-65-4321 Westfield
SSN ? Name, City
  • Lets check anomalies
  • Redundancy ?
  • Update ?
  • Delete ?

SSN PhoneNumber
123-45-6789 206-555-1234
123-45-6789 206-555-6543
987-65-4321 908-555-2121
987-65-4321 908-555-1234
41
Summary of BCNF Decomposition
Find a dependency that violates the BCNF
condition
A1, A2, , An ? B1, B2, , Bm
Heuristics choose B , B , B as large as
possible
1
2
m
Continue until there are no BCNF violations left.
Decompose
Others
As
Bs
2-attribute relations are BCNF
R1
R2
42
Example Decomposition
Person(name, SSN, age, hairColor,
phoneNumber) SSN ? name, age age ? hairColor
Decompose in BCNF (in class) Step 1 find all
keys (How ? Compute S, for various sets
S) Step 2 now decompose
43
Other Example
  • R(A,B,C,D) A ? B, B ? C
  • Key AD
  • Violations of BCNF A ? B, A? C, A?BC
  • Pick A? BC split into R1(A,BC) R2(A,D)
  • What happens if we pick A ? B first ?

44
Lossless Decompositions
  • A decomposition is lossless if we can recover
  • R(A,B,C)
  • R1(A,B) R2(A,C)
  • R(A,B,C) should be the same
    as R(A,B,C)

Decompose
Recover
R is in general larger than R. Must ensure R
R
45
Lossless Decompositions
  • Given R(A,B,C) s.t. A?B, the decomposition into
    R1(A,B), R2(A,C) is lossless

46
3NF A Problem with BCNF
Unit Company
Product
FDs Unit ? Company Company, Product ?
Unit So, there is a BCNF violation, and we
decompose.
Unit Company
Unit ? Company
Unit Product
No FDs
Notice we loose the FD Company, Product ? Unit
47
So Whats the Problem?
Unit Company
Unit Product
Galaga99 UW Galaga99
databases Bingo UW
Bingo databases
No problem so far. All local FDs are
satisfied. Lets put all the data back into a
single table again (anomalies?)
Unit Company
Product
Galaga99 UW
databases Bingo UW
databases
Violates the dependency company, product -gt
unit!
48
Solution 3rd Normal Form (3NF)
A simple condition for removing anomalies from
relations
A relation R is in 3rd normal form if Whenever
there is a nontrivial dependency A1, A2, ..., An
? Bfor R , then A1, A2, ..., An is a key
for R, or B is part of a key.
Tradeoff BCNF no anomalies, but may lose some
FDs 3NF keeps all FDs, but may have some
anomalies
49
Purpose of Normalization
  • To reduce the chances for anomalies to occur in a
    database.
  • normalization prevents the possible corruption of
    databases stemming from what are called
    insertion anomalies," "deletion anomalies," and
    "update anomalies."

50
Insertion Anomaly
  • A failure to place a new database entry into all
    the places in the database where that new entry
    needs to be stored.
  • In a properly normalized database, a new entry
    needs to be inserted into only one place in the
    database

51
Deletion Anomaly
  • A failure to remove an existing database entry
    when it is time to remove that entry.
  • In a properly normalized database, an old,
    to-be-gotten-rid-of entry needs to be deleted
    from only one place in the database

52
Update anomaly
  • An update of a database involves modifications
    that may be additions, deletions, or both. Thus
    "update anomalies" can be either of the kinds of
    anomalies discussed above.

53
(No Transcript)
54
(No Transcript)
55
(No Transcript)
56
(No Transcript)
57
(No Transcript)
58
(No Transcript)
59
(No Transcript)
60
(No Transcript)
61
(No Transcript)
62
(No Transcript)
63
(No Transcript)
64
(No Transcript)
Write a Comment
User Comments (0)
About PowerShow.com