Normalization - PowerPoint PPT Presentation

About This Presentation
Title:

Normalization

Description:

Normalization cs3431 – PowerPoint PPT presentation

Number of Views:63
Avg rating:3.0/5.0
Slides: 25
Provided by: pcguest
Learn more at: http://web.cs.wpi.edu
Category:

less

Transcript and Presenter's Notes

Title: Normalization


1
Normalization
2
Why Normalization?
  • To remove potential redundancy in design
  • Redundancy causes several anomalies insert,
    delete and update
  • Redundancy wastes storage, and often slows down
    query processing

3
Insert Anomaly
Student
sNumber sName pNumber pName
s1 Dave p1 MM
s2 Greg p2 ER
Question Could we insert any professor ? Note
We cannot insert a professor who has no students.
Insert Anomaly We are not able to insert valid
value/(s)
4
Delete Anomaly
Student
sNumber sName pNumber pName
s1 Dave p1 MM
s2 Greg p2 ER
Question Can we delete a student and keep a
professor info ? Note We cannot delete a student
that is the only student of a professor.
Delete Anomaly We are not able to perform a
delete without losing some valid information.
Note In both cases, minimum cardinality of
Professor in the corresponding ER schema is 0
5
Update Anomaly
Student
sNumber sName pNumber pName
s1 Dave p1 MM
s2 Greg p1 MM
Question Can we simply update a professors
name ? Note To update the name of a professor,
we have to update in multiple tuples.
Update Anomaly To update a value, we have to
update multiple rows.
Update anomalies are due to redundancy.
Note the maximum cardinality of Professor in the
corresponding ER schema is
6
Normalization
  • Need a method to find dependencies between
    attributes
  • Functional dependencies
  • Need a method to remove such harmful
    dependencies, when they exist
  • Relational decomposition
  • Break R (A,B,C,D) into R1 (A, B) and R2 (B, C, D)

7
Keys Revisited
  • A key for a relation R (a1, a2, , an) is a set
    of attributes, K, that together uniquely
    determine the values for all attributes of R.
  • A key is minimal no subset of K is a key.
  • A superkey may not be minimal
  • A prime attribute an attribute that is part of a
    key

8
Keys Example
Student
sNumber sName address
1 Dave 144FL
2 Greg 320FL
Primary Key ltsNumbergt Candidate key
ltsNamegt Some superkeys ltsNumber, addressgt,
ltsNamegt, ltsNumbergt, ltsNumber, sNamegt,
ltsNumber, sName, addressgt Prime Attribute
sNumber, sName
9
Functional Dependencies (FDs)
Student
sNumber group address
1 DB 144FL
2 AI 320FL
Suppose we have the FD group? address That is,
there is a function from group to
address Meaning For any two rows in the
Student relation with the same value for group,
the value for address must be same.
10
FD and Keys
Student
sNumber group address
1 DB 144FL
2 AI 320FL
Primary Key ltsNumbergt FD group ? address
  • Questions
  • Does a key implies functional dependencies?
    Which ones ?
  • Does a functional dependency imply keys ? Which
    ones ?

Observation Any key (primary or candidate) or
superkey of a relation R functionally determines
all attributes of R.
11
Properties of FDs
  • Consider A, B, C, Z are sets of attributes
  • Reflexive (trivial FD) if A ? B, then A ? B
  • Transitive if A ? B, and B ? C, then A ? C
  • Augmentation if A ? B, then AZ ? BZ
  • Union if A ? B, A ? C, then A ? BC
  • Decomposition if A ? BC, then A ? B, A ? C
  • Note Sound and complete inference rules for FDs

12
Inferring FDs
  • Suppose we have
  • a relation R (A, B, C) and
  • functional dependencies A ? B, B ? C, C ? A
  • Questions
  • What is a key for R?
  • Should we split R into multiple relations?
  • We can infer A ? ABC, B ? ABC, C ? ABC.
  • Hence A, B, C are all keys.

13
Reasoning About FDs
  • An FD f is implied by a set of FDs F if f
    holds whenever all FDs in F hold.
  • Closure of F, denoted by F, is the set of all
    FDs that are implied by F.
  • Computing closure F of a set of FDs can be
    expensive.
  • Size of closure is exponential in attrs!

14
Reasoning About FDs
  • But given question
  • Is X ? Y in closure of a set of FDs
    F?
  • Fortunately, computing just attribute closure is
    sufficient (and linear time complexity)
  • Compute attribute closure of X, denoted X, wrt
    F
  • Set of all attributes A such that X ? A is in F
  • Check if Y is in X . If yes, then X ? Y in
    F.

15
Reasoning About FDs (Contd.)
  • Does F A B, B C, C D E
    imply A E?
  • Question
  • i.e, is A E in the closure F ?
  • Equivalent Question
  • Is E in the attribute closure ?

16
Algorithm for Inference of FDs
  • Computing the closure of set of attributes A1,
    A2, , An
  • Let X A1, A2, , An
  • If there exists a FD B1, B2, , Bm ? C,
    such that every Bi ? X, then X X ? C
  • Repeat step 2 until no more attributes can be
    added.
  • A1, A2, , An X

17
Inferring FDs Example 1
  • Consider R (A, B, C, D, E)
  • with FDs A ?? B, B ? C, CD ? E
  • Does A ? E? (Is A ? E in F ?)
  • Rephrase as Is E in A ?
  • Let us compute A
  • A A, B, C
  • Therefore, A ? E is false

18
Inferring FDs Example 2
  • Given R (A, B, C), and
  • FDs A ? B, B ? C, C ? A
  • What are possible keys for R ?
  • Compute the closure of attributes
  • A A, B, C
  • B A, B, C
  • C A, B, C
  • So keys for R are ltAgt, ltBgt, ltCgt

19
Decomposing Relations
StudentProf
FDs pNumber ? pName
20
Decomposition
  • Decomposition
  • Must be Lossless (no spurious tuples)

21
Decomposition Lossless Join
StudentProf
sNumber sName pNumber pName
s1 Dave p1 MM
s1 Dave p2 MM
s2 Greg p1 MM
s2 Greg p2 MM
Spurious Tuples
22
Normalization
  • Once decided, what is the algorithm for
    (lossless) decomposing?

23
Normalization Step Decompose
  • Consider relation R with set of attributes AR.
    Consider a FD A ? B
    (such that no other attribute in
    (AR A B) is functionally determined by A).
  • If A is not a superkey for R,
    we may decompose R as
  • Create R with attributes (AR B)
  • Create R with attributes A ? B
  • Key for R A
  • Foreign key R (A) references R (A)

24
Example Decomposition Revisited
StudentProf
sNumber sName pNumber pName
s1 Dave p1 MM
s2 Greg p2 MM
FDs pNumber ? pName
Student
Professor
sNumber sName pNumber
s1 Dave p1
s2 Greg p2
pNumber pName
p1 MM
p2 MM
FOREIGN KEY Student (PNum) references Professor
(PNum)
25
Normalization
  • How do I decide if I need to further decompose?
  • Once decided, what is the algorithm for
    decomposing?
Write a Comment
User Comments (0)
About PowerShow.com