Normalization - PowerPoint PPT Presentation

About This Presentation

Title:

Normalization

Description:

Normalization cs3431 – PowerPoint PPT presentation

Number of Views:63

Avg rating:3.0/5.0

Slides: 25

Provided by: pcguest

Learn more at: http://web.cs.wpi.edu

Category:

more less

Transcript and Presenter's Notes

Title: Normalization

1
Normalization
2
Why Normalization?

To remove potential redundancy in design
Redundancy causes several anomalies insert,
delete and update
Redundancy wastes storage, and often slows down
query processing

3
Insert Anomaly
Student
sNumber sName pNumber pName
s1 Dave p1 MM
s2 Greg p2 ER
Question Could we insert any professor ? Note
We cannot insert a professor who has no students.
Insert Anomaly We are not able to insert valid
value/(s)
4
Delete Anomaly
Student
sNumber sName pNumber pName
s1 Dave p1 MM
s2 Greg p2 ER
Question Can we delete a student and keep a
professor info ? Note We cannot delete a student
that is the only student of a professor.
Delete Anomaly We are not able to perform a
delete without losing some valid information.
Note In both cases, minimum cardinality of
Professor in the corresponding ER schema is 0
5
Update Anomaly
Student
sNumber sName pNumber pName
s1 Dave p1 MM
s2 Greg p1 MM
Question Can we simply update a professors
name ? Note To update the name of a professor,
we have to update in multiple tuples.
Update Anomaly To update a value, we have to
update multiple rows.
Update anomalies are due to redundancy.
Note the maximum cardinality of Professor in the
corresponding ER schema is
6
Normalization

Need a method to find dependencies between
attributes
Functional dependencies
Need a method to remove such harmful
dependencies, when they exist
Relational decomposition
Break R (A,B,C,D) into R1 (A, B) and R2 (B, C, D)

7
Keys Revisited

A key for a relation R (a1, a2, , an) is a set
of attributes, K, that together uniquely
determine the values for all attributes of R.
A key is minimal no subset of K is a key.
A superkey may not be minimal
A prime attribute an attribute that is part of a
key

8
Keys Example
Student
sNumber sName address
1 Dave 144FL
2 Greg 320FL
Primary Key ltsNumbergt Candidate key
ltsNamegt Some superkeys ltsNumber, addressgt,
ltsNamegt, ltsNumbergt, ltsNumber, sNamegt,
ltsNumber, sName, addressgt Prime Attribute
sNumber, sName
9
Functional Dependencies (FDs)
Student
sNumber group address
1 DB 144FL
2 AI 320FL
Suppose we have the FD group? address That is,
there is a function from group to
address Meaning For any two rows in the
Student relation with the same value for group,
the value for address must be same.
10
FD and Keys
Student
sNumber group address
1 DB 144FL
2 AI 320FL
Primary Key ltsNumbergt FD group ? address

Questions
Does a key implies functional dependencies?
Which ones ?
Does a functional dependency imply keys ? Which
ones ?

Observation Any key (primary or candidate) or
superkey of a relation R functionally determines
all attributes of R.
11
Properties of FDs

Consider A, B, C, Z are sets of attributes
Reflexive (trivial FD) if A ? B, then A ? B
Transitive if A ? B, and B ? C, then A ? C
Augmentation if A ? B, then AZ ? BZ
Union if A ? B, A ? C, then A ? BC
Decomposition if A ? BC, then A ? B, A ? C
Note Sound and complete inference rules for FDs

12
Inferring FDs

Suppose we have
a relation R (A, B, C) and
functional dependencies A ? B, B ? C, C ? A
Questions
What is a key for R?
Should we split R into multiple relations?
We can infer A ? ABC, B ? ABC, C ? ABC.
Hence A, B, C are all keys.

13
Reasoning About FDs

An FD f is implied by a set of FDs F if f
holds whenever all FDs in F hold.
Closure of F, denoted by F, is the set of all
FDs that are implied by F.
Computing closure F of a set of FDs can be
expensive.
Size of closure is exponential in attrs!

14
Reasoning About FDs

But given question
Is X ? Y in closure of a set of FDs
F?
Fortunately, computing just attribute closure is
sufficient (and linear time complexity)
Compute attribute closure of X, denoted X, wrt
F
Set of all attributes A such that X ? A is in F
Check if Y is in X . If yes, then X ? Y in
F.

15
Reasoning About FDs (Contd.)

Does F A B, B C, C D E
imply A E?
Question
i.e, is A E in the closure F ?
Equivalent Question
Is E in the attribute closure ?

16
Algorithm for Inference of FDs

Computing the closure of set of attributes A1,
A2, , An
Let X A1, A2, , An
If there exists a FD B1, B2, , Bm ? C,
such that every Bi ? X, then X X ? C
Repeat step 2 until no more attributes can be
added.
A1, A2, , An X

17
Inferring FDs Example 1

Consider R (A, B, C, D, E)
with FDs A ?? B, B ? C, CD ? E
Does A ? E? (Is A ? E in F ?)
Rephrase as Is E in A ?
Let us compute A
A A, B, C
Therefore, A ? E is false

18
Inferring FDs Example 2

Given R (A, B, C), and
FDs A ? B, B ? C, C ? A
What are possible keys for R ?
Compute the closure of attributes
A A, B, C
B A, B, C
C A, B, C
So keys for R are ltAgt, ltBgt, ltCgt

19
Decomposing Relations
StudentProf
FDs pNumber ? pName
20
Decomposition

Decomposition
Must be Lossless (no spurious tuples)

21
Decomposition Lossless Join
StudentProf
sNumber sName pNumber pName
s1 Dave p1 MM
s1 Dave p2 MM
s2 Greg p1 MM
s2 Greg p2 MM
Spurious Tuples
22
Normalization

Once decided, what is the algorithm for
(lossless) decomposing?

23
Normalization Step Decompose

Consider relation R with set of attributes AR.
Consider a FD A ? B
(such that no other attribute in
(AR A B) is functionally determined by A).
If A is not a superkey for R,
we may decompose R as
Create R with attributes (AR B)
Create R with attributes A ? B
Key for R A
Foreign key R (A) references R (A)

24
Example Decomposition Revisited
StudentProf
sNumber sName pNumber pName
s1 Dave p1 MM
s2 Greg p2 MM
FDs pNumber ? pName
Student
Professor
sNumber sName pNumber
s1 Dave p1
s2 Greg p2
pNumber pName
p1 MM
p2 MM
FOREIGN KEY Student (PNum) references Professor
(PNum)
25
Normalization