Title: Lecture 6: Schema refinement: Functional dependencies
1Lecture 6 Schema refinement Functional
dependencies
- www.cl.cam.ac.uk/Teaching/current/Databases/
2Recall Database design lifecycle
- Requirements analysis
- User needs what must database do?
- Conceptual design
- High-level description often using E/R model
- Logical design
- Translate E/R model into relational schema
- Schema refinement
- Check schema for redundancies and anomalies
- Physical design/tuning
- Consider typical workloads, and further optimise
3Todays lecture
- Why are some designs bad?
- Whats a functional dependency?
- Whats the theory of functional dependencies?
- (Next lecture How can we use this theory to
classify redundancy in relation design?)
4Not all designs are equally good
- Why is this design bad?
- Data(sid,sname,address,cid,cname,grade)
- Why is this one preferable? Student(sid,sname,a
ddress) Course(cid,cname) Enrolled(sid,cid,gra
de)
5An instance of our bad design
sid sname address cid cname grade
124 Britney USA 206 Database A
204 Victoria Essex 202 Semantics C
124 Britney USA 201 S/Eng I A
206 Emma London 206 Database B-
124 Britney USA 202 Semantics B
6Evils of redundancy
- Redundancy is the root of many problems
associated with relational schemas - Redundant storage
- Update anomalies
- Insertion anomalies
- Deletion anomalies
- LOW TRANSACTION THROUGHPUT
- In general, with higher redundancy, if
transactions are correct (no anomalies), then
they have to lock more objects thus causing
greater contention and lower throughput - (Aside Could having a dummy value, NULL, help?)
7Decomposition
- We remove anomalies by replacing the schema
- Data(sid,sname,address,cid,cname,grade)
- with Student(sid,sname,address)
Course(cid,cname) Enrolled(sid,cid,grade) - Note the implicit extra cost here
- Two immediate questions
- Do we need to decompose a relation?
- What problems might result from a decomposition?
8Functional dependencies
- Recall
- A key is a set of fields where if a pair of
tuples agree on a key, they agree everywhere - In our bad design, if two tuples agree on sid,
then they also agree on address, even though the
rest of the tuples may not agree
9Functional dependencies cont.
- We can say that sid determines address
- Well write this sid ? address
- This is called a functional dependency (FD)
- (Note An FD is just another integrity constraint)
10Functional dependencies cont.
- Wed expect the following functional dependencies
to hold in our Student database - sid ? sname,address
- cid ? cname
- sid,cid ? grade
- A functional dependency X ? Y is simply a pair of
sets (of field names) - Note the sloppy notation A,B ? C,D rather than
A,B ? C,D
11Formalities
- Given a relation RR(A1?1, , An?n), and X, Y
(?A1, , An), an instance r of R satisfies X?Y,
if - For any two tuples t1, t2 in R, if t1.Xt2.X then
t1.Yt2.Y - Note This is a semantic assertion. We can not
look at an instance to determine which FDs hold
(although we can tell if the instance does not
satisfy an FD!)
12Properties of FDs
- Assume that X ? Y and Y ? Z are known to hold in
R. Its clear that X ? Z holds too. - We shall say that an FD set F logically implies X
? Y, and write F X ? Y - e.g. X ? Y, Y ? Z X ? Z
- The closure of F is the set of all FDs logically
implied by F, i.e. - F _at_ X?Y F X?Y
- The set F can be big, even if F is small ?
13Closure of a set of FDs
- Which of the following are in the closure of our
Student FDs? - address?address
- cid?cname
- cid?cname,sname
- cid,sid?cname,sname
14Candidate keys and FDs
- If RR(A1?1, , An?n) with FDs F and X?A1, ,
An, then X is a candidate key for R if - X ? A1, ,An ? F
- For no proper subset Y?X is Y ? A1, ,An ? F
-
15Armstrongs axioms
- Reflexivity If Y?X then F \ X?Y
- (This is called a trivial dependency)
- Example sname,address?address
- Augmentation If F \ X?Y then
F \ X,W?Y,W - Example As cid?cname then cid,sid?cname,sid
- Transitivity If F \ X?Y and F \ Y?Z then F \ X?Z
- Example As sid,cid?cid and cid?cname, then
sid,cid?cname
16Consequences of Armstrongs axioms
- Union If F \ X?Y and F \ X?Z then F \
X?Y,Z - Pseudo-transitivity If F \ X?Y and F
\ W,Y?Z then F \ X,W?Z - Decomposition If F \ X?Y and Z?Y then F \ X?Z
- Exercise Prove that these are consequences of
Armstrongs axioms
17Proof of Union Rule
Suppose that F \ X?Y and F \ X?Z. By augmentation
we have F \ X?X,Y since X U X X.
Also by augmentation F \
X,Y?Z,Y Therefore, by transitivity we have
F \ X?Z,Y
QED
18Functional Dependencies Can be useful in
Algebraic Reasoning
Suppose R(A,B,C) is a relation schema with
dependency A?B, then
(This is called Heaths rule.)
19Proof of Heaths Rule
First show that
Suppose
then
and
Since
we have
20Proof of Heaths Rule (cont.)
In the other direction, we must show that
Suppose
Then there must exist records
and
There must also exist
so that
But the functional dependency tells us that
QED
Therefore, we have
21Equivalence
- Two sets of FDs, F and G, are said to be
equivalent if FG - For example
- (A,B?C), (A?B) and
- (A?C), (A?B)
- are equivalent
- F can be huge wed prefer to look for small
equivalent FD sets
22Minimal cover
- An FD set, F, is said to be minimal if
- Every FD in F is of the form X?A, where A is a
single attribute - For no X?A in F is F-X?A equivalent to F
- For no X?A in F and Z?X is
- (F-X?A)?Z?A equivalent to F
- For example, (A?C), (A?B) is a minimal cover
for (A,B?C), (A?B)
23More on closures
- FACT If F is an FD set, and X?Y?F then there
exists an attribute A?Y such that X?A?F
24Why Armstrongs axioms?
- Soundness
- If F \ X?Y is deduced using the rules, then X?Y
is true in any relation in which the dependencies
of F are true - Completeness
- If X?Y is is true in any relation in which the
dependencies of F are true, then F \ X?Y can be
deduced using the rules
25Soundness
- Consider the Augmentation rule
- We have X?Y, i.e. if t1.Xt2.X then t1.Yt2.Y
- If in addition t1.Wt2.W then it is clear that
t1.(Y,W)t2.(Y,W)
26Soundness cont.
- Consider the Transitivity rule
- We have X?Y, i.e. if t1.Xt2.X then t1.Yt2.Y
() - We have Y?Z, i.e. if t1.Yt2.Y then t1.Zt2.Z
() - Take two tuples s1 and s2 such that s1.Xs2.X
then from () s1.Ys2.Y and then from ()
s1.Zs2.Z
27Completeness
- Exercise
- (You may need the fact from slide 23)
28Attribute closure
- If we want to check whether X?Y is in a closure
of the set F, could compute F and check but
expensive ? - Cheaper We can instead compute the attribute
closure, X, using the following algorithm - Then F \ X?Y iff Y is a subset of X
- Try this with sid,sname?cname,grade
closure X repeat until no change if ?U?V?F,
where U?closure then closureclosure?V
29Preview of next lecture Goals of normalisation
- Decide whether a relation is in good form
- If it is not, then we will decompose it into a
set of relations such that - Each relation is in good form
- The decomposition has not lost any information
that was present in the original relation - The theory of this process and the notion of
good form is based on FDs
30Summary
- You should now understand
- Redundancy and various forms of anomalies
- Functional dependencies
- Armstrongs axioms
- Next lecture Schema refinement Normalisation