Functional Dependencies FDs and - PowerPoint PPT Presentation

1 / 27
About This Presentation
Title:

Functional Dependencies FDs and

Description:

... from (1) and (3) we have (5) t1[Z] = t2[Z] and from (2) and (5) we have t1[YZ] ... Considered to be part of the definition of relation. L.H.S of FD. R.H.S ... – PowerPoint PPT presentation

Number of Views:41
Avg rating:3.0/5.0
Slides: 28
Provided by: imadr
Category:

less

Transcript and Presenter's Notes

Title: Functional Dependencies FDs and


1
Lecture 24 (11/10/05)
  • Functional Dependencies (FDs) and
  • Normalization

2
Assignment
  • Will post up today after 600 PM

3
Inference Rules for FDs
  • Given a set of FDs F, we can infer additional FDs
    that hold whenever the FDs in F hold
  • Armstrong's inference rules
  • IR1. (Reflexive) If Y ? X, then X ? Y
  • (Generates trivial FDs)
  • E.g. SSN, ENAME ? ENAME
  • IR2. (Augmentation) If X ? Y, then XZ ? YZ (Note
    that XZ stands for X U Z)
  • E.g. SSN ? ENAME, then SSN, PNUMBER ? ENAME,
    PNUMBER
  • IR3. (Transitive) If X ? Y and Y ? Z, then X ? Z
  • E.g. SSN ? DNO and DNO ? DNAME then SSN ? DNAME

4
Inference Rules for FDs
  • IR1, IR2, IR3 form a sound and complete set of
    inference rules
  • Sound ? Any rule inferred using IR1, IR2 or IR3 a
    valid FD
  • Complete ? All possible FDs can be generated
    using them
  • Some additional inference rules that are useful
  • IR4. (Decomposition) If X ? YZ, then X ? Y and X
    ? Z
  • IR5. (Union) If X ? Y and X ? Z, then X ? YZ
  • IR6. (Pseudo-transitivity) If X ? Y and WY ? Z,
    then WX ? Z
  • Can be deduced from IR1, IR2, and IR3
    (completeness property)

5
Proofs - SKIP
  • IR1. (Reflexive) If Y ? X, then X ? Y
  • For any two tuples t1 and t2 with t1X t2X
    then t1Y t2Y because Y ? X
  • IR2. (Augmentation) If X ? Y, then XZ ? YZ
  • Proof by contradiction
  • If for two tuples t1 and t2 we have
  • (1) t1X t2X
  • (2) t1Y t2Y
  • (3) t1XZ t2XZ
  • (4) t1YZ ? t2YZ
  • Cant be true since from (1) and (3) we have (5)
    t1Z t2Z and from (2) and (5) we have t1YZ
    t2YZ which contradicts (4)

6
Proofs - SKIP
  • IR3. (Transitive) If X ? Y and Y ? Z, then X ? Z
  • For any two tuples t1 and t2 with t1X t2X
    then t1Y t2Y which implies that t1Z
    t2Z hence X ? Z holds
  • IR4. (Decomposition) If X ? YZ, then X ? Y and X
    ? Z
  • X ? YZ
  • YZ ? Y (Using IR1)
  • X ? Y (Using IR3)
  • Similary for X ? Z

7
Proofs - SKIP
  • IR5. (Union) If X ? Y and X ? Z, then X ? YZ
  • X ? Y
  • X ? Z
  • X ? XY (Using IR1)
  • XY ? YZ (Using IR2)
  • X ? YZ (Using IR3)
  • IR6. (Psuedotransitivity) If X ? Y and WY ? Z,
    then WX ? Z
  • X ? Y
  • WY ? Z
  • WX ? WY (Using IR2)
  • WX ? Z (Using IR3)

8
Closure of a Set of FDs
  • The closure of a set F of FDs is F which is the
    set of all FDs that can be inferred from F
  • F SSN ? ENAME, BDATE, ADDRESS DNUMBER,
    DNUMBER ? DNAME, DMGSSN
  • SSN ? DNAME, DMGSSN
  • SSN ? SSN
  • DNUMBER ? DNAME
  • Most of the times it is IMPOSSIBLE to list all
    FDs for a given situation
  • F is impossible to find sometimes

9
Closure of a Set of Attributes
  • Test if a certain FD is in F
  • Entailed by F
  • Closure of a set of attributes X with respect to
    F is the set X of all attributes that are
    functionally determined by X
  • I.e. Y such that X ? Y is true in F
  • X can be calculated by repeatedly applying the
    inferences using the FDs in F
  • READ Algorithm 10.1 p.310 in book

10
Closure of a Set of Attributes
  • Algorithm to compute X
  • XX
  • Repeat
  • oldXX
  • For each FD Y? Z in F do
  • If X ? Y then XX U Z
  • Until (X oldX) //i.e. until no change
    occurs
  • Using the attribute closure we can determine if
    any FD is in F (i.e. entailed by F)

11
Example - Computing Attribute Closure
X XF A A, D, E B
B D D, E AB A,
B, C, D, E
F AB ? C A ? D D ? E AC
? B
Is AB ? E entailed by F? Is D? C entailed by
F? What is a possible key for R?
12
In-class Exercise
  • F
  • SSN ? EName
  • PNumber ? PName, PLocation
  • SSN, PNumber?Hours
  • Find SSN, PNumber, SSN, PNumber
  • Key?

13
Equivalence of Sets of FDs
  • For two sets of FDs F and G, F is said to cover G
    if every FD in G can be inferred from F
  • i.e., if G ? F
  • Two sets of FDs F and G are equivalent if
  • every FD in F can be inferred from G, and
  • G covers F
  • every FD in G can be inferred from F
  • F covers G
  • F and G are equivalent if F G
  • There is an algorithm for checking equivalence of
    sets of FDs

14
Equivalence of Sets of FDs - SKIP
15
Minimal Sets of FDs
  • A set of FDs F is minimal if it satisfies the
    following
  • (1) Every dependency in F has a single attribute
    for its RHS
  • Minimal RHS
  • (2) We cannot replace any dependency X ? A in F
    with a dependency Y ? A, where Y proper subset of
    X ( Y ? X) and still have a set of dependencies
    that is equivalent to F
  • Minimal LHS
  • (3) We cannot remove any dependency from F and
    have a set of dependencies that is equivalent to
    F
  • All FDs are necessary
  • A minimal cover of a set of FDs E is a minimal
    set of FDs that equivalent to E
  • Algorithm 10.2 p.311 in book (SKIP)

16
Normalization of Relations
  • Normalization The process of decomposing
    unsatisfactory "bad" relations into smaller
    relations
  • By breaking up their attributes
  • Main purpose to achieve some desirable properties
  • Minimizing redundancy or update anomalies
  • A normal form (NF) is a condition that indicates
    the degree to which a relation has been
    normalized
  • The normal form of a relation refers to the
    highest normal form that a relation meets

17
Normalization of Relations
  • 1NF, 2NF, 3NF, BCNF based on keys and FDs of a
    relation schema
  • 4NF based on keys, multi-valued dependencies
    MVDs5NF based on keys, join dependencies JDs
    (Chapter 11)
  • On their own, NFs DO NOT guarantee good design
    (in theoretical terms)!
  • Additional properties may be needed
  • Lossless join (a must)
  • No spurious tuple generation problem
  • Dependency preservation (an advantage)
  • After decomposition, each FD is represented some
    relation

18
Practical Use of Normal Forms
  • Even though higher NFs means better
  • Database designers need not normalize to the
    highest possible normal form
  • Usually up to 3NF, BCNF or 4NF
  • De-normalization the process of storing the join
    of higher normal form relations as a base
    relationwhich is in a lower normal form
  • For performance reasons

19
Revision
  • A superkey of a relation schema R A1, A2,
    ...., An is a set of attributes S ? R with the
    property that no two tuples t1 and t2 in any
    legal relation state r of R will have t1S
    t2S
  • A key/candidate key K is a superkey with the
    additional property that removal of any attribute
    from K will cause K not to be a superkey any more
    (i.e. minimal)
  • If a relation schema has more than one key
  • One of the candidate keys is arbitrarily
    designated to be the primary key, and the others
    are called secondary keys
  • A prime attribute must be a member of some
    candidate key
  • A non-prime attribute is not a prime
    attributethat is, it is not a member of any
    candidate key

20
1NF First Normal Form
  • Disallows
  • Multi-valued attributes
  • E.g. DEPARTMENT (DNumber, .,DLOCATIONS)
  • Composite attributes
  • E.g. ADDRESS (Street, City, State, Zip)
  • Nested relations/compound attributes
    combinations of the above
  • An attribute PROJECT for every EMPLOYEE listing
    the project number and name
  • EMPLOYEE (SSN, ..,PROJECT (PNUMBER, PNAME))
  • (Defn.) Every attribute must be dependant on
    every key
  • The only attributes allowed in 1NF are single
    atomic (or indivisible) values
  • Otherwise, attributes cant depend on or cant be
    determined by the key
  • Considered to be part of the definition of
    relation

21
L.H.S of FD
R.H.S of FD
Better design?
22
Solutions
  • DLOCATIONS multi-valued attribute not
    dependent on DNUMBER (in b)
  • 3 Solutions
  • (1) Previous slide
  • Suffers from redundancy
  • (2) If maximum number is known (e.g. 3) then
    replace DLOCATIONS by 3 attributes Loc 1, Loc2,
    Loc3
  • Lot of nulls
  • Difficulty in querying Find all departments in
    St Cloud
  • (3) Remove DLOCATIONS from this relations and
    create a new relation LOCATION (DNUMBER,
    DLOCATION)
  • Best theoretical solution
  • Project off bad attributes and store in a new
    relation with the key

23
(No Transcript)
24
2NF Second Normal Form
  • Full functional dependency - a Full FD or FFD Y ?
    Z is an FD where removal of any attribute from Y
    means the FD does not hold any more
  • SSN, PNUMBER ? HOURS is a full FD since neither
    SSN ? HOURS nor PNUMBER ? HOURS hold
  • SSN, PNUMBER ? ENAME is not a full FD (it is
    called a partial dependency) since SSN ? ENAME
    also holds (an FFD)
  • (Defn.) A relation schema R is in second normal
    form (2NF)
  • If it in 1NF and
  • If every non-prime attribute A in R is fully
    functionally dependent on every key of R (i.e. No
    partial dependencies allowed for nonprime
    attributes)
  • R can be decomposed into 2NF relations via the
    process of 2NF normalization

25
Non-prime attributes are only associated with the
part of the key on which they are fully
functional dependent
26
Assume COUNTRY_NAME, LOT is another candidate
key
27
3NF Third Normal Form
  • Transitive functional dependency - a FD X?Z that
    can be derived from two FDs X?Y and Y?Z
  • SSN ? DMGRSSN is a transitive FD since
  • SSN ? DNUMBER
  • DNUMBER ? DMGRSSN
  • SSN ? ENAME is non-transitive since there is no
    set of attributes X where SSN?X and X?ENAME
Write a Comment
User Comments (0)
About PowerShow.com