Teknik Informatika Fakultas Teknik - PowerPoint PPT Presentation

1 / 87
About This Presentation
Title:

Teknik Informatika Fakultas Teknik

Description:

Relational Database Schema resulting from an ER-Relational Mapping ... Project Y2K for 23 hours. WORKS-ON [SSN, Proj, Hours] Informatika Unitomo, 31 Agust '03 ... – PowerPoint PPT presentation

Number of Views:1118
Avg rating:3.0/5.0
Slides: 88
Provided by: rahm4
Category:

less

Transcript and Presenter's Notes

Title: Teknik Informatika Fakultas Teknik


1
Module 5FunctionalDependencies
andNormalization
  • Teknik Informatika Fakultas Teknik
  • Universitas Dr. Soetomo Surabaya

2
Module 5 - Motivation
  • Relational Database Schema resulting from an
    ER-Relational Mapping exhibits good design
    characteristics
  • How can we measure the goodness or quality of
    the design ?
  • Use formal theories to distinguish good relation
    schemas from bad
  • Functional dependency information
  • Normalization techniques

3
Module 5 - Contents
  • Functional Dependencies and Normalization
  • Informal Design Guidelines
  • Functional Dependency
  • Normalization
  • Properties of Decomposition

4
Informal Design Guidelines
  • Informal measures of relational database
  • schema quality and design guidelines
  • Semantics of the attributes
  • Reducing the redundant values in tuples
  • Reducing the null values in tuples
  • Disallowing spurious tuples
  • Apply to schemas of base relations
  • (files that will be physically stored)

5
Semantics of the attributes
  • Grouping attributes to construct a relation
    implies that a particular meaning or semantics
    is associated with the attributes
  • Employee Smith, with unique id (SSN) 123, who
    lives
  • at 23 Queens Street, earns salary 34K and
    works in
  • Department Payroll
  • EMPLOYEE SSN, Name, Address, Salary, Dept
  • The employee with unique id (SSN) 123 Works-on
  • Project Y2K for 23 hours
  • WORKS-ON SSN, Proj, Hours

6
Guideline 1
  • Design each relation so that it is easy to
    explain its meaning
  • Do not combine attributes from multiple entity
    types and relationship types into a single
    relation
  • Employee Smith, with unique id (SSN) 123,
  • who lives at 23 Queens Street, earns salary
    34K and
  • works in Department Payroll that is managed
    by
  • Wyeth and is located in Perth
  • EMPLOYEE SSN, Name, Address, Salary, Dept,
    Manager, Location

7
Redundant values in tuples
  • One design goal is to minimize the storage space
    that base relations occupy
  • The way the grouping of attributes into relation
    schemas is done, has a significant effect on
    storage space
  • In addition, an incorrect grouping may cause
    update anomalies which may result in inconsistent
    data or even loss of data

8
Example with redundant values
  • Consider relation BigEmp,
  • which groups tuples from
  • two distinct entity types

9
Update Anomalies
  • Insertion Anomalies
  • Problems with inserting new tuples
  • Deletion Anomalies
  • Problems with deletion of tuples - may result
  • in loss of information
  • Modification Anomalies
  • Problems with modifying values of attributes
  • may result in inconsistent data

10
Insertion Anomaly 1
  • Insertion of new employee requires correct
    department name and location details or null
    values if the employee not assigned to any
    department yet

11
Insertion Anomaly 2
  • Insertion of new department that has no employees
    yet is not possible since EMP (primary key of
    BigEmp) can not be null

12
Deletion Anomalies
  • Deletion of all employees in Department D4
    results in loss of departments name and location

13
Modification Anomalies
  • If we change the location of the Shipping
    department, many tuples must be modified to keep
    the database consistent

14
Guideline 2
  • Design the base relation schema so that no
    insertion, deletion, or modification anomalies
    occur in the relations
  • If any do occur, ensure that all applications
    that access the database update the relations in
    such a way as to not compromise the integrity of
    the database

15
Null Values in Tuples
  • Tuples without data for certain attributes are
    said to contain null values
  • Example NULL value for dept in EMPLOYEE
  • relation (123,Jones, 12 Queen St, NULL, 34K)
  • When occurring in large numbers, null values
    represent a waste of physical disk space
  • More importantly, null values can introduce
    problems with the semantics of relations, and in
    specification of various operations such as join,
    aggregation

16
Semantics of Null Values
  • Nulls can have multiple interpretations
  • Attribute does not apply to this relation
  • Attribute value for this tuple is unknown
  • The value is known but not yet recorded yet
  • Having the same representation (NULL) for all
    possibilities above compromises the different
    meanings they may have

17
Guideline 3
  • As far as possible, avoid placing attributes in a
    base relation whose values may be null
  • If nulls are unavoidable, make sure that they
    apply in exceptional cases only and that they do
    not apply to a majority of tuples in the relation

18
Disallowing spurious tuples
  • The bifurcation, or general decomposition, of a
    large relation into a set of smaller ones can
    result in numerous problems
  • One problem involves the situation where
    information in the original relation is lost due
    to the decomposition
  • Ironically, the problem becomes apparent when
    the
  • smaller relations are joined back together
    and the
  • result is larger than the original relation,
    i.e. contains
  • spurious tuples

19
Example Decomposition
20
Join with Spurious Tuples
21
Guideline 4
  • Design the relation schemas so that they can be
    (relationally) joined with equality conditions on
    attributes that are either primary keys or
    foreign keys in a way that guarantees that no
    spurious tuples are generated
  • Do not have relations that contain matching
  • attributes other than foreign key - primary
    key
  • combinations
  • If such relations are unavoidable, do not join
  • them on such attributes, because the join may
  • contain spurious tuples

22
Module 5 - Contents
  • Informal Design Guidelines
  • This section presented informal
  • guidelines for good relational design
  • The following sections present formal
  • concepts and applied theory
  • Functional Dependency
  • Normalization
  • Properties of Decomposition

23
Functional Dependency
  • Single most important concept in relational
  • schema design
  • To understand the concept, we will ...
  • introduce, motivate and formally define the
  • term functional dependency
  • describe basic inference rules for functional
  • dependencies
  • use these inference rules to derive others and
  • to compute closures

24
What is Functional Dependency
  • A function dependency (FD) is a constraint
    denoted X ? Y between two sets of attributes X
    and Y from a relational schema R.
  • The FD specifies a restriction on the possible
    tuples that can form a relation instance r of R

25
The FD Constraint - Informally
  • Example A relation
  • DEPARTMENT
  • (DNO, DNAME, DLOC)
  • can have following FDs
  • FD1 DNO -gt DLOC
  • FD2 DNO -gt DNAME
  • Functional dependency X -gt Y holds if and only if
    whenever two tuples agree on their X-value, they
    must necessarily agree on their Y value

26
The FD Constraint - Formally
  • The FD constraint is that for any two tuples t1
    and t2 in the relation instance r that have
  • t1X t2X
  • we must also have
  • t1Y t2Y
  • This means that the values
  • of the Y component of a tuple depend on, or are
    determined by, the X component
  • The values of the X component of a tuple uniquely
    (or functionally) determine the values of the Y
    component

If X -gt Y holds, then Y is functionally dependent
on X X is termed the left-hand-side (LHS)
of the FD Y is termed the right-hand-side
(RHS) of the FD
27
What determines FDs
  • FD is a property of the meaning or semantics of
    the attributes
  • Humans use their understanding of these semantics
    to create and specify each FD, they cannot be
    accurately obtained from viewing a particular
    relation instance
  • However, if a FD does not hold in even one
    instance then the FD cannot hold on the schema

28
Relation at Time T1
  • Can you determine if the FD STATE -gt CODE holds
    true in the database from viewing the instance
    above?

29
Relation at Time T2
  • If the existence of the Smith tuple is legal,
    then it implies that the proposed FD
  • STATE -gt CODE is invalid !

30
Application of FDs
  • The main use of FDs is to further describe (if
    the semantics of the application warrant) a
    relation schema R by specifying constraints on
    its attributes that must hold at all times
  • Relation instances r(R), that satisfy the FD
    constraints are called legal extensions, legal
    instances or legal relation states

31
Inferred FDs
  • Designer specifies obvious FDs, however numerous
    other FDs can be inferred
  • Let F be a set of FDs on R
  • F Pnum -gt Pname, Plocation, Dnum,
  • Dnum -gt Dname, MgrSSN, MgrStartDate
  • FDs that can be inferred are
  • Pnum -gt MgrStartDate
  • Dnum -gt MgrSSN
  • Pnum -gt Pnum

32
Notation for Inferred FDs
  • The notation F X -gt Y denotes that the
  • FD X -gt Y is inferred from the set of
  • functional dependencies F
  • Example
  • F Pnum -gt MgrStartDate,
  • Dnum -gt MgrSSN,
  • Pnum -gt Pnum
  • where F Pnum -gt Pname, Plocation, Dnum,
  • Dnum -gt Dname, MgrSSN, MgrStartDate

33
Compact Notation for FDs
  • Pnum -gt PName, PLocation, Dnum
  • Dnum -gt DnaMe, MGrSSN, MgrStartDate
  • ESSN, DependenT_Name -gt BDate
  • Compact Notation
  • P -gt NLD
  • D -gt MGS
  • ET -gt B

34
Inference Rules
  • It is practically impossible to specify all
  • possible FDs that may hold
  • To infer additional FDs from a set of valid FDs
    we need a system of inference rules
  • There are 6 inference rules IR1 - IR6
  • IR1-IR3 are referred to as Armstrongs
  • Inference Rules
  • The set of all dependencies that can be
  • inferred from a given F is called the
  • closure of F, denoted by F

35
IR1 Reflexive Rule
  • If X ? Y then X -gt Y
  • A set of attributes always determines itself or
  • any of its subsets
  • If ESSN, Dependent_Name ? ESSN then
  • ESSN, Dependent_Name -gt ESSN holds

36
IR2 Augmentation Rule
  • X -gt Y XZ ? YZ
  • Adding the same set of attributes to both the
  • LHS RHS of a FD results in another
  • valid FD
  • If SSN -gt Ename then
  • SSN, Address -gt Ename, Address
  • holds

37
IR3 Transitive Rule
  • X -gt Y, Y -gt Z X -gt Z
  • FDs are transitive
  • If SSN -gt Dno
  • and Dno -gt Dlocation
  • then SSN -gt Dlocation holds

38
Armstongs Inference Rules
  • The rules IR1-IR3 are sound and complete.
  • Sound Given a set of FDs F holding on a relation
    schema R, any FD that we can infer from F by
    using IR1-IR3 holds in every legal relation
    instance
  • Complete Repeatedly applying IR1-IR3 generates
    all possible FDs that can be inferred from F

39
IR4 Decomposition Rule
  • X -gt YZ X -gt Y
  • We can remove attributes from the RHS of a
  • dependency, and decompose the FD
  • If SSN -gt Ename, Dno
  • then SSN -gt Ename and SSN -gt Dno
  • both hold

40
Proof of Decomposition Rule
  • We can use IR1-IR3 to prove IR4
  • 1 X -gt YZ Given
  • 2 YZ -gt Y Using IR1 plus YZ ? Y
  • 3 X -gt Y Using IR3 on results of 1 2

41
IR5 Union Rule
  • X -gt Y, X -gt Z X -gt YZ
  • We can union attributes from the RHS of a
  • dependency, and combine a set of FDs
  • into a single FD (reverse of IR4)
  • If SSN -gt Ename and SSN -gt Dno
  • then SSN -gt Ename, Dno
  • holds

42
Proof of Union Rule
  • We can use IR1-IR3 to prove IR5
  • 1 X -gt Y Given
  • 2 X -gt Z Given
  • 3 XX -gt XY Using IR2 on 1 (adding X)
  • 4 X -gt XY Since XXX (not using IRI-R3)
  • 5 XY -gt YZ Using IR2 on 2 (adding Y)
  • 6 X -gt YZ Using IR3 on 4 5

43
IR6 Pseudotransitive Rule
  • X -gt Y, WY -gt Z WX -gt Z
  • Represents a variant of IR3
  • If SSN -gt MgrSSN and
  • MgrSSN, Dependent_Name -gt
  • Relationship
  • then SSN, Dependent_Name -gt Relationship

44
Proof of Pseudotransitive Rule
  • We can use IR1-IR3 to prove IR6
  • 1 X -gt Y Given
  • 2 WY -gt Z Given
  • 3 WX -gt WYUsing IR2 on 1 (adding W)
  • 4 WX -gt Z Using IR3 on 3 2

45
Use of Inference Rules
  • Step 1 Database designers first specify the set
    of FDs F
  • that can easily be determined from
    semantics
  • Step 2 Armstrongs rules are used to infer
    additional FDs
  • that also hold
  • Determine set of attributes X that appear on a
    FDs LHS
  • Use Armstrongs rules to determine attributes
  • determined by X (called X)
  • X is the closure of X under the set F

46
Computing the Closure (X)
  • Algorithm to compute the closure of X under F
  • X X
  • repeat
  • old X X
  • for each FD Y -gt Z in F do
  • if Y ? X then X X? Z
  • until (old X X )

47
Example Closure Computation
  • X Pnumber
  • F Pnumber -gt Pname, Plocation, Dnum
  • Dnum -gt Dname, MgrSSN, MgrStartDate
  • Step 1 X Pnumber
  • Step 2 X Pnumber, Pname, Plocation, Dnum
  • Step 3 X Pnumber, Pname, Plocation, Dnum,
  • Dname, MgrSSN,
    MgrStartDate

48
Module 5 - Contents
  • Informal Design Guidelines
  • Functional Dependency
  • Functional dependencies are constraints forced
    on all
  • database instances
  • The designer specifies some intuitive FDs but
    others
  • can be inferred using rules
  • The closure of a set of FDs F is the set of
    all FDs F that
  • can be inferred from F, where as the
    closure of a attribute
  • X under F is the set of attributes X
    that are functionally
  • determined by X
  • Normalization
  • Properties of Decomposition

49
Normalization
  • Normalization is a process that aims at achieving
    better designed relational database schemas
    through the use of semantic information given by
  • Functional Dependencies
  • Primary Keys

50
The Normalization Process
  • The normalization process takes a relational
    schema through a series of tests to certify
    whether it satisfies certain conditions
  • The schemas that satisfy certain condition are
  • said to be in a given Normal Form.
  • Unsatisfactory schemas are decomposed by
  • breaking up their attributes into smaller
  • relations that possess desirable properties
  • (e.g., no anomalies)

51
Review of Key Concepts
  • Superkey - A set of attributes such that no two
  • tuples have the same values for these attributes
  • Key - A Minimal Superkey, called a Candidate
  • Key if more than one
  • Primary key - A selected candidate key
  • Secondary key - Remaining candidate keys
  • Prime Attribute - An attribute that is a member
    of
  • any candidate key
  • Non-prime attribute An attribute that is not a
  • member of any candidate key

52
First Normal Form (1NF)
  • A relation schema is in 1NF if domains of
    attributes
  • include only atomic (simple, indivisible) values
  • and the value of an attribute is a single value
  • from the domain of that attribute
  • 1NF disallows
  • having a set of values, a tuple of values, or
    a
  • combination of both as an attribute value for
    a
  • single tuple
  • relations within relations and relations as
  • attributes of tuples
  • Considered part of the formal definition of
    relation

53
Non-1NF Relation
Non atomic values
54
Relations in 1NF
Problem with this design Redundancy (Jones, 123)
55
Relations in 1NF
  • Problem with this design
  • Too many Null values
  • What if an order has gtmax (4) items ?

56
Second Normal Form (2NF)
A relation schema is in 2NF if it is in 1NF, and
every non-prime attribute is fully functionally
dependent on the primary key
  • A FD X -gt Y is termed partial
  • if some attribute can be
  • removed from X and the
  • dependency still holds
  • Formally, for some attribute
  • A? X, (X-A) -gt Y is true
  • A FD X -gt Y is termed full if
  • removal of any attribute
  • from X means that the FD
  • no longer holds
  • Formally, for any attribute
  • A ? X, (X-A) -gt Y is false

57
Non 2NF Relation
  • Relation DEP
  • DEP SSN, Dep_Name, Dep_Bdate,
  • Emp_Bdate
  • Functional Dependencies in DEP
  • FD1 SSN, Dep_Name -gt Dep_Bdate
  • FD2 SSN -gt Emp_Bdate

58
Relation in 2NF
  • 2NF relations can be produced by creating a
    decomposition where non-prime attributes are
    associated only with the part of the primary key
    on which they are fully dependent
  • Relations after decomposition of DEP
  • EMP SSN, Emp_Bdate
  • NEW-DEP SSN,Dep_Name, Dep_Bdate

59
Third Normal Form (3NF)
  • A relation schema is in 3NF if it is in 2NF,
  • and no non-prime attribute is transitively
  • dependent on the primary key
  • A FD X -gt Y is termed transitive if there is a
  • set of attributes Z that is not a subset of any
  • key and both X -gt Z and Z -gt Y hold

60
Non 3NF Relation
  • Relation EMP
  • EMPName, State, Code

Functional Dependencies in PERSON FD1 NAME -gt
STATE, CODE FD2 CODE -gt STATE Transitive
Dependency !
61
Relation in 3NF
  • 3NF relations can be produced from 2NF relations
    by creating a decomposition where no transitive
    dependencies exist

Relations after decomposition of EMP EMP-CODE
Name, Code STATE-CODE State, Code
62
Generalization of 2NF and 3NF
  • 2NF and 3NF respectively require that a schema
    should have neither partial nor transitive
    dependencies on the primary key to avoid update
    anomalies
  • However the dependencies could exist for
  • other candidate keys !
  • General definitions for 2NF and 3NF ...

63
General Definition of 2NF
  • A relation schema R is in 2NF if it is in 1NF,
  • and every non-prime attribute is fully
  • functionally dependent on every key of R
  • Property PID, Shire, Lot, Size, Tax_Rate,
    Price
  • FD1 PID -gt Shire, Lot, Size, Tax_Rate, Price
  • FD2 Shire, Lot -gt PID, Size, Tax_Rate, Price
  • FD3 Shire -gt Tax_Rate
  • FD4 Size -gt Price
  • What are the candidate keys of Property ?

64
Generalized 2NF Example
  • Property PID, Shire, Lot, Size, Tax_Rate,
    Price
  • Candidate keys are PID Shire, Lot
  • Is there a partial dependency on any key ?
  • FD1 PID -gt Shire, Lot, Size, Tax_Rate, Price
  • FD2 Shire, Lot -gt PID, Size, Tax_Rate , Price
  • FD3 Shire -gt Tax_Rate
  • If so, the relation is not in 2NF
  • 2NF Relations after decomposition
  • Property PID, Shire, Lot, Size, Price
  • ShireRates Shire, Tax_Rate

65
General Definition of 3NF
  • A relation schema R is in3NF if for every FD
  • X-gtA that holds on R, either
  • a) X is a superkey of R
  • OR
  • b) A is a prime attribute of R
  • This definition can be applied directly to test
    for 3NF without having to check for 2NF first

66
Generalised 3NF Example
  • Property PID, Shire, Lot, Size, Price
  • Primary key PID, Candidate key Shire, Lot
  • FD1 PID -gt Shire, Lot, Size, Price
  • FD2 Shire, Lot -gt PID, Size, Price
  • FD4 Size -gt Price
  • Does any FD fail the 3NF test?
  • If so, the relation is not in 3NF
  • 3NF Relations after decomposition
  • Property PID, Shire, Lot, Size
  • Cost Size, Price

67
Alternate Definition of 3NF
  • A relation schema R is in 3NF if every nonprime
    attribute of R is
  • a) Fully functionally dependent on every key of
    R
  • AND
  • b) Non-transitively dependent on every key of R

68
Boyce-Codd NF (BCNF)
  • A relation schema R is in BCNF if for every
  • FD X-gtA that holds on R, X is a superkey of R
  • A stricter normal form than 3NF
  • In practice, most 3NF relational schemas are
  • also in BCNF
  • BCNF is better than 3NF unless other
  • properties (to be discussed) are lost

69
BCNF Example
  • Property PID, Shire, Lot, Size
  • FD1 PID -gt Shire, Lot, Size
  • FD2 Shire, Lot -gt PID, Size
  • FD5 Size -gt Shire
  • Candidate keys are PID Shire, Lot
  • Size is not a super key
  • But, Shire is a prime attribute
  • Property is in 3NF but not in BCNF (due to FD5)

70
BCNF Example (cont)
  • BCNF Relations after decomposition
  • PropertyPID, Size, Lot
  • ShireLotSize Size, Shire
  • where
  • FD1 PID -gt Shire, Lot, Size
  • FD2 Shire, Lot -gt PID, Size
  • FD5 Size -gt Shire
  • Property and ShireLotSize are in BCNF

71
Limitations of Normalization
  • Isolation
  • Normal forms, when considered in isolation for
  • individual relations, do not guarantee a good
    design
  • Normalization must also consider additional
    properties
  • that the relational schemas, taken
    together, must
  • possess (loss less join and dependency
    preservation)
  • Performance
  • Ideally, relation schemas should be at least
    in 3NF.
  • However, for performance reasons, some
    designers
  • leave relations in lower forms

72
Module 5 - Contents
  • Informal Design Guidelines
  • Functional Dependency
  • Normalization
  • Properties of Decomposition
  • Dependency Preservation Property
  • Loss less (non additive) Join Property

73
Properties of Decompositions
  • Normalization must also consider
  • additional properties that the relational
  • schemas, taken together, must possess
  • Dependency Preservation Property
  • Loss less (non additive) Join Property

74
Dependency Preservation
  • Informally, each FD in F must either appear
    directly on one of the relation schema in the
    decomposition or can be inferred from the
    dependencies that appear in some
  • relation schema
  • FDs are user-specified constraints that
    capture
  • data semantics. The DBMS should enforce
    them
  • efficiently
  • The relational database schema must exhibit
  • the dependency preservation property
    for
  • this to occur

75
Formal Definition
  • A decomposition D R1,R2,,Rm of R is
    dependency preserving with respect to a set of
    functional dependencies F on R if the union of
    projections of F on each Ri in D is equivalent to
    F, that is
  • (p R1(F) ? p R2(F) ?p Rm(F) ) F
  • where the projection of F on Ri, denoted by
    pRi(F) is the set of dependencies X -gt Y in F,
    such that the attributes X ? Y are contained in
    Ri.

76
Decomposition with Lost FD
  • Property PID, Shire, Lot, Size decomposed to
  • Property PID, Size, Lot
  • ShireLotSize Size, Shire
  • FD1 PID -gt Shire, Lot, Size
  • FD2 Shire, Lot -gt PID, Size
  • FD3 Size -gt Shire
  • Property FP PID -gt Lot, Size
  • ShireLotSize FS Size -gt Shire
  • (FP ? FS) FD1,FD3
  • What about FD2?

77
Dep. Preserving Decomposition
  • EMP EMP, STATE, CODE decomposed to
  • Emp-Code(EMP,CODE)
  • State-Code(STATE,CODE)
  • FD1 EMP -gt STATE
  • FD2 EMP -gt CODE
  • FD3 CODE -gt STATE
  • Emp-Code FE EMP -gt CODE
  • State-Code FS CODE -gt STATE
  • (FE?FS)FD2,FD3, (FE?FS)FD1,FD2,FD3

78
Decomposition with Lost FD
  • EMP EMP, STATE, CODE decomposed to
  • Emp-Code(EMP,STATE)
  • State-Code(STATE,CODE)
  • FD1 EMP -gt STATE
  • FD2 EMP -gt CODE
  • FD3 CODE -gt STATE
  • Emp-Code FE EMP -gt STATE
  • State-Code FS CODE -gt STATE
  • (FE?FS)FD2,FD3, (FE?FS)FD1,FD3 F2?

79
Loss less (non-additive)Join
  • Loss less refers to the loss of information,
  • not to loss of tuples
  • A decomposition which is not loss less,
  • may result in spurious tuples after a join
  • is applied to the decomposed relations

80
Formal Definition
  • A decomposition D R1,R2,,Rm of R is
  • lossless with respect to a set of functional
  • dependencies F on R if for every state r of
  • R that satisfies F, the following holds
  • p R1(r) p R2(r) p Rm(r) r
  • where p R1(r) is the projection of r on Ri
  • and is a join

81
Decomposition with Loss
  • EMP EMP, STATE, CODE decomposed to
  • Emp-Code(EMP,STATE)
  • State-Code(STATE,CODE)
  • When joined together, it will result in spurious
    tuples
  • Information regarding the CODE for the EMP is lost

82
Join with Spurious Tuples
83
Relational Database Design
  • Design by Analysis
  • (Top Down Strategy)
  • 1. Start from attribute
  • groupings as obtained
  • from conceptual modeling
  • and mapping activities
  • 2. Analyze and decompose
  • until desirable properties
  • are met
  • (Covered in Modules 4-5)
  • Practical and widely used
  • approach
  • Design by Synthesis
  • (Bottom Up Strategy)
  • 1. Start from individual attributes and FDs
  • 2. Synthesize into a 3NF (or higher) relations
  • (Not Covered)
  • All possible FDs must be identified in advance.
    Even if a few FDs are missing, the schema can be
    suboptimal
  • Results are dependent on the order in which the
    synthesis algorithm receives the
    FDs(nondeterministic)

84
From UoD to Relational Schema
85
Module 5 - Review
Functional Dependency concepts and Normalization
techniques are formal theories to measure
thegoodness or quality of a relational schema
design resulting from ER-Relational mapping
  • Functional Dependencies
  • represent data semantics.
  • Some FDs can be specified by
  • database designers, others
  • can be inferred. These FDs
  • can be used to enforce
  • constraints on data that should
  • hold on all instances of a given
  • schema
  • Normalization techniques use FD and primary key
    information to restructure (decompose) relation
    schemas, such that the resulting relations
    possess desirable properties and are said to be
    in a given normal form. A higher NF is generally
    better, however other factors such as performance
    and decomposition need to be considered

86
Recommended Readings
  • Elmasri Navathe
  • Chapter 14,
  • Optional15(15.1.2, 15.1.3)

87
Next ...
  • Module 6
  • Relational Algebra
Write a Comment
User Comments (0)
About PowerShow.com