Advanced Issues in Table Normalization - PowerPoint PPT Presentation

1 / 29
About This Presentation
Title:

Advanced Issues in Table Normalization

Description:

Our non-key attribute definition: ... Hulk Hogan. Advanced Sports Marketing. 1. 401. SpMkg. Marketing. Marc Ekking. Marketing Strategy ... – PowerPoint PPT presentation

Number of Views:70
Avg rating:3.0/5.0
Slides: 30
Provided by: odinLcb
Category:

less

Transcript and Presenter's Notes

Title: Advanced Issues in Table Normalization


1
Advanced Issues in Table Normalization
  • Class 10
  • DSC 544/444
  • Fall 2007

2
Learning Goals
  • Extend definition of non-key attribute
  • Requires notion of a candidate key
  • Boyce-Codd normal form
  • Fourth normal form
  • Denormalization

3
Non-key attributes
  • Our non-key attribute definition
  • Any attribute other than those defining the
    tables primary key.
  • We will extend this definition.
  • Why?
  • Because if we dont, then we would sometimes
    create unnecessary tables when going to 2NF or
    3NF.
  • Consider the following example

4
Candidate Keys
  • Recall the Team table
  • Team(TeamName, TeamColor, CoachName, CoachPhone)
  • Beyond the implied PK dependencies(i.e.,
    TeamName ? TeamColor, CoachName,
    CoachPhone)lets assume the following
    dependencies hold

TeamName
TeamColor
CoachName
CoachPhone
5
Candidate Keys
  • What do the following dependencies imply?
  • TeamColor ? TeamName, CoachName, CoachPhone
  • CoachName ? TeamColor, TeamName , CoachPhone
  • CoachPhone ? TeamColor, CoachName, TeamName
  • They mean that there are alternative primary
    keys!
  • TeamColor
  • CoachName
  • CoachPhone

TeamName
TeamColor
CoachName
CoachPhone
6
Candidate Keys
  • What is a candidate key for a table?
  • An attribute (or attribute combination) that
    could serve as a PK for the table
  • i.e., it determines all other attributes
  • In our example, the table has 4 candidate keys
  • TeamName
  • TeamColor
  • CoachName
  • CoachPhone

7
Non-key attributes
  • We can now provide an extended definition of what
    is a non-key attribute.
  • Our generalized non-key attribute definition
  • Any attribute other than those defining the
    tables candidate keys.
  • The 2NF and 3NF definitions do not change!
  • 2NF requirement no non-key attribute should be
    determined by part of a key.
  • 3NF requirement satisfy 2NF, and no non-key
    attribute should be determined by another non-key
    attribute.

8
Applying the Generalized Definition
  • In our current example, what are the non-key
    attributes?
  • None!
  • So, do we have any 2NF or 3NF violations?
  • No!
  • Is there really any unnecessary data redundancy
    with this table design?Team(TeamName, TeamColor,
    CoachName, CoachPhone)
  • Not given our assumption of 1 coach per team

9
What was wrong with the old definition?
  • Without our more enlightened definition of a key
    (and thus non-key) attribute, we might do
    pointless things.
  • Not necessarily wrong, but pointless.
  • Example
  • We previously noted the dependency
  • CoachName ? CoachPhone and thought
  • oh no, a dependency between two non-key
    attributes bad!
  • We split the team table into 2 tables as a
    result.
  • But, we did not reduce data redundancy, we simply
    split the table in two.
  • Note we still might like to split Coach into a
    separate table, but the reason would not be
    related to data redundancy

10
A simple application
  • Consider the following example
  • Student(StudentNum, SSN, Name, Address)
  • Dependencies?
  • StudentNum ? SSN, Name, Address
  • SSN ? StudentNum , Name, Address
  • Candidate keys
  • ___________
  • ___________
  • Non-key attributes
  • Using prior definition ________ , _________ ,
    _________
  • Using generalized definition _______ , _______ ,
    ______

11
A simple application
  • Consider the following table design
  • Student(StudentNum, SSN, Name, Address)
  • Normal form level
  • Using prior definition _____
  • After fixing the implied 3NF violation, we get
  • Student(StudentNum, SSN)
  • ContactInfo(SSN, Name, Address)
  • Does this reduce the amount of data we are
    storing?
  • No, all we did is split the same data across 2
    tables
  • Using generalized definition _____

12
Learning Goals
  • Extend definition of non-key attribute
  • Requires notion of a candidate key
  • Boyce-Codd normal form
  • Fourth normal form
  • Denormalization

13
Boyce-Codd Normal Form
  • Why another normal form level?
  • There are some opportunities for improvement that
    are not caught by the 2NF and 3NF rules of thumb
  • For example, consider a Course Section table with
    the following determinants
  • (Department, Num, SecNum) the PK
  • Prefix ? Department
  • Aside
  • This info. implies that there are 2 candidate
    keys
  • (Department, Num, SecNum)
  • (Prefix, Num, SecNum)

14
Boyce-Codd Normal Form
  • Sample data for Course Section table
  • Because Prefix ? Department, we know that
    (Prefix, Num, SecNum) could also be a primary key
    for this table.

15
Boyce-Codd Normal Form
  • What normal form level is this table in?
  • Remember the sequence of questions to ask
    yourself (for any table!)
  • What are the non-key attributes?
  • Any partial dependencies (2NF violations)?
  • Any transitive dependencies (3NF violations)?
  • So, this table is in _____ normal form.

16
Boyce-Codd Normal Form
  • Is there a different design for the Course
    Section attributes that we might prefer? How
    about
  • Would this design reduce any data repetition? ___

Prefix
Num
SecNum
CourseName
Instructor
17
Boyce-Codd Normal Form
  • Boyce and Codd recognized this kind of table
    design that would have unnecessary data
    redundancy
  • not flagged by either the 2NF or 3NF rules!
  • So, they invented a stronger normal form rule.
  • Boyce-Codd normal form (BCNF) requirement
  • Every determinant must be a candidate key.
  • Any table that satisfies BCNF must be in 3NF
  • And therefore satisfies 2NF as well

18
Boyce-Codd Normal Form
  • Consider the prior table design
  • Recall the candidate keys
  • (Department, Num, SecNum)
  • (Prefix, Num, SecNum)
  • Is every determinant a candidate key? _____
  • Prefix ? Department

19
Learning Goals
  • Extend definition of non-key attribute
  • Requires notion of a candidate key
  • Boyce-Codd normal form
  • Fourth normal form
  • Denormalization

20
Fourth Normal Form
  • Another normal form?
  • Dont the 2NF, 3NF, and BCNF rules of thumb
    capture all the possible types of unnecessary
    data redundancy?
  • You wish!
  • Consider the following E-R diagram

Movie
Title
21
Fourth Normal Form
  • Here is one possible table design
  • Movie(Title, ActorName, WriterName)
  • Notice that this design can store any combination
    of movie titles with actor and writer names you
    desire, so it does work.
  • But, it is not a good design since there will be
    unnecessary data redundancy.
  • Consider a movie title for which there are 12
    actors but only 1 writer.
  • How many times will you be forced to repeat the
    writer name? ______________

22
Fourth Normal Form
  • OK, so the design is not very good.
  • But, what is its normal form level?
  • There are no non-key attributes, so
  • 2NF check ensure no non-key attribute depends on
    part of key
  • 3NF check ensure no dependencies between non-key
    attributes
  • We definitely pass those tests!
  • BCNF check that every determinant is a candidate
    key
  • Yes, we pass this test too, because the only
    determinant is our primary key
  • So the table seems great according to those
    concepts.

23
Fourth Normal Form
  • The Fourth Normal Form concept
  • No table should contain independent multi-valued
    dependencies.
  • This table does not pass 4NF
  • Movie(Title, ActorName, WriterName)
  • This table does pass 4NF
  • Rental(TripID, MemberName, EquipmentID)

24
Fourth Normal Form
  • Why does one pass and not the other?
  • Because for the Movie, the multi-valued
    dependencies are independent.
  • The following diagram works
  • For a movie title there is a list of actors and a
    list of writers
  • Is this examples diagram sufficient?

Movie
Title
Rental
TripID
25
Fourth Normal Form
  • Because MemberName and EquipmentID are
    interlinked, they are not called independent
    multi-valued dependencies.
  • So, the Rental table passes the 4NF test.
  • That is a good thing, because it naturally
    follows from the correct E-R diagram.

Rental
Trip
Equipment
Member
26
Learning Goals
  • Extend definition of non-key attribute
  • Requires notion of a candidate key
  • Boyce-Codd normal form
  • Fourth normal form
  • Denormalization

27
Denormalization
  • Higher normal form implies less data redundancy.
  • But at what cost?
  • More tables to deal with!
  • Denormalization purposeful violation of a normal
    form
  • Choosing to have fewer tables (incurring some
    data repetition)
  • Why might you do this?
  • If splitting the table would cause little gain in
    terms of reducing redundancy, and the split table
    would often need to be reconstituted (which
    requires processing time)

28
Denormalization
  • Example
  • Do we want to bother splitting the table to
    create a separate table listing prefixes and the
    associated depts?
  • Manager may decide dont bother.
  • Remember normal form rules help you identify
    data-redundancy concerns, but it is up to you to
    decide if you want to bother splitting a table to
    address the dependency you found. The rules are
    rules of thumb.

29
Summary
  • Normalization rules-of-thumb are more reliable if
    we define key attributes as those within any
    candidate key.
  • BCNF rule is stronger than 3NF and can sometimes
    catch problems not found by 3NF.
  • Check whether every determinant is a candidate
    key.
  • 4NF rule checks for a special type of problem
  • Independent multi-valued dependencies
  • Sometimes, we may choose to go to a lower normal
    form level (denormalization).
Write a Comment
User Comments (0)
About PowerShow.com