Title: Chapter 8 Normal Forms Based on Functional Dependencies
1Chapter 8Normal Forms Based on Functional
Dependencies
- Deborah Costa
- Oct 18, 2007
28.1 Normalization
- Data redundancy and the consequent modification
(insertion, deletion, and update) anomalies can
be traced to undesirable functional
dependencies in a relation schema - Desirable FD is any FD in a relation schema, R
where the determinant is a candidate key of R
this will not cause data redundancy. - Undesirable FD is where the determinant of an FD
in R is not a candidate key of R and this will
cause data redundancy.
3A little bit of the History
- Database Normalization was first proposed by
Edgar F. Codd. - Codd defined the first three Normal Forms, which
well look into, of the 7 known Normal Forms. - In order to do normalization we must know what
the requirements are for each of the three Normal
Forms that well go over. - One of the key requirements to remember is that
Normal Forms are progressive. That is, in order
to have 3rd NF we must have 2nd NF and in order
to have 2nd NF we must have 1st NF.
4Normalization Update Anomaly
- The same information can be expressed on multiple
records therefore updates to the table may
result in logical inconsistencies. - Example each record in an "Employees' Skills"
table might contain an Employee ID, Employee
Address, and Skill thus a change of address for
a particular employee will potentially need to be
applied to multiple records (one for each of his
skills). If the update is not carried through
successfullyif, that is, the employee's address
is updated on some records but not othersthen
the table is left in an inconsistent state.
Specifically, the table provides conflicting
answers to the question of what this particular
employee's address is. This phenomenon is known
as an update anomaly.
An update anomaly. Employee 519 is shown as
having different addresses on different records.
5Normalization Insertion Anomaly
- There are circumstances in which certain facts
cannot be recorded at all. For example, each
record in a "Faculty and Their Courses" table
might contain a Faculty ID, Faculty Name, Faculty
Hire Date, and Course Codethus we can record the
details of any faculty member who teaches at
least one course, but we cannot record the
details of a newly-hired faculty member who has
not yet been assigned to teach any courses. This
phenomenon is known as an insertion anomaly.
An insertion anomaly. Until the new faculty
member is assigned to teach at least one course,
his details cannot be recorded.
6Normalization Deletion Anomaly
- There are circumstances in which the deletion of
data representing certain facts necessitates the
deletion of data representing completely
different facts. The "Faculty and Their Courses"
table described in the previous example suffers
from this type of anomaly, for if a faculty
member temporarily ceases to be assigned to any
courses, we must delete the last of the records
on which that faculty member appears. This
phenomenon is known as a deletion anomaly.
A deletion anomaly. All information about Dr.
Giddens is lost when he temporarily ceases to be
assigned to any courses.
7Normalization (cont)
- In order to eliminate this problem with
undesirable FD is to somehow render the
undesirable FDs desirable and the process of
doing this is called normalization. - Normal Forms (NFs) provides a stepwise
progression towards the goal of a fully
normalized relation schema that is guaranteed to
be free of data redundancies that cause
modification anomalies from a functional
dependency perspective.
8Normalization (cont)
- A relation schema is said to be in a particular
normal form if it satisfies certain prescribed
criteria otherwise the relation is said to
violate the normal form. The violation of each of
these normal forms signals the presence of a
specific type of undesirable FD. - It is important to note that the normalization
process is anchored to the candidate key of a
relation schema, R. - We will use the primary key as the basis for
evaluating and normalizing a relation schema.
9First Normal Form (1NF)
- First Normal form imposes conditions sot that a
base relation which is physically stored as a
file does not contain records with a variable
number of fields. This is accomplished by
prohibiting multi-valued attributes, composite
attributes, and combinations thereof in a
relation schema. As a consequence the value of an
attribute in a tuple of a relation can be neither
a set of values, nor another tuple. Such
constraint in effect prevents relations from
containing other relations.
101NF Violation and Resolution Figure 8.1 pg 348
As you can see this is schema violates the 1NF
because there are multiple Artirst_nm associated
with an Album_no or the domain of Artist_nm does
not have atomic values. In fact by definition,
ALBUM is not even a relation.
111NF Violation and Resolution Figure 8.1 pg 348
In order to fix ALBUM we must expand the relation
so that there is a tuple for each (atomic)
Artist_nm for a given Album_no. The primary key
for this is Album_no, Artist_nm as we all
should hopefully know by now.
12Second Normal Form (2NF)
- The requirements to satisfy the 2nd NF
- All requirements for 1st NF must be met.
- Redundant data across multiple rows of a table
must be moved to a separate table. - The resulting tables must be related to each
other by use of foreign key.
132nd NF Example
- Only Candidate key is (Employee, Skill)
- Not in 2NF
- Current Work Location is dependent on Employee
- Can Cause an Anomaly
Updating Jones Work location for Typing and
Shorthand but not Whittling. Then asking What is
Jones current work location, can cause a
contradictory answer, because there are 2
different locations.
142nd NF Example
- Both tables are in 2NF
- Meets 1NF requirements
- No non-primary key attribute is dependent on part
of a key
15Third Normal Form (3NF)
- The requirements to satisfy the 3rd NF
- All requirements for 2nd NF must be met.
- Eliminate fields that do not depend on the
primary key - That is, any field that is dependent not only on
the primary key but also on another field must be
moved to another table
16Third Normal FormExample
Eliminate Columns Not Dependent On Key i.e. if a
column is in a relation, then it must be
dependent on the key.
17Third Normal FormExample
Move non-key-dependent attributes to a new table.
18Boyce-Codd Normal Form(BCNF)
- After Codd proposed the first three normal forms
in 1972, it was discover that the 3NF did not
satisfactorily handle a more general case of
undesirable functional dependencies. In other
words data redundancies and the consequent
modification anomalies due to functional
dependencies can persist even after a relation
schema is normalized to 3NF. - Use this mnemonic by Brian Moran to help you
remember - 3NF, but
- All functional dependencies imply the only whole
key. - "The key, the whole key, and nothing but the key,
so help me Codd.
198.2 The Motivating Exemplar Revisited
Normalization concepts have been presented by
analyzing 1NF, 2NF 3NF and BCNF in
isolation. However in practice normal form
violations rarely occur in isolation.
We can see from figure 8.8a that STOCK follows
1NF because there are not composite or
multi-valued attributes in it.
20Motivating Exemplar Revisited (cont)
Using Armstrongs axioms we get Store, Product
and Manager, Location, Product for candidate
keys, however we choose Store, Product as a
primary key. Now that we have the primary key
for STOCK we can see that fd1, fd2, fd3 and fd4
violates 2NF in STOCK fd6 violates 3NF in
STOCK. fd7 violates BCNF in STOCK To fix all of
the violations above we must decompose the
relational schema DR1 R2 R3 R4 R5
21Motivating Exemplar Revisited (cont)
- This section is very confusing in my opinion. So
for better understanding please read it more then
once. - After reading a couple of times we should be able
to know how to decompose the base relation schema
under investigation and know if our decomposition
is complete and correct without looking at the
same data. - A decomposition is complete when it is a
dependency-preserving lossless-join
decomposition. Preservation of FDs is a
verification process and is accomplished by
inspecting the decomposition to see if the union
of the FDs hold on individual relation schema of
D is a cover for F. This is demonstrated in
Section 8.1.5.1.You should also test for the
lossless-join property, the method for testing is
presented in Section 8.1.5.2.