Title: Aims and Objectives
1Aims and Objectives
- In this session you will be able to
- Explain normalisation.
- Explain the problems that exist when data is not
normalised. - List the fundamental concepts of normalisation.
- Describe the (three) steps of the normalisation
process. - Perform the normalisation process on a set of
data.
2Normalisation
- We are still in the Systems Analysis stage, where
we are modelling the real world of the system. - We have used DFDs to model the processes. We
have used ERDs to model the data. This top-down
approach modelled entity types and the
relationships between them. - There are attributes associated with each entity
type, but ERDs do not model these. Attributes
are lower level data and must be modelled too in
order to help design a computer database. - Normalisation is the technique used to do this.
3Normalisation
- Normalisation is a bottom-up approach that models
lower level - details about data. But why do we bother to use
normalisation? - By normalising data, we can ensure a structure
for the data in a - system which
- Makes the data easy to retrieve.
- Ensures that integrity of the can be maintained
(i.e. is consistent, up-do-date, and correct at
all times).
4Normalisation
- Lets take an entity Warehouse with the following
attributes
- There are problems with this data as it stands
- Update difficulties (repetition of rname, pdesc,
pcode - redundant). - Insertion difficulties (new product codes? new
regions?). - Deletion difficulties (delete H in W 10, and all
details about screwdrivers are lost too). - Retrieval is slow because entity will be so big.
5Normalisation
- These problems are called update anomalies (i.e.
problems with - deletion, insertion, editing). They result in
data that is difficult to - retrieve and whose integrity may be compromised.
- They are caused by certain kinds of relationships
between - attributes. These problem-causing relationships
are known as - undesirable Functional Dependencies.
- Undesirable Functional Dependencies must be
identified and - removed. The process of doing this is known as
Normalisation.
6Concepts in Normalisation
- Identifier or Key This allows data to be
uniquely identified. One or more attribute in
each entity must be assigned as the key. - Repeating groups Attributes which occur
repeatedly for a single occurrence of another
attribute/s. - Functional Dependencies (FDs) Relationships
between the attributes. There are three types of
FDs - full FD (essential),
- partial FD (problem-causing and undesirable),
- and transitive FD (problem-causing and
undesirable).
7Functional Dependency
Attribute B is fully functionally dependent on
attribute A, if knowing the value of A uniquely
determines the value of B
- For any particular value of A there is one and
only one corresponding value of B. This is
denoted as A ? B.
For example CUSTOMER(Account No, Name, Address,
DOB, Balance)
Name, Address, DOB and Balance are fully
functionally dependent on Account No which is
the key or identifier i.e. Account No ? Name,
Address, DOB, Balance.
8Functional Dependency
Sometimes there can be more than one attribute
involved in the key or identifier. For
example LINE ITEM (Order, Part,
Part-description, Qty, Price)
- Order alone will not determine any of the other
attribute types. - A combination of Order and Part will be
associated with single, unique values of the all
other attribute types. - Order, Part ? Part-description, Qty, Price.
9Partial Functional Dependency
LINE ITEM (Order, Part, Part-description, Qty,
Price)
- Full functional dependencies exist between
- Order, Part ? Part-description, Qty, Price.
- Part ? Part-description.
- However, to obtain a unique value for
Part-description, does not require both Order
and Part. It requires only Part. - Part-description is not fully functionally
dependent on the entire key of Order,Part. - Part-description is in fact only partially
dependent on the whole key Order,Part (i.e.
depends on part of the key).
10Transitive Functional Dependency
Lets take another example PURCHASE ORDER
(O,S, SName, SAddr, Date, TPrice.). The key
for what we see here is O which means O ? S,
SName, SAddr, Date, TPrice. This is true, but if
we think about it, dont supplier name and
address really depend on supplier number? Which
means S ? SName, SAddr. This in fact is a
transitive dependency (why cant it be a
partial dependency?), which means O ? S ?
SName, SAddr.
11Steps in Normalisation
- The steps involved in Normalisation are
- First Normal Form (1NF) remove repeating
groups. - Second Normal Form (2NF) remove partial
functional - dependencies.
- Third Normal Form (3NF) remove transitive
functional - dependencies.
12First Normal Form (1NF)
WAREHOUSE
(W,Loc,R,RName, (P,PDesc,Price,PCode,CDesc,Qty)
)
We have update anomalies here, caused by
repeating groups. We need to put the data into
1NF to help remove some of the anomalies. This
means we must form a new entity type from the
attributes involved in the repeating groups.
13First Normal Form
Remove all repeating groups.
After applying our 1NF rule we have
14First Normal Form
- The data looks like the following
15First Normal Form
- If the initial set of attributes does not have
any repeating groups then it is in 1NF. - After removal it must be possible to reconstitute
the original entity type. - One or more attribute types must be common to
both entity types after the split.
16Second Normal Form (2NF)
WAREHOUSE(W, Loc, R,Rname) STOCK(W,P, PDesc,
Price, PCode, CDesc,Qty)
We have update anomalies in the Stock entity,
caused by partial dependencies. We need to put
the data into 2NF to help remove some them. This
means we must form a new entity type from the
attributes involved in the partial dependency.
17Second Normal Form
All attributes types, which are not part of the
identifier, must be fully functionally dependent
on the whole of the identifier.
After applying our 2NF rule we have
WAREHOUSE (W, Loc, R, RName). STOCK (W,P,
Qty). PRODUCT (P, PDesc, Price, PCode,CDesc).
18Second Normal Form
- The data looks like the following
19Second Normal Form
- If the initial set of attributes does not have a
compound key or identifier (i.e. more than one
attribute in the key), then it is already in 2NF. - After removal it must be possible to reconstitute
the original entity type. One or more attribute
types must be common to the entity types after
the split. - There must be no partial dependencies.
20Third Normal Form (3NF)
We have update anomalies in the Warehouse and
Product entities, caused by transitive
dependencies. We need to put the data into 3NF
to help remove some them. This means we must
form new entity types from the attributes
involved in the transitive dependencies.
WAREHOUSE(W, Loc, R,Rname) STOCK(W,P,
Qty) PRODUCT(P, PDesc, Price, PCode,CDesc).
21Third Normal Form
All attributes which are not part of the key are
functionally dependent on the key, the whole key
and nothing but the key
After applying our 3NF rule we have
WAREHOUSE (W, Loc, R). STOCK (W,P,
Qty). PRODUCT (P, PDesc, Price,
PCode). REGION(R, RName). PRODUCT_CODE(P,
CDesc).
22(No Transcript)
23Third Normal Form
- After removal it must be possible to reconstitute
the original entity types. One or more attribute
types must be common to the entity types after
the split. - There must be no transitive dependencies.
24Summary
- You should now be able to
- Explain normalisation.
- Explain the problems that exist when data is not
normalised. - List the fundamental concepts of normalisation.
- Describe the (three) steps of the normalisation
process. - Perform the normalisation process on a set of
data.