Title: Normalisation
1Normalisation
- Objectives
- Understand functional dependencies and keys
revisited - Describe deletion, addition update anomalies
associated with poor database design. - Define 1st, 2nd, 3rd normal forms.
- Recognise the problems associated with tables
which are not in each of the above normal forms.
2Functional Dependencies and Keys
- Functional dependencies and keys are fundamental
to understanding the process of normalisation. - Key words to look out for are functionally
determines and functionally depends on. - Functionally Determines .. By knowing the value
of one particular attribute A, then you can
determine a unique value of attribute B - In the above situation, attribute A
functionally determines attribute B. - An example Your student ID functionally
determines your student record
3More on functional dependencies ..
- 2. Functionally Depends On .. A unique value of
the attribute B can only be found by knowing the
value of Attribute A. - In the above situation, the value of attribute B
functionally depends on the value of attribute A - An example student name functionally depends on
the the value of student ID
4A good check
- When stating either functionally determines
or functionally dependent on - . remember that only ONE value can determine
another value and vice versa that is only one
unique value can exist as a result of a key.
5Keys ..
- A primary key exists in a table if
- All the non-key attributes in the table are
functionally dependent on the key - No subcollection of the columns in the key also
has the above property.
6Consider the following tables
7Answer the following questions ref p. 105
- Is Order date functionally dependent on Order
Number? - Is Customer Number functionally dependent on
Order Number? - Does Order date functional determine the Order
number? - For the Part table, is On-Hand functionally
dependent on Part Description? - In the Order Line table, is the Number Ordered
dependent on Order Number? - In the Order Line table, is the Number Ordered
functionally dependent on the Part Number?
8Design issues for a database
- Associate an entity with one piece of data only
- Define as character fields unless items are used
in calculations - Carefully consider the size of character fields
not too large not too small - Do not store derived data on a record
- Use dates rather than numbers to represent
duration. - Aim to design your database correctly in the
first instance.
9Normalization
- TABLES are the basis structure of the relational
database and they need to be properly designed. - the process to achieve this is referred to as
normalisation - Normalisation attempts to efficiently assign
attributes to tables. - Normalisation will
- reduces data redundancies (does not eliminate it)
- help eliminate data anomalies.
10Data anomalies
- Modification anomaly
- If data is stored in more than one place in the
database and a modification is made in one table
but not another one then a modification anomaly
exists - Insertion anomaly
- When a new tuple is created an insertion anomaly
can occur if - The value of one attribute is not known
- An error occurs with the data entry
11Data Anomalies
- Deletion Anomaly
- Is where an inconsistency arises in the database
where data in a particular a record is deleted
resulting in information being no longer
available. - The DBMS will ensure that NO data anomalies will
occur. - SO WHAT IS THE ACTUAL PROCESS OF NORMALISATION?
12Normalisation
- Summary
- 1NF all attributes atomic - no repeating fields.
- 2NF all non-key attributes fully dependent
on whole primary key. - 3NF non transitive dependencies.
- BCNF every determinant is a candidate key.
- 4NF no multi-valued dependencies.
- DK/NF all constraints on relations are logical
consequences of domains and key.
ITC114 ITC423
13Normalisation
- Consider the following example
- Project Project Employee Employee Job
Chg/ Hours Total - Num Name No. Name Class
Hour Billed Charge - Golf 125 John Dig 12 10 120
102 Mary Cut 15 20 300
109 Peter Mow 5 15 75 - Tennis 102 Mary Cut 15 10 150
109 Peter Mow 5 20
100 - 3 Footy 101 Pam Chop 20 5
100 102 Mary Cut 15 30 45 - Data redundancies creates the following
anomalies - update anomalies
- addition (or insertion) anomalies
- deletion anomalies.
ITC114
14Normalization
- Data redundancy a potential problem with all
databases - it can also cause - wasted data and wasted disk space.
- potential integrity problems.
- impact on performance.
- additional management processing requirements
for data.
ITC114
15Normalization
- 1. First Normal form (1NF)
- A table is in 1NF if
- there are no repeating groups (multiple entries
for a single record) - each attribute is defined and is atomic.
- all key attributes are clearly defined.
ITC114
16- The NORMALIZATION Process
- 1. First Normal Form
- Project Project Employee Employee Job
Chg/ Hours TotalNum Name No.
Name Class Hour Billed
Charge1 Golf 123 John Dig
12 10 120 102 Mary
Cut 15 20 300
109 Peter
Mow 5 15 75
ITC114
17Normalization
1st Normal Form (1NF) example....
- 1.1 Remove Repeating Groups how do you do this?
- Project Project Employee Employee Job
Chg/ Hours Num Name No.
Name Class Hour Billed 1
Golf 123 John Dig 12 10
1 Golf 102 Mary Cut
15 20 1 Golf
109 Peter Mow 5 15
- 1.2 Create a primary key for this table which
key will uniquely identify all other non-key
attributes - Project Num Employee No.
- 1.3 Finally check each attribute is atomic.
- This table is now in 1NF but there are still
problems .......
ITC114
18Normalization
- 2. Second Normal form (2NF)
- A table is in 2NF if
- it is in 1NF.
- there are no partial dependencies.
- Partial Dependency
- A partial dependency occurs if any non key
attribute/s is/are only dependent on part of the
primary key.
ITC114
19Second Normal Form (2NF) example..
Normalization
- Project Project Employee Employee Job
Chg/ Hours Num Name No.
Name Class Hour Billed1 Golf 123
John Dig 12 10 1
Golf 102 Mary Cut 15
20 1 Golf 109
Peter Mow 5 15 - In the above
- Employee Name dependent on Employee No NOT
Project Num - Project Name dependent on Project Num NOT
Employee No.To be in 2NF we must remove some of
the attributes from the above table to produce X
new tables e.g.
ITC114
20Second Normal Form (2NF) example..
Normalization
- To determine the tables required ask the
question1. What primary key attributes are each
of the non-key attributes functionally dependent
on? - 2. For each new primary key that can be
identified - create a new table with the new key identified
as the primary keys for that table (but still
leave the key in the 1NF table. - remove the non-key attributes out of the 1NF
table that are functionally dependent on this new
key and put them into the new table created
above. - OUTCOME These steps will help you identify the
tables you need to produce - for all tables to
be in 2NF.
ITC114
21Second Normal Form (2NF) example..
Normalization
- Table 1 - Employee Table 2 ProjectEmployee
Employee Job Chg/ Project Project
No. Name Class Hour Num
Name 123 John Dig 12 1
Golf102 Mary Cut 15 2
Tennis109 Peter Mow
5 3 Footy - Table 3 - Assign table
- Project Employee HoursNum No.
Billed1 123 10 1 102
20 1 109 15
ITC114
22Dependency Diagram 2nd Normal Form
Normalization
Table name PROJECT
P_NAME
P_NO
Table name EMPLOYEE
E_NO
E_NAME
J_CLASS
CHG_HR
Table name ASSIGN
E_NO
P_NO
HOURS
ITC114
23Third Normal Form (3NF)
Normalization
- A Table is in 3NF
- if it is in 2NF and
- if it contains no transitive dependencies.
- Transitive dependency
- no non-key attribute can be functionally
dependent on another non-key attribute.
ITC114
243NF
- To remove a transitive dependency
- Identify all the non-key attributes causing a
transitive dependency from the table not in 3NF. - Identify the KEY in the transitive dependency.
- Leave a copy of this KEY in the original table
but put also place it into a new table - Finally remove all the non-key attributes
associated with the transitive dependency and
also place in the new table
25Third Normal Form (3NF) example...
Normalization
- Original Employee Table 1Employee Employee
Job Chg/ No. Name
Class Hour 123 John Dig
12 102 Mary Cut 15
109 Peter Mow 5 - Table 2 New Employee Table 4 - Charge
Employee Employee Job Job Chg/
No. Name Class Class Hour
123 John Dig Dig
12 102 Mary Cut Cut
15 109 Peter Mow Mow
5
ITC114
26Dependency Diagram 3rd Normal Form
Table name PROJECT
P_NAME
P_NO
Table name EMPLOYEE
E_NO
E_NAME
J_CODE
Table name JOB
J_CLASS
J_CODE
CHG_HR
Table name ASSIGN
E_NO
P_NO
HOURS
ITC114
27Denormalization
- Normalization is not the end of the story
....... - Other factors need to be considered
- physical processing requirements.
- practicalities e.g. is it worth having a lot of
tables. - Denormalization is common sense - we will see
this in the design section.
ITC114
28Consider the text example ..
291NF
302NF
- To be in 2NF each of the non-key attributes
MUST be functionally dependent on the PRIMARY KEY
not just part of it. - To do this we can identify all the functional
dependencies for the table in 1 NF - Order No. ? Order Date
- Part No. ? Part Description
- Order No., Part No. ? Number Ordered, Quoted
Price
312NF
32This is what we now end up with .
333NF
34Dependency diagram
353NF