Normalisation - PowerPoint PPT Presentation

1 / 42
About This Presentation
Title:

Normalisation

Description:

No two wards can have the same number or name ... Patient No. and Prescription date ? ... Patient No., Prescription date and Drug code together would provide what is ... – PowerPoint PPT presentation

Number of Views:40
Avg rating:3.0/5.0
Slides: 43
Provided by: your176
Category:

less

Transcript and Presenter's Notes

Title: Normalisation


1
Normalisation
  • delivered by John Stockwell

2
Objectives
  • At the end of this lecture you will be able to
  • describe the use of normalisation
  • describe how normalisation and E-R modelling may
    be used together
  • use the principles of normalisation to produce
    third normal form

3
Introduction
  • It is very important that we do not keep a data
    item in more than one place
  • The only exception to this rule is when we are
    using primary or foreign keys

4
Introduction
  • The reasons for this are
  • Data inconsistency may occur where we change the
    data in some places but not in others
  • It is simpler to change any data item if it only
    occurs in one place
  • It simplifies the use of queries to select data
    from the database

5
Normalisation
  • A series of steps that enable us to take a
    collection of data items and distribute them into
    related tables without duplicating values
  • The tables will contain data items relating to a
    particular concept within the problem domain

6
E-R diagrams and normalisation
  • Havent we just been doing this?
  • Yes
  • E-R diagrams use a top-down approach
  • Normalisation works from the bottom-up
  • Data items (known as attributes when drawing E-R
    diagrams) are taken and grouped together into
    tables

7
E-R diagrams and normalisation
  • How do these two techniques fit together?
  • Both of them may be used at different stages in
    the development of an information system
  • They are used to complement one another

8
E-R diagrams and normalisation
  • In a large project normalisation can be used on
    all the data items gathered together from
    existing documentation and forms
  • The resulting grouping may then be compared with
    the results of the E-R diagram developed from an
    analysis of the case study description
  • Any anomalies are then investigated and resolved

9
E-R diagrams and normalisation
  • Another way to use normalisation is to check that
    the tables we have derived are in the correct
    form before we move towards implementing our
    design
  • This is the way we will use normalisation in this
    module
  • For relatively simple data bases we may find we
    have built an E-R diagram that already
    corresponds to third normal form !

10
Producing third normal form
  • Three processes are to be carried out on the
    collection of data items
  • They produce first, second and then third normal
    form
  • This is usually sufficient for most purposes
  • Before normalising can take place it is essential
    that the designer completely understands the
    meaning of the data and any restrictions or
    business rules imposed upon it

11
Producing third normal form
  • Let us assume we have only one table
  • The example is chosen to demonstrate clearly each
    of the subsequent three processes
  • In reality we may need only some of the
    procedures or even none if the table we are
    testing is already in third normal form

12
Figure 1 Un-normalised table
13
Figure 1
  • Shows information relating to a patient in a
    hospital
  • The patient may be administered drugs with a
    certain dosage as part of a prescription
  • The prescription also includes the time for which
    the dosage should be given
  • The patient is placed in a ward in the hospital

14
Figure 1
  • It is important to understand what this data
    means
  • There may be more than one patient in any ward
  • Two patients may have the same surname and
    fore-name
  • No two wards can have the same number or name
  • More than one patient may receive a prescription
    on any one day
  • This prescription could be identical to another
    patients prescription

15
Figure 1
  • Any one patient may receive more than one
    prescription on any one day
  • A drug code is for one named drug only
  • If a patient is prescribed a drug on a particular
    day they cannot receive another prescription for
    the same drug on that day
  • Where there are spaces shown they indicate the
    same values as in the cells vertically above them

16
Pick a primary key
  • The table seems to be showing data about a
    patient, where they are and what is prescribed to
    them
  • It makes sense to pick a non-textual primary key
  • Use the smallest number of attributes possible
  • In this case we choose Patient No.

17
First Normal Form
  • Remove any repeating groups of attributes to
    separate relations
  • Pick keys for any relations identified
  • A repeating group is defined as any attribute or
    attributes that may occur with multiple values
    for a single value of the primary key attribute.
  • In this case we are looking for attributes or
    data items that have more than one possible value
    for a single value of Patient No.
  • Which attributes here may take on more than one
    value for a particular patient?

18
First Normal Form
  • Each patient may receive a number of
    prescriptions but may not be placed in more than
    one ward
  • Therefore the repeating attributes are
    Prescription date, Drug code, Drug name, Dosage
    and Length of treatment
  • These repeating attributes are placed in a group
    by themselves and have to have their own primary
    key

19
First Normal Form
  • This primary key will be a combination of the
    primary key of the un-normalised group and one or
    more other attributes
  • We need to choose the primary key of the
    repeating group so that a particular row may be
    uniquely identified
  • Which of the other attributes when combined with
    Patient No. would uniquely identify a row in the
    table ?

20
First Normal Form
  • Patient No. and Prescription date ?
  • (patient could have more than one prescription on
    a given date)
  • Patient No. and Drug code will also not quite do
    as a patient could be prescribed the same drug on
    different days
  • Patient No., Prescription date and Drug code
    together would provide what is called a compound
    primary key
  • (assuming that a patient cannot be prescribed the
    same drug more than once on a given day)

21
First Normal Form
  • The repeating attributes are separated from the
    other group and added to Patient No. to form a
    new group
  • So if we are given the Patient No., Prescription
    date, and Drug code the values of the other
    attributes Drug name, Dosage and Length of
    treatment will be automatically determined
  • The attributes left behind in the original table
    produce a row which is uniquely identifiable by a
    value of the primary key

22
Figure 2 Two tables in first Normal Form
23
Figure 2
  • If we know the Patient No. is 681 we can be sure
    of the surname, Forename, Ward No. and Ward name
  • Notice that we have the start of a relational
    database as the two tables are related through
    the common field or attribute Patient No.

24
Figure 4.2
  • It is a common mistake amongst beginners to
    forget to include the primary key of the
    non-repeating group in the repeating group
  • It is possible to have more than one repeating
    group and also a repeating group within a
    repeating group but examples of these types are
    not shown here

25
Second normal form
  • Represent the data in Second Normal Form by
    removing any attributes that only depend on part
    of the key
  • This only applies to groups having a compound key
  • Any group with a simple primary key (one
    attribute in the primary key) may be left as it is

26
Second normal form
  • We look at the groups with a compound key and
    decide if any of the attributes in the group
    depend on only part of the compound key. This is
    sometimes known as functional dependency or part
    key dependency
  • Where an attribute depends on only part of the
    key we remove it to another group which has as
    its primary key that part of the original
    compound key on which it was dependent

27
Second normal form
  • In our example we only consider the data items of
    the second table We consider all the data items
    including those that make up the compound key
  • In this case we consider each attribute in turn
    and find that Patient No., Prescription date and
    Drug code all depend on the whole of the compound
    key

28
Second normal form
  • However we see that Drug name does not depend on
    Patient No. or on Prescription date
  • The Drug name is independent of the patient to
    whom it is prescribed or the date on which it is
    prescribed

29
Second normal form
  • It is dependent on the Drug code only
  • The Dosage and Length of Treatment are not more
    dependent on part of the primary key compared
    with the whole key and so are left in their
    original group

30
Figure 3 Tables in Second Normal Form
31
Third Normal Form
  • Remove any attributes that are not directly
    dependent on any of the key attributes but are
    more dependent on a non-key attribute in that
    group
  • We look at each attribute in a group and ask Is
    it directly dependent on the key?

32
Third Normal Form
  • If it is not we look at the other non-key
    attributes in the group to find what it is
    dependent upon
  • When we have found such a relation we remove it
    to a different group with a suitable key
  • We also need to leave behind the attributes
    forming the key for the new group so the tables
    may be related to one another. This is in fact
    the foreign key that we have already met. Many
    beginners forget to do this !

33
Third Normal Form
  • Look at every non-key attribute in turn and ask
    if it really does depend on the key
  • The groups at the Second Normal form stage have
    an identity indicated by their primary key
  • The three tables shown in Figure 4.3 could be
    said to be representing
  • Patient
  • Prescription
  • Drug

34
Third Normal Form
  • Taking the attributes of the Patient table first
    we ask
  • Does Surname depend upon the patient ? Yes, it
    does.
  • Does Fore-name depend upon the patient ? Yes,
    it does.
  • Does Ward no. depend upon the patient ? No, it
    doesnt.

35
Third Normal Form
  • Does Ward no. depend upon any other attributes
    in this group? Yes, it depends upon Ward nameOf
    course ward name depends upon Ward No.
  • These two attributes are then removed to form a
    new group. The primary key chosen for this is
    Ward no. We must leave a copy of that key behind
    in the Patient table to form a foreign key.

36
Figure 4 Tables in third normal form
37
Tables in third normal form
  • The tables we now have could be called Patient,
    Ward, Prescription and Drug
  • The presentation here shows the normalisation
    procedure and at each step also shows a little
    sample data
  • In practice when performing normalisation we
    would only deal with the data item names
  • The SSADM way of doing this is given in Figure 5

38
Figure 5
  • UNF 1NF 2NF 3NF
  • Patient No. Patient No. Patient No. Patient
    No.
  • Surname Surname Surname Surname
  • Forename Forename Forename Forename
  • Ward No. Ward No. Ward No. Ward No.
  • Ward name Ward name Ward name
  • Prescription date Ward No.
  • Drug code Patient No. Patient No. Ward name
  • Drug name Prescription date Prescription date
  • Dosage Drug code Drug code Patient No.
  • Length of Drug name Dosage Prescription date
  • treatment Dosage Length of treatment Length
    of treatment
  • Length of treatment
  • Drug code Drug code Drug name Drug
    name

39
Figure 6
40
Use of Normalisation
  • Normalisation may be used to check that tables
    derived from our E-R diagram are in third normal
    form.

41
Summary
  • This section has dealt with
  • a description of the use of normalisation
  • how normalisation and E-R modelling may be used
    together
  • the principles of normalisation to produce Third
    Normal Form

42
References
  • Goodland M., Slater C., SSADM A Practical
    Approach, McGraw Hill, 1995
  • Britton Doake, Software System Development a
    gentle introduction,3e, McGraw Hill,2003, ISBN
    0077099745
  • Kendall K., Kendall J., Systems Analysis and
    Design, Prentice Hall, 2002 (fifth edition)
Write a Comment
User Comments (0)
About PowerShow.com