Normalization for better DB deign - PowerPoint PPT Presentation

1 / 48
About This Presentation
Title:

Normalization for better DB deign

Description:

MI375 Sean Eom Normalization Normalization is a process for creating several tables from an un-normalized relation (Table). The purpose of normalization is to reduce ... – PowerPoint PPT presentation

Number of Views:153
Avg rating:3.0/5.0
Slides: 49
Provided by: Sean65
Category:

less

Transcript and Presenter's Notes

Title: Normalization for better DB deign


1
Normalization for better DB deign
  • MI375
  • Sean Eom

2
Normalization
  • Normalization is a process for creating several
    tables from an un-normalized relation (Table).
    The purpose of normalization is to reduce data
    redundancies and eliminate the data anomalies.
  • The techniques of normalization now provide a
    foundation for logical data base design.

3
Data anomalies (An undesirable consequence of a
data modification) occurs with poorly designed
Tables.
  • Insertion anomaly facts (data) about two or more
    different entities must be added to a single row
    of a relation.
  • Deletion anomaly facts (data) about two or more
    different entities are lost when a single row of
    a relation is deleted.
  • Update anomaly

4
Insertion anomaly The table below contains info
about customers and their agents. To add new
customers to this table, we must enter the
corresponding agent data. By doing so, the
potential for creating data inconsistency is
great!
5
Modification anomaly Using The same table below,
if agent Leah Hahn married to John Doe
yesterday, we need to change her name to Leah
Doe. In this case, only three changes must be
made. In a large table, such change might occur
in thousands of records. By doing so, the
potential for creating data inconsistency is
great!
6
Deletion anomaly Using The same table below, if
agent Leah Hahn decides to quit the job because
John is a millionaire, we must delete all
fields in 3 records. In this case, only three
changes must be made. In a large table, such
change might occur in thousands of records. By
doing so, the potential for creating data
inconsistency is great!
7
Normalization You have seen various types of
data anomalies with poorly designed tables, lets
design a better table using normalization. .
  • Normalization is done through the analysis of
    functional dependencies between attributes (or
    data items or fields).
  • The attribute B is functionally dependent on the
    attribute A if each value in column A determines
    one and only one value in column B.

8
Steps in Normalization
  • User views are identified.
  • Each user view is converted to the form of an
    unnormalized relation. Any repeating groups are
    then removed from the unnormalized relation the
    result is a relation in first normal form (1NF).
  • Any partial dependencies are removed from these
    relations the result is a set of relations in
    second normal form (2NF).
  • Any transitive dependencies are removed, creating
    a set of relations in third normal form (3NF).

9
Example 1
  • Student_Activity REPORT

10
User View
  • The user view the data as if they were laid out
    in tabular form, which is an unnormalized
    relation (a relation that contains one or more
    repeating groups).
  • Student_Activity REPORT is an example of
    user-view (next slide).

11
User-view of student activity SID is a unique
student ID, Club is the name of a club, Cost is
the cost of joining a club, and AmtPaid is the
amount the student has paid to that club.
12
An un-normalized relation user view SID is a
unique student ID, Club is the name of a club,
Cost is the cost of joining a club, and AmtPaid
is the amount the student has paid to that club.

13
Remove repeating groups
Remove partial dependencies
Remove transitive dependencies
14
Unnormalized Relations
  • Student Activity REPORT is an example of an
    unnormalized relation.
  • An unnormalized relation is a relation that
    contains one or more repeating groups.
  • Repeating groups a group of several data
    entries associated with one attribute.
  • Derives its name from the fact that a group of
    multiple (related) entries can exist for any
    single key attribute occurrence

15
Conversion to First Normal Form
  • A relational table must not contain repeating
    groups. The user-view of student activity report
    contains a repeating group, the 3rd row from the
    bottom.
  • Repeating groups can be eliminated by adding the
    appropriate entry in at least the primary key
    column(s).

16
Normalized Relations (1NF)
17
  • 1NF Definition
  • The term first normal form (1NF) describes the
    tabular format in which
  • There are no repeating groups in the table.
  • (All attributes are dependent on the primary key)
  • (All the key attributes are defined.)
  • All relational tables satisfy the 1NF
    requirements.

18
From 1NF to 2NF (As we remove partial dependency
from 1NF table, the table becomes 2NF).
  • To do so, we need to draw a functional dependency
    diagram.
  • A ----? B (This notation is interpreted as
  • The attribute B is functionally dependent on the
    attribute A if each value in column A determines
    one and only one value in column B.
  • The attribute A determines only one value in
    Column B.

19
Functional Dependency analysis
20
Functional Dependency Diagram
21
Normalized Tables (2NF and 3NF)Each table is 3NF
table, which means it has no partial dependency
and transitive dependency.
22
Example 2
  • Example company that manages building projects
  • Charges its clients by billing hours spent on
    each contract
  • Hourly billing rate is dependent on employees
    position
  • Periodically, a report is generated that contains
    information displayed in Table 5.1

23
A Sample Report Layout
24
A Table in the Report Format
25
The Need for Normalization (continued)
  • Structure of data set in Figure 5.1 does not
    handle data very well
  • The table structure appears to work report is
    generated with ease
  • Unfortunately, the report may yield different
    results, depending on what data anomaly has
    occurred

26
Conversion to First Normal Form
  • Repeating group
  • Derives its name from the fact that a group of
    multiple (related) entries can exist for any
    single key attribute occurrence
  • Relational table must not contain repeating
    groups
  • Normalizing the table structure will reduce these
    data redundancies
  • Normalization is three-step procedure

27
Step 1 Eliminate the Repeating Groups
  • Present data in a tabular format, where each cell
    has a single value and there are no repeating
    groups
  • Eliminate repeating groups by eliminating nulls,
    making sure that each repeating group attribute
    contains an appropriate data value

28
Data Organization First Normal Form
29
Step 2 Identify the Primary Key
  • Primary key must uniquely identify attribute
    value
  • New key must be composed

30
Step 3 Identify all Dependencies
  • Dependencies can be depicted with the help of a
    diagram
  • Dependency diagram
  • Depicts all dependencies found within a given
    table structure
  • Helpful in getting birds-eye view of all
    relationships among a tables attributes
  • Use makes it much less likely that an important
    dependency will be overlooked

31
A Dependency Diagram First Normal Form (1NF)
32
First Normal Form
  • Tabular format in which
  • All key attributes are defined
  • There are no repeating groups in the table
  • All attributes are dependent on primary key
  • All relational tables satisfy 1NF requirements
  • Some tables contain partial dependencies
  • Dependencies based on only part of the primary
    key
  • Sometimes used for performance reasons, but
    should be used with caution
  • Still subject to data redundancies

33
Conversion to Second Normal Form
  • Relational database design can be improved by
    converting the database into second normal form
    (2NF)
  • Two steps

34
Step 1 Identify All Key Components
  • Write each key component on separate line, and
    then write the original (composite) key on the
    last line
  • Each component will become the key in a new table

35
Step 2 Identify the Dependent Attributes
  • Determine which attributes are dependent on which
    other attributes
  • At this point, most anomalies have been
    eliminated

36
Second Normal Form (2NF) Conversion Results
37
Second Normal Form
  • Table is in second normal form (2NF) if
  • It is in 1NF and
  • It includes no partial dependencies
  • No attribute is dependent on only a portion of
    the primary key

38
Conversion to Third Normal Form
  • Data anomalies created are easily eliminated by
    completing three steps

39
Step 1 Identify Each New Determinant
  • For every transitive dependency, write its
    determinant as a PK (Primary Key) for a new table
  • Determinant
  • Any attribute whose value determines other values
    within a row

40
Step 2 Identify the Dependent Attributes
  • Identify the attributes dependent on each
    determinant identified in Step 1 and identify the
    dependency
  • Name the table to reflect its contents and
    function

41
Step 3 Remove the Dependent Attributes from
Transitive Dependencies
  • Eliminate all dependent attributes in transitive
    relationship(s) from each table that has such a
    transitive relationship
  • Draw a new dependency diagram to show all tables
    defined in Steps 13
  • Check new tables and modified tables from Step 3
    to make sure that each has a determinant and does
    not contain inappropriate dependencies

42
Third Normal Form (3NF) Conversion Results
43
Third Normal Form
  • A table is in third normal form (3NF) if
  • It is in 2NF and
  • It contains no transitive dependencies

44
A transitive dependency occurs when one nonkey
attribute (such as Charge_Hour in Fig. 5.4) is
dependent on one or more nonkey attributes (such
as Job_Class).
PRIMARY KEY
A
B
  • In this case, there is a transitive dependency
    between the primary key and attributes A and B.
    Transitive dependencies result in insertion,
    deletion, and update anomalies, similar to those
    for partial dependencies.

45
Improving the Design
  • Table structures are cleaned up to eliminate the
    troublesome initial partial and transitive
    dependencies
  • Normalization cannot, by itself, be relied on to
    make good designs
  • It is valuable because its use helps eliminate
    data redundancies

46
The Completed Database
47
The Completed Database (continued)
48
Summary
  • Normalization is a table design technique aimed
    at minimizing data redundancies
  • First three normal forms (1NF, 2NF, and 3NF) are
    most commonly encountered
  • 2NF is better than 1NF 3NF is better than 2NF
  • For most business database design purposes, 3NF
    is highest we need to go in the normalization
    process
Write a Comment
User Comments (0)
About PowerShow.com