Design guidelines and Normalization - PowerPoint PPT Presentation

1 / 11
About This Presentation
Title:

Design guidelines and Normalization

Description:

Design Guidelines for Relation Schema. Discussed four informal measures of ... a property of the relation schema (intension) R, not of a particular legal ... – PowerPoint PPT presentation

Number of Views:180
Avg rating:3.0/5.0
Slides: 12
Provided by: reu1
Category:

less

Transcript and Presenter's Notes

Title: Design guidelines and Normalization


1
Design guidelines and Normalization
  • Informal design guidelines for Relation Schema
  • Functional dependencies
  • Normal forms

Dr. Reuven Bakalash
2
Design Guidelines for Relation Schema
  • Discussed four informal measures of quality of
    relation schema design
  • Semantics of the attributes
  • Reducing the redundant values in tuples
  • Reducing the null values in tuples
  • Disallowing the possibility of generating
    spurious tuples

Dr. Reuven Bakalash
3
Semantics of the Relation Attribute
  • Semantics, specifies how to interpret the
    attribute values. They should have a clear
    meaning for themselves and how they relate to one
    another (see the COMPANY schema)
  • Guideline 1
  • Design a relation schema such that it is easy to
    explain its meaning. If a relation schema
    corresponds to one entity type or one
    relationship type, the meaning tends to be clear.
  • So, do not combine attributes from entity type
    and relationship type to a single relation.

Dr. Reuven Bakalash
4
Semantics of the Relation Attribute
EMP_DEPT
(a)
EMP_PROJ
(b)
  • (a) and (b) have clear semantics
  • Both violate guideline 1. They mix attributes.
  • (a) represents a single employee but includes
    also information on department
  • (b) relates an employee to a project but also
    include employee and project information
  • They can be used as views, but they cause
    problems when used as base relations.

Dr. Reuven Bakalash
5
Information in tuples
  • Minimize the storage space that the base
    relations occupy

Compare the space used by two base relations
EMPLOYEE and DEPARTMENT in Fig. 14.2 with the
space of EMP_DEPT in Fig. 14.4 In 14.4 all
department information repeats for each employee.
Dr. Reuven Bakalash
6
Information in tuples
  • Update anomalies (insertion, deletion and
    modification anomalies)
  • Insertion anomalies (examples based on EMP_DEPT,
    Fig. 14.4).

To insert a new employee into EMP_DEPT we must to
enter the attributes of department correctly. In
14.2 we enter only dept. number in the new
tuple. It is difficult to enter a new department
that has no employees yet, since SSN is a primary
key.
  • Deletion anomalies

If we delete the tuple representing the last
employee in a department, we lost the information
concerning that department from database.
Dr. Reuven Bakalash
7
Information in tuples
  • Modification anomalies

In EMP_DEPT if we change, say, the manager of
dept 5, we must update the tuples of all
employees that work for this department.
  • Guideline 2
  • Design the base relations schemas so that no
    insertion, deletion, or modification anomalies
    are present in the relations.
  • If any anomalies are present, note them clearly
    and make sure that the programs that update the
    database will operate correctly.

Dr. Reuven Bakalash
8
Information in tuples
  • Null values in tuples
  • Nulls waste space and lead to problems of
    understanding the meaning. Nulls can have
    multiple interpretations, such as
  • The attribute does not apply to this tuple
  • The attribute value is unknown
  • The value is known but absent (not recorded yet)

Nulls also cause problems with COUNT and
SUM. Grouping many attributes in a fat relation
causes many nulls.
  • Guideline 3
  • As far as possible, avoid placing attributes in a
    base relation whose values may frequently be
    null.
  • If nulls are unavoidable, make sure that they
    apply in exceptional cases only and do not apply
    to a majority of tuples in a relation.

Dr. Reuven Bakalash
9
Information in tuples
  • Generation of spurious (false) tuples

Consider two tables EMP_LOCS and EMP_PROJ1 (Fig.
14.5) which will replace EMP_PROJ (Fig.
14.4b). This is a bad schema design because we
cant recover the original information from
EMP_PROJ. The result of NATURAL JOIN on EMP_LOCS
and EMP_PROJ1 is shown in Fig. 14.6. Spurious
tuples are generated! PLOCATION is neither a
primary key nor a foreign key.
  • Guideline 4
  • Design relation schemas so that they can be
    JOINed with equality conditions on attributes
    that are either primary keys or foreign keys in a
    way that guarantees that no spurious tuples are
    generated. Do not have relations that contain
    matching attributes other than foreign_key-primary
    _key combinations.
  • If such relations are unavoidable, do not join
    them on such attributes, because the join may
    produce spurious tuples.

Dr. Reuven Bakalash
10
Functional Dependencies
  • The following are functional dependencies
  • State, Driver_license_number ? SSN
  • SSN ? ENAME
  • PNUMBER ? PNAME, PLOCATION
  • SSN, PNUMBER ? HOURS
  • A FD, denoted X ? Y, between two sets of
    attributes X and Y that are subset of R specifies
    a constraint on the possible tuples that can form
    a relation state of R.
  • X functionally determines Y in R if and only if,
    whenever two tuples of r(R) agree on their
    X-value, they must necessarily agree on their
    Y-value
  • If there cant be more than one tuple with a
    given X-value in any relation instance r(R) -
    that is, X is a candidate key of R this implies
    that X ? Y for any subset of attributes Y of R
    (because the key constraint doesnt allow that
    two tuples will have the same value of X).
  • If X ? Y in R, this does not say whether or not Y
    ? X in R.

11
Functional dependencies
  • FD is a property of the relation schema
    (intension) R, not of a particular legal relation
    state (extension) r of R. Hence, an FD cannot be
    inferred automatically from a given relation
    extension r but must be defined by someone who
    knows the semantic of the attributes of R.

TEXT ? COURSE ? Cannot confirm unless we know
that it is true for all possible states of
TEACH Single counterexample disproves a
functional dependancy. COURSE ? TEXT is ruled out
See Figure 14.3 for diagrammatic notation
Write a Comment
User Comments (0)
About PowerShow.com