Design guidelines and Normalization - PowerPoint PPT Presentation

1 / 11

About This Presentation

Title:

Design guidelines and Normalization

Description:

Design Guidelines for Relation Schema. Discussed four informal measures of ... a property of the relation schema (intension) R, not of a particular legal ... – PowerPoint PPT presentation

Number of Views:180

Avg rating:3.0/5.0

Slides: 12

Provided by: reu1

Category:

more less

Transcript and Presenter's Notes

Title: Design guidelines and Normalization

1
Design guidelines and Normalization

Informal design guidelines for Relation Schema
Functional dependencies
Normal forms

Dr. Reuven Bakalash
2
Design Guidelines for Relation Schema

Discussed four informal measures of quality of
relation schema design
Semantics of the attributes
Reducing the redundant values in tuples
Reducing the null values in tuples
Disallowing the possibility of generating
spurious tuples

Dr. Reuven Bakalash
3
Semantics of the Relation Attribute

Semantics, specifies how to interpret the
attribute values. They should have a clear
meaning for themselves and how they relate to one
another (see the COMPANY schema)
Guideline 1
Design a relation schema such that it is easy to
explain its meaning. If a relation schema
corresponds to one entity type or one
relationship type, the meaning tends to be clear.
So, do not combine attributes from entity type
and relationship type to a single relation.

Dr. Reuven Bakalash
4
Semantics of the Relation Attribute
EMP_DEPT
(a)
EMP_PROJ
(b)

(a) and (b) have clear semantics
Both violate guideline 1. They mix attributes.
(a) represents a single employee but includes
also information on department
(b) relates an employee to a project but also
include employee and project information
They can be used as views, but they cause
problems when used as base relations.

Dr. Reuven Bakalash
5
Information in tuples

Minimize the storage space that the base
relations occupy

Compare the space used by two base relations
EMPLOYEE and DEPARTMENT in Fig. 14.2 with the
space of EMP_DEPT in Fig. 14.4 In 14.4 all
department information repeats for each employee.
Dr. Reuven Bakalash
6
Information in tuples

Update anomalies (insertion, deletion and
modification anomalies)
Insertion anomalies (examples based on EMP_DEPT,
Fig. 14.4).

To insert a new employee into EMP_DEPT we must to
enter the attributes of department correctly. In
14.2 we enter only dept. number in the new
tuple. It is difficult to enter a new department
that has no employees yet, since SSN is a primary
key.

Deletion anomalies

If we delete the tuple representing the last
employee in a department, we lost the information
concerning that department from database.
Dr. Reuven Bakalash
7
Information in tuples

Modification anomalies

In EMP_DEPT if we change, say, the manager of
dept 5, we must update the tuples of all
employees that work for this department.

Guideline 2
Design the base relations schemas so that no
insertion, deletion, or modification anomalies
are present in the relations.
If any anomalies are present, note them clearly
and make sure that the programs that update the
database will operate correctly.

Dr. Reuven Bakalash
8
Information in tuples

Null values in tuples

Nulls waste space and lead to problems of
understanding the meaning. Nulls can have
multiple interpretations, such as
The attribute does not apply to this tuple
The attribute value is unknown
The value is known but absent (not recorded yet)

Nulls also cause problems with COUNT and
SUM. Grouping many attributes in a fat relation
causes many nulls.

Guideline 3
As far as possible, avoid placing attributes in a
base relation whose values may frequently be
null.
If nulls are unavoidable, make sure that they
apply in exceptional cases only and do not apply
to a majority of tuples in a relation.

Dr. Reuven Bakalash
9
Information in tuples

Generation of spurious (false) tuples

Consider two tables EMP_LOCS and EMP_PROJ1 (Fig.
14.5) which will replace EMP_PROJ (Fig.
14.4b). This is a bad schema design because we
cant recover the original information from
EMP_PROJ. The result of NATURAL JOIN on EMP_LOCS
and EMP_PROJ1 is shown in Fig. 14.6. Spurious
tuples are generated! PLOCATION is neither a
primary key nor a foreign key.

Guideline 4
Design relation schemas so that they can be
JOINed with equality conditions on attributes
that are either primary keys or foreign keys in a
way that guarantees that no spurious tuples are
generated. Do not have relations that contain
matching attributes other than foreign_key-primary
_key combinations.
If such relations are unavoidable, do not join
them on such attributes, because the join may
produce spurious tuples.

Dr. Reuven Bakalash
10
Functional Dependencies

The following are functional dependencies
State, Driver_license_number ? SSN
SSN ? ENAME
PNUMBER ? PNAME, PLOCATION
SSN, PNUMBER ? HOURS
A FD, denoted X ? Y, between two sets of
attributes X and Y that are subset of R specifies
a constraint on the possible tuples that can form
a relation state of R.
X functionally determines Y in R if and only if,
whenever two tuples of r(R) agree on their
X-value, they must necessarily agree on their
Y-value
If there cant be more than one tuple with a
given X-value in any relation instance r(R) -
that is, X is a candidate key of R this implies
that X ? Y for any subset of attributes Y of R
(because the key constraint doesnt allow that
two tuples will have the same value of X).
If X ? Y in R, this does not say whether or not Y
? X in R.

11
Functional dependencies

FD is a property of the relation schema
(intension) R, not of a particular legal relation
state (extension) r of R. Hence, an FD cannot be
inferred automatically from a given relation
extension r but must be defined by someone who
knows the semantic of the attributes of R.

TEXT ? COURSE ? Cannot confirm unless we know
that it is true for all possible states of
TEACH Single counterexample disproves a
functional dependancy. COURSE ? TEXT is ruled out
See Figure 14.3 for diagrammatic notation

Write a Comment

User Comments (0)