Relational Data Model

About This Presentation

Title:

Relational Data Model

Description:

Star(name, street, city) 9. Set attribute ... interface Cartoon: Movie { relationship Set Star voices; }; interface MurderMystery: Movie ... – PowerPoint PPT presentation

Number of Views:135

Avg rating:3.0/5.0

Slides: 76

Provided by: dblab1

Category:

more less

Transcript and Presenter's Notes

Title: Relational Data Model

1
Relational Data Model

Overview
introduced by E. F. Codd (1970)
strong theoretical foundations
simplicity
numerous commercial systems
has a single data-modeling concept relation
a table of values (informally )
Each column in the table has a column header
called an attribute.
Each row is called a tuple.
loosely speaking, represents databases as a
collection of relations and constraints

2
Formal Relational Concepts (1)

Domain
A set of atomic (indivisible) values.
Attribute
A name to suggest the meaning that a domain plays
in a particular relation.
Each attribute Ai has a domain dom(Ai)
Relation schema
A relation name R and a set of attributes Ai that
define the relation
Denoted by R(A1, A2, ... , An)
(ex) Movie(title, year, length, filmType)
Degree of a relation
the number of attributes

3
Formal Relational Concepts (2)

Tuple t (of R(A1, A2, ,An) )
A (ordered) set of values t ltv1, v2, ... , vngt
where each value vi is an element of dom(Ai).
Also called an n-tuple.
Relation instances, r(R)
A set of tuples
r(R) t1, t2, ... , tm, or alternatively
r(R) ? dom(A1) ? dom(A2) ? ... ? dom(An)
Relational database schema
A set S of relation schemas that belong to the
same database.
S R1, R2, ... , Rn

4
Suppliers relation S
CITY
STATUS
NAME
S
Domain
Primary key
S SNAME STATUS CITY
S1 Smith 20
London S2 Jones 10
Paris S3 Blake 30
Paris S4 Clark
20 London S5 Adams
30 Athens
S
Relation
Cardinality
Tuples
Attributes
Degree
5
Characteristics of Relations

Ordering of tuples in a relation r(R)
The tuples are not considered to be ordered, even
though they appear to be in the tabular form
Ordering of attributes in a relation schema R
(and of values within each tuple) is immaterial
All values are considered atomic (indivisible).
Relational databases do not allow repeating
groups
repeating group a column or combination of
columns that contains several data values in row
A special null value is used to represent missing
value

6
Relational terminology
7
From ODL to relational schema

Basically one relation for each class, and one
attribute for each property
interface Movie attribute string
title attribute integer year attribute
integer length attribute enum Film color,
blackAndWhite filmtype
Movie(title, year, length, filmType)

8
Non-atomic attribute in class

Remember, relational model allows atomic values
to appear in relation
As for record structures, expand it making one
attribute of relation for each field of the
structure
interface Star attribute string
name attribute Struct Addr string street,
string city address
Star(name, street, city)

9
Set attribute

One approach is to make one tuple for each value
of a set-valued attribute
interface Star attribute string
name attribute SetltStruct Addr string street,
string citygt address
Star(name, street, city)

10
Other type constructor (1)

For a bag (multiset)
add to relation schema another attribute count
representing the number of times that each
element is a member of the bag
interface Star attribute string
name attribute BagltStruct Addr string street,
string citygt address

11
Other type constructor (2)

For a list, add a new attribute position,
indicating the position in the list
A fixed-length array can be represented by
attributes for each position in the array

12
Single-valued relationship

Relational model does not support the notion of a
pointer, so simulate the effect of pointers by
values that represent the related objects
Include key attribute of referenced class
interface Movie ... attribute enum Film
color, blackAndWhite filmtype relationship
Studio ownedBy inverse Studioowns //
assume studioName is key of the Studio class

13
Multivalued relationship

May represent a set of related objects by
creating one tuple for each value
Note there is redundancy

14
What if there is no key

ODL permits two objects in a class to have
exactly the same values for all properties
Invent a new attribute that can be used as key

15
Relationship and its inverse

In the ODL model, the relationship and its
inverse are both needed !
We cannot follow a pointer backwards
However, representing both relationship and its
inverse in relational schema is redundant !
WHY ???

16
Steps in Logical Database Design

using ER Model Approach
Major Steps
identify entity types
identify relationships between entity types
determine appropriate attributes for entity and
relationship types
convert ER diagram into the system dependent
model
Hierarchical model
Network model
Relational model
Object-oriented model
Normalization

17
Regular entity type

create a relation.
for composite attribute, may include only the
simple component.
choose a primary key.
Example
Movies(title, year, length, filmType)
Stars(name, address) or Stars(name, street, city)

18
Binary MN relationship type R

create a new relation S.
include all attributes of R in S.
include primary keys of the participating
relations as foreign key in S
Example
key of the Stars entity type starName
key of the Movies entity type title year
key of the Studios entity type StudioName
Owns(title, year, StudioName)
Stars-In(title, year, starName)

19
Regular binary 1N or 11 relationship R

For 11 or 1N relationship types, we may not
create a new relation optionally
Let S, T participating entity types (S N-side)
include the primary key of T as foreign key in S.
include all attributes of R in S.
Example
Movies(title, year, length, filmType, StudioName)
StudioName is a foreign key
Studios(name, address)
no schema for the Owns relationship

20
For multivalued attribute A

create a new relation R that includes the
multivalued attribute
include the primary key attribute K of the
relation that has A as an attribute
primary key combination of A and K
Department(number, name)
DeptLocation(number, location)

21
N-ary relationship type (1)

create a new relation S.
include primary keys of participating entity
types as foreign keys in S.
include all attributes of n-ary relationship.
primary key of S usually a combination of all
foreign keys
Contracts(starName, title, year, studioOfStar,
producingStudio)

22
N-ary relationship type (2)
Quantity
SName
ProName
SUPPLY
SUPPLIER
PROJECT
PART
PartNo
PROJECT
SUPPLIER
SName
ProName

PART
SUPPLY
PartNo
SName ProName PartNO
Quantity
FK
FK
FK
23
Weak entity type

create a relation R
include the primary key of the owner as foreign
key in R
primary key in R
primary key in the owner partial key of the
weak entity
Studios(name, addr)
Crews(number, StudioName)
Note, there is no separate relation for
identifying relationship type

24
Converting subclasses to relations

Recall the distinction
In ODL, an object belongs to exactly one class.
An object inherits properties from all its
superclasses but technically is not a member of
the superclasses
In ER model, an entity belongs to several entity
sets that are related by isa relationships.
Thus, the linked entities together represent the
object and give that object its properties -
attributes and relationships

25
Relational representation of ODL subclasses

Every subclass has its own relation that
represents all the properties of that subclass
including all its inherited properties
interface Cartoon Movie relationship
SetltStargt voices interface MurderMystery
Movie attribute string weapon interface
Cartoon-MurderMystery Cartoon, MurderMystery
Movie(title, , starName)Cartoon(title, ,
starName, voices)MurderMystery(title, ,
starName, weapon)Cartoon-MurderMystery(title, ,
starName, voices, weapon)

26
Representing ISA in the relational model

For each entity set, create a relation that has
attributes of that entity set alone as well as
key attributes of related entity sets
There is no relation created for an isa
relationship
Comparison
ODL translation keeps all properties of an object
together in one relation
ER translation repeats the key for an object once
for each of the entity sets or relationships to
which that entity belongs

Movies(title, year, length, filmType)
MurderMysteries(title, year, weapon)
Cartoons(title, year)
Voices(title, year, starName)
Why we need the Cartoon relation whose attributes
are a subset of the Movies relation
How to represent a silent cartoon movie without
Cartoon relation
What about Cartoon-MurderMystery ?

28
Using null values

interface Cartoon Movie relationship
SetltStargt voices interface MurderMystery
Movie attribute string weapon interface
Cartoon-MurderMystery Cartoon, MurderMystery
Movie(title, year, length, filmType, studioName,
starName, voice, weapon)
Use null value when not applicable !

29
Functional Dependency (FD)

Definition
For any two tuples t1 and t2 in R such that t1X
t2X, we must have t1Y t2Y where X and
Y are sets of attributes in relation R
The values of the X component of a tuple uniquely
(functionally) determine the value of the Y
component.
Notation X ? Y
FD is a property of the relation schema
(intension) of R that should hold all relation
instances (extensions) all the times
Example
every key K always functionally determines any
subset of attributes Y of R i.e. K ? Y

30
Example on FD

title year ? length
title year ? filmType
title year ? studioName
title year ? length filmType studioName
title year ? starName (false)

31
Key, superkey, primary key, candidate key

Superkey of a relation schema RA1, A2, ... An
is a set of attributes S with the property that
no two tuples t1 and t2 in any relation instance
of R will have t1S t2S
Candidate key a minimal superkey
Candidate key, Primary key
If a relation schema has more than one minimal
super key, each is called a candidate key
One of candidate keys are designated to be a
primary key
Example
(title, year, starName) key
(title, year, starName, length) superkey

32
Discovering keys for relations

If a relation comes from an entity set then the
key for the relation is the key attributes of
this entity set or class
If a relation R comes from a relationship
many-many keys of both connected entity sets are
the key attributes for R
many-one from entity set E1 to entity set E2 key
attributes of E1 are key attributes of R, but
those of E2 are not
one-one key attributes for either of the
connected entity sets are key attributes of R.
If a multiway relationship R has an arrow to
entity set E, then there is at least one key for
the corresponding relation that excludes the key
of E
Movies(title, year, length, filmType) Stars(name,
address)Owns(title, year, studioName) //
many-oneStars-in(title, year, starName) //
many-many

33
Keys for relations derived from ODL

If there is no key at all of an ODL class,
introduce an attribute that is a surrogate for
the object identifier of objects
There are certain cases in which the key
attributes for the class are not a key for the
relation
WHY ???
In general, if the relation for C represents
several multivalued relationships from C, then
the keys for all the classes that these
relationships connected to C must be added to the
key for C
The result is the key for C relation

34
Rules for FDs

splitting rule
We can replace a FD A1A2An ? B1B2Bm by a set of
FDs A1A2An ? Bi for i 1, m
combining rule
We can replace a set of FDs A1A2An ? Bi for i
1, m by a single FD A1A2An ? B1B2Bm
A1A2An ? B1B2Bm is
trivial if the Bs are a subset of the As
nontrivial if at least one the Bs is not among
the As
completely nontrivial if none of the Bs is also
one of the As
We can always remove from the right side of a DF
those attributes that appear on the left
title year ? year length // the second year
may be dropped

35
Armstrong's axioms (inference rules)

To infer new dependencies from a (given) set F of
dependencies i.e. F X ? Y
Let X,Y,Z be a set of attributes
Reflexive rule if Y ? X then X ? Y
Augmentation rule if X ? Y then XZ ? YZ
Transitive rule if X ? Y and Y ? Z then X ? Z
Def F (closure of F) is the set of all
dependencies which are logically implied by F
A set of inference rules is complete if given the
set F the rule allows us to determine all
dependencies in F
A set of rule is sound if using them we cannot
deduce any dependency not in F
Armstrong's axioms are sound and complete

36
Additional inference rules

Decomposition rule if X ? YZ then X ?
Y (proof) 1. X ? YZ (given) 2. YZ ? Y
(reflexive) 3. X ? Y (transitive)
Union rule if X ? Y and X ? Z then X ?
YZ (proof) 1. X ? Y (given) 2. X ? Z
(given) 3. X ? XY (augmentation on 1 with
X) 4. XY ? YZ (augmentation on 2 with Y) 5.
X ? YZ (transitive with 3 and 4)
Pseudotransitive rule if X ? Y and WY ? Z then
WX ? Z (proof) 1. X ? Y (given) 2. WY ? Z
(given) 3. WX ? WY (augmentation on 1 with
W) 4. WX ? Z (transitive with 2 and 3)

37
Closure of attributes

X (closure of X with respect to F) is the set of
attributes that are functionally determined by X
Algorithm (compute X) X Xrepeat oldX
X for each functional dependency Y ? Z
in F do if Y ? X then X X ?
Zuntil (oldX X )

38
Example of attribute closure (1)

R(A,B,C,D,E,G)
FDs 1. AB ? C 2. D ? EG 3. C ? A 4. BE ?
C 5. BC ? D 6. CG ? BD 7. ACD ? B 8. CE ?
AG
Let X BD, then X0 BD X1 BDEG (by 2) X2
BCDEG (by 4) X3 ABCDEG (by 3,5,6,8) X4
X3 Hence, X X4 ABCDEGX is a key

39
Example of attribute closure (2)

R(A,B,C,D,E,F)
FDs 1. AB ? C 2. BC ? AD 3. D ? E 4. CF ?
B
Let X AB, then X0 AB X1 ABC (by 1) X2
ABCD (by 2) X3 ABCDE (by 3) X4 X3 Hence,
X X4 ABCDEX is not a key

40
Minimal cover

Two sets of functional dependencies E and F are
equivalent if E F
A functional dependency fd in F is redundant if
(F - fd) F
F' is a nonredundant cover (or minimal cover,
minimal base) of F if F' F and F' contains
no redundant functional dependencies
Usually the closure of F (i.e. F) is too big to
handle, so use the previous algorithm to detect
redundant functional dependency
X ? Y follows from set of dependencies S if and
only if Y are in X
There exist several minimal covers for a set of
FDs !

41
Minimal cover example (1)

R(A,B,C,D,E,F)
S 1. AB ? C 2. BC ? AD 3. D ? E 4. CF ?
B
Does AB ? D follow from S ?
A,B ABCDE and D ? A,B
AB ? D follows from S
AB ? D is redundant
Does D ? A follow from S ?
D DE and A ? D
D ? A does not follow from S
D ? A is not redundant

42
Minimal cover example (2)

R(A,B,C)
S 1. A ? B 2. A ? C 3. B ? A 4. B ? C 5. C
? A 6. C ? B 7. AB ? C 8. BC ? A 9. AC ?
B
Minimal cover1 A ? B, B ? A, B ? C, C ? B
Minimal cover2 A ? B, B ? C, C ? A
Other minimal cover, too.
Issue How to get minimal cover ?

43
Normalization

one phase in database design
first proposed by E.F. Codd (1972)
a process during which unsatisfactory relation
schemas are decomposed by breaking into smaller
relation schemas that possess desirable
properties
utilizes functional dependency, multivalued
dependency and join dependency

44
Bad relational schema

Anomalies
Insertion anomalies
Cannot record filmType without starName
Deletion anomalies
If we delete the last star, we also lose the
movie info.
Modification (update) anomalies

45
Decomposing relations

Given a relation R(A1, A2, , An), we may
decompose R into two relations S(B1, B2, , Bm)
and T(C1, C2, , Ck)
A1, A2, , An B1, B2, , Bm ? C1, C2,
, Ck
Tuples of relation S is projections onto B1, B2,
, Bm of relation R
Similarly for relation T

46
Decomposition example
47
Normal forms

First Normal Form (1NF)
Second Normal Form (2NF)
Third Normal Form (3NF)
Boyce/Codd Normal Form (BCNF)
Fourth Normal Form (4NF)
Fifth Normal Form (5NF)

48
First normal form (1NF)

The only attribute values permitted are single
atomic (indivisible)
Not allow a set, list, bag, etc.
considered to be part of the formal definition of
a relations
nested relation concept (?)

49
Various functional dependencies

Prime and nonprime attribute
an attribute of a relation schema R is called a
prime attribute of R if it is a member of any
candidate key of R
an attribute is called nonprime if it is not a
prime attribute, i.e. not a member of any
candidate key of R
Full functional dependency
Y is said to be fully dependent on X if X ? Y and
Z \ ? Y for any X ? Z
Y is fully dependent on X if and only if
Y is functionally dependent on X, and -
not functionally dependent on any proper subset
of X
Partial functional dependency
Y is said to be partially dependent on X if some
attribute can be removed from X and the
dependency still holds
Transitive functional dependency

50
Second Normal Form (2NF)

A relation schema R is in 2NF if it is in 1NF and
every nonprime attribute A in R is fully
functionally dependent on every key of R
R(SSN, pNumber, hours, eName, pName,
pLocation)SSN pNumber ? hoursSSN ?
eNamepNumber ? pName pLocationdecomposed
intoR1(SSN, pNumber, hours)R2(SSN,
eName)R3(pNumber, pName, pLocation)

51
Example for 1NF and 2NF

FIRST(S, P, Status, City, Qty) S P ? Status
City Qty S ? City StatusCity ? Status
Anomalies
insertion cannot record City for a supplier
until he supplies something
modification City for a supplier appears many
times
deletion deletion of last tuple for S lose its
City
FIRST is not in 2NF, so decompose into
SECOND(S, Status, City) SP(S, P, Qty) S
? City StatusCity ? StatusS P ? Qty

52
Third Normal Form (3NF) (1)

A relation schema is in 3NF if it is in 2NF and
no nonprime attribute of R is transitively
dependent on any key
EMP_DEPT(SSN, Ename, Bdate, Addr, D, Dname,
Dmgrssn) FDs SSN ? Ename Bdate Addr D D ?
Dname Dmgrssn Since Dname and Dmgrssn are
transitively dependent on D (not in
3NF)ED1(SSN, Ename, Bdate, Addr, D)ED2(D,
Dname, Dmgrssn)

53
Third Normal Form (2)

Alternatively
A relation schema R is in 3NF if whenever a
nontrivial functional dependency X --gtA holds in
R, then either (a) X is a super key of R or (b) A
is a prime attribute of R.
Violation of (a) implies X is not a superset of
any key i.e.
X could be nonprime, or
in result, typical transitive dependency
X could be a proper subset of a key
in result, have a partial dependency, that is not
in 2NF !
Violation of (b) implies A is a nonprime
attribute

54
Example for 2NF and 3NF

SECOND(S, Status, City) SP(S, P, Qty) S ?
City StatusCity ? Status
Anomalies
insertion cannot record new Status for a city
without S
modification Status for a City appears in many
tuples
deletion delete only the second tuple for a
particular City
SECOND is not in 3NF, so decompose into
SC(S,City) CS(City, Status)

55
Boyce/Codd Normal Form (BCNF)

A relation schema R is in BCNF if whenever a
nontrivial functional dependency X ? A holds in
R, then X is a super key of R
Difference with 3NF
Drop the second condition in 3NF that allows A to
be prime if X is not a superkey

56
3NF and BCNF example (1)

MovieStudio(title, year, length, filmType,
studioName, studioAddr)title year ? length
filmType studioNamestudioName ? studioAddr
Key title, yearHence, MovieStudio is not 3NF
Decompose intoMovieStudio1(title, year, length,
filmType, studioName)MovieStudio2(studioName,
studioAddr)Then we get a schema in BCNF.

57
BCNF example (2)
58
All binary relations are in BCNF

Let A and B are all attributes
Consider all possible cases, here there are
totally 4 cases
no nontrivial FD at all
A, B is a key, so in BCNF
A ? B holds, B ? A does not hold
A is a key, so in BCNF
B ? A holds, A ? B does not hold similarly
A ? B holds, B ? A holds
A and B are keys, so in BCNF
Note that such dependencies are plausible

59
Why 3NF, not BCNF ???

Bookings(title, theater, city) theater ?
city title city ? theater
Two candidate keys title city, title theater
Bookings is not in BCNF, so decompose
intotheater, city and theater, title
Consider following two instances
When two relations are joined, title city ?
theater does not hold !

60
Recovering info. from decomposition

We need to make sure that projections of the
original tuples can be joined again to produce
all and only the original tuples
Now recover the original relation with join
operation
There are two spurious tuplesAttribute B is not
a key in either relation

61
Lossless (non-additive) join

To prevent spurious tuples from being generated
when a natural join is applied
Decomposition D R1, , Rm of R has the
lossless join property wrt a set of FDs if for
every legal state r of R ?ltR1gt(r) ? ? ?ltRmgt(r)
r

62
Property of lossless join

D R1, R2 of R has the lossless join property
wrt a set of FDs if (R1 ? R2) ? (R1 R2)
or (R1 ? R2) ? (R2 R1)That is, (R1 ? R2) is
a key in R1 or R2
ExamplePCZ(phone, company, zip) phone ?
company zip ? company Decomposed
into PC(phone, company) ZC(zip, company)Since
company ( PC ? ZC) is not a key, lossy join

63
Projecting FDs

How we can find (new) FDs relevant to decomposed
relation schema ?
Suppose relation R is decomposed into relation S
and other relation, and F is a set of FDs known
to hold for R
Let X be a set of attributes that is contained in
the set of attributes of S
Compute X
For each attribute B such that
B is an attribute of S
B is in X
B is not in X
Then, FD X ? B holds in S

64
Example 3.39

R(A, B, C, D), A ? B, B ? C
Let S(A,C) be a decomposed relation of R
Need to compute the closure of each subset of
A,C
Compute A
A ABC
C is in S
so A ? C holds for S
Compute C
C C, no new FD
Compute AC
AC ABC, no new FD
Hence, A ? C is the only non-trivial FD for S

65
Example 3.40

R(A, B, C, D, E), A ? D, B ? E, DE ? C
Let S(A, B, C) be a decomposed relation of R
Need to compute the closure of each subset of
A, B, C
Compute A AD, no new FD
Compute B BE, no new FD
Compute C C, no new FD
Compute AB ABCDE, so AB ? C holds for S
Compute BC BCE, no new FD
Compute AC ACD, no new FD
Compute ABC ABCDE, no new FD
Hence, AB ? C is the only nontrivial FD for S

66
Dependency preservation

A decomposition D R1, , Rm of R is
dependency-preserving wrt a set F of FDs if
(?F(R1) ? ? ?F(Rm)) Fwhere ?F(Ri)
denotes a set of FDs X ? Y in F such that all
attributes in X ? Y are contained in Ri
We do not want FDs to be lost in the
decomposition
Always possible to have a dependency-preserving
decomposition D such that each Ri in D is in 3NF
Not always possible to find a decomposition that
preserves dependencies into BCNF

67
Multivalued dependency example (1)

EMP(eName, pName, depName) Smith X,Y John,
Anna
Must repeat every combination due to 1NF
Two independent one-many relationships are mixed
in the same relation
eName --gtgt pNameeName --gtgt depName

68
Multivalued dependency example (2)

interface Star attribute string
name attribute SetltStruct Addr string street,
string citygt address relationship SetltMoviegt
starredIn inverse Moviestars
Note that there are no nontrivial FDs, hence it
is in BCNF

69
Multivalued dependency example (3)

A1A2 An --gtgt B1B2 Bm holds ifFor each pair
of tuples t and u of relation R that agree on all
the As, we can find in R some tuple v that
agrees
With both t and u on the As
With t on the Bs and
With u on all attributes of R that are not among
the As or Bs
In previous example,
first tuple t
second tuple u
then, the third tuple of the previous instance
becomes v

70
Multivalued dependency definition

Let X, Y be sets of attributes in RLet Z be
compliment of X ? Y
Multivalued dependency (MVD) X --gtgt Y X
multidetermines Y holds in R if and only if each
X-value in R is associated with a set of Y-values
in a way that does not depend on Z-values
Formally, a MVD from X to Y, X --gtgt Y exists in R
iff Yxz Yxz for each X, Z, Z such that Yxz
and Yxz are nonemptywhere Yxz y? ltx,y,zgt ?
R

71
MVD and FD-MVD rules

Complementation
If X --gtgt Y, then X --gtgt T X Y
Augmentation
If X --gtgt Y and V ? W, XW --gtgt YV
Transitivity
If X --gtgt Y and Y --gtgtZ, then X --gtgt Z Y
If X ? Y, then X --gtgt Y (i.e. an FD is a special
case of an MVD)
Coalescence rule
If X--gt Y, Z ? Y and for some W disjoint from Y
we have W ? Z, then X ? Z

72
Fourth normal form (4NF)

An MVD A1A2 An --gtgt B1B2 Bm for a relation R
is trivial if
As ? Bs or
As ? Bs are all attributes of R
A relation R is in 4NF if whenever A1A2 An
--gtgt B1B2 Bm is a nontrivial MVD, A1, A2, ,
An is a superkey

73
4NF example

Star(name, street, city, title, year)name --gtgt
street city
Star relation is not in 4NF,hence decompose
into Star1(name, street, city) Star2(name,
title, year)
Note that both new relations are in 4NF

74
Relationship among normal forms (1)