Title: CPSC 310 Database Systems
1CPSC 310 Database Systems
- Lecturer Anxiao (Andrew) Jiang
- Lecture Two Relational Data Model
2The Relational Data Model
- Tables
- Schemas
- Conversion from E/R to Relations
Slides by Jeffrey Ullman
3A Relation is a Table
Attributes (column headers)
- name manf
- Snickers MM/Mars
- Twizzlers Hershey
- Candies
Tuples (rows)
4Schemas
- Relation schema relation name and attribute
list. - Optionally types of attributes.
- Example Candies(name, manf) or Candies(name
string, manf string) - Database collection of relations.
- Database schema set of all relation schemas in
the database.
5Why Relations?
- Very simple model.
- Often matches how we think about data.
- Abstract model that underlies SQL, the most
important database language today.
6From E/R Diagrams to Relations
- Entity set -gt relation.
- Attributes -gt attributes.
- Relationships -gt relations whose attributes are
only - The keys of the connected entity sets.
- Attributes of the relationship itself.
7Entity Set -gt Relation
- Relation Candies(name, manf)
name
manf
Candies
8Relationship -gt Relation
name
name
addr
manf
Con- sumers
Candies
Likes
2
1
Favorite
Buddies
Likes(consumer, candy)
Favorite(consumer, candy)
Buddies(name1, name2)
9Combining Relations
- OK to combine into one relation
- The relation for an entity-set E
- The relations for many-one relationships of which
E is the many. - Example Consumers(name, addr) and
Favorite(consumer, candy) combine to make
Consumer1(name, addr, favCandy).
10Risk with Many-Many Relationships
- Combining Consumers with Likes would be a
mistake. It leads to redundancy, as
name addr candy Sally 123 Maple
Twizzler Sally 123 Maple Kitkat
11Handling Weak Entity Sets
- Relation for a weak entity set must include
attributes for its complete key (including those
belonging to other entity sets), as well as its
own, nonkey attributes. - A supporting relationship is redundant and yields
no relation (unless it has attributes).
12Example
name
name
Logins
Hosts
At
location
billTo
Hosts(hostName, location) Logins(loginName,
hostName, billTo) At(loginName, hostName,
hostName2)
13Subclasses Three Approaches
- Object-oriented One relation per subset of
subclasses, with all relevant attributes. - Use nulls One relation entities have NULL in
attributes that dont belong to them. - E/R style One relation for each subclass
- Key attribute(s).
- Attributes of that subclass.
14Example
Candies
name
manf
isa
Choc- olates
color
15Object-Oriented
name manf Twizzler Hershey Candies name
manf color Snickers MM/Mars light Chocolates
Good for queries like find the color of
chocolate candies made by MM/Mars.
16E/R Style
name manf Twizzler Hershey Snickers
MM/Mars Candies name color Snickers
light Chocolates
Good for queries like find all candies
(including chocolates) made by MM/Mars.
17Using Nulls
name manf color Twizzler Hershey
NULL Snickers MM/Mars
dark Candies
Saves space unless there are lots of attributes
that are usually NULL.
18CPSC 310 Database Systems
- Lecturer Anxiao (Andrew) Jiang
- Lecture Three
- Functional Dependencies
19Functional Dependencies
- Meaning of FDs
- Keys and Superkeys
- Inferring FDs
Slides by Jeffrey Ullman
20Functional Dependencies
- X -gt A is an assertion about a relation R that
whenever two tuples of R agree on all the
attributes of X, then they must also agree on the
attribute A. - Say X -gt A holds in R.
- Convention , X, Y, Z represent sets of
attributes A, B, C, represent single
attributes. - Convention no set formers in sets of attributes,
just ABC, rather than A,B,C .
21Example
- Consumers(name, addr, candiesLiked, manf,
favCandy) - Reasonable FDs to assert
- name -gt addr
- name -gt favCandy
- candiesLiked -gt manf
22Example Data
name addr candiesLiked manf favCandy Janeway
Voyager Twizzlers Hershey Smarties Janeway Voyag
er Smarties Nestle Smarties Spock
Enterprise Twizzlers Hershey Twizzlers
23FDs With Multiple Attributes
- No need for FDs with gt 1 attribute on right.
- But sometimes convenient to combine FDs as a
shorthand. - Example name -gt addr and name -gt favCandy become
name -gt addr favCandy - gt 1 attribute on left may be essential.
- Example store candy -gt price
24Keys of Relations
- K is a superkey for relation R if K
functionally determines all of R. - K is a key for R if K is a superkey, but no
proper subset of K is a superkey.
25Example
- Consumers(name, addr, candiesLiked,
manf, favCandy) - name, candiesLiked is a superkey because
together these attributes determine all the other
attributes. - name -gt addr favCandy
- candiesLiked -gt manf
26Example, Cont.
- name, candiesLiked is a key because neither
name nor candiesLiked is a superkey. - name doesnt -gt manf candiesLiked doesnt -gt
addr. - There are no other keys, but lots of superkeys.
- Any superset of name, candiesLiked.
27E/R and Relational Keys
- Keys in E/R concern entities.
- Keys in relations concern tuples.
- Usually, one tuple corresponds to one entity, so
the ideas are the same. - But --- in poor relational designs, one entity
can become several tuples, so E/R keys and
Relational keys are different.
28Example Data
name addr candiesLiked manf favCandy Janeway
Voyager Twizzlers Hershey Smarties Janeway Voyag
er Smarties Nestle Smarties Spock Enterprise
Twizzlers Hershey Twizzlers
Relational key name candiesLiked But in E/R,
name is a key for Consumers, and candiesLiked is
a key for Candies. Note 2 tuples for Janeway
entity and 2 tuples for Twizzlers entity.
29Where Do Keys Come From?
- Just assert a key K.
- The only FDs are K -gt A for all attributes A.
- Assert FDs and deduce the keys by systematic
exploration. - E/R model gives us FDs from entity-set keys and
from many-one relationships.
30More FDs From Physics
- Example no two courses can meet in the same
room at the same time tells us - hour room -gt course.
31Inferring FDs
- We are given FDs X1 -gt A1, X2 -gt A2,, Xn -gt An
, and we want to know whether an FD Y -gt B must
hold in any relation that satisfies the given
FDs. - Example If A -gt B and B -gt C hold, surely A -gt
C holds, even if we dont say so. - Important for design of good relation schemas.
32Inference Test
- To test if Y -gt B, start by assuming two tuples
agree in all attributes of Y. - Y
- 0000000. . . 0
- 00000?? . . . ?
33Inference Test (2)
- Use the given FDs to infer that these tuples
must also agree in certain other attributes. - If B is one of these attributes, then Y -gt B is
true. - Otherwise, the two tuples, with any forced
equalities, form a two-tuple relation that proves
Y -gt B does not follow from the given FDs.
34Closure Test
- An easier way to test is to compute the closure
of Y, denoted Y . - Basis Y Y.
- Induction Look for an FD whose left side X is a
subset of the current Y . If the FD is X -gt A,
add A to Y . - Continue until Y cannot be changed.
35Y
36Finding All Implied FDs
- Motivation normalization, the process where we
break a relation schema into two or more schemas. - Example ABCD with FDs AB -gtC, C -gtD, and D
-gtA. - Decompose into ABC, AD. What FDs hold in ABC ?
- Not only AB -gtC, but also C -gtA !
37Why?
ABCD
ABC
a1b1c
a2b2c
Thus, tuples in the projection with equal Cs
have equal As C -gt A.
38Basic Idea
- Start with given FDs and find all nontrivial
FDs that follow from the given FDs. - Nontrivial left and right sides disjoint.
- Restrict to those FDs that involve only
attributes of the projected schema.
39Simple, Exponential Algorithm
- For each set of attributes X, compute X .
- Add X -gtA for all A in X - X.
- However, drop XY -gtA whenever we discover X -gtA.
- Because XY -gtA follows from X -gtA in any
projection. - Finally, use only FDs involving projected
attributes.
40A Few Tricks
- No need to compute the closure of the empty set
or of the set of all attributes. - If we find X all attributes, so is the
closure of any superset of X.
41Example
- ABC with FDs A -gtB and B -gtC. Project onto
AC. - A ABC yields A -gtB, A -gtC.
- We do not need to compute AB or AC .
- B BC yields B -gtC.
- C C yields nothing.
- BC BC yields nothing.
42Example --- Continued
- Resulting FDs A -gtB, A -gtC, and B -gtC.
- Projection onto AC A -gtC.
- Only FD that involves a subset of A,C .
43A Geometric View of FDs
- Imagine the set of all instances of a particular
relation. - That is, all finite sets of tuples that have the
proper number of components. - Each instance is a point in this space.
44Example R(A,B)
(1,2), (3,4)
(5,1)
(1,2), (3,4), (1,3)
45An FD is a Subset of Instances
- For each FD X -gt A there is a subset of all
instances that satisfy the FD. - We can represent an FD by a region in the space.
- Trivial FD an FD that is represented by the
entire space. - Example A -gt A.
46Example A -gt B for R(A,B)
A -gt B
(1,2), (3,4)
(5,1)
(1,2), (3,4), (1,3)
47Representing Sets of FDs
- If each FD is a set of relation instances, then a
collection of FDs corresponds to the
intersection of those sets. - Intersection all instances that satisfy all of
the FDs.
48Example
A-gtB
B-gtC
CD-gtA
49Implication of FDs
- If an FD Y -gt B follows from FDs X1 -gt
A1,,Xn -gt An , then the region in the space of
instances for Y -gt B must include the
intersection of the regions for the FDs Xi -gt Ai
. - That is, every instance satisfying all the FDs
Xi -gt Ai surely satisfies Y -gt B. - But an instance could satisfy Y -gt B, yet not be
in this intersection.
50Example
B-gtC
A-gtC
A-gtB