Title: CS 453: Database Systems
1CS 453 Database Systems
Chapter 5 Relational Algebra
2Chapter outline
- Introduction
- Six basic operators
- Union ?
- Difference
- Selection s
- Projection P
- Cartesian Product ?
- Renaming r
- Derived or auxiliary or additional operators
- Intersection ?, complement
- Joins (natural, theta join, equi-join,
semi-join) - Outer Join (Left, Right and Full)
- Division
3Relational Algebra
- It is a procedural language, I.e., sequence of
operations is explicitly specified in the
language - Formalism for creating new relations from
existing ones - Six basic operations select, project, union, set
difference, Cartesian product, renameE.g., w r
? s, w r s, w r ? s v where r, s, v are
relations - Each operation takes one or more relations as
input and give a relation as output (a property
known as closure allowing us to nest relational
algebraic expressions) - Contrast relational algebra expressions with
arithmetic expressions (, -, x, /) the former
operates on relations and the latter on numbers
4Operators Relational Algebra
- Six basic operators
- Selection s Selects a subset of rows from
relation - Projection P Deletes unwanted columns from
relation - Cartesian Product ? Allows us to combine two
relations - Renaming r Renaming of relations and their
attributes - Union ? Tuples in relation 1 and in relation 2
- Difference Tuples in relation 1, but not in
relation 2 - Derived or auxiliary or additional operators
- Intersection ?, complement
- Joins (natural, theta join, equi-join,
semi-join) - Outer Join (Left, Right and Full)
- Division
5Core Relational Algebra
- Limited expressive power (subset of possible
queries) - Good optimizer possible
- Rich enough language to express enough useful
things - A small set of operators that allow us to
manipulate relations in limited but useful ways.
The operators are - 1. Union, intersection, and difference the usual
set operators - But the relation schemas must be the same
- 2. Selection Picking certain rows from a
relation - 3. Projection Picking certain columns
- 4. Products and joins Composing relations in
useful ways - 5. Renaming of relations and their attributes
- Additional operations Not essential, but (very!)
useful - Since each operation returns a relation,
operations can be composed! (Algebra is closed)
6Finiteness Relational Algebra
- ? SELECT
- p PROJECT
- X CARTESIAN PRODUCT
- ? UNION
- SET-DIFFERENCE
- ? SET-INTERSECTION
- ? THETA-JOIN
- NATURAL JOIN
- DIVISION or QUOTIENT
7Select Operation
- Notation ?P (r)
- Defined as ?P (r) t t ? r ? P(t) and
the relation schema of r is unchanged - Read as t is a tuple of r and t satisfies the
predicate P - P is a formula of the formattribute name op
attribute name (e.g. salary gt expense) - or attribute name op constant (e.g. salary gt
10000) - where op is one of ? ? ? ? ? ?
- Predicates can be connected by ? (and) ? (or) ?
(not) - Example P (salary gt 10000) ? (age lt 25)
8Select Operation Example
9Project Operation
- Notation ?X (r)where X A1, A2, ..., Ak is
any subset of the schema of r - Defined as ?X (r) t X t ? r
- After the projection, duplicates are assumed to
be eliminated automatically since a relation is a
set!Not exactly implemented in real DMBSs
duplicates are not eliminated until explicitly
told to do so - Order of attributes in the projection list can be
arbitrary. (One use of projection is to rearrange
attributes) - Note in general the relation schema changes
10Project Operation Example
11Union Operation
- Notation r ? s
- Defined as r ? s t (t ? r) ? (t ? s)
- For r ? s to be valid (r and s must be
compatible), I.e., - r, s must have the same number of attributes
- domains must be compatible pair-wise between r
and s - R(EmpName, Age) S(StudName, Age)T(ProdID,
Name) U(StudName, Age, CGA) - Compatible relations R and S, R and ? StudName,
Age U, etc - Incompatible relations R and T, R and U, etc
12Union Operation Example
13Set Difference Operation
- Notation r ? s
- Defined as r ? s t (t ? r) ? (t ? s)
- For r ? s to be valid
- r, s must have the same number of attributes
- domains must be compatible pair-wise between r
and s
14Set Difference Operation Example
15Set Intersection Operation
What about Intersection ?
- It is a derived operator
- r ? s r (r s)
- Also expressed as a join (will see later)
- Example
- UnionizedEmployees ? RetiredEmployees
16Set Intersection Operation Example
17Cartesian-Product Operation
- Notation r ? s
- Defined as r ? s t q (t ? r) ? (q ? s)
- Assume that attributes of r and s are disjoint
(I.e., r and s dont have attributes with the
same name) - If they are not disjoint, then make them disjoint
by renaming the attributes concerned
18Cartesian-Product Operation Example
19Rename Operation
- Notation ?x?y (r)
- X and Y are list of same number of attributes
- Allows us to rename attributes in a relation
schema - Example
- Given R A, B then ?A?C (r) returns r with
R C, B - This only affects the schema R but not the
relation r
20Join Operation
- Cartesian product is rarely used by its own it
is the basis of the join operation, which is more
popular - Let R (A, B, C, D ) and
- S ( E, B, D )
- r joins s has the result schema T(A, B, C, D, E)
and is defined byr s ?r.A,r.B,r.C,r.D,s.E
(?r.Bs.B ? r.Ds.D (r ? s)) - Where
- The selection sC checks equality of all common
attributes - The projection eliminates the duplicate common
attributes - This is also called natural join
- Join is not primitive
Q What is the join result if (a) R S and (b) R
and S do not overlap?
21Natural-Join Operation Example
r
s
22Natural-Join Operation
- Given the schemas R(A, B, C, D), S(A, C, E), what
is the schema of R ? S ? - Given R(A, B, C), S(D, E), what is R ? S ?
- Given R(A, B), S(A, B), what is R ? S ?
23Theta-Join Operation
- Natural Join
- Notation R1 R2
- Meaning is equivalent to R1 R2 PA (?C (R1
? R2)) - Theta-Join
- Notation R R1 C R2
- Meaning is equivalent to R ?C (R1 ? R2)
- Here q can be any condition
- ? can be lt gt ? ???
- If equal (), then it is an?EQUI-JOIN
- A join that involves a predicate
24Theta-Join Operation Example
r
s
25Another Theta-Join Operation Example
- Sells Bars
- BarInfo Sells Sells.BarBars.Name Bars
26Equi-join Operation
- A theta join where q is an equality
- Equi-Join
- Notation R R1 AB R2
- Meaning is equivalent to R ?AB (R1 ? R2)
- Most useful join in practice
- Example
- Employee SSN SSN Dependents
27Semi-join Operation
- Semi-Join
- Notation R ? S P R.A1,,R.An (R ? S)
- Meaning is equivalent to
- R ? S P R.A1,,R,An (PA (?C (R1 ? R2)))
- Where A1, , An are the attributes in R
- Example
- Employee ? Dependents
28Semi-Join Operation Example
29Outer Join Operation
- An extension of the join operation that avoids
loss of information - Computes the join and then adds tuples from one
relation that do not match tuples in the other
relation to the result of the join - Uses null values in left- or right- outer join
- null signifies that the value is unknown or does
not exist - All comparisons involving null are false by
definition - May give rise to some peculiarity. Eg (A 1) ?
(A ? 1) does not return all tuples in a relation
with the attribute A! - We need the special truth value unknown and
three-valued logic (true, false, unknown) to
handle nulls (not in our scope)
30Outer Join Operation Example
31Outer Join Operation Example
- Join returns only the matching (or good) tuples
- The fact that loan L-260 has no borrower is not
explicit in the result - Hayes has borrowed an non-existent loan L-155 is
also undetected
32Left Outer Join Operation Example
- Keep the entire left relation (Loan) and fill in
informationfrom the right relation, use null if
information is missing
33Right and Full Outer Join Operation Example
34Division Operation
- Notation r ? s
- Suited to queries that include the phrase for
all. - Let r and s be relations over schemas R and S
respectively, where R (A1,, Am, B1,,
Bn) S (B1,, Bn)The result of r ? s is a
relation over the schema (R S) (A1,, Am) - r ? s t (t ? ?R-S (r)) ? (?u ? s, tu
? r)
35Further Discussion Alternative Definition of
Division
- Not supported as a primitive operator, but useful
for expressing queries like - Find sailors who have reserved all boats
- Let A have 2 fields x and y B have only field y
- A/B
- i.e., A/B contains all x tuples (sailors) such
that for every y tuple (boat) in B, there is an
xy tuple in A - Or If the set of y values (boats) associated
with an x value (sailor) in A contains all y
values in B, then x value is in A/B
36Division Operation Example
r ? s t (t ? ? R-S(r)) ? (?u ? s, tu ? r)
r
s
The result consists of attribute A only but not
all of the 5 values. How to find out?u 1, 2
Check if ?u ? s ( tu ? r )
t ? ? R-S( r )
r ? s
check ? and d
37Another Division Operation Example
A/B
B1
B2
B3
A/B1
A/B2
A/B3
A
38Another Division Operation Example
Relations r, s
r ? s
A
B
C
a
?
A
?
?
A
39Properties of Division Operation
- Let q r ? sThen q is the largest relation
satisfying q ? s ? r
Relation r, s
r
s
q
40Assignment Operation
- The assignment operation(?) provides a convenient
way to express complex queries by storing
intermediate results into temporary relations - Assignment must always be made to a temporary
relation variable. - Example Write r ? s as temp1 ? ? R-S (r )
temp2 ? ? R-S((temp1 ? s) - ? R-S,S(
r)) result temp1 - temp2 - The result to the right of the ? is assigned to
the relation variable on the left of the ? - May use variable in subsequent expressions
41Expressing A/B Using Basic Operators
- Division is not essential op just a useful
shorthand - (Also true of joins, but joins are so common that
systems implement joins specially) - Idea For A/B, compute all x values that are not
disqualified by some y value in B - x value is disqualified if by attaching y value
from B, we obtain an xy tuple that is not in A -
- 1 r ? s ? R-S (r ) ? ? R-S ((? R-S (r ) ? s) ?
? R-S,S (r )) - 2
-
- 3 temp1 ? ? R-S (r ) temp2 ? ? R-S((temp1 ?
s) - ? R-S,S( r)) result temp1 - temp2
Disqualified x values
A/B
42Finally RA has Limitations !
- Cannot compute transitive closure
- Find all direct and indirect relatives of Fred
- Cannot express in RA !!! Need to write C program