Relational Algebra - PowerPoint PPT Presentation

1 / 25
About This Presentation
Title:

Relational Algebra

Description:

CIS 550 Database & Information Systems. September 21, 2004. Some content courtesy of Susan Davidson & Raghu Ramakrishnan ... Very Different, Formalism ... – PowerPoint PPT presentation

Number of Views:60
Avg rating:3.0/5.0
Slides: 26
Provided by: zack4
Category:

less

Transcript and Presenter's Notes

Title: Relational Algebra


1
Relational Algebra Calculus
  • Zachary G. Ives
  • University of Pennsylvania
  • CIS 550 Database Information Systems
  • September 21, 2004

Some slide content courtesy of Susan Davidson
Raghu Ramakrishnan
2
Administrivia
  • Homework 1 due Thursday

3
A Set of Logical Operations The Relational
Algebra
  • Six basic operations
  • Projection ?? (R)
  • Selection ?? (R)
  • Union R1 R2
  • Difference R1 R2
  • Product R1 R2
  • (Rename) ??-gtb (R)
  • And some other useful ones
  • Join R1 ?? R2
  • Semijoin R1 ?? R2
  • Intersection R1 Å R2
  • Division R1 R2

4
Data Instance for Operator Examples
STUDENT
COURSE
Takes
PROFESSOR
Teaches
5
Mini-Quiz
  • Try writing queries for these
  • The names of students named Bob
  • The names of students expecting an A
  • The names of students in Amir Roths 501 class
  • The sids and names of students not enrolled in
    any courses

6
The Big Picture SQL to Algebra toQuery Plan to
Web Page
Web Server / UI / etc
Query Plan anoperator tree
Execution Engine
Optimizer
Storage Subsystem
SELECT FROM STUDENT, Takes, COURSE
WHERE STUDENT.sid Takes.sID AND
Takes.cID cid
7
Optimization Is Based on Algebraic Equivalences
  • Relational algebra has laws of commutativity,
    associativity, etc. that imply certain
    expressions are equivalent in semantics
  • They may be different in cost of evaluation!

?c Ç d(R) ?c(R) ?d(R)
?c (R1 R2) R1 ?c R2
?c Ç d (R) ?c (?d (R))
  • Query optimization finds the most efficient
    representation to evaluate (or one thats not bad)

8
Switching Gears An Equivalent, ButVery
Different, Formalism
  • Codd invented a relational calculus that he
    proved was equivalent in expressiveness to the
    rel. algebra
  • Based on a subset of first-order logic
    declarative, without an implicit order of
    evaluation
  • Tuple relational calculus
  • Domain relational calculus
  • More convenient for certain kinds of
    manipulations
  • The database uses the relational algebra
    internally
  • But query languages (e.g., SQL) are mostly based
    on the relational calculus

9
Domain Relational Calculus
  • Queries have form
  • ltx1,x2, , xngt p
  • Predicate Boolean expression over x1,x2, , xn
  • Precise operations depend on the domain and query
    language may include special functions, etc.
  • Assume the following at minimum
  • ltxi,xj,gt ? R X op Y X op const const op X
  • where op is ?, ?, ?, ?, ?, ?
  • xi,xj, are domain variables

domain variables
predicate
10
More Complex Predicates
  • Starting with these atomic predicates, build up
    new predicates by the following rules
  • Logical connectives If p and q are predicates,
    then so are p ? q, p ? q, ?p, and p ? q
  • (xgt2) ? (xlt4)
  • (xgt2) ? ?(xgt0)
  • Existential quantification If p is a predicate,
    then so is ?x.p
  • ?x. (xgt2) ?(xlt4)
  • Universal quantification If p is a predicate,
    then so is ?x.p
  • ?x.xgt2
  • ?x. ?y.ygtx

11
Some Examples
  • Faculty ids
  • Subjects for courses with students expecting a
    C
  • All course numbers for which there exists a
    smaller course number

12
Logical Equivalences
  • There are two logical equivalences that will be
    heavily used
  • p ? q ? ?(p ? q) (Whenever p is true, q must
    also be true.)
  • ?x. p(x) ? ??x. ?p(x) (p is true for all x)
  • The second can be a lot easier to check!
  • Example
  • The highest course number offered

13
Terminology Free and Bound Variables
  • A variable v is bound in a predicate p when p is
    of the form ?v or ?v
  • A variable occurs free in p if it occurs in a
    position where it is not bound by an enclosing ?
    or ?
  • Examples
  • x is free in x gt 2
  • x is bound in ?x. x gt y

14
Can Rename Bound Variables Only
  • When a variable is bound one can replace it with
    some other variable without altering the meaning
    of the expression, providing there are no name
    clashes
  • Example ?x. x gt 2 is equivalent to ?y. y gt 2
  • Otherwise, the variable is defined outside our
    scope

15
Safety
  • Pitfall in what we have done so far how do we
    interpret
  • ltsid,namegt ?ltsid,namegt ? STUDENT
  • Set of all binary tuples that are not students
    an infinite set (and unsafe query)
  • A query is safe if no matter how we instantiate
    the relations, it always produces a finite answer
  • Domain independent answer is the same regardless
    of the domain in which it is evaluated
  • Unfortunately, both this definition of safety
    and domain independence are semantic conditions,
    and are undecidable

16
Safety and Termination Guarantees
  • There are syntactic conditions that are used to
    guarantee safe formulas
  • The definition is complicated, and we wont
    discuss it you can find it in Ullmans
    Principles of Database and Knowledge-Base Systems
  • The formulas that are expressible in real query
    languages based on relational calculus are all
    safe
  • Many DB languages include additional features,
    like recursion, that must be restricted in
    certain ways to guarantee termination and
    consistent answers

17
Mini-Quiz
  • How do you write
  • Which students have taken more than one course
    from the same professor?

18
Translating from RA to DRC
  • Core of relational algebra ?, ?, ?, x, -
  • We need to work our way through the structure of
    an RA expression, translating each possible form.
  • Let TRe be the translation of RA expression e
    into DRC.
  • Relation names For the RA expression R, the DRC
    expression is ltx1,x2, , xngt ltx1,x2, , xngt
    ? R

19
Selection TR?? R
  • Suppose we have ??(e), where e is another RA
    expression that translates as
  • TRe ltx1,x2, , xngt p
  • Then the translation of ?c(e) is
  • ltx1,x2, , xngt p??where ? is obtained
    from ? by replacing each attribute with the
    corresponding variable
  • Example TR?12 ?4gt2.5R (if R has arity 4)
    is
  • ltx1,x2, x3, x4gt lt x1,x2, x3, x4gt ? R ?
    x1x2 ? x4gt2.5

20
Projection TR?i1,,im(e)
  • If TRe ltx1,x2, , xngt p then
    TR?i1,i2,,im(e) ltx i1,x i2, , x im gt ?
    xj1,xj2, , xjk.p, where xj1,xj2, , xjk are
    variables in x1,x2, , xn that are not in x i1,x
    i2, , x im
  • Example With R as before,?1,3 (R)ltx1,x3gt?
    x2,x4. ltx1,x2, x3,x4gt ?R

21
Union TRR1 ? R2
  • R1 and R2 must have the same arity
  • For e1 ? e2, where e1, e2 are algebra expressions
  • TRe1ltx1,,xngtp and TRe2lty1,yngtq
  • Relabel the variables in the second
  • TRe2lt x1,,xngtq
  • This may involve relabeling bound variables in q
    to avoid clashes
  • TRe1?e2ltx1,,xngtp?q.
  • Example TRR1 ? R2 lt x1,x2, x3,x4gt
    ltx1,x2, x3,x4gt?R1 ? ltx1,x2, x3,x4gt?R2

22
Other Binary Operators
  • Difference The same conditions hold as for union
  • If TRe1ltx1,,xngtp and TRe2lt
    x1,,xngtq
  • Then TRe1- e2 ltx1,,xngtp??q
  • Product
  • If TRe1ltx1,,xngtp and TRe2lt
    y1,,ymgtq
  • Then TRe1? e2 ltx1,,xn, y1,,ym gt p?q
  • Example TRR?S ltx1,,xn, y1,,ym gt
    ltx1,,xngt ?R ? lty1,,ym gt ?S

23
What about the Tuple Relational Calculus?
  • Weve been looking at the Domain Relational
    Calculus
  • The Tuple Relational Calculus is nearly the same,
    but variables are at the level of a tuple, not an
    attribute
  • Q 9 S ? COURSES, 9 T 2 Takes (S.cid T.cid Æ
    Q.cid S.cid Æ Q.exp-grade T.exp-grade)

24
Limitations of the Relational Algebra / Calculus
  • Cant do
  • Aggregate operations
  • Recursive queries
  • Complex (non-tabular) structures
  • Most of these are expressible in SQL, OQL, XQuery
    using other special operators
  • Sometimes we even need the power of a
    Turing-complete programming language

25
Summary
  • Can translate relational algebra into relational
    calculus
  • DRC and TRC are slightly different syntaxes but
    equivalent
  • Given syntactic restrictions that guarantee
    safety of DRC query, can translate back to
    relational algebra
  • These are the principles behind initial
    development of relational databases
  • SQL is close to calculus query plan is close to
    algebra
  • Great example of theory leading to practice!
Write a Comment
User Comments (0)
About PowerShow.com