Lecture 4: Relational algebra - PowerPoint PPT Presentation

About This Presentation
Title:

Lecture 4: Relational algebra

Description:

Consider the simple case, where relation A has fields x and y, and relation B has field y ... Naked Gun 33 1/3. Naked Gun 2. Naked Gun 2. Naked Gun. sequel ... – PowerPoint PPT presentation

Number of Views:37
Avg rating:3.0/5.0
Slides: 35
Provided by: gmb
Category:

less

Transcript and Presenter's Notes

Title: Lecture 4: Relational algebra


1
Lecture 4Relational algebra
  • www.cl.cam.ac.uk/Teaching/current/Databases/

2
Todays lecture
  • Whats the (core) relational algebra?
  • How can we write queries using the relational
    algebra?
  • How powerful is the relational algebra?

3
Relational query languages
  • Query languages allow the manipulation and
    retrieval of data from a database
  • The relational model supports simple, powerful
    query languages
  • Strong formal foundation
  • Allows for much (provably correct) optimisation
  • NOTE Query languages are not (necessarily)
    programming languages

4
Formal relational query languages
  • Two formal query languages
  • Relational Algebra
  • Simple operational model, useful for expressing
    execution plans
  • Relational Calculus
  • Logical model (declarative), useful for
    theoretical results
  • Both languages were introduced by Codd in a
    series of papers
  • They have equivalent expressive power
  • They are the key to understanding SQL query
    processing!

5
Preliminaries
  • A query is applied to relation instances, and the
    result of a query is also a relation instance
  • Schema of relations are fixed (cf. types)
  • The query will then execute over any valid
    instance
  • The schema of the result can also be determined

6
Example relation instances
  • A database of boats, sailors, and reservations

S2
R1
S1
B1
7
Core relational algebra
  • Five basic operator classes
  • Selection
  • Selects a subset of rows
  • Projection
  • Picking certain columns
  • Renaming
  • Renaming attributes
  • Set theoretic operations
  • The familiar operations union, intersection,
    difference,
  • Products and joins
  • Combining relations in useful ways

8
Selection
  • Selects rows that satisfy a condition, written
  • R1 ?c(R2)
  • where c is a condition involving the attributes
    of R2, e.g.
  • ?ratinggt8(S2)
  • returns the relation instance

9
Selection cont.
  • Note
  • The schema of the result is exactly the same as
    the schema of the input relation instance
  • There are no duplicates in the resulting relation
    instance (why?)
  • The resulting relation instance can be used as
    the input for another relational algebra
    operator, e.g.
  • ?snameJulia(?ratinggt8(S2))

10
Projection
  • Deletes fields that are not in the
  • projection list
  • R1?A(R2)
  • where A is a list of attributes from the
  • schema of R2, e.g.
  • ?sname,rating(S2)
  • returns the relation instance

11
Projection cont.
  • Note
  • Projection operator has to eliminate duplicates
    (why?)
  • Aside Real systems dont normally perform
    duplicate elimination unless the user explicitly
    asks for it (why not?)

12
Renaming
  • R1 ?AB(R2)
  • Returns a relation instance identical to R2
    except that field A is renamed B
  • For example, ?snamenom(S1)

13
Familiar set operations
  • We have the familiar set-theoretic operators,
    e.g. ?, ?, -
  • There is a restriction on their input relation
    instances they must be union compatible
  • Same number of fields
  • Same field names and domains
  • E.g. S1?S2 is valid, but S1?R1 is not

14
Cartesian products
  • A?B
  • Concatenate every row of A with every row of B
  • What do we do if A and B have some field names in
    common?
  • Several choices, but well simply assume that the
    resulting duplicate field names will have the
    suffix 1 and 2

15
Example
  • S1?R1

16
Theta join
  • Theoretically, it is a derived operator
  • R1 Vc R2 _at_ ?c(R1?R2)
  • E.g., S1 Vsid.1ltsid.2R1

17
Theta join cont.
  • The result schema is the same as for a
    cross-product
  • Sometimes this operator is called a conditional
    join
  • Most commonly the condition is an equality on
    field names, e.g. S1 Vsid.1sid.2R1

18
Equi- and natural join
  • Equi-join is a special case of theta join where
    the condition is equality of field names, e.g. S1
    Vsid R1
  • Natural join is an equi-join on all common fields
    where the duplicate fields are removed. It is
    written simply A V B

19
Natural join cont.
  • Note that the common fields appear only once in
    the resulting relation instance
  • This operator appears very frequently in
    real-life queries
  • It is always implemented directly by the query
    engine (why?)

20
Division
  • Not a primitive operator, but useful to express
    queries such as
  • Find sailors who have reserved all the boats
  • Consider the simple case, where relation A has
    fields x and y, and relation B has field y
  • A/B is the set of xs (sailors) such that for
    every y (boat) in B, there is a row (x,y) in A

21
Division cont.
  • Can you code this up in the relational algebra?

22
Division cont.
  • Can you code this up in the relational algebra?

xs that are disqualified ?x((?x(A) ? B)
A) Thus ?x(A)-?x((?x(A) ? B) A)
23
Example 1
  • Find names of sailors whove reserved boat 103
  • Solution 1 ?sname(?bid103(Reserves) V Sailors)
  • Solution 2 ?sname(?bid103(Reserves V Sailors))
  • Which is more efficient?

Query optimisation
24
Example 2
  • Find names of sailors whove reserved a red boat

25
Example 2
  • Find names of sailors whove reserved a red boat

?sname(?colourred(Boats) V Reserves V Sailors)
Better ?sname(?sid(?bid(?colourred(Boats)) V
Reserves) V Sailors)
26
Example 3
  • Find sailors whove reserved a red or a green boat

27
Example 3
  • Find sailors whove reserved a red or a green boat

let T ?colourred?colourgreen(Boats) in
?sname(T V Reserves V Sailors)
28
Example 4
  • Find sailors whove reserved a red and a green
    boat

29
Example 4
  • Find sailors whove reserved a red and a green
    boat

NOTE Cant just trivially modify last solution!
let T1 ?sid (?colourred(Boats) V Reserves)
T2 ?sid (?colourgreen(Boats) V
Reserves) in ?sname((T1 ? T2) V Sailors)
30
Example 5
  • Find the names of sailors whove reserved at
    least two boats

let T ?sid.1sid (?sid.1,sname,bid (Sailors V
Reserves)) in ?sname.1 (?sid.1sid.2?bid.1?b
id.2(T ? T))
31
Example 6
  • Find the names of sailors whove reserved all
    boats

let T ?sid,bid (Reserves) / ?bid (Boats) in
?sname(T V Sailors)
32
Computational limitations
  • Suppose we have a relation SequelOf of movies and
    their immediate sequels
  • We want to compute the relation isFollowedBy

33
Computational limitations
  • We could compute
  • ?fst,thd(?moviefst,sequelsnd(SequelOf)
  • V ?moviesnd,sequelthd(SequelOf)
    )
  • This provides us with sequels-of-sequels
  • We could write three joins to get sequels-of-
    sequels-of-sequels and union the results
  • What about Friday the 13th (9 sequels)? ?
  • In general we need to be able to write an
    arbitrarily large union
  • The relational algebra needs to be extended to
    handle these sorts of queries

34
Summary
  • You should now understand
  • The core relational algebra
  • Operations and semantics
  • Union compatibility
  • Computational limitations of the relational
    algebra
  • Next lecture Relational calculus
Write a Comment
User Comments (0)
About PowerShow.com