Dr' Alexandra I' Cristea - PowerPoint PPT Presentation

1 / 74
About This Presentation
Title:

Dr' Alexandra I' Cristea

Description:

http://www.cl.cam.ac.uk/Teaching/2003/Databases/codd.pdf. Reprint ... based on Silberschatz, Korth, Sudarshan Database System Concepts. 5th Edition, Chapter 2 ... – PowerPoint PPT presentation

Number of Views:52
Avg rating:3.0/5.0
Slides: 75
Provided by: acri5
Category:
Tags: aic | alexandra | cristea

less

Transcript and Presenter's Notes

Title: Dr' Alexandra I' Cristea


1
CS 319 Theory of Databases C4
  • Dr. Alexandra I. Cristea
  • http//www.dcs.warwick.ac.uk/acristea/

2
(provisionary) Content
  • Generalities DB
  • Integrity constraints (FD revisited)
  • Relational Algebra (revisited)
  • Query optimisation
  • Tuple calculus
  • Domain calculus
  • Query equivalence
  • LLJ, DP and applications
  • Temporal Data
  • The Askew Wall

3
previous FD
4
Relational model
  • E.F. Codd check wikipedia!
  • Papers
  • http//www.cl.cam.ac.uk/Teaching/2003/Databases/co
    dd.pdf
  • Reprint at A relational Model of Data for Large
    Shared Data Banks
  • Relational Completeness of Data Base Sublanguages

5
Relational Model
  • Structure of Relational Databases
  • Fundamental Relational-Algebra-Operations
  • Additional Relational-Algebra-Operations

based on Silberschatz, Korth, Sudarshan Database
System Concepts 5th Edition, Chapter 2
6
Example of a Relation Instance
7
Basic Structure
  • Formally, given sets D1, D2, . Dn a relation r
    is a subset of D1 x D2 x x DnThus,
    a relation is a set of n-tuples (a1, a2, , an)
    where each ai ? Di

8
Example
  • If
  • customer_name Jones, Smith, Curry, Lindsay,
    / Set of all customer names /
  • customer_street Main, North, Park, / set
    of all street names/
  • customer_city Harrison, Rye, Pittsfield,
    / set of all city names /
  • Then r (Jones, Main, Harrison),
    (Smith, North, Rye),
    (Curry, North, Rye),
    (Lindsay, Park, Pittsfield) is a relation
    over
  • customer_name x customer_street x
    customer_city

9
Attribute Types
  • Each attribute of a relation has a name
  • The set of allowed values for each attribute is
    called the domain of the attribute
  • Attribute values are (normally) required to be
    atomic that is, indivisible
  • E.g. the value of an attribute can be an account
    number, but cannot be a set of account numbers
  • Domain is said to be atomic if all its members
    are atomic
  • The special value null is a member of every
    domain
  • The null value causes complications in the
    definition of many operations
  • We shall ignore the effect of null values in our
    main presentation and consider their effect later

10
Relation Schema
  • A1, A2, , An are attributes
  • R (A1, A2, , An ) is a relation schema
  • Example
  • Customer_schema (customer_name,
    customer_street, customer_city)
  • r(R) denotes a relation r on the relation schema
    R
  • Example
  • customer (Customer_schema)

11
Relation Instance
  • The current values (relation instance) of a
    relation are specified by a table
  • An element t of r is a tuple, represented by a
    row in a table

attributes (or columns)
customer_name
customer_street
customer_city
Jones Smith Curry Lindsay
Main North North Park
Harrison Rye Rye Pittsfield
tuples (or rows)
customer
12
Relations are Unordered
  • Order of tuples is irrelevant (tuples may be
    stored in an arbitrary order)
  • Example account relation with unordered tuples

13
Database
  • A database consists of multiple relations
  • Information about an enterprise is broken up into
    parts, with each relation storing one part of
    the information
  • Storing all information as a single relation
    such as bank(account_number, balance,
    customer_name, ..)results in
  • repetition of information
  • e.g.,if two customers own an account (What gets
    repeated?)
  • the need for null values
  • e.g., to represent a customer without an account
  • Normalization theory deals with how to design
    relational schemas

14
The customer Relation
15
The depositor Relation
16
Keys
  • Let K ? R
  • K is a superkey of R if values for K are
    sufficient to identify a unique tuple of each
    possible relation r(R)
  • Example customer_name, customer_street and
    customer_name are both superkeys
    of Customer, if no two customers can possibly
    have the same name

17
Keys (Cont.)
  • K is a candidate key if K is minimal
  • Example customer_name superkey no subset
    of it is a superkey.
  • Primary key a candidate key chosen as the
    principal means of identifying tuples within a
    relation

18
Foreign Keys
  • A relation schema may have an attribute that
    corresponds to the primary key of another
    relation. The attribute is called a foreign key.

19
Query Languages
  • Language in which user requests information from
    the database.
  • Categories of languages
  • Procedural
  • Non-procedural, or declarative
  • Pure languages
  • Relational algebra
  • Tuple relational calculus
  • Domain relational calculus
  • Pure languages form underlying basis of query
    languages that people use.

20
Relational Algebra
  • Procedural language
  • Six basic operators
  • select ?
  • project ?
  • union ?
  • set difference
  • Cartesian product x
  • rename ?
  • The operators take one or two relations as
    inputs and produce a new relation as a result.

21
Select Operation Example
  • Relation r

A
B
C
D
? ? ? ?
? ? ? ?
1 5 12 23
7 7 3 10
  • ?AB D gt 5 (r)

A
B
C
D
? ?
? ?
1 23
7 10
22
Select Operation
  • Notation ? p(r)
  • p is called the selection predicate
  • Defined as ?p(r) t t ? r and p(t)
  • Where p is a formula in propositional calculus
    consisting of terms connected by ? (and), ?
    (or), ? (not)Each term is one of
  • ltattributegt op ltattributegt or ltconstantgt
  • where op is one of , ?, gt, ?. lt. ?
  • Example of selection ? branch_namePerryridge
    (account)

23
Project Operation Example
  • Relation r

A
B
C
? ? ? ?
10 20 30 40
1 1 1 2
A
C
A
C
?A,C (r)
? ? ? ?
1 1 1 2
? ? ?
1 1 2

24
Project Operation
  • Notation where A1, A2 are attribute names and r
    is a relation name.
  • The result is defined as the relation of k
    columns obtained by erasing the columns that are
    not listed
  • Duplicate rows removed from result, since
    relations are sets
  • Example To eliminate the branch_name attribute
    of account ?account_number, balance
    (account)

25
Union Operation Example
  • Relations r, s

A
B
A
B
? ? ?
1 2 1
? ?
2 3
s
r
A
B
? ? ? ?
1 2 1 3
  • r ? s

26
Union Operation
  • Notation r ? s
  • Defined as r ? s t t ? r or t ? s
  • For r ? s to be valid.
  • 1. r, s must have the same arity (same number
    of attributes)
  • 2. The attribute domains must be compatible
    (example 2nd column of r deals with the
    same type of values as does the 2nd column
    of s)
  • Example to find all customers with either an
    account or a loan
  • ?customer_name (depositor) ? ?customer_name
    (borrower)

27
Set Difference Operation Example
  • Relations r, s

A
B
A
B
? ? ?
1 2 1
? ?
2 3
s
r
  • r s

A
B
? ?
1 1
28
Set Difference Operation
  • Notation r s
  • Defined as
  • r s t t ? r and t ? s
  • Set differences must be taken between compatible
    relations.
  • r and s must have the same arity
  • attribute domains of r and s must be compatible

29
Cartesian-Product Operation Example
A
B
C
D
E
  • Relations r, s

? ?
1 2
? ? ? ?
10 10 20 10
a a b b
r
s
A
B
C
D
E
  • r x s

? ? ? ? ? ? ? ?
1 1 1 1 2 2 2 2
? ? ? ? ? ? ? ?
10 10 20 10 10 10 20 10
a a b b a a b b
30
Cartesian-Product Operation
  • Notation r x s
  • Defined as
  • r x s t q t ? r and q ? s
  • Assume that attributes of r(R) and s(S) are
    disjoint. (That is, R ? S ?).
  • If attributes of r(R) and s(S) are not disjoint,
    then renaming must be used.

31
Composition of Operations
  • Can build expressions using multiple operations
  • Example ?AC(r x s)
  • r x s
  • ?AC(r x s)

A
B
C
D
E
? ? ? ? ? ? ? ?
1 1 1 1 2 2 2 2
? ? ? ? ? ? ? ?
10 10 20 10 10 10 20 10
a a b b a a b b
A
B
C
D
E
? ? ?
? ? ?
10 10 20
a a b
1 2 2
32
Rename Operation
  • Allows us to refer to results of RA expressions
    to refer to a relation by more than one name.
  • Example ? x (E)
  • returns the expression E under the name X
  • expression E under name X, with attributes
    renamed to A1 , A2 , ., An .

33
  • Revise basic operators!

34
Relational Algebra Operators thus
  • Procedural language with six basic operators
  • select ?
  • project ?
  • union ?
  • set difference
  • Cartesian product x
  • rename ?
  • The operators take one or two relations as
    inputs and produce a new relation as a result.

35
Banking Example
  • branch (branch_name, branch_city, assets)
  • customer (customer_name, customer_street,
    customer_city)
  • account (account_number, branch_name, balance)
  • loan (loan_number, branch_name, amount)
  • depositor (customer_name, account_number)
  • borrower (customer_name, loan_number)

36
(No Transcript)
37
Example Queries
  • Find all loans of over 1200
  • ?amount gt 1200 (loan)

Find the loan number for each loan of an amount
greater than 1200
  • ?loan_number (?amount gt 1200 (loan))

Find the names of all customers who have a loan,
an account, or both, from the bank
  • ?customer_name (borrower) ? ?customer_name
    (depositor)

38
Example Queries
  • Find the names of all customers who have a loan
    at the Perryridge branch.

?customer_name (?branch_namePerryridge
(?borrower.loan_number loan.loan_number(borrower
x loan)))
Find the names of all customers who have a loan
at the Perryridge branch but do not have an
account at any branch of the bank.
?customer_name (?branch_name Perryridge
(?borrower.loan_number loan.loan_number(borrowe
r x loan))) ?customer_name(de
positor)
39
Example Queries
  • Find the names of all customers who have a loan
    at the Perryridge branch.
  • Query 1
  • ?customer_name (?branch_name Perryridge
  • (?borrower.loan_number loan.loan_number
    (borrower x loan)))
  • Query 2
  • ?customer_name(?loan.loan_number
    borrower.loan_number ( (?branch_name
    Perryridge (loan)) x borrower))

40
Example Queries
  • Find the largest account balance
  • Strategy
  • Find those balances that are not the largest
  • Rename account relation as d so that we can
    compare each account balance with all others
  • Use set difference to find those account balances
    that were not found in the earlier step.
  • The query is

?balance(account) - ?account.balance
(?account.balance lt d.balance (account x rd
(account)))
41
Formal Definition
  • A basic expression in the relational algebra
    consists of either one of the following
  • A relation in the database
  • A constant relation
  • Let E1 and E2 be relational-algebra expressions
    the following are all relational-algebra
    expressions
  • E1 ? E2
  • E1 E2
  • E1 x E2
  • ?p (E1), P is a predicate on attributes in E1
  • ?s(E1), S is a list consisting of some of the
    attributes in E1
  • ? x (E1), x is the new name for the result of E1

42
Additional Operations
  • We define additional operations that do not add
    any power to the
  • relational algebra, but that simplify common
    queries.
  • Set intersection
  • Natural join
  • Division
  • Assignment

43
Set-Intersection Operation
  • Notation r ? s
  • Defined as
  • r ? s t t ? r and t ? s
  • Assume
  • r, s have the same arity
  • attributes of r and s are compatible
  • Note r ? s r (r s)

44
Set-Intersection Operation Example
  • Relation r, s
  • r ? s

A B
A B
? ? ?
1 2 1
? ?
2 3
s
r
A B
? 2
45
Natural-Join Operation
  • Notation r s
  • Let r and s be relations on schemas R and S
    respectively. Then, r s is a relation on
    schema R ? S obtained as follows
  • Consider each pair of tuples tr from r and ts
    from s.
  • If tr and ts have the same value on each of the
    attributes in R ? S, add a tuple t to the
    result, where
  • t has the same value as tr on r
  • t has the same value as ts on s

46
Natural joint example
  • Example
  • R (A, B, C, D)
  • S (E, B, D)
  • Result schema (A, B, C, D, E)
  • r s is defined as
  • ?r.A, r.B, r.C, r.D, s.E (?r.B s.B ? r.D s.D
    (r x s))

47
Natural Join Operation Example
  • Relations r, s

B
D
E
A
B
C
D
1 3 1 2 3
a a a b b
? ? ? ? ?
? ? ? ? ?
1 2 4 1 2
? ? ? ? ?
a a b a b
r
s
A
B
C
D
E
? ? ? ? ?
1 1 1 1 2
? ? ? ? ?
a a a a b
? ? ? ? ?
48
Division Operation
r ? s
  • Notation
  • Suited to queries that include the phrase for
    all.
  • Let r and s be relations on schemas R and S
    respectively where
  • R (A1, , Am , B1, , Bn )
  • S (B1, , Bn)
  • The result of r ? s is a relation on schema
  • R S (A1, , Am)
  • r ? s t t ? ? R-S (r) ? ? u ? s ( tu ?
    r )
  • Where tu means the concatenation of tuples t and
    u to produce a single tuple

49
Division Operation Example
A
B
B
? ? ? ? ? ? ? ? ? ? ?
1 2 3 1 1 1 3 4 6 1 2
  • Relations r, s

1 2
s
A
  • r ? s

r
? ?
50
Another Division Example
  • Relations r, s

A
B
C
D
E
D
E
? ? ? ? ? ? ? ?
a a a a a a a a
? ? ? ? ? ? ? ?
a a b a b a b b
1 1 1 1 3 1 1 1
a b
1 1
s
r
  • r ? s

A
B
C
? ?
a a
? ?
51
Division Operation (Cont.)
  • Property
  • Let q r ? s
  • Then q is the largest relation satisfying q x s
    ? r
  • Definition in terms of the basic algebra
    operationLet r(R) and s(S) be relations, and let
    S ? R
  • r ? s ?R-S (r ) ?R-S ( ( ?R-S (r ) x s )
    ?R-S,S(r ))
  • To see why
  • ?R-S,S (r) simply reorders attributes of r
  • ?R-S (?R-S (r ) x s ) ?R-S,S(r) ) gives those
    tuples t in ?R-S (r ) such that for some tuple
    u ? s, tu ? r.

52
Assignment Operation
  • The assignment operation (?) provides a
    convenient way to express complex queries.
  • Write query as a sequential program consisting
    of
  • a series of assignments
  • followed by an expression whose value is
    displayed as a result of the query.
  • Assignment must always be made to a temporary
    relation variable.
  • Example Write r ? s as
  • temp1 ? ?R-S (r ) temp2 ? ?R-S ((temp1 x s
    ) ?R-S,S (r )) result temp1 temp2
  • The result to the right of the ? is assigned to
    the relation variable on the left of the ?.
  • May use variable in subsequent expressions.

53
Bank Example Queries
  • Find the names of all customers who have a loan
    and an account at the bank.

?customer_name (borrower) ? ?customer_name
(depositor)
  • Find the name of all customers who have a loan at
    the bank and the loan amount

?customer_name, loan_number, amount (borrower
loan)
54
Bank Example Queries
  • Find all customers who have an account from at
    least the Downtown and the Uptown branches.
  • Query 1
  • ?customer_name (?branch_name Downtown
    (depositor account )) ?
  • ?customer_name (?branch_name Uptown
    (depositor account))
  • Query 2
  • ?customer_name, branch_name (depositor
    account) ? ?temp(branch_name)
    ((Downtown ), (Uptown ))
  • Note that Query 2 uses a constant relation.

55
Bank Example Queries
  • Find all customers who have an account at all
    branches located in Brooklyn city.

?customer_name, branch_name (depositor
account) ? ?branch_name (?branch_city
Brooklyn (branch))
56
Library Case
reservation
book
author
name
department
Initials
title
name
publisher
date
year
cancelled
copy
borrow
name
department
department
from
cpYear
to
present
57
Library database
  • book ( ISBN, title, publisher, year )
  • author (ISBN, initials, name )
  • copy (barcode, ISBN, department, cpYear, present
    )
  • reservation (name, department, ISBN, date,
    cancelled )
  • borrow (name, department, barcode, from, to )

58
Library Questions (RA)
  • List all the authors of books of which the
    library has a copy that has never been borrowed.
  • List all the authors of books that have never
    been borrowed.
  • List all the authors of which no book has ever
    been borrowed.

59
(simple) Employee database
  • employee(person_name, street, city)
  • works(person_name, company_name, salary)
  • company(company_name, city)
  • manages(person_name, manager_name)

60
Exercise 2.1.c
  • Find the names of all employees who earn more
    than every employee of SBC.
  • The more than every... clause cannot be
    expressed directly in the relational algebra.
  • For all other employees there is an employee of
    SBC earning more. This can be described using a
    selection on a cartesian product (of works with
    works).
  • We then use the difference to find the correct
    result.

61
Solution 2.1.c
  • ?person-name (W)
  • - ?person-name( (?w.salaryltw2.salary
    w2.company-name SBC (W x ?W2(W) ))

62
Exercise 2.5.e
  • Find all companies located in every city in which
    SBC is located.

63
Case 1 1 city per company
  • ?C.company-name (
  • (?C.citySBC.city
  • (?C (company ) x (?SBC.company-nameSBC (?SBC
    (company ) ) )
  • Trivial!

64
Case 2 multiple cities per company
  • Find all companies located in every city in which
    SBC is located.
  • The every city... clause cannot be expressed
    directly in the relational algebra (except for
    division).
  • Lets see

65
Case 2a multiple cities division
  • All cities where SBC is located
  • SBCCity ?city (?company-nameSBC (company ) )
  • Is SBCCity constant?
  • Yes!
  • So we can use it on the right hand side of the
    division.
  • companies located in every city in which SBC is
    located
  • company ? SBCCity
  • Still quite neat!!

66
Case 2b multiple cities no division
  • Find all companies located in every city in which
    SBC is located.
  • The every city... clause cannot be expressed
    directly in the relational algebra (except for
    division).
  • For the other companies there is a city in which
    SBC is located and the other company is not.
  • The cities where a company is not located are all
    the cities except the ones where the company is
    located.
  • We thus need 2 times a difference in this query.
  • Can also be formulated using a difference operator

67
Case 2b multiple cities no division
  • For the other companies there is a city in which
    SBC is located and the other company is not
    IntRes.
  • E ?company-name (company ) x SBCCity
  • IntRes E company
  • We want not (IntRes)
  • ?company-name (company ) - ?company-name (IntRes)

68
  • What is the meaning of
  • SBCCity ?

69
Recognizing Types of Queries
  • Identify the type of the following queries, and
    afterwards also translate them to the algebra
    (PSJ, union, intersection, difference, division)
  • Give the name of customers that have a loan with
    a branch where they also have an account.
  • Give the name of customers who have a loan at a
    branch where they do not have an account.
  • Give the name of customers who have a loan at
    every branch where they have an account.
  • Give the name of customers who have loans only at
    branches where they have an account.

70
Reading Queries
  • 5. ?customer-name(?balancegtamount(
  • account ?? depositor ?? borrower ?? loan))
  • 6. ?branch-name(branch) ?
  • ?branch-name(?branch-city ? customer-city(
  • branch ?? account ?? depositor ?? customer))
  • 7. ?X.customer-name


    (?X.account-number Y.account-number ?
    X.customer-name ? Y.customer-name
    (?X(depositor) ? ?Y(depositor)))

71
Beer Database
  • visits(drinker, bar)
  • serves(bar, beer)
  • likes(drinker, beer).

72
Beer questions with a difference
  • Give all drinkers that visit bars that dont
    serve any beer they like
  • Give all drinkers that only visit bars that serve
    a beer they like
  • Give all drinkers that only visit bars that serve
    no beer they like
  • Give all drinkers that only visit bars that serve
    all beers they like (and maybe other beers as
    well)
  • Give all drinkers that only visit bars that only
    serve beers they like (and thus serve nothing
    else)

73
Summary
  • We have learned RA
  • We have learned to perform simple and more
    complex queries in RA

74
to follow Query optimisation
Write a Comment
User Comments (0)
About PowerShow.com