The relational algebra - PowerPoint PPT Presentation

1 / 31
About This Presentation
Title:

The relational algebra

Description:

R1 R2 is the relation containing all tuples that appear only in both R1 and R2. ... Different from a Cartesian Product (R1 X R2) because includes a sigma predicate, ... – PowerPoint PPT presentation

Number of Views:92
Avg rating:3.0/5.0
Slides: 32
Provided by: thebutl
Category:

less

Transcript and Presenter's Notes

Title: The relational algebra


1
Introduction
  • The relational algebra
  • The relational calculus
  • Query languages

2
Set operations
  • Both operands must have the same number of
    corresponding domains of tuples (i.e., they are
    union-compatible), as does the result
  • Union
  • R1 U R2 is the relation containing all tuples
    that appear in R1 and R2.
  • If R1 and R2 have I and J tuples, respectively,
    it will have a maximum of (I J) tuples
  • Set Difference
  • R1 - R2 is the relation containing all tuples of
    R1 that do not appear in R2.
  • If R1 and R2 have I and J tuples, respectively,
    it will have a maximum of (I) tuples
  • Intersection
  • R1 ? R2 is the relation containing all tuples
    that appear only in both R1 and R2.
  • It can be derived from the set difference
    R1-(R1-R2).
  • Cartesian Product
  • R1 X R2 is the relation containing all possible
    pairs from the two relations
  • If R1 has I tuples and M attributes and R2 has J
    tuples and N attributes, the resulting relation
    has (I J) tuples and (M N) attributes
  • The result is always a relation
  • Involves more than one relation

3
Join Operation ?
  • Combines attributes of two relations into one.
  • Tuples in R1 are related to tuples in R2
  • Different from a Cartesian Product (R1 X R2)
    because includes a sigma predicate, (i.e., a
    Select on a Cartesian Product)
  • Essential to relational algebra, but
    memory-intensive in the practical world, so
    vendors try to optimize for these using query
    optimization

4
?
  • Theta join
  • R ? ltjoin conditiongt S
  • ltjoin conditiongt can be ltconditiongt and
    ltconditiongt and and ltconditiongt
  • ltconditiongt A ? B predicate
  • Result contains the sum of the degrees of the two
    relations
  • Equijoin (a type of theta join)
  • Equality comparisons only for theta
  • Natural join
  • If two relations being joined have one attribute
    (domain) name in common, assume that the single
    attribute in common is the one being compared to
    see if a new tuple will be inserted in the result
  • Result contains the sum of the degrees of the two
    relations minus the duplicate attributes after
    the first
  • Outer join (left, right, full) to retain
    additional tuples from one or more of the sets

5
?
  • Join of relations R (a,b),S(b,c),T(c,d) examples
  • Natural
  • M(a,R.b,c) lt- R ? S
  • Theta
  • N(b,S.c,T.c,d) lt- S ?S.cT.c T (Equijoin)
  • O(b,S.c,T.c,d) lt- S ?S.cltT.c T (Less than)
  • Outer
  • N(b,S.c,T.c,d) lt- S S.cT.c T (Left)
  • N(b,S.c,T.c,d) lt- S S.cT.c T (Right)
  • N(b,S.c,T.c,d) lt- S S.cT.c T (Full)
  • Added to manage distributed transactions
  • Semi
  • P(b,S.c) lt- S S.cltT.c T

6
Join Operation ?
  • Combines attributes of two relations into one.
  • Tuples in R1 are related to tuples in R2
  • Different from a Cartesian Product (R1 X R2)
    because includes a sigma predicate, (i.e., a
    Select on a Cartesian Product)
  • Essential to relational algebra, but
    memory-intensive in the practical world, so
    vendors try to optimize for these using query
    optimization

7
?
  • Theta join
  • R ? ltjoin conditiongt S
  • ltjoin conditiongt can be ltconditiongt and
    ltconditiongt and and ltconditiongt
  • ltconditiongt A ? B predicate
  • Result contains the sum of the degrees of the two
    relations
  • Equijoin (a type of theta join)
  • Equality comparisons only for theta
  • Natural join
  • If two relations being joined have one attribute
    (domain) name in common, assume that the single
    attribute in common is the one being compared to
    see if a new tuple will be inserted in the result
  • Result contains the sum of the degrees of the two
    relations minus the duplicate attributes after
    the first
  • Outer join (left, right, full) to retain
    additional tuples from one or more of the sets

8
?
  • Join of relations R (a,b),S(b,c),T(c,d) examples
  • Natural
  • M(a,R.b,c) lt- R ? S
  • Theta
  • N(b,S.c,T.c,d) lt- S ?S.cT.c T (Equijoin)
  • O(b,S.c,T.c,d) lt- S ?S.cltT.c T (Less than)
  • Outer
  • N(b,S.c,T.c,d) lt- S S.cT.c T (Left)
  • N(b,S.c,T.c,d) lt- S S.cT.c T (Right)
  • N(b,S.c,T.c,d) lt- S S.cT.c T (Full)
  • Added to manage distributed transactions
  • Semi
  • P(b,S.c) lt- S S.cltT.c T

9
Division Operation
  • A relation over the attributes in R1 defined over
    the attributes in R3 that match the combination
    of every tuple in R2, where R3 is the set of
    attributes in R1 but not R2
  • Restated For R1R2, compute all x values in R1
    that are not disqualified by some y value in R2
    (i.e., x tuple is disqualified if, by attaching y
    tuple from R2, we obtain an xy tuple that is not
    in R1)
  • Restated If x is in R1 and R2 ,and y is only in
    R1, then R1/R2 produces all tuples containing y
    from R1 that have a corresponding x for all x
    values in R2
  • p ltcgt(R1) p ltcgt((R2 X (p ltcgt(R1))) R1)

Show all salespeople calling on companies having
more then 10 billion in revenue R lt-
?Result(pEmpID,Comp(Contacts))(pComp(sRevgt10(Comp
anies))) Company Contacts / Companiesgt10billion
-gt R EmpID Comp. Comp. Result 56 IBM GE 56
76 GE IBM 56 GE 62 GM
10
Division - explored
A
B1
A/B1
  • Man Frm
  • Lilly tab
  • Lilly cap
  • Lilly syr
  • Lilly sup
  • SKB tab
  • SKB cap
  • GSK cap
  • PF cap
  • PF sup

Frm cap
Man Lilly,SKB,GSK,PF
B2
A/B2
Man Lilly,PF
Frm cap,sup
A/B3
B3
Frm tab,cap,sup
Man Lilly
T1 lt- Attributes of A not in Bx T2 lt- Cartesian
Product of Bx and T1 T3 lt- Set difference of T2
and A T4 lt- Attributes of T3 not in Bx T5 lt- T1
T4
11
Cover
  • All relational algebra operations can be
    expressed from this set of operations
  • s
  • p
  • U
  • -
  • X
  • But inconvenient to leave out intersection
  • ?
  • Also useful to have
  • Rename (rho ?)
  • Grouping (?)
  • Aggregation (F)

12
Relational algebra enhancements
  • Not derivable from basic operations
  • Aggregation (FA)
  • COUNT, SUM, AVG, MIN, MAX
  • Must consider how they use null values
  • Grouping (?A) or (GAF)
  • Allows congregation of relation tuples into
    groups according to common values in attributes

13
Relational algebra queries
  • Expression simplification is an important query
    optimization technique, which can affect the
    running time of queries by an order of magnitude
    or more.
  • early "selection" reduces the number of tuples
  • early "projection" reduces the number of domains

14
Relational Operations
  • Logical equivalences follow the following laws
  • Identity, Domination, Idempotent, Double negation
  • Commutative
  • R1 n R2 ? R2 n R1
  • Associative
  • (R1 n R2) n R3 ? R1 n (R2 n R3)
  • Distributive
  • R1 U (R2 n R3) ? (R1 U R2) n (R1 U R3)
  • R1 n (R2 U R3) ? (R1 n R2) U (R1 n R3)
  • Can be applied to Union, Intersection,
    Cross-Product, Joins
  • Not applicable to Set Difference or Division
  • Select commutes with Project only if Select
    attributes are in Project relation

15
Relational Operations
  • Transformations
  • Double negation
  • (s(S)) ? s(S)
  • Nesting cascade
  • sc2(sc1(S)) ? sc1(S) ? sc2(S)
  • More equivalencies
  • R ?c S ? sc(R X S)
  • pa(sc (S)) ? sc(pa (S)) if c ? a
  • sc(R X S) ? sc(R) X S if c attributes ? R
  • sc(R ? S) ? sc(R) ? S if c attributes ? R
  • sc1(sc2(sc3(R X S))) ? sc1(sc2(R) X sc3(S)) if
    attributes are members of their respective
    relation

16
Relational Algebra
  • Precedence is relatively unset and vendors vary
  • In general
  • Unary
  • Muliplicative
  • Additive
  • Best practice
  • Parenthetical

17
The Relational Calculus
  • High-level
  • Declarative non-procedural
  • From a branch of symbolic logic known as
    predicate calculus
  • A predicate is a truth-valued function with
    arguments
  • Replace arguments with values to obtain a
    proposition
  • Determine if proposition is true or false
  • Two forms
  • Tuple relational calculus (another Codd creation)
  • Domain relational calculus
  • Declares what is to be retrieved, not how to
    retrieve it
  • Requires a well-formed formula
  • Identical expressive power with relational
    algebra
  • The basis for a relationally complete language
    (i.e., a language is relationally complete if any
    query expressed by the relational calculus can
    also be expressed by the language)

18
Tuple Relational Calculus
  • Tuple Relational Calculus - variables range over
    tuples in a relation (tuple variables)
  • T F(T)
  • Interpreted as Find the set of all tuples T such
    that the formula F is true
  • F is a well-formed formula defining the
    predicate multiple predicates are connected
    using AND (/\), OR(\/) and NOT( or )
  • T on the right is the set of all tuple variables
    containing values that make F true
  • The left-hand T is free, the right becomes bound
    by conditions
  • Can be unsafe (t (Employees(t))), which
    yields all tuples in the universe not in the
    Employees set of tuples, so the concept of
    domain of a tuple relational calculus
    expression was established (an method of
    creating a (domain) range variable was proposed
    by Date to first constrain to the range of the
    expression).

19
Tuple Relational Calculus
  • Tuple Relational Calculus - variables range over
    tuples in a relation (tuple variables)
  • Tuple Relational Calculus S.x1,S.x2,,S.xn
    p(S.x1,S.x2,S.xm) m gt n
  • Generalization
  • Find the set of all tuples S such that F(S) is
    true SF(S)
  • F is a wff (well-formed formula)
  • Can be thought of as
  • What is retrieved predicate(s) to be
    satisfied
  • With free variables on the left (within the
    domain of the expression) and bound variables on
    the right

Example Single relation List the names of all
managers who earn more than 25,000. S.Name
Staff(S) /\ S.position manager /\ Salary gt
25000 Multiple relations List names of staff
who manage properties for rent in
Glasgow S.Name Staff(S) /\ ?(P)(PropertyForRen
t(P) /\ (P.staffno S.staffno) /\ P.city
Glasgow) Compare with relational
algebra List all cities where there is a branch
office but no properties for rent TRC B.city
Branch(B) /\ ((?P)(PropertyForRent(P) /\ B.city
P.city)) RA ?city (Branch) - ?city
(PropertyForRent) and SQL SELECT DISTINCT
City FROM Branch WHERE City NOT IN (SELECT City
FROM PropertyForRent)
20
Tuple Relational Calculus
  • To be a well-formed formula, it must be made up
    of one or more combinations of the following
    predicate calculus atoms using logical operators
    /\, \/ or
  • R(Sl), where R is a relation and Sl is a tuple
    variable
  • Sl.aeT Sm.af where S represents tuple variables
    ae represents attributes of a relation over which
    Sl ranges S represents tuple variables af
    represents attributes of a relation over which Sm
    ranges and T (theta) represents comparison
    operators (lt, lt, gt, gt, , ltgt).
  • Sl.aeT c where S represents tuple variables a
    represents attributes of a relation over which S
    ranges c represents a constant from the domain
    of a T (theta) represents comparison operators
    (lt, lt, gt, gt, , ltgt).

21
Domain Relational Calculus
  • Domain Relational Calculus variables range over
    the domains of attributes
  • d1,d2,dn F(d1,d2,dm)
  • d1,d2,dn are domain variables
  • Predicate requires finding a tuple containing a
    value in each domain that satisfies the
    proposition

Example Find the names of managers who earn more
than 25000 N (?N,pos,sal)(Staff(N,pos,sal) /\
posmanager /\ sal gt 25000 Example List
all cities where there is either a branch office
or a property for rent DRC city(Branch(bN,st,c
ity,pc)\/(PropertyForRent(pN,st1,city,pc1,rms))
RA ?city (Branch) ? ?city (PropertyForRent)
22
Domain Relational Calculus
  • To be a well-formed formula, it must be made up
    of one or more combinations of the following
    predicate calculus atoms using logical operators
    /\, \/ or
  • Of the form R(s1,s2,s3,,sn), where R is a
    relation name of degree n and sn is a domain
    variable
  • sl T sm where s represents domain variables a
    represents attributes of a relation over which S
    ranges T (theta) represents comparison operators
    (lt, lt, gt, gt, , ltgt) for comparing values from
    comparable domains
  • sl T c where s represents a domain variable c
    represents a constant from the domain of sl T
    (theta) represents comparison operators (lt, lt, gt,
    gt, , ltgt) for comparing values from domain sl
    with c.

23
The Relational Calculus
  • Find the names and ages of all students with a
    gpa above 3
  • TRC
  • P ?S?Students(S.gpagt3?P.nameS.name ?
    P.ageS.age)
  • DRC
  • ltN,AgtltI,N,T,Agt ?Students ? T gt 3
  • Binding variables using quantifiers
  • ? - there is at least one set of tuple values
    (existential)
  • ? - all tuple values in a set (universal)
  • DeMorgans law can be applied, for example
  • (?X)(F(X)) (?X)((F(X))
  • (?X)(F(X)) (?X)((F(X))
  • (?X)(F1(X) /\ F2(X)) (?X)((F1(X)) \/
    (F2(X)))
  • (?X)(F1(X) /\ F2(X)) (?X)((F1(X)) \/
    (F2(X)))

24
Relationally complete
BranchID LoanNumber Loan Value 1 1 1000 1
2 3000 2 1 1200 3 1 1500
  • Consider the following
  • Bank Branches give loans
  • Each loan has a loan number
  • Each loan has a value
  • Show the branches and loan numbers with a loan
    value greater than 1200
  • DRCltl,b,agt ltl,b,agt ? loan ? a gt 1200
  • TRCt t ? loan ? a gt 1200
  • RA sagt1200Loans
  • Show the loan number for any loans values greater
    than 1200
  • DRCltlgt ?b,a (ltl,b,agt ? loan ? a gt 1200
  • TRCt ?s ? loan(t.l s.l ? s.a gt 1200
  • RA pl(sagt1200Loans)

25
Query By Example
  • The foundation for Query By Example (QBE) is
    considered to primarily be domain relational
    calculus
  • A graphical interface where the user enters
    values for what is wanted
  • Originally develop by IBM in the 1970s
  • Now in almost all DBMSs
  • Example
  • Show the name of all customers less than 18
    years old who reserved a tape on 10/02/02
  • An example using 2 tables

26
DDL
  • CREATE
  • SCHEMA
  • DOMAIN
  • TABLE
  • VIEW
  • INDEX
  • ALTER
  • DOMAIN
  • TABLE
  • DROP
  • SCHEMA
  • DOMAIN
  • TABLE
  • VIEW
  • INDEX

27
CREATE
  • Example 1 CREATE TABLE TableName (
  • ColumnName1 DataType NOT NULL,
  • ColumnName2 DataType,
  • ColumnName3 DataType DEFAULT Yes,
  • ColumnName4 DataType CHECK (VALUE IN (SELECT
    col003 FROM table2)),
  • PRIMARY KEY pkName (ColumnName1,ColumnName2),
  • FOREIGN KEY fkName (ColumnName2) REFERENCES
    Table5 (col006) ON DELETE NO ACTION ON UPDATE
    CASCADE
  • Example 2 CREATE TABLE TableName (
  • ColumnName1 DataType CONSTRAINT pkName PRIMARY
    KEY,
  • ColumnName2 DataType CONSTRAINT fkName
    REFERENCES Table5 (col006) ON DELETE NO ACTION ON
    UPDATE CASCADE nnName NOT NULL,
  • ColumnName3 DataType CONSTRAINT dfName DEFAULT
    Yes,
  • ColumnName4 DataType CONSTRAINT ckName CHECK
    (VALUE IN (SELECT col003 FROM table2))

28
ALTER
  • ALTER TABLE TableName ADD COLUMN (ColumnName
    DataType)
  • ALTER TABLE TableName ALTER COLUMN (ColumnName
    DataType)
  • ALTER TABLE TableName DROP COLUMN ColumnName

29
DROP
  • DROP TABLE TableName

30
INDEX
  • CREATE INDEX IndexName ON TableName (ColumnName1,
    ColumnName2) DESC
  • DROP INDEX IndexName

31
Views
  • CREATE VIEW ViewName AS
  • SELECT col001, col005 FROM TableName WHERE
    col007 value
  • DROP VIEW ViewName
  • Views are updatable only if
  • DISTINCT, GROUP BY, multiple tables and nested
    tables referenced in the FROM clause are not
    included
  • Every attribute is the sole pointer to only a
    single column in the base table
  • The integrity constraints of the base table are
    not violated
  • View materialization may improve query speed, but
    delays updates
Write a Comment
User Comments (0)
About PowerShow.com