Title: Relational Algebra and Relational Calculus
 1Chapter 6
- Relational Algebra and Relational 
 Calculus
2Introduction
- Relational algebra and relational calculus are 
 formal query languages of the relational model.
- Relational algebra is a procedural language. 
- Relational calculus is a non-procedural language. 
- Both are equivalent to one another. 
- Both have formal strong foundation on logic. 
- Query languages ! programming languages 
- The result of an operation is a new relation. 
- Using one or more input relations. 
- Operation does not change original relations.
3Relational Algebra
- Other characteristics of Relational algebra 
- Set language All tuples are manipulated without 
 looping.
- Output from one operation can become input to 
 another operation.
- It allows operations to be nested, just as in 
 arithmetic. This property is called closure.
- The sequence of relational algebra operations 
 forms a relational algebra expression
- The result is also a relation.
4Relational Algebra Operations
- Unary relational operations Selection, 
 Projection, and Rename.
- Set theory operations Union, Intersection, 
 Difference, and Cartesian Product.
- Binary relational operations Join, and Division. 
- Additional relational operations Outer Joins, 
 Outer Union, and Semijoin.
5Relational Algebra Operations 
 6Relational Algebra Operations 
 7COMPANY database schema 
 8COMPANY database state 
 9Selection (or Restriction)
- ?predicate (R) 
- It selects a subset of the tuples (rows) of R 
 that satisfy the specified condition (predicate).
- No duplicates in the result (Why?) 
- The selection condition acts as a filter. 
- Tuples that make the condition false are filtered 
 out.
- Schema of result is identical to the schema of 
 the input relation.
-  Degree(Result)  Degree(R) 
-  Cardinality(Result)  Cardinality(R)
10Example - Selection (or Restriction)
- List the employees whose department is 4 
-  ?dno  4 (Employee) 
- List all employees whose salary is greater than 
 30,000
-  ?salary gt 10000 (Employee) 
11Selection (contd.)
- SELECTION Operation Properties 
- The Selection operation ? ltselection 
 conditiongt(R) produces a relation S that has the
 same schema (same attributes) as R
- Selection ? is commutative 
- ? ltcondition1gt(? lt condition2gt (R))  ? 
 ltcondition2gt (? lt condition1gt (R))
- Because of commutativity property, a cascade 
 (sequence) of Selection operations may be applied
 in any order
- ?ltcond1gt(?ltcond2gt (?ltcond3gt (R))  ?ltcond2gt 
 (?ltcond3gt (?ltcond1gt ( R)))
- A cascade of Selection operations may be replaced 
 by a single selection with a conjunction of all
 the conditions
- ?ltcond1gt(?lt cond2gt (?ltcond3gt(R))  ? ltcond1gt AND 
 lt cond2gt AND lt cond3gt(R)))
- The number of tuples in the result of a Selection 
 is less than (or equal to) the number of tuples
 in the input relation R.
12Projection
- ?col1, . . . , coln(R) 
- It defines a relation that contains a vertical 
 subset of R, extracting the values of specified
 attributes.
- It creates a vertical partitioning of R. 
- It eliminates duplicates. Why? 
- Some DBMSs do not eliminate duplicates, unless 
 the user asks for it.
- Example List each employees first and last name 
 and salary
-  ?lname, fname, salary(Employee)
13Projection (contd.)
- Projection Operation Properties 
- The number of tuples in the result of projection 
 ?ltlistgt(R) is always less or equal to the number
 of tuples in R.
- If the list of attributes includes a key of R, 
 then the number of tuples in the result of
 Projection is equal to the number of tuples in R.
- Projection is not commutative. 
- ? ltlist1gt (? ltlist2gt (R) )  ? ltlist1gt (R) as 
 long as ltlist2gt contains the attributes in
 ltlist1gt.
-  Degree(Result)  Degree(R) 
-  Cardinality(Result)  Cardinality(R) 
14Applying Selection and Projection 
 15Single relational expression
- To retrieve the first name, last name, and salary 
 of all employees who work in department number 5
- We can write a single relational expression 
- ?lname, fname, salary(?dno  5(Employee)) 
- Or a sequence of relational operations, giving a 
 name to each intermediate relations
- Dept5Emps  ?dno  5(Employee) 
- Result  ?lname, fname, salary(Dept5Emps)
16The Rename operation
- Denoted by ? (rho), it is used to 
- assign names to intermediate relations, or 
- rename a relation and its attributes. 
- General form 
- ?S (B1, B2, , Bn )(R) changes both 
- the relation name to S, and 
- the column (attribute) names to B1, B1, ..Bn 
- ?S(R) changes 
- the relation name only to S 
- ?(B1, B2, , Bn )(R) changes 
- the column (attribute) names only to B1, B1, ..Bn
17The Rename operation (contd.)
- For convenience, we also use a shorthand for 
 renaming attributes in an intermediate relation
- If we write 
- RESULT ? ? FNAME, LNAME, SALARY (DEP5_EMPS) 
- RESULT will have the same attribute names as 
 DEP5_EMPS (same attributes as EMPLOYEE)
- If we write 
- RESULT (F, M, L, S, B, A, SX, SAL, SU, DNO) ? 
- ? RESULT (F,M,L.S.B,A,SX,SAL,SU, DNO)(DEP5_EMPS) 
- The 10 attributes of DEP5_EMPS are renamed to F, 
 M, L, S, B, A, SX, SAL, SU, DNO, respectively.
18Binary Union operation
- R ? S 
- Union of two relations R and S defines a relation 
 that contains all the tuples of R, or S, or both
 R and S, duplicate tuples being eliminated.
- R and S must be union compatible. 
- Same number of attributes 
- Corresponding attributes have the same or 
 compatible domain (type compatible).
- Schema of the result is 1st relation schema.
19Example - Union
- To retrieve the social security numbers of all 
 employees who either work in department 5
 (RESULT1 below) or directly supervise an employee
 who works in department 5 (RESULT2 below)
- We can use the UNION operation as follows 
- DEP5_EMPS ? ?dno5 (EMPLOYEE) 
- RESULT1 ? ? SSN(DEP5_EMPS) 
- RESULT2(SSN) ? ?SUPERSSN(DEP5_EMPS) 
- RESULT ? RESULT1 ? RESULT2 
- The union operation produces the tuples that are 
 in either RESULT1 or RESULT2 or both
20Example of Union (contd.) 
 21Set Difference (Minus or Except)
- R  S 
- Defines a relation consisting of the tuples that 
 are in relation R, but not in S.
- R and S must be union compatible. 
- Schema of the result is 1st relation schema. 
- Same definition applies to all the set operations.
22Intersection
- R ? S 
- Defines a relation consisting of the set of all 
 tuples that are in both R and S.
- R and S must be union compatible. 
- Expressed using basic operations 
-  R ? S  R  (R  S) 
23Examples  Set Theory operations 
 24Properties of Union, intersect, and difference
- Notice that both union and intersection are 
 commutative operations that is
- R ? S  S ? R, and R ? S  S ? R 
- Both union and intersection can be treated as 
 n-ary operations applicable to any number of
 relations as both are associative operations
- R ? (S ? T)  (R ? S) ? T 
- (R ? S) ? T  R ? (S ? T) 
- The minus operation is not commutative that is 
 in general
- R  S ? S  R
25Cartesian (or cross) product
- R X S 
- Defines a relation that is the concatenation of 
 every tuple of relation R with every tuple of
 relation S.
- Each row of R is paired with each row of S. 
- Relations do not have to be type compatible. 
- Result schema has one field for each field of R 
 and S.
26Cartesian Product (contd.)
- Generally, CROSS PRODUCT is not a meaningful 
 operation
- Can become meaningful when followed by other 
 operations
- Example (not meaningful) 
- FEMALE_EMPS ? ? sexF(EMPLOYEE) 
- EMPNAMES ? ? fname, lname, SSN (FEMALE_EMPS) 
- EMP_DEPENDENTS ? EMPNAMES x DEPENDENT 
- EMP_DEPENDENTS will contain every combination of 
 EMPNAMES and DEPENDENT
- whether or not they are actually related
27Example  Cartesian Product 
 28Join Operation
- Join is a derivative of Cartesian product. 
- Equivalent to performing a Selection, using join 
 predicate as selection formula, over Cartesian
 product of the two operand relations.
- Very important and of the most difficult 
 operation
- Hard to implement efficiently in an RDBMS. 
- It allows us combine related tuples from various 
 relations.
- General form (called theta join)  R ltjoin 
 conditiongt S
- Where R and S can be any relations. 
- Join condition can be any general Boolean 
 expression.
29Join Operation (contd.)
- Example Suppose that we want to retrieve the 
 name of the manager of each department.
- To get the managers name, we need to combine 
 each DEPARTMENT tuple with the EMPLOYEE tuple
 whose SSN value matches the MGRSSN value in the
 department tuple.
- We do this by using the join operation. 
- DEPT_MGR ? DEPARTMENT MGRSSNSSN EMPLOYEE 
- MGRSSNSSN is the join condition. 
- Combines each department record with the employee 
 who manages the department
- The join condition can also be specified as 
 DEPARTMENT.MGRSSN EMPLOYEE.SSN
30Example of the Join Operation
DEPT_MGR ? DEPARTMENT MGRSSNSSN 
EMPLOYEE 
 31Some properties of Join
- Consider the following JOIN operation 
- R(A1, A2, . . ., An) S(B1, B2, 
 . . ., Bm)
-  R.AiS.Bj 
- Result is a relation Q with degree n  m 
 attributes
- Q(A1, A2, . . ., An, B1, B2, . . ., Bm), in that 
 order.
- The resulting relation state has one tuple for 
 each combination of tuplesr from R and s from S,
 but only if they satisfy the join condition
 rAisBj
- Hence, if R has nR tuples, and S has nS tuples, 
 then the join result will generally have less
 than nR  nS tuples.
- Only related tuples (based on the join condition) 
 will appear in the result
32Other Join Operations
- In addition to the Theta Join 
- Equijoin (a particular type of Theta Join) 
- Natural join 
- Outer join 
- Semijoin
33Equijoin
- A theta join where join condition contains only 
 equality ().
- The result of an equijoin always have one or more 
 pair of attributes (with identical names or not)
 that have identical values in every tuple.
- The join for the creation of DEPT_MGR was an 
 equijoin.
- Same degree and cardinality of a Theta join 
-  Degree(Result)  Degree(R)  Degree(S) 
-  Card(Result)  Card(R) x Card(S)
34Natural Join
- R  S 
- An Equijoin of the two relations R and S over all 
 common attributes x. One occurrence of each
 common attribute is eliminated from the result.
- If attributes have different names, a renaming 
 operations is applied first.
35Example - Natural Join
- To apply a natural join on the DNUMBER attributes 
 of DEPARTMENT and DEPT_LOCATIONS, it is
 sufficient to write
-  DEPT_LOCS ? DEPARTMENT  DEPT_LOCATIONS 
- Only attribute with the same name is DNUMBER 
- An implicit join condition is created based on 
 this attribute
- DEPARTMENT.DNUMBERDEPT_LOCATIONS.DNUMBER 
- Another example Q ? R(A,B,C,D)  S(C,D,E) 
- The implicit join condition includes each pair of 
 attributes with the same name, ANDed together
- R.CS.C AND R.D.S.D 
- Result keeps only one attribute of each such 
 pair
- Q(A,B,C,D,E)
36Another example of Natural Join 
 37Complete set of Relational Operations
- The set of operations including Selection ?, 
 Projection ? , Union ?, Difference -, and
 Cartesian Product X is called a complete set
 because any other relational algebra expression
 can be expressed by a combination of these five
 basic operations.
- For example 
- R ? S  (R ? S )  ((R - S) ? (S - R)) 
- R ltjoin conditiongtS  ? ltjoin conditiongt (R 
 X S)
38Division
- R(x,y) ? S(y) 
- Not supported as a primitive operator 
- Let R have 2 fields, x and y S has only field y 
-  R(x,y) ? S(y) is the set of all x values in R 
 such that the y
-  values associated with an x value in R 
 contains all y
-  values in S. 
- In general, x and y can be any lists of fields. 
- Expressed using basic operations 
-  T1 ? ?x(R) 
-  T2 ? ?x((S X T1)  R) 
-  T ? T1  T2
39Example - Division 
 40Query Tree Notation
- Query Tree 
- An internal data structure to represent a query. 
- Standard technique for estimating the work 
 involved in executing the query, the generation
 of intermediate results, and the optimization of
 execution.
- Nodes stand for operations like selection, 
 projection, join, renaming, division, .
- Leaf nodes represent base relations. 
- A tree gives a good visual feel of the complexity 
 of the query and the operations involved.
- Algebraic Query Optimization consists of 
 rewriting the query or modifying the query tree
 into an equivalent tree.
- (see Chapter 15) 
41Example of Query Tree 
 42Outer joins
- Use Outer joins if you want to display rows in 
 the result that do not have matching values in
 the join column.
- Outer joins Left ( ), Right ( ), and Full 
 ( )
- Left Outer Join R S 
- (Left) outer join is a join in which tuples from 
 R that do not have matching values in common
 columns of S are also included in result
 relation.
- Attribute of S in the unmatched tuples are filled 
 with null values.
43Outer joins (contd.)
- Right outer join is similar to left outer join. 
- It keeps unmatching tuples of the right relation 
 in the result of R S.
- R attributes in the unmatched tuples are filled 
 with null values.
- The full outer joins keeps all unmatching tuples 
 of the left and right relations of the operation.
- It is a combination of a left- and a right outer 
 join.
44Example - Left Outer join
- Temp ?(Employee) SSNMGR_SSN ( Department) 
 Result ? ?fname,minit,lname,dname(Department)
45Outer Union Operations
- OUTER UNION Operations 
- It was developed to take the union of tuples from 
 two relations if the relations are not type
 compatible.
- This operation will take the union of tuples in 
 two relations R(X, Y) and S(X, Z) that are
 partially compatible, meaning that only some of
 their attributes, say X, are type compatible.
- The attributes that are type compatible are 
 represented only once in the result, and those
 attributes that are not type compatible from
 either relation are also kept in the result
 relation T(X, Y, Z).
46Outer Union (contd.)
- Example Outer union of relations Student(name, 
 SSN, department, advisor) Instructor(name, SSN,
 department, rank)
- Tuples are matched by the common attributes 
 name, SSN, and department.
- If a student is also an instructor, both advisor 
 and rank will have values otherwise, one of
 these attributes will be null.
- Resulting schema 
- Stu_Or_Inst(name, SSN, department, advisor, rank)
47Semijoin
- R FS 
- Defines a relation that contains the tuples of R 
 that participate in the join of R with S.
- Result schema is the schema of the first 
 relation.
- Can rewrite Semijoin using Projection and Join 
-  R F S  ?R.A(R F S) 
- Example List employees data for employees who 
 work at the Research department
-  Employee dnodnumber (sdnameResearch( 
 Department))
48Examples of queries in RA Procedural form
- Q1 Retrieve the name and address of all 
 employees who work for the Research department.
-  RESEARCH_DEPT ? ? DNAMEResearch (DEPARTMENT) 
-  RESEARCH_EMPS ? (RESEARCH_DEPT DNUMBER 
 DNOEMPLOYEEEMPLOYEE)
-  RESULT ? ? FNAME, LNAME, ADDRESS (RESEARCH_EMPS) 
- Q6 Retrieve the names of employees who have no 
 dependents.
-  ALL_EMPS ? ? SSN(EMPLOYEE) 
-  EMPS_WITH_DEPS(SSN) ? ? ESSN(DEPENDENT) 
-  EMPS_WITHOUT_DEPS ? (ALL_EMPS - EMPS_WITH_DEPS) 
-  RESULT ? ? LNAME, FNAME (EMPS_WITHOUT_DEPS  
 EMPLOYEE)
49Examples of queries in RA Single expression
- As a single expression, these queries become 
- Q1 Retrieve the name and address of all 
 employees who work for the Research department.
-  ? Fname, Lname, Address (s Dname Research 
- (DEPARTMENT DnumberDno(EMPLOYEE)) 
- Q6 Retrieve the names of employees who have no 
 dependents.
-  ? Lname, Fname((? Ssn (EMPLOYEE) - ? Ssn (? 
 Essn (DEPENDENT)))  EMPLOYEE)
50Relational Calculus
- Two versions tuple relational calculus (TRC) and 
 domain relational calculus (DRC).
- Calculus uses variables, constants, operators 
 (comparison and logical), and quantifiers.
- TLC variables range over tuples (rows). 
- DRC variables range over domain elements 
 (columns).
- Expressions in relational calculus are called 
 formulas (or predicates).
- Usually an answer to a formula is a set of tuples 
 that make the formula evaluate to true.
51Tuple Relational Calculus
- Query has the form t  p(t) 
- t is a tuple variable and p(t) is a formula. 
- It finds the set of all tuples t such that p(t) 
 is true.
- Tuple variable is a variable that ranges over a 
 named relation ie., variable whose only
 permitted values are tuples of the relation.
- Example Specify range of a tuple variable t of 
 the Employee relation as Employee(t)
-   t  Employee(t)  ? Get all Employee tuples
52Example - Tuple Relational Calculus
- Example To find the first and last names of all 
 employees whose salary is above 50,000, we can
 write the following tuple calculus expression
- t.fname, t.lname  Employee(t) AND 
 t.salarygt50000
- The first and last name (Projection ?fname, 
 lname) of each EMPLOYEE tuple t that satisfies
 the condition t.salarygt50000 (Selection ? SALARY
 gt50000) will be retrieved.
- Queries are evaluated on instances of Employee.
53Quantifiers in Relational calculus
- Can use two quantifiers to tell how many 
 instances the formula applies to
- Existential quantifier  (there exists) 
- Universal quantifier " (for all) 
- Tuple variables qualified by " or  are called 
 bound variables, otherwise called free variables.
- Only free variables should appear to the left of 
 the bar .
- Otherwise, the answer is either True or False. 
- If F(t) is a formula, then so are (t)(F(t)) and 
 ("t)(F(t)), where t is a tuple variable.
54Quantifiers (contd.) 
- The formula (? t)(F) is true if the formula F 
 evaluates to true for some (at least one) tuple
 assigned to free occurrences of t in F otherwise
 (? t)(F) is false.
- Example Retrieve the name and address of all 
 employees who work for the Research department
-  t.fname,t.lname,t.address  Employee(t) Ù 
 (d) (Department(d) Ù (d.name Research) Ù
 (d.dnumber  t.dno))
55Quantifiers (contd.)
- The formula (? t)(F) is true if the formula F 
 evaluates to true for every tuple (in the
 universe) assigned to free occurrences of t in F
 otherwise (? t)(F) is false.
-  (? t) is used in in statements about every 
 instance
- Example For all Project tuples, the location is 
 not Houston.
-  (? t)(p.plocation ? Houston) 
- Can also use (p) (p.plocation  Houston) 
 which means There are no locations in Houston.
-  ("t) (P(t)) ? (t) (P(t)) 
56Quantifiers (contd.)
- Find the names of employees who work on all the 
 projects controlled by department number 5. The
 query can be
- e.lname, e.fname  Employee(e) Ù ( (? 
 x)((Project(x)) V (x.dnum5) V
-  ( (? w)(Works_On(w) Ù w.ESSNe.SSN Ù 
 x.pnumberw.pno))))
- Exclude from the universal quantification all 
 tuples that we are not interested in by making
 the condition true for all such tuples.
- The first tuples to exclude (by making them 
 evaluate automatically to true) are those that
 are not in the relation R of interest.
- In query above, using the expression 
 (Project(x)) inside the universally quantified
 formula evaluates to true all tuples x that are
 not in the Project relation.
- Then we exclude the tuples we are not interested 
 in from R itself. The expression not(x.dnum5)
 evaluates to true all tuples x that are in the
 project relation but are not controlled by
 department 5.
- Finally, we specify a condition that must hold on 
 all the remaining tuples in R.
-  ( (? w)(Works_On(w) Ù w.ESSNe.SSN Ù 
 x.Pnumberw.pno)