Title: Module 6 Relational Algebra
1Module 6Relational Algebra
- Teknik Informatika Fakultas Teknik
- Universitas Dr. Soetomo Surabaya
2database Modules
- Module 1 Database Systems
- Module 2 Relational Model
- Module 3 Entity Relationship Model
- Module 4 ER to Relational Mapping
- Module 5 FDs and Normalization
- Module 6 Relational Algebra
- Module 7 SQL
- Module 8 Database Systems Architecture
3Module 6 - Motivation
- A data model must also provide a set of
operations to manipulate the data
- The relational data model provides a means of
defining the database structure and constraints
Find the names and departments of all employees
who earn more than 55K Increment the salary of
all employees in the printing department by
10 What is the address of employee Jones
The basic set of relational model
operations constitute the Relational Algebra
4Module 6 - Contents
- Relational Algebra
- What is a Relational Query
- Relational Query Languages
- Relational Algebra Operations
- Query Formulation in Relational Algebra
- Exercises in Relational Algebra
5What is a Relational Query
- Data in a relational database can be manipulated
in the following ways - INSERT New tuples may be inserted
- DELETE Existing tuples may be deleted
- UPDATE Values of attributes in existing
tuples - may be changed
- RETRIEVE Attributes of specific tuples,
entire - tuples, or even entire relations may be
retrieved - Relational Query Languages should provide all of
the above
6Relational Query Languages
- Relational Queries are formulated in Relational
- Query Languages
- Relational Algebra (RA)
- Formal query language for a relational
database - Structured Query Language (SQL)
- Comprehensive, commercial query language with
- widely accepted international standard
- Query by Example (QBE)
- Commercial, graphical query language with
- minimum syntax
7SQL and Relational Algebra
- SQL
- Declarative language
- Users specify what the result of the query
should be, DBMS decides operations and orderof
execution - Operations
- Provides commands to create and modify
database structure and constraints (DDL) - Provides commands to insert, delete, update
and retrieve (DML)
- RA
- Procedural language
- Algebraic expressions specify an order of
operations ie. How the query will be processed - Operations
- Provides operators, that enable a user to
specify retrieval requests only
8Module 6 - Contents
- Relational Algebra
- What is a Relational Query
- Relational Query Languages
- Relational Algebra Operations
- Query Formulation in Relational Algebra
- Exercises in Relational Algebra
9Relational Algebra Operations
- Relational algebra operations are applied on
- relations
- Result of relational algebra operations are also
- relations, i.e the algebra operations produce
- new relations from old
- A sequence of relational algebra operations
- forms a relational algebra expression, whose
- result will also be a relation
10Types of RA Operations
- Set operations from mathematical set theory
(Applicable because each relation is also a set
of tuples) - UNION
- INTERSECTION
- DIFFERENCE
- CARTESIAN PRODUCT
- Operations developed specifically for RDBs
- SELECT
- PROJECT
- JOIN
- DIVISION
11Operators and Notation
- Specific Database
- Operators
- Select s
- Project ?
- Join ?
- Division
- Traditional Set
- Operators
- Intersection n
- Union ?
- Difference
- Cartesian Product
12Understanding RA Operations
- SELECT
- PROJECT
- Assignment and Naming
- UNION
- INTERSECTION
- DIFFERENCE
- Properties of operators
- CARTESIAN PRODUCT
- JOIN
- DIVISION
13Select
- s lt selection condition gt ( lt relation name gt )
- Select those rows which satisfy a given condition
This operation is also called restriction
Selected Tuples
14Select Example
- 1. List all details of employees working in
- department 4?
- EMPLOYEE Ename, SSN, DOB, Address, Sex,
- Salary, SuperSSN, Dno
- s Dno 4 (EMPLOYEE)
15Select Example
- 2. List all details of employees earning more
than 30000? - EMPLOYEE Ename, SSN, DOB, Address, Sex,
- Salary, SuperSSN, Dno
- s Salary gt 30000 (EMPLOYEE)
16Select Example
- 3. List all details about employees who work in
- department 4 and earn over 25000, or work
- in department 5 and earn over 30000?
- EMPLOYEE Ename, SSN, DOB, Address, Sex,
- Salary, SuperSSN, Dno
- s (Dno 4 ? Salary gt 25000) ?
- (Dno 5 ? Salary gt 30000)
(EMPLOYEE)
17Project
- ? lt attribute list gt (lt relation name gt)
-
- Produce a new relation with only some of the
attributes of the original relation. Duplicate
tuples are eliminated in the result relation
Duplicated Tuples
18Project Example
- 4. For each employee, list their name, date of
birth and salary. - EMPLOYEE Ename, SSN, Bdate, Address, Sex,
- Salary, SuperSSN, Dno
- ? Ename,Bdate,Salary (EMPLOYEE)
19Project Example
- 5. List the salaries paid to employees in each
department and the department number. - EMPLOYEE Ename, SSN, DOB, Address, Sex,
- Salary, SuperSSN, Dno
- ? Dno, Salary (EMPLOYEE)
20Handling Complex Queries
- Formulation of complex queries may require
- several relational algebra operations one after
- the other
- Operations can be written as a single
relational - algebra expression by nesting the operations
- Operations can be applied one at a time by
- creating intermediate result relations
- Intermediate Results have to be assigned to
- temporary relations which must be named
21Relation Assignment and Naming
- Relation Assignment
- Result Relation ? Relational Expression
- Relation Naming
- TEMP ?
- Attribute (re)Naming
- TEMP (dept, emp-salary) ?
22Assignment Example
- 6. Create a new relation named RESULT, containing
each employee and their date of birth. Label the
resulting columns with Employee and DOB. - EMPLOYEE Ename, SSN, Bdate, Address, Sex,
- Salary, SuperSSN, Dno
- RESULT(Employee,DOB ) ? ? Ename,Bdate (EMPLOYEE)
23Assignment Example
- 7. List the names and salaries of all employees
who work for department 5 - EMPLOYEE Ename, SSN, Bdate, Address, Sex,
- Salary, SuperSSN, Dno
- ? Ename,Salary ( s Dno 5 (EMPLOYEE ) )
- EMPS - DEP5 ? s Dno 5 (EMPLOYEE )
- RESULT ? ? Ename,Salary (DEP5 - EMPS)
Query with Expression
Query with Intermediate Relations
24Understanding RA Operations
- SELECT
- PROJECT
- Assignment and Naming
- UNION
- INTERSECTION
- DIFFERENCE
- Properties of operators
- CARTESIAN PRODUCT
- JOIN
- DIVISION
25Basic Set Operators
- Relation is a set of tuples (no duplicates)
- Set theory, and hence elementary set
- operators also apply
- to relations
- UNION
- INTERSECTION
- DIFFERENCE
- CARTESIAN
- PRODUCT
A
B
A
B
A
B
26Union Compatibility in Relations
- Two relations R(A1, A2, ..., An) and S(B1, B2,
- ..., Bn) are union compatible iff
- They have the same degree n, (number of
- columns)
- Their columns have corresponding domains, i.e
- dom(Ai) dom(Bi) for 1 i n
- Applies to union, intersection and difference
27 Union Compatibility
- Although domains need to correspond they do
- not have to have the same name
- WORKS_ON ESSN, Pno, Hours
- WORKED_ON Employee, Project, Duration
- where dom (ESSN) dom (Employee)
- dom (Pno) - dom (project)
- dom (Hours) dom (Duration)
28Union
- R1 ? R2
- Produces a relation that includes all tuples that
appear only in R1, or only in R2, or in both R1
and R2 - Duplicate Tuples are eliminated
- R1 and R2 must be union compatible
29Union Example
- 8.Identify the employees who both work on
projects and also have dependents - WORKS_ON ESSN, PNo, Hours
- DEPENDENT ESSN, Dep_Name, Sex, DOB,
- Relationship
- WORKS_ON ? DEPENDENT
- The relations are not UNION compatible !
30Union Example
- 9.List the ESSNs of employees who either have
dependents or work on projects. - WORKS_ON ESSN, PNo, Hours
- DEPENDENT ESSN, Dep_Name, Sex, DOB,
- Relationship
- ? ESSN ( DEPENDENT ) ? ? ESSN (WORKS_ON )
31Intersection
- R1 n R2
- Produces a relation that includes the tuples that
appear - in both R1 and R2.
- R1 and R2 must be union compatible.
32Intersection Example
- 10. List the ESSNs of employees who have
- dependents and work on projects.
- WORKS_ON ESSN, PNo, Hours
- DEPENDENT ESSN, Dep_Name, Sex, DOB,
- Relationship
- ? ESSN ( DEPENDENT ) n ? ESSN (WORKS_ON )
33Difference
- R1 - R2
- Produces a relation that includes all the tuples
that - appear in R1, but do not appear in R2.
- R1 and R2 must be union compatible.
34Difference Example
- 11. List the ESSNs of employees who have
- dependents but do not work on projects.
- WORKS_ON ESSN, PNo, Hours
- DEPENDENT ESSN, Dep_Name, Sex, DOB,
- Relationship
- ? ESSN ( DEPENDENT ) ? ESSN (WORKS_ON )
35Properties of Operators
- Commutative and Associative Operators
- Precedence among operators in relational
- algebra expressions
- De Morgans Laws
36Commutative and Associative
37Operator Precedence
- , ?, lt, gt, ,
- not
- and
- or
- s, ?
- n, ?, , , , ?
- Operators performed left to right in the
expression - ( ) can be used to alter operator precedence,
that is operations in ( ) will be performed
before even if they have a lower precedence order
38Precedence Example
- List all employees who are male, and either
- earn less than 40000 or work for deptment
5. - EMPLOYEE Ename, SSN, DOB, Address, Sex,
- Salary, SuperSSN,
Dnum - ? Ename (s (Sex M and (Salary lt 40000 or Dnum
5)) (EMPLOYEE )) - How does the above solution differ from the
following? - ? Ename (s (Sex M and Salary lt 40000 or Dnum
5) (EMPLOYEE ))
39De Morgans Laws
- ( p ? q ) p ? q
- ( p ? q ) p ? q
- where p and q are predicates, e.g Agegt20,
- DeptResearch, e.g
- (Salary gt 40000) ? (Dept Research) )
-
- (Salary gt 40000) ? (Dept Research)
40DeMorgans Law Example
- 13. List all projects which are neither located
in - Brisbane, nor controlled by department 4.
- PROJECT PName, PNo, Plocation, Dnum
- ? Pname(s not (Plocation Brisbane) and not
(Dnum4) (PROJECT)) -
- ? Pname (s not (Plocation Brisbane or Dnum4)
(PROJECT)) -
- ? Pname (s not (Plocation ltgt Brisbane and Dnum
ltgt 4) (PROJECT))
41Cartesian Product
- R1 R2
- Also known as a cross-product or cross-join R1
and R2 need NOT be union compatible - The result of R1 (A1, A2, An) x R2 (B1, B2,
Bm) is a relation Q with n m attributes Q (A1,
A2, An, B1, B2, Bm) in that order - Q has one tuple for each combination of tuples
from R1 and R2, thus if R1 has r tuples and R2
has t tuples, then Q will have r t tuples
42Cartesian Product Example
43Cartesian Product Example
- 14. For each female employee, list the names
- of all of her dependents.
- EMPLOYEE Ename,SSN,DOB,Address,Sex,Salary,SuperSS
N, Dno - DEPENDENT ESSN, DepName, Sex, DOB, Relationship
- FEMALE_EMPS ? s Sex F (EMPLOYEE)
- EMP_NAMES ? ? Ename, SSN (FEMALE_EMPS)
- EMP_DEPEND ? EMP_NAMES DEPENDENT
- ACTUAL_DEPEND ? s SSN ESSN (EMP_DEPEND)
- RESULT ? ? Ename, DepName (ACTUAL_DEPEND)
44Understanding RA Operations
- SELECT
- PROJECT
- Assignment and Naming
- UNION
- INTERSECTION
- DIFFERENCE
- Properties of operators
- CARTESIAN PRODUCT
- JOIN
- DIVISION
45Join Operations
- A Join is similar to Cartesian Product, but only
selected pairs of tuples appear in the result - It is used to combine related tuples from two
relations into a single tuple in a new relation.
This is needed when information is contained in
more than one relation - There are three types of Join Operations
- Thieta-Join
- Equi-Join
- Natural Join
46Thieta-Join
- R1 ? lt join conditiongt R2
- A join condition(s) is of the form A ? B,
where A ? R1 and B ? R2, - and ? is one of , ¹, lt, , gt,
47Thieta-Join Example
- 15. For each employee, list all the employees who
- earn more (than the first employee).
- EMPLOYEE Ename, SSN, DOB, Address, Sex, Salary,
- SuperSSN, Dno
- DEPARTMENT DName, DNumber, MgrSSN, MgrStart
- A ? EMPLOYEE
- B ? EMPLOYEE
- RESULT ? ? A.Ename, B.Ename (A? A.Salary lt
B.Salary B)
48Equi-Join
- R1 ? lt join conditiongt R2
- Specialization of Join
- Join condition only has equality comparisons only
49Equi-Join Example
- 16. List the names of the managers of each
- department.
- EMPLOYEE Ename, SSN, DOB, Address, Sex, Salary,
- SuperSSN, Dno
- DEPARTMENT DName, DNumber, MgrSSN, MgrStart
- DEPT_MGR ? DEPARTMENT ? MGRSSN SSN EMPLOYEE
- RESULT ? ? Ename (DEPT_MGR)
50Natural Join
- R1 R2
- Similar to equi-join except that the attributes
that are used for the join are those that have
the same name in each relation - Consequently, they are not explicitly specified
- The duplicate column is eliminated
51Natural Join Example
52Natural Join Example
- 17. What is the result schema of the following
- query? What attributes is the join
performed - on?
- DEPARTMENT DName, DNumber, MgrSSN, MgrStart
- DEPT_LOCS DNumber, Dlocation
- DEPARTMENT DEPT_LOCS
53Natural Join Example
- 18. What is the difference between the results of
- the following queries? What attributes are
the - joins performed on?
- EMPLOYEE Ename, SSN, DOB, Address, Sex, Salary,
SuperSSN, Dno - DEPARTMENT DName, DNumber, MgrSSN, MgrStart
- EMPLOYEE DEPARTMENT
- and
- EMP(MgrSSN ,Dnumber) ? ? SSN,Dno (EMPLOYEE)
- RESULT ? EMP DEPARTMENT
54Division
- R1 R2
- Result relation contains columns in R1, but not
in R2 - Relations R1 and R2 must be division compatible,
i.e last n columns of R1 must be identically
named to columns in R2, where n is the degree of
R2 - The result relation contains tuples t, such that
a value in t appears in R1, in combination with
every tuple in R2
55Division Example
56Division Example
- 22. Retrieve the names of employees who work on
- all projects that John Smith works on.
- EMPLOYEE Ename, SSN, DOB, Address, Sex, Salary,
SuperSSN, Dno - WORKS_ON ESSN, Pno, Hours
- SMITH ? s EnameJohn Smith (EMPLOYEE)
- SMITH_PNOS ? ? Pno (WORKS_ON ? ESSNSSN SMITH)
- SMITH_PNOS ? ? ESSN, Pno (WORKS_ON)
- SMITH_PNOS ? SSN_PNOS SMITH_PNOS
- RESULT ? ? Ename (SSNS EMPLOYEE)
57Module 6 - Contents
- Relational Algebra
- What is a Relational Query
- Relational Query Languages
- Relational Algebra Operations
- Query Formulation in Relational Algebra
- Exercises in Relational Algebra
58Query Formulation in RA
- Understand what the English query means
- Identify which relations, tuples (SELECT) and
attributes (PROJECT) that will be required for
the query - Identify the relationships between required
relations and accordingly which binary operators
can be used - (JOIN, PRODUCT, UNION, DIVISION, )
- Formulate the query keeping in mind operator
properties (Commutative/Associative, Order
precedence, De Morgans Laws)
59Which RA Operator to use ?
- Use unary operators SELECT / PROJECT when
choosing tuples / attributes respectively froma
single relation - Use binary operators UNION, PRODUCT, JOIN, when
defining the relationship between 2 or more
relations
- SELECT ?
- PROJECT ?
- UNION ?
- INTERSECTION
- DIFFERENCE ?
- CARTESIAN PRODUCT ?
- JOIN
- DIVISION
s ? ? Complete Set of Operations
60Complete Set of RA Operators
- It has been proved that s, ?, ?, , is a
complete set of RA operators - Each remaining relational algebra operator can be
expressed as a sequence of operations from this
set - These remaining operators have been defined
- primarily for convenience !
61Expressing other operators
- Intersection
- R n S ( R ? S ) (( R S ) ? ( S R )
- (Thieta/Equi) Join
- R ltconditiongt S s ltconditiongt ( R S)
- Natural Join
- R1 (B1, A2, A3, . . . An) ? ? (A1, A2, A3, . .
. An) R - R S ? (B1, A2, A3, ... An, B2, ... Bm) s
ltR.B1 S.B1gt( R1S)
62Expressing other operators
- Division
- T1 ? ? Y ( R )
- T2 ? ? Y ( ( S T1 ) R )
- R S T1 T2
63Module 6 - Contents
- Relational Algebra
- What is a Relational Query
- Relational Query Languages
- Relational Algebra Operations
- Query Formulation in Relational Algebra
- Exercises in Relational Algebra
64Relational Algebra Exercises
- These exercises use the Company database as an
example to illustrate relational algebra queries
that require the use of multiple relational
algebra operators - EMPLOYEE Ssn, Fname, Mit, Lname, Dob, Address,
Sex, - Salary, Dno, SuperSSN
- DEPARTMENT Dnumber, Dname, MGRSSN,MgrStart
- PROJECT Pno, PName, Plocation, DNum
- DEPENDENT ESSN,DepName, Sex, DOB, Relationship
- WORKS_ON ESSN, PNo, Hours
- DEPT_LOCS DNumber, DLocation
65RA Exercise
- 23. Retrieve the name and address of all
employees - who work for the Research Department.
- EMPLOYEE Ename, SSN, DOB, Address, Sex, Salary,
SuperSSN, Dno - DEPARTMENT Dname, Dnumber, MgrSSN, MgrStart
- RESEARCH_DEPT ? s DnameResearch (DEPARTMENT)
- RESEARCH_DEPT_EMPS ? (RESEARCH_DEPT? DnumberDno
EMPLOYEE) - RESULT ? ? Ename,Address (RESEARCH_DEPT_EMPS)
66RA Exercise
- 24. For every project located in Ipswich, list
the project - number, the controlling department number,
and the - department managers name, address birth
date. - EMPLOYEE Ename, SSN, DOB, Address, Sex, Salary,
SuperSSN, Dno - DEPARTMENT Dname, Dnumber, MgrSSN, MgrStart
- PROJECT PName, Pnumber, Plocation, Dnum
- IPSWICH_PROJS ? s Plocationipswich (PROJECT)
- CONTR_DEPT ? (IPSWICH_PROJS ? DnumDnumber
DEPARTMENT) - PROJ_DEPT_MGR ? (CONTR_DEPT? MgrSSNSSN EMPLOYEE)
- RESULT ? ? Pnumber,Dnum,Ename,Address,Bdate
(PROJ_DEPT_MGR)
67RA Exercise
- 25. Find the names of employees who work on all
- projects controlled by department 5.
- EMPLOYEE Ename, SSN, DOB, Address, Sex, Salary,
SuperSSN, Dno - PROJECT PName, Pnumber, Plocation, Dnum
- WORKS_ON ESSN, Pno, Hours
- DEPT5_PROJS(Pno) ? ? Pnumber (s Dnum5 (PROJECT))
- EMP_PROJ(SSN,Pno) ? ? ESSN,Pno (WORKS_ON)
- RESULT_EMP_SSNS ? EMP_PROJ DEPT5_PROJS
- RESULT ? ? Ename (RESULT_EMP_SSNS EMPLOYEE )
68RA Exercise
- 26. List project numbers for projects that
involve an employee - whose name is Smith, either as a worker or
as a manager of - the department that controls the project.
- EMPLOYEE Ename, SSN, DOB, Address, Sex, Salary,
SuperSSN, Dno - PROJECT PName, Pnumber, Plocation, Dnum
- DEPARTMENT Dname, Dnumber, MgrSSN, MgrStart
- WORKS_ON ESSN, Pno, Hours
- SMITHS(ESSN) ? ? SSN (s EnameSmith (EMPLOYEE))
- SMITH_WORKER_PROJS ? ? Pno (WORKS_ON SMITHS)
- MGRS ? ? Ename,Dnumber (EMPLOYEE ? SSNMgrSSN
DEPARTMENT) - SMITH_MGRS ? s EnameSmith (MGRS)
- SMITH_MANAGED_DEPTS(Dnum) ? ? Dnumber
(SMITH_MGRS) - SMITH_MGR_PROJS(Pno) ? ? Pnumber
(SMITH_MANAGED_DEPTS PROJECT) - RESULT ? SMITH_WORKER_PROJS ? SMITH_MGR_PROJS
69RA Exercise
- 27. Retrieve the names of employees who have no
- dependents.
- EMPLOYEE Ename, SSN, DOB, Address, Sex, Salary,
SuperSSN, Dno - DEPENDENT ESSN, Dep_Name, Sex, DOB,
Relationship - ALL_EMPS ? ? SSN (EMPLOYEE)
- EMPS_WITH_DEPS(SSN) ? ? ESSN (DEPENDENT)
- EMPS_WITHOUT_DEPS ? ( ALL_EMPS EMPS_WITH_DEPS)
- RESULT ? ? Ename (EMPS_WITHOUT_DEPS EMPLOYEE )
70RA Exercise
- 28. List the names of managers who have at least
- one dependent.
- EMPLOYEE Ename, SSN, DOB, Address, Sex, Salary,
SuperSSN, Dno - DEPARTMENT Dname, Dnumber, MgrSSN, MgrStart
- DEPENDENT ESSN, Dep_Name, Sex, DOB,
Relationship - MGR(SSN) ? ? MgrSSN (DEPARTMENT)
- EMPS_WITH_DEPS(SSN) ? ? ESSN (DEPENDENT)
- MGRS_WITH_DEPS ? (MGRS n EMPS_WITH_DEPS)
- RESULT ? ? Ename (MGRS_WITH_DEPS EMPLOYEE)
71Module 6 - Review
- Relational algebra gives the theoretical
foundations for Relational Query Languages - Relational algebra operations operate on
entire relations, - and produce results which are also
relations - Relational algebra expressions, consisting of
a sequence - of relational algebra operators, specify a
high-level - procedure to achieve a query result
- However, relational algebraic query formulation
is procedural, and therefore focuses on how a
query result can be achieved - Declarative query languages, e.g., SQL, allow the
user to specify what info the user wants rather
than how the result is to be obtained
72Recommended Readings
- Elmasri Navathe
- Chapter 7
73Next ...
- Module 7
- Structured Query
- Language (SQL)