Title: Information Retrieval from Relational Databases
1Information Retrieval from Relational Databases
2Chapter Learning Objectives
- Identify and explain the purpose of the three
primary relational algebra operators - Identify and explain the primary components of a
Structured Query Language (SQL) statement - Identify the relational algebra operations
achieved by a given SQL statement - Create a SQL statement to retrieve requested
information from a relational database - Examine a SQL statement and the tables to which
it will be applied and identify the query result - Find errors in a SQL statement
- Create a Microsoft Query-by-Example (QBE) to
retrieve information from relational tables - Examine a Microsoft Access QBE query and the
tables to which it applies and identify the query
result - Find errors in a Microsoft Access QBE query
3Examples of Needs for Multiple Views of One Data
Set
- Cash-basis versus Accrual Accounting
- Weighted Average versus FIFO or LIFO
- Double-Declining Balance Depreciation versus
Straight Line - Foreign Currency Translation
4Querying/Information Retrieval
- Necessary components for effective querying
- A database that is well-designed (e.g., fully
relational) - A query developer who understands the table
structures and the nature of the data in the
tables - A query developer who understands the desired
query output - A query developer who has good logic and
reasoning skills - A query developer who knows the querying
language used to retrieve information from the
enterprise database
5Three Query Languages
- Relational Algebra
- Three main operators Select, Project, Join
- Provides the conceptual basis for SQL and QBE
- Structured Query Language (SQL)
- The user enters commands according to a
pre-defined syntax to retrieve desired data. - Query By Example (QBE)
- The user starts with a sample of the table(s)
columns and marks the fields he or she wants to
include in the answer. Defaults are available for
summarizing and manipulating the data.
6Relational Algebra
- Select
- includes only certain rows in its answer
- Project
- includes only certain columns in its answer
- Join
- combines two or more tables on the basis of one
or more common attributes
7Example Tables (Incomplete Enterprise
Database)from Dunn McCarthy (2004) working
paper
8Relational Algebra SELECT
- Find the cash receipts from Customer 2
(keeping all the details of those cash receipts)
9Relational Algebra PROJECT
Find the customer number, name, and salesperson
number for all customers
10Join Types
- Inner join
- includes only the records from both tables that
have the exact same values in the fields that are
joined - I.e.,
- Outer join
- includes all records from one table, and matches
those records from the other table for which
values in the joined fields are equal - I.e.,
Left Outer Join
Right Outer Join
11Relational Algebra Inner Join
Find all details of customers who have a
salesperson and all available details of each
customers salesperson
12Relational Algebra Left Outer Join
Find all details of all sales and the cash
receipt number and amount applied of any cash
receipts related to those sales
13SQL (Structured Query Language)
- Each query statement follows the same
structureSELECT attribute name(s)FROM table
name(s)WHERE criteria is met
14SQL Statements and Relational Algebra
- SELECT isolates columns
- relational algebras project
- FROM identifies the table(s) involved
- if gt1 table, helps accomplish relational
algebras join - WHERE isolates rows
- relational algebras select
- also helps accomplish relational algebras join
15SQL and Relational Algebra SELECT
Find the cash receipts from Customer 2
(keeping all the details of those cash receipts)
16SQL and Relational Algebra PROJECT
Find the customer number, name, and employee
number for all customers
17SQL and Relational Algebra Inner Join
Find all details of customers who have a
salesperson and all available details of each
customers salesperson
18SQL and Relational Algebra Outer Join
Find all details of all sales and the cash
receipt number and amount applied of any cash
receipts related to those sales
19Mathematical Comparison Operators
- Queries may include mathematical comparison
operators such as - equal to
- lt less than
- lt less than or equal to
- gt greater than
- gt greater than or equal to
- ltgt not equal to
- Included in the WHERE clause
- For text fields, A lt B lt C, etc.
20SQL Mathematical Comparison Operators
- Select Account, BalanceFrom CashWhere
Balancegt50000
21SQL Mathematical Comparison Operators on
Character Attributes
- Select Sale, AmountFrom SaleWhere Employee
ltgt E-10
22Queries with Logical Operators
- Queries may include logical operators AND, OR,
and NOT - AND accomplishes a set intersection answer
includes all instances that meet BOTH conditions - OR accomplishes a set union answer includes all
instances that meet one condition and all
instances that meet the other condition - NOT identifies instances that do not meet one or
more conditions
23Queries with Special Operators
- BETWEEN is used to define the range limits.
- The end points of the range are included
Select Sale, Amount, DateFrom SaleWhere Date
BETWEEN 7/1 and 7/31
24Queries with Special Operators
- IS NULL is used to retrieve attributes for which
the value is null.
Select From CashWhere Balance IS NULL
25Queries with Special Operators
- EXISTS is used to retrieve attributes for which
the value is not null.
Select From CashWhere Balance EXISTS
26Aggregation Functions in Queries
- An aggregation function summarizes the data
values within a field (column) - COUNT - the number of rows that contain a given
value in the field - AVERAGE - the mean value of all rows included in
the answer - SUM - the sum of all rows included in the answer
- MIN - the minimum attribute value for the field
- MAX - the maximum attribute value for the field
27Queries with Horizontal Calculations
- Horizontal calculations combine values from
different fields for each row - Should NOT be included in the same query as an
aggregation function
28Querying Summary
- Querying provides the power of the relational
database model - Querying requires organized thinking and logic
- You have to understand the structure of the
database tables and the nature of the data in
those tables. - You must identify which table(s) are needed for
each query, and determine the appropriate
manipulations that need to be made in the
appropriate sequence - Remember to separate horizontal calculations from
vertical aggregations - Comprehensive testing of queries is crucial
before releasing queries for use by general users