Title: IELM 511: Information System design
1IELM 511 Information System design
Introduction
Part 1. ISD for well structured data relational
and other DBMS
Info storage (modeling, normalization) Info
retrieval (Relational algebra, Calculus, SQL) DB
integrated APIs
ISD for systems with non-uniformly structured data
Basics of web-based IS (www, web2.0, ) Markups,
HTML, XML Design tools for Info Sys UML
Part III (one out of)
APIs for mobile apps Security, Cryptography IS
product lifecycles Algorithm analysis, P, NP, NPC
2Agenda
Relational Algebra
Relational Calculus
Structured Query Language (SQL)
DB APIs
3Recall our Bank DB design
BRANCH( b-name, city, assets)
CUSTOMER( cssn, c-name, street, city, banker,
banker-type)
LOAN( l-no, amount, br-name)
PAYMENT( l-no, pay-no, date, amount)
EMPLOYEE( e-ssn, e-name, tel, start-date, mgr-ssn)
ACCOUNT( ac-no, balance)
SACCOUNT( ac-no, int-rate)
CACCOUNT( ac-no, od-amt)
BORROWS( cust-ssn, loan-num)
DEPOSIT( c-ssn, ac-num, access-date)
DEPENDENT( emp-ssn, dep-name)
4Background Algebra
What is an algebra ?
Study of systems of mathematical objects and
operations defined on the objects
Examples of algebras
Integers, with operations , -, , /,
Real numbers, with operations , -, , /,
Vectors, with operations , -, ?, ,
Booleans, with operations ?, ?, ?,
5Relational Algebra
Relational Algebra objects instances of
relational schemas (namely, tables) operations
s, P, , set-theoretic operations ?, ?, -,
Key concepts Operator arguments Arguments
of operators are instances of schemas
(table) Operation closure The outcome of the
operator is an instance of schema Expressions
A sequence of operations can be written as an
expression Operator precedence The sequence
of application of operations in an expression is
fixed.
Compare these concepts to those in other algebras
6Relational Algebra select, s
Notation in remainder, we will refer to an
instance of a schema as a table
s unary operator, input one table output
table
LOAN
loan_number amount branch_name
L17 1000 Downtown
L23 2000 Redwood
L15 1500 Pennyridge
L93 500 Mianus
L11 900 Round Hill
L16 1300 Pennyridge
samount gt 1200(LOAN)
loan_number amount branch_name
L23 2000 Redwood
L15 1500 Pennyridge
L16 1300 Pennyridge
7Relational Algebra select, s
q conditions of s operator - Denote the
criterion for selection of a given tuple -
Must be evaluated one tuple at a time - Must
evaluate to true or false - Output set
of tuples for which q-conditions are true
LOAN
s(amount gt 1200) ? (branch_name Pennyridge)
(LOAN)
loan_number amount branch_name
L17 1000 Downtown
L23 2000 Redwood
L15 1500 Pennyridge
L93 500 Mianus
L11 900 Round Hill
L16 1300 Pennyridge
loan_number amount branch_name
L15 1500 Pennyridge
L16 1300 Pennyridge
8Relational Algebra project, P
P unary operator, input one table output
table
Plist of attributes (TABLE)
Ploan_number, amount (LOAN)
LOAN
loan_number amount branch_name
L17 1000 Downtown
L23 2000 Redwood
L15 1500 Pennyridge
L93 500 Mianus
L11 900 Round Hill
L16 1300 Pennyridge
loan_number amount
L17 1000
L23 2000
L15 1500
L93 500
L11 900
L16 1300
9Relational Algebra project, P
Project returns a set of tuples the number of
rows may be smaller that input
Example Find the names of all branches that have
given loans
Pbranch_name (LOAN)
LOAN
loan_number amount branch_name
L17 1000 Downtown
L23 2000 Redwood
L15 1500 Pennyridge
L93 500 Mianus
L11 900 Round Hill
L16 1300 Pennyridge
branch_name
Downtown
Redwood
Pennyridge
Mianus
Round Hill
10Relational Algebra combining operations
Example Find the names of all branches that have
given loans larger than 1200
Pbranch_name ( s(amount gt 1200) (LOAN))
LOAN
loan_number amount branch_name
L17 1000 Downtown
L23 2000 Redwood
L15 1500 Pennyridge
L93 500 Mianus
L11 900 Round Hill
L16 1300 Pennyridge
branch_name
Redwood
Pennyridge
11Relational Algebra combining operations
Note expressions impose a sequence in which
operations are perfromed
Example Find the names of all branches that have
given loans larger than 1200
X ( s(amount gt 1200) (LOAN))
Y Pbranch_name (X)
Y
LOAN
X
loan_number amount branch_name
L17 1000 Downtown
L23 2000 Redwood
L15 1500 Pennyridge
L93 500 Mianus
L11 900 Round Hill
L16 1300 Pennyridge
branch_name
Redwood
Pennyridge
loan_number amount branch_name
L23 2000 Redwood
L15 1500 Pennyridge
L16 1300 Pennyridge
12Relational Algebra join,
Join is useful when the information required is
in two (or more) tables. Tables are sets of
tuples, and the join of two tables produces a
cartesian product of the two sets
Background (set theory) cartesian product, A B
(x, y) x ? A, y ? B Example A 1, 2,
3 , B a, s A B (1, a), (1, s), (2,
a), (2, s), (3, a), (3, s)
13Relational Algebra join,
BORROWS
LOAN
customer loan_no
111-12-0000 L17
222-12-0000 L23
333-12-0000 L15
444-00-0000 L93
666-12-0000 L17
111-12-0000 L11
999-12-0000 L17
777-12-0000 L16
loan_number amount branch_name
L17 1000 Downtown
L23 2000 Redwood
L15 1500 Pennyridge
L93 500 Mianus
L11 900 Round Hill
L16 1300 Pennyridge
5 columns
Cartesian product, BORROWS LOAN
customer loan_no loan_number amount branch_name
111-12-0000 L17 L17 1000 Downtown
111-12-0000 L17 L23 2000 Redwood
111-12-0000 L17 L15 1500 Pennyridge
777-12-0000 L16 L16 1300 Pennyridge
48 rows
14Relational Algebra join,
Usually, a cartesian product produces several
tuples with un-related information. q-join
specifies a q-condition (same as a selection
criterion) to restrict the output of a join to
meaningful tuples only.
Example Find the loan no, amount and branch name
for all customers.
BORROWS loan_no loan_number LOAN
5 columns
customer loan_no loan_number amount branch_name
111-12-0000 L17 L17 1000 Downtown
222-12-0000 L23 L23 2000 Redwood
333-12-0000 L15 L15 1500 Pennyridge
444-00-0000 L93 L93 500 Mianus
666-12-0000 L17 L17 1000 Downtown
111-12-0000 L11 L11 900 Round Hill
999-12-0000 L17 L17 1000 Downtown
777-12-0000 L16 L16 1300 Pennyridge
8 rows Why ?
15Relational Algebra dot-notation in join,
Two tables being joined may have the same
attribute name (possibly denoting two different
things). To distinguish the columns in the
q-join, the names of attributes use dot-notation
The following are all equivalent
C BORROWS loan_no loan_number LOAN
C BORROWS BORROWS.loan_no LOAN.loan_number
LOAN
A BORROWS B LOAN C A A.loan_no
B.loan_number B
16Relational Algebra set theoretic operations, ?
Since a table is a set of tuples, it is possible
to make a union of two tables. BUT we require
closure (union of two tables should be a
table). ? Union is defined for two tables with
identical schemas.
Example Find the names of customers who have
either a deposit, or a loan with the bank
name
Jones
Smith
Hayes
Curry
Turner
Williams
Adams
Johnson
Brooks
Lindsay
A Pcustomer (BORROWS) ? Pc_ssn
(DEPOSIT) RESULT Pname (A A.customer
CUSTOMER.ssn CUSTOMER )
17Relational Algebra set theoretic operations, ?
Other set theoretic operations can be applied
with same rules.
Example Find the names of customers who have
both, a deposit and a loan with the bank
A Pcustomer (BORROWS) ? Pc_ssn
(DEPOSIT) RESULT Pname (A A.customer
CUSTOMER.ssn CUSTOMER )
RESULT
customer
111-12-0000
222-12-0000
333-12-0000
444-00-0000
666-12-0000
999-12-0000
777-12-0000
customer
111-12-0000
222-12-0000
333-12-0000
c_ssn
888-12-0000
222-12-0000
333-12-0000
555-00-0000
111-12-0000
000-12-0000
name
Jones
Smith
Hayes
?
18Relational Algebra set theoretic operations, -
Other set theoretic operations (same rules).
Example Find the names of customers who have a
loan but no deposits.
A Pcustomer (BORROWS) - Pc_ssn
(DEPOSIT) RESULT Pname (A A.customer
CUSTOMER.ssn CUSTOMER )
RESULT
customer
111-12-0000
222-12-0000
333-12-0000
444-00-0000
666-12-0000
999-12-0000
777-12-0000
customer
888-12-0000
555-12-0000
000-12-0000
c_ssn
888-12-0000
222-12-0000
333-12-0000
555-00-0000
111-12-0000
000-12-0000
name
Johnson
Turner
Lindsay
-
19Relational Algebra set theoretic operations,
Set division extends the meaning of integer
division, in the sense that it cancels away
common multiples. It is useful in answering for
all queries.
Example Do all the loan officers have the same
manager ? A solution Find the ssn of the person
who manages all the loan officers.
A Pbanker (sb_typeLO (CUSTOMER) ) B
Pmgr_ssn, e_ssn (EMPLOYEE) RESULT B A
A
B
mgr_ssn e_ssn
321-32-4321 111-22-3333
111-22-3333 333-11-4444
111-22-3333 123-45-6789
321-32-4321 555-66-8888
888-99-9999 987-65-4321
777-77-7777 888-99-9999
777-77-7777 321-32-4321
null 777-77-7777
RESULT
banker
333-11-4444
123-45-6789
mgr_ssn
111-22-3333
Note for this example, we have to specify
that the common divisor in B is e_ssn.
20Relational Algebra set theoretic operations,
Generic definition of
Attribute restrictions A B is defined only for
A( R, C) and B( C), where R, C are sets of
attributes. Output The output contains each
tiR such that ? tuples tjC ? B, ? a tuple, t
? A in which tC tjC and tR tiR.
common attribute set, C
OUTPUT
attribute set, R
r1 rm
c1 ck
r1 rm c1 ck
t1 tn
21Relational Algebra concluding remarks
RA provides a formal language to get information
from the database RA can potentially answer any
query, as long as the query pertains to exactly
one row of some table derivable using
expressions. Limitations of RA aggregation and
summary information Examples find the average
amount of assets in the branches find the total
assets of the bank, RA is procedural, namely,
an expression of RA specifies a step by step
procedure for computing the result.
22Relational Calculus (RC)
Background what is a calculus ? RC is based on
a formal system in logic, first order predicate
calculus (fopc) A formal system has a set
of symbols rules about how the symbols can
be arranged in well formed formulae (wff) a
(logical) mechanism to derive if a wff is
true/false. additionally, fopc allows wff
with variables and quantifiers (?, ?).
A query in RC takes the form t P(t)
Meaning the set of all tuples, t, for which
some Proposition, P(t) is true. P is also called
a predicate.
23Relational Calculus (RC) examples
1. Report the loans that exceed 1200 t t
? LOAN ? tamount gt 1200
2. Find the names of customers who took a loan
from the Pennyridge branch. tname ? s ?
BORROWS ? scustomer tssn ? ? u ? LOAN ?
uloan_number sloan_no ?
ubranch_name Pennyridge
24Relational Calculus (RC) remarks
RC is non-procedural any way that the predicate
P can be evaluated is valid. RC is the formal
basis for Structured Query Language (SQL) SQL is
the de facto standard language for all RDBMSs In
terms of functionality (i.e. the power to get
some information from any DB) RA and RC are
equivalent). Namely, any query that can be
written in RC has an equivalent RA formula, and
vice versa. Advantage of RC (over RA)
conceptually, it is better to allow the user
to define the logic of the query, but leave the
procedure for computing it to the program why ?.
25Bank tables..
BRANCH
EMPLOYEE
branch_name city assets
Downtown Brooklyn 9000000
Redwood Palo Alto 2100000
Pennyridge Horseneck 1700000
Mianus Horseneck 400000
Round Hill Horseneck 8000000
Pownal Bennington 300000
North Town Rye 3700000
Brighton Brooklyn 7100000
e_ssn e_name tel start_date mgr_ssn
111-22-3333 Jones 12345 Nov-2005 321-32-4321
333-11-4444 Smith 54321 Mar-1998 111-22-3333
123-45-6789 Lee 54321 Mar-1998 111-22-3333
555-66-8888 Turner 55555 Aug-2002 321-32-4321
987-65-4321 Jones 87621 Mar-1995 888-99-9999
888-99-9999 Chan 87654 Feb-1980 777-77-7777
321-32-4321 Adams 77777 Feb-1990 777-77-7777
777-77-7777 Black 99111 Jan-1980 null
26CUSTOMER
DEPOSIT
ssn name street city banker b_type
111-12-0000 Jones Main Harrison 321-32-4321 CRM
222-12-0000 Smith North Rye 321-32-4321 CRM
333-12-0000 Hayes Main Harrison 321-32-4321 CRM
444-12-0000 Curry North Rye 333-11-4444 LO
555-12-0000 Turner Putnam Stamford 888-99-9999 DO
666-12-0000 Williams Nassau Princeton 333-11-4444 LO
777-12-0000 Adams Spring Pittsfield 123-45-6789 LO
888-12-0000 Johnson Alma Palo Alto 888-99-9999 DO
999-12-0000 Brooks Senator Brooklyn 123-45-6789 LO
000-12-0000 Lindsay Park Pittsfield 888-99-9999 DO
c_ssn ac_num accessDate
888-12-0000 A101 Jan 1, 09
222-12-0000 A215 Feb 1, 09
333-12-0000 A102 Feb 28, 09
555-00-0000 A305 Mar 10, 09
888-12-0000 A201 Mar 1, 98
111-12-0000 A217 Mar 1, 09
000-12-0000 A101 Feb 25, 09
BORROWS
customer loan_no
111-12-0000 L17
222-12-0000 L23
333-12-0000 L15
444-00-0000 L93
666-12-0000 L17
111-12-0000 L11
999-12-0000 L17
777-12-0000 L16
LOAN
loan_number amount branch_name
L17 1000 Downtown
L23 2000 Redwood
L15 1500 Pennyridge
L93 500 Mianus
L11 900 Round Hill
L16 1300 Pennyridge
27References and Further Reading
Silberschatz, Korth, Sudarshan, Database Systems
Concepts, McGraw Hill
Next SQL and DB APIs