Title: Database Management Systems
1- Database Management Systems
2What is a DBMS
- Database management systems
- Provide efficient and secure access to large
amounts of data. - Address problems such as
- How to store the data efficiently
- How to query data efficiently
- How to update the data securely (by multiple
users) - Contrast with using file systems for the same task
3Relational Databases
- Based on the relational model
- Separates the logical view from the physical view
of the data.
4Querying a Database
- Find all the students who have taken SYSC3001 in
Winter 2009. - S(tructured) Q(uery) L(anguage)
- select E.name
- from Enroll E
- where E.courseSYSC3001 and
- E.termWinter 2009
- Query processor figures out how to answer the
query efficiently.
5Database Industry
- Relational databases are a great success of
theoretical ideas. - Big 3 DBMS companies are among the largest
software companies in the world. - IBM (with DB2) and Microsoft (SQL Server,
Microsoft Access) are also important players. - 20B industry
- Challenged by object-oriented DBMS.
6Functionality of a DBMS
- Storage management
- Abstract data model
- High level query and data manipulation language
- Efficient query processing
- Transaction (concurrency) processing
- Resiliency recovery from crashes
- Interface with programming languages
7Why Use a DBMS?
- Data independence and efficient access.
- Reduced application development time.
- Data integrity and security.
- Uniform data administration
- Concurrent access and recovery from crashes.
8The Study of DBMS
- Several aspects
- Modeling and design of databases
- Database programming querying and update
operations - Database implementation
- DBMS study cuts across many fields of Computer
Science OS, languages, software engineering, AI,
Logic, multimedia, theory...
9Database Design
- Why do we need it?
- Agree on structure of the database before
deciding on a particular implementation. - Consider issues such as
- What entities to model?
- How entities are related?
- What constraints exist in the domain?
- How to achieve good designs?
10Database Design Formalisms
- Object Definition Language (ODL)
- Closer in spirit to object-oriented models
- Entity/Relationship model (E/R)
- More relational in nature.
- Both can be translated (semi-automatically) to
relational schemas (with varying amount of pain). - New comers UML and XML
11Entity / Relationship Diagrams
Objects entities Classes
entity sets Attributes are the names
of roles played by some domain (a set of atomic
values)in a relation (a table of values or file
of records). Relationships are associations
among entities.
Product
address
buys
12 name
category
name
price
makes
Company
Product
stockprice
buys
employs
Person
name
ssn
address
13Multi-way Relationships
How do we model a purchase relationship between
buyers, products and stores?
Product
Purchase
Store
Person
14Roles in Relationships
What if we need an entity set twice in one
relationship?
Product
Purchase
Store
buyer
salesperson
Person
15Attributes on Relationships
date
Product
Purchase
Store
Person
16The Relational Data Model
Database Model (ODL, E/R, UML)
Relational Schema
Physical storage
Complex file organization and index structures.
ODL definitions Diagrams (E/R, UML)
Tables row names attributes rows tuples
17Terminology
Product
18More Terminology
19More on Tuples
20Updates
The database maintains a current database
state. Updates to the data 1) add a
tuple 2) delete a tuple 3) modify an
attribute in a tuple Updates to the data happen
very frequently. Updates to the schema
relatively rare. Rather painful.
21From E/R Diagrams to Relational Schema
- relationships are already independent
entities - only atomic types exist in the E/R
model. Entity sets
relations Relationships
relations Special care for weak entity sets
existence depends on existence of another entity.
Example Dependent of Employee .
22 name
category
name
price
makes
Company
Product
Stock price
buys
employs
Person
name
ssn
address
23Entity Sets to Relations
name
category
price
Product
Product Name
Category Price iPod
gadgets
150
24Relationships to Relations
Start Year
name
category
name
makes
Company
Product
Stock price
Relation MAKES (watch out for attribute name
conflicts) Product-name
Product-Category Company-name Starting-year
iPod gadgets
Apple 1981
25Problems in Designing Schema
Name SIN Phone Number
Address
Fred 123-321-99 (201)555-1234 1234
Main ...
Fred 123-321-99 (206)572-4312 1234
Main ... Joe 909-438-44 (908)464-0028
987 Baseline ... Joe 909-438-44
(212)555-4000 987 Baseline ...
Problems - redundancy - update anomalies
- deletion anomalies Image a Book entity Name
ISBN Publisher Phone Address
26Relation Decomposition
Break the relation into two relations
Name SIN Address
Fred 123-321-99 1234 Main
St. ....
Joe 909-438-44 987
Baseline ....
Name Phone Number
Fred (201) 555-1234
Fred (206) 572-4312 Joe
(908) 464-0028 Joe (212) 555-4000
27Anomalies
- The updated programs will not operate correctly.
- Examples EMP_DEPT relation
- EName SIN BDate ADDR DNumber DName DMgrSIN
- Insertion anomalies It is difficult to insert a
new department that has no employees as yet in
the EMP_DEPT relation. - Deletion anomalies If we delete from the
EMP_DEPT an employee tuple that happens to
represent the last employee working for a
particular department, the information concerning
that department is lost from the database. - Update anomalies In EMP_DEPT relation, if we
want to change the value of one of the attributes
of a particular department, say the manager of
department 5, we must update the tuples of all
employees who work in that department otherwise,
the database will become inconsistent.
28Decompositions in General
Let R be a relation with attributes
A , A , A
1
2
n
Create two relations R1 and R2 with attributes
B , B , B
C , C , C
1
2
m
1
2
l
Such that
?
B , B , B
C , C , C
A , A , A
1
2
m
1
2
l
1
2
n
And -- R1 is the projection of R on
-- R2 is the projection of R on
B , B , B
1
2
m
C , C , C
1
2
l
29Boyce-Codd Normal Form
A simple condition for removing anomalies from
relations A relation R is in BCNF if and only
if Whenever there is a nontrivial
dependency for R , it is the case that
a super-key for R.
B
A , A , A
1
2
n
A , A , A
1
2
n
In English (though a bit vague) Whenever a
set of attributes of R is determining another
attribute, should determine all the
attributes of R.
30Example
Name SIN Phone
Number Addr
Fred 123-321-99 (201)
555-1234 ..
Fred 123-321-99 (206)
572-4312 . Joe
909-438-44 (908) 464-0028 . Joe
909-438-44 (212) 555-4000
.
What are the dependencies? What are the
keys? Is it in BCNF?
31And Now?
Name SSN Addr
Fred 123-321-99
Joe 909-438-44
Name Phone Number
Fred (201) 555-1234
Fred (206) 572-4312 Joe
(908) 464-0028 Joe (212) 555-4000
32More Examples
- EMP_DEPT
- ENAME SIN BDATE ADDR DNUM DNAME
DMGRSIN - Whats wrong?
- How to decompose? Functional dependency.
- Decompose EMP_DEPT into
- EMP
- ENAME SIN BDATE ADDR DNUM
- DEPT
- DNUM DNAME DMGRSIN
33More Examples (contd)
- Example
- EMP_PROJ
- SIN PNUMBER HOURS ENAME PNAME
PLOCATOIN -
- Can be decomposed into
- EP1
- SIN PNUMBER HOURS
- EP2
- SIN ENAME
- EP3
- PNUMBER PNAME PLOCATOIN
34More Examples (contd)
- EMP
- ENAME Proj_NAME Dep_NAME
- Smith X john
- Smith y anna
- Smith x anna
- Smith y john
- Brown w jim
- Brown x jim
- Brown y jim
- Brown z jim
- Brown w Joan
- Brown x joan
- Brown y joan
- Brown z joan
- Brown w bob
- Brown x bob
- Brown y bob
- Brown z bob
35More Examples (contd)
- EMP_PROJECTS
- ENAME Proj_NAME
- Smith x
- Smith y
- Brown w
- Brown x
- Brown y
- Brown z
- EMP_DEPENDENTS
- ENAME Dep_NAME
- Smith anna
- Smith john
- Brown jim
- Brown joan
- Brown bob
36SQL Introduction
Standard language for querying and manipulating
data Structured Query
Language
Many standards out there SQL92, SQL2,
SQL3. Vendors support various subsets of these,
but all of what well be talking about. Basic
form (many many more bells and whistles in
addition) Select attributes From
relations (possibly multiple, joined) Where
conditions (selections)
37Selections
SELECT FROM
Company WHERE countryUSA AND
stockPrice gt 50 You can use
attribute names of the relation(s) used in the
FROM. comparison operators , ltgt,
lt, gt, lt, gt apply arithmetic
operations stockprice2 operations
on strings (e.g., for concatenation).
lexicographic order on strings.
pattern matching s LIKE p special
stuff for comparing dates and times.
38Projections
Select only a subset of the attributes
SELECT name, stock price
FROM Company WHERE
countryUSA AND stockPrice gt 50
Rename the attributes in the resulting table
SELECT name AS company,
stockprice AS price FROM
Company WHERE countryUSA AND
stockPrice gt 50
39Ordering the Results
SELECT name, stock price
FROM Company WHERE
countryUSA AND stockPrice gt 50
ORDERBY country, name
Ordering is ascending, unless you specify the
DESC keyword. Ties are broken by the second
attribute on the ORDERBY list, etc.
40Joins
SELECT name, store
FROM Person, Purchase WHERE
namebuyer AND citySeattle
AND
productgizmo Product ( name, price,
category, maker) Purchase (buyer, seller,
store, product) Company (name, stock price,
country) Person (name, phone number, city)
41Disambiguating Attributes
Find names of people buying telephony products
SELECT Person.name FROM
Person, Purchase, Product WHERE
Person.namebuyer
AND productProduct.name
AND Product.categorytelephony Produc
t ( name, price, category, maker) Purchase
(buyer, seller, store, product) Person( name,
phone number, city)
42Tuple Variables
Find pairs of companies making products in the
same category
SELECT product1.maker, product2.maker
FROM Product AS product1, Product AS
product2 WHERE
product1.categoryproduct2.category
AND product1.maker ltgt
product2.maker
Product ( name, price, category, maker)
43Union, Intersection, Difference
(SELECT name FROM Person WHERE
CitySeattle) UNION (SELECT name FROM
Person, Purchase WHERE buyer name AND
store The Bon)
Similarly, you can use INTERSECT and EXCEPT. You
must have the same attribute names (otherwise
rename).
44Subqueries
SELECT Purchase.product FROM Purchase WHERE
buyer (SELECT name
FROM Person WHERE
social-security-number 123 - 45 - 6789)
In this case, the subquery returns one value. If
it returns more, its a run-time error.
45Subqueries Returning Relations
Find companies who manufacture products bought by
Joe Blow.
SELECT Company.name FROM Company,
Product WHERE Company.namemaker
AND Product.name IN
(SELECT product FROM
Purchase WHERE buyer
Joe Blow)
You can also use s gt ALL R
s gt ANY R
EXISTS R
46Conditions on Tuples
SELECT Company.name FROM Company,
Product WHERE Company.namemaker
AND (Product.name,price) IN
(SELECT product, price)
FROM Purchase
WHERE buyer Joe Blow)
47Correlated Queries
Find movies whose title appears more than once.
SELECT title FROM Movie AS Old WHERE year
lt ANY (SELECT
year FROM Movie
WHERE title
Old.title)
Movie (title, year, director, length)
Movie titles are not unique (titles may reappear
in a later year).
Note scope of variables
48Removing Duplicates
SELECT DISTINCT Company.name FROM
Company, Product WHERE Company.namemaker
AND (Product.name,price) IN
(SELECT product, price)
FROM Purchase
WHERE buyer Joe Blow)
49Conserving Duplicates
The UNION, INTERSECTION and EXCEPT operators
operate as sets, not bags.
(SELECT name FROM Person WHERE
CitySeattle) UNION ALL (SELECT name
FROM Person, Purchase WHERE buyername
AND storeThe Bon)
50Aggregation
SELECT Sum(price) FROM Product WHERE
manufacturerToyota SQL supports several
aggregation operations SUM, MIN, MAX, AVG,
COUNT Except COUNT, all aggregations apply to
a single attribute
SELECT Count() FROM Purchase
51Grouping and Aggregation
Usually, we want aggregations on certain parts of
the relation. Find how much we sold of every
product SELECT product, Sum(price) FROM
Product, Purchase WHERE Product.name
Purchase.product GROUPBY Product.name
1. Compute the relation (I.e., the FROM and
WHERE). 2. Group by the attributes in the
GROUPBY 3. Select one tuple for every group (and
apply aggregation) SELECT can have (1) grouped
attributes or (2) aggregates.
52HAVING Clause
Same query, except that we consider only
products that had at least 100 buyers.
SELECT product, Sum(price) FROM
Product, Purchase WHERE Product.name
Purchase.product GROUPBY Product.name HAVING
Count(buyer) gt 100
HAVING clause contains conditions on aggregates.
53Modifying the Database
We have 3 kinds of modifications insertion,
deletion, update.
Insertion general form -- INSERT INTO
R(A1,., An) VALUES (v1,., vn) Insert a new
purchase to the database INSERT INTO
Purchase(buyer, seller, product, store)
VALUES (Joe, Fred, wakeup-clock-espress
o-machine,
The Sharper Image)
If we dont provide all the attributes of R, they
will be filled with NULL.
We can drop the attribute names if were
providing all of them in order.
54More Interesting Insertions
INSERT INTO PRODUCT(name) SELECT
DISTINCT product FROM Purchase WHERE
product NOT IN (SELECT
name FROM Product)
The query replaces the VALUES keyword. Note the
order of querying and inserting.
55Deletions
DELETE FROM PURCHASE WHERE seller
Joe AND product Brooklyn
Bridge Factoid about SQL there is no way to
delete only a single
occurrence of a tuple that appears twice
in a relation.
56Updates
UPDATE PRODUCT SET price price/2 WHERE
Product.name IN (SELECT
product FROM Sales
WHERE Date today)
57Defining Views
Views are relations, except that they are not
physically stored. They are used mostly in order
to simplify complex queries and to define
conceptually different views of the database to
different classes of users. View purchases of
telephony products CREATE VIEW
telephony-purchases AS SELECT product, buyer,
seller, store FROM Purchase, Product WHERE
Purchase.product Product.name
AND Product.category telephony
58A Different View
CREATE VIEW Seattle-view AS SELECT
buyer, seller, product, store FROM
Person, Purchase WHERE Person.city
Seattle AND
Person.name Purchase.buyer
We can later use the views SELECT
name, store FROM Seattle-view,
Product WHERE Seattle-view.product
Product.name AND
Product.category shoes
Whats really happening when we query a view??
59What is a Transaction?
- Any action that reads from and/or writes to a
database may consist of - Simple SELECT statement to generate a list of
table contents - A series of related UPDATE statements to change
the values of attributes in various tables - A series of INSERT statements to add rows to one
or more tables - A combination of SELECT, UPDATE, and INSERT
statements
60What is a Transaction? (continued)
- A logical unit of work that must be either
entirely completed or aborted - Successful transaction changes the database from
one consistent state to another - One in which all data integrity constraints are
satisfied - Most real-world database transactions are formed
by two or more database requests - The equivalent of a single SQL statement in an
application program or transaction
61Evaluating Transaction Results
- Not all transactions update the database
- SQL code represents a transaction because
database was accessed - Improper or incomplete transactions can have a
devastating effect on database integrity - Some DBMSs provide means by which user can define
enforceable constraints based on business rules - Other integrity rules are enforced automatically
by the DBMS when table structures are properly
defined, thereby letting the DBMS validate some
transactions
62Transaction Properties
- Atomicity
- Requires that all operations (SQL requests) of a
transaction be completed - Durability
- Indicates permanence of databases consistent
state
63Transaction Properties (continued)
- Serializability
- Ensures that the concurrent execution of several
transactions yields consistent results - Isolation
- Data used during execution of a transaction
cannot be used by second transaction until first
one is completed
64Transaction Management with SQL
- ANSI has defined standards that govern SQL
database transactions - Transaction support is provided by two SQL
statements - COMMIT permanent change to a DB
- ROLLBACK undo a change to a DB up to the COMMIT
point - ANSI standards require that, when a transaction
sequence is initiated by a user or an application
program, - it must continue through all succeeding SQL
statements until one of four events occurs
65The Transaction Log
- Stores
- A record for the beginning of transaction
- For each transaction component (SQL statement)
- Type of operation being performed (update,
delete, insert) - Names of objects affected by the transaction (the
name of the table) - Before and after values for updated fields
- Pointers to previous and next transaction log
entries for the same transaction - The ending (COMMIT) of the transaction
66A Transaction Log