Title: Database Management Systems
1Database Management Systems
- Entity-Relationship Model
2Database Design
- Why do we need it?
- Agree on structure of the database before
deciding on a particular implementation. - Consider issues such as
- What entities to model
- How entities are related
- What constraints exist in the domain
- How to achieve good designs
3Database Design Formalisms
- 1. Object Definition Language (ODL)
- Closer in spirit to object-oriented models
- 2. Entity/Relationship model (E/R)
- More relational in nature.
- Both can be translated (semi-automatically) to
relational schemas - ODL to OO-schema direct transformation (C or
Smalltalk based system).
4Purpose of E/R Model
- The E/R model allows us to sketch the design of a
database informally. - Designs are pictures called entity-relationship
diagrams. - Fairly mechanical ways to convert E/R diagrams to
real implementations like relational databases
exist.
5Entity / Relationship Diagrams
Product
Entities Attributes Relationships between
entities
address
buys
6 name
category
name
price
makes
Company
Product
stockprice
buys
employs
Person
name
ssn
address
7University Example
- A college contains many departments
- Each department can offer any number of courses
- Many instructors can work in a department
- An instructor can work only in one department
- For each department there is a Head
- An instructor can be head of only one department
- Each instructor can take any number of courses
- A course can be taken by only one instructor
- A student can enroll for any number of courses
- Each course can have any number of students
8Modeling
- A database can be modeled as
- a collection of entities,
- relationship among entities.
- An entity is an object that exists and is
distinguishable from other objects. - Example person, company, event, plant
- Entities have attributes
- Example people have names and addresses
- An entity set is a set of entities of the same
type that share the same properties. - Example set of all persons, companies, trees,
holidays
9Entity Sets instructor and student
instructor_ID instructor_name
student-ID student_name
10ER Case Study
Banks Database
- Each bank has a unique name.
- Each branch has a number, name, address (number,
street, city), and set of phones. - Customer includes their name, set of address
(P.O. Box, city, zip code, country), set of
phones, and social security number. - Accounts have numbers, types (e.g. saving,
checking) and balance. Other branches might use
the same designation for accounts. So to name an
account uniquely, we need to give both the branch
number to which this account belongs to and the
account number. - Not all bank customers must own accounts and a
customer may have at most 5 accounts in the bank. - An account must have only one customer.
- A customer may have many accounts in different
branches. - Â
11Relationship Sets
- A relationship is an association among several
entities - Example 44553 (Ahmed) advisor 22222
(Hassan) student entity relationship set
instructor entity - A relationship set is a mathematical relation
among n ? 2 entities, each taken from entity sets - (e1, e2, en) e1 ? E1, e2 ? E2, , en ?
Enwhere (e1, e2, , en) is a relationship - Example
- (44553,22222) ? advisor
12Relationship Set advisor
13Relationship Sets (Cont.)
- An attribute can also be property of a
relationship set. - For instance, the advisor relationship set
between entity sets instructor and student may
have the attribute date which tracks when the
student started being associated with the advisor
14What is a Relation ?
- A mathematical definition
- if A, B are sets, then a relation R is a subset
of A x B - A1,2,3, Ba,b,c,d,
- R (1,a), (1,c), (3,b)
- - makes is a subset of Product x Company
15Relationships
- Relationships indicate a meaningful connection
between two entity types - Relationships may have attributes, but they
cannot have key attributes. - Identifying relationships connect a weak entity
type to some other entity type - indicates where the weak entity gets a key to
complete its own partial key
16Multiplicity of E/R Relations
- one-one
- many-one
- many-many
17Mapping Cardinalities
One to many
One to one
Note Some elements in A and B may not be mapped
to any elements in the other set
18Mapping Cardinalities
Many to one
Many to many
Note Some elements in A and B may not be mapped
to any elements in the other set
19One-to-One Relationship
- one-to-one relationship between an instructor and
a student - an instructor is associated with at most one
student via advisor - and a student is associated with at most one
instructor via advisor
20One-to-Many Relationship
- one-to-many relationship between an instructor
and a student - an instructor is associated with several
(including 0) students via advisor - a student is associated with at most one
instructor via advisor,
21Many-to-Many Relationship
- An instructor is associated with several
(possibly 0) students via advisor - A student is associated with several (possibly 0)
instructors via advisor
22Alternative Notation for Cardinality Limits
- Cardinality limits can also express participation
constraints
23Multi-way Relationships
How do we model a purchase relationship between
buyers, products and stores?
Can still model as a mathematical set (how ?)
24Relational Roles
- It is sometimes convenient to name an
entitys role in a relationship. - particularly useful in recursive relationships
- removes ambiguity in direction of relationship
25Roles in Relationships
What if we need an entity set twice in one
relationship?
Product
Purchase
Store
buyer
salesperson
Person
26Data Modeling Case Study
- The following is description by a pharmacy owner
- Ahmed Hassan catches a cold and what he
suspects is a flu virus. He makes an appointment
with his family doctor who confirm his diagnosis.
The doctor prescribes an antibiotic and nasal
decongestant tablets. - Ahmed leaves the doctor's office and drives to
his local drug store. The pharmacist packages
the medication and types the labels for pill
bottles. - The label includes information about customer,
the doctor who prescribe the drug, the drug
(e.g., Penicillin), when to take it, and how
often, the content of the pill (250 mg), the
number of refills, expiration date, and the date
of purchase." - Please develop a data model for the entities and
relationships within the context of pharmacy.
27Attributes
- An entity is represented by a set of attributes,
that is descriptive properties possessed by all
members of an entity set. - Example
- instructor (ID, name, street, city,
salary ) course (course_id, title, credits) - Domain the set of permitted values for each
attribute - Attribute types
- Simple and composite attributes.
- Single-valued and multivalued attributes
- Example multivalued attribute phone_numbers
- Derived attributes
- Can be computed from other attributes
- Example age, given date_of_birth
28Sets and Derived Attributes
- Multivalued attributes ? double lined oval
- multivalued set valued
- that there may be more than one value for the
attribute. - Derived attributes ? dashed line ovals
- the attribute is computed from other data
29Composite Attributes
- Composite attributes ? tree
- composed of other attributes.
- used for a set of related attributes, when the
set is not a conceptual entity - the composite doesnt have identity it doesnt
have a key
Address
City
ZipCode
State
Street
CompositeAttribute
30Composite Attributes
31(No Transcript)
32Attributes on Relationships
date
Product
Purchase
Store
Person
33Converting Multi-way Relationships to Binary
ProductOf
date
Product
Purchase
StoreOf
Store
BuyerOf
Person
34ER Case Study
Television Series Database
A Television network wishes to create a database
to keep track of its TV series. A television
series has one or more episode. Television series
identified by name and season number, and
includes their production company name and
Num_of_Episodes ( i.e. total number of episodes
in a specific season of a series ). Episode of
a specific season of a series is identified by
episode number and has a title and a length. No
episode can exist without a corresponding
television series. Also each episode has only one
writer. A writer is identified by name, and also
has birth date and a literary agency that
represents him or her. An actor appears as a
performer in a television series or a guest star
on an episode. An actor is identified by name
and also has a nationality and birth date. An
actor plays a particular character in a
television series or episode.
35Keys in E/R Diagrams
- Every entity must have a key
name
category
price
Product
36Attributes and Keys
- Key attributes must be unique for each entity
- Keys are used to identify particular entities
- Partial keys are only partially unique
- used for weak entity types
37Keys
- A super key of an entity set is a set of one or
more attributes whose values uniquely determine
each entity. - A candidate key of an entity set is a minimal
super key - ID is candidate key of instructor
- course_id is candidate key of course
- Although several candidate keys may exist, one of
the candidate keys is selected to be the primary
key.
38 Keys in E/R Diagrams
name
category
Underline
price
Product
No formal way to specify multiple keys in
E/R diagrams
Person
name
ssn
address
39Entity With Composite, Multivalued, and Derived
Attributes
40E-R Diagrams
- Rectangles represent entity sets.
- Diamonds represent relationship sets.
- Attributes listed inside entity rectangle
- Underline indicates primary key attributes
41 name
category
name
price
makes
Company
Product
stockprice
buys
employs
Person
name
ssn
address
42Relationship Sets with Attributes
43Entity Set to Relation
name
category
price
Product
Product(name, category, price) name
category
price gizmo
gadgets 19.99
44 Classroom Design Exercise
- Imagine we are creating a database for a dorm,
which includes a cooperative kitchen. - We want to record certain information about each
resident. What? - Not all residents belong to the kitchen coop.
Those that do interact in various ways - They take turns at various jobs preparer,
cleanup, buyer (for supplies). No one should
have two jobs on one day. - They may or may not be vegetarian. Each meal must
have at least one vegetarian entreé. - They pay fees to the coop.
- For each meal, there is a menu. Each menu item
requires certain ingredients, which must be on
hand.
45Constraints in E/R Diagrams
Finding constraints is part of the modeling
process. Commonly used constraints Keys
social security number uniquely identifies a
person. Single-value constraints a person
can have only one father. Referential
integrity constraints if you work for a company,
it
must exist in the database. Other
constraints peoples ages are between 0 and 150.
46Other Constraints
makes
lt100
Company
Product
What does this mean ?
47Entity Types
- Entity types ?boxes Weak entity type ? double
box
DEPENDENT
Weak Entity
48Weak Entity Sets
Entity sets are weak when their key comes from
other classes to which they are related.
affiliation
University
Team
number
sport
name
49Weak Entity Sets
- Sometimes an E.S. E s key comes not (completely)
from its own attributes, but from the keys of one
or more E.S.s to which E is linked by a
supporting many-one relationship. - Called a weak E.S.
- Represented by putting double rectangle around E
and a double diamond around each supporting
relationship. - Many-one-ness of supporting relationship
(includes 1-1) essential. - With many-many, we wouldn't know which entity
provided the key value. - Exactly one also essential, or else we might
not be able to extract key attributes by
following the supporting relationship.
50Example Logins (Email Addresses)
- Login name user name host name, e.g.,
ark_at_soe.ucsc.edu. - A login entity corresponds to a user name on a
particular host, but the passwd table doesnt
record the host, just the user name, e.g., ark. - Key for a login the user name at the host
(which is unique for that host only) the IP
address of the host (which is unique globally).
name
name
_at_
_at_
Logins
Hosts
51Example Schema
DependentOf
WorksOn
52Design Principles
- Setting client has (possibly vague) idea of what
he/she wants. You must design a database that
represents these thoughts and only these
thoughts. - Avoid redundancy
- saying the same thing more than once.
- Wastes space and encourages inconsistency.
- Example
- Good
name
addr
name
ManfBy
product
Manfs
53Example
- Bad repeats manufacturer address for each beer
they manufacture. - Bad manufacturers name said twice.
name
manf
product
Manf addr
name
addr
name
manf
ManfBy
product
Manfs
54Entity Sets Vs. Attributes
- You may be unsure which concepts are worthy of
being entity sets, and which are handled more
simply as attributes. - Especially tricky for the class design project,
since there is a temptation to create needless
entity sets to make project larger. - Wrong
- Right
name
name
ManfBy
Cars
Manfs
name
manf
Cars
55 Intuitive Rule for E.S. Vs. Attribute
- Make an entity set only if it either
- Is more than a name of something i.e., it has
nonkey attributes or relationships with a number
of different entity sets, or - Is the many in a many-one relationship.
56 Don't Overuse Weak E.S.
- There is a tendency to feel that no E.S. has its
entities uniquely determined without following
some relationships. - However, in practice, we almost always create
unique ID's to compensate social-security
numbers, VIN's, etc. - The only times weak E.S.'s seem necessary are
when - We can't easily create such ID's e.g., no one is
going to accept a species ID as part of the
standard nomenclature (species is a weak E.S.
supported by membership in a genus). - There is no global authority to create them,
e.g., crews and studios.
57Notation Summary
58(No Transcript)
59(No Transcript)
60Equivalent Schema defined in UML
61Design an ER schema for the following enterprise
62(No Transcript)
63University Example
- A college contains many departments
- Each department can offer any number of courses
- Many instructors can work in a department
- An instructor can work only in one department
- For each department there is a Head
- An instructor can be head of only one department
- Each instructor can take any number of courses
- A course can be taken by only one instructor
- A student can enroll for any number of courses
- Each course can have any number of students
64Steps in ER Modeling
- Step 1 Identify the Entities
- DEPARTMENT
- STUDENT
- COURSE
- INSTRUCTOR
65Steps in ER Modeling
- Step 2 Find the relationships
- One course is enrolled by multiple students and
one student enrolls for multiple courses, hence
the cardinality between course and student is
Many to Many. - The department offers many courses and each
course belongs to only one department, hence the
cardinality between department and course is One
to Many. - One department has multiple instructors and one
instructor belongs to one and only one department
, hence the cardinality between department and
instructor is one to Many. - Each department there is a Head of department
and one instructor is Head of department
,hence the cardinality is one to one . - One course is taught by only one instructor, but
the instructor teaches many courses, hence the
cardinality between course and instructor is
many to one.
66Steps in ER Modeling
- Step 3 Identify the key attributes
- Deptname is the key attribute for the Entity
Department, as it identifies the - Department uniquely.
- Course (CourseId) is the key attribute for
Course Entity. - Student (Student Number) is the key attribute
for Student Entity. - Instructor Name is the key attribute for
Instructor Entity. - Step 4 Identify other relevant attributes
- For the department entity, the relevant
attribute is location - For course entity, course name, duration,
prerequisite - For instructor entity, room, telephone
- For student entity, student name, date of birth
67Steps in ER Modeling
- Step 5 Draw complete E-R diagram with all
attributes including Primary Key
68E-R Diagram for a University Enterprise
69Case Study 2
- Design a DB representing cities, counties, and
states in the US - For states, record the name, population, and
state capital (a city). - For counties, record the name, the population,
and the located state. - For cities, record the name, the population, the
located state and the located county. - Uniqueness assumptions
- Names of states are unique.
- Names of counties are unique within a state
(e.g., 26 states have Washington Counties). - Cities are unique only within a state (e.g.,
there are 24 Springfields among the 50 states). - Some counties and cities have the same name, even
within a state (e.g., Los Angeles). - All cities are located within a single county
70Design 1 bad
Co. Popu.
Co. name
Popu.
Located
cities
states
name
Ci. Popu.
Ci. name
capital
Problem County Population is repeated for each
city.
71Design 2 good
Co. Popu.
Co. name
Popu.
name
Located
counties
states
Belongs-to
capitals
cities
Ci. name
Ci. Popu.
72Case Study 3
- Design a DB consistent with the following facts.
- Trains are either local trains or express trains,
but never both. - A train has a unique number and an engineer.
- Stations are either express stops or local stops,
but never both. - A station has a unique name and an address.
- All local trains stop at all stations.
- Express trains stop only at express stations.
- For each train and each station the train stops
at, there is a time.
73Design 1 bad
number
type
time
name
addr
StopsAt
trains
stations
engineer
type
Problem does not capture the constraints that
express trains only stop only at express stations
and local trains stop at all local stations
74Design 2 good
Lname
address
local stations
time
Lnumber
engineer
StopsAt2
local trains
Enumber
engineer
express trains
time
StopsAt1
express stations
Ename
address
75Case Study 4 (Pine Valley Furniture Company
- 1. The company sells a number of different
furniture products. - These products are grouped into several product
lines. - The identifier for a product is Product_ID, while
the identifier for a product line is
Product_Line_ID. - Referring to the customer invoice, we identify
the following additional attributes for product
Product_Description, Product_Finish, and
Unit_Price. - Another attribute for product line is
Product_Line_Name. - A product line may group any number of products,
but must group at least one product. - Each product must belong to exactly one product
line.
76Case Study 4
- 2. Customers submit orders for products The
identifier for an order is Order-ID, and another
attribute is Order_Date. - A customer may submit any number of orders, but
need not submit any orders. - Each order is submitted by exactly one customer.
- The identifier for a customer is Customer_ID.
- Other attributes include Customer_Name and
Customer_Address.
77Case Study4
- 3. A given customer order must request at least
one product. - Any product sold by Pine Valley Furniture may not
be requested on any order, or may be requested on
one or more orders. - An attribute associated with each order and
product is Quantity, which is the number of units
requested.
78Case Study4
- 4. Pine Valley Furniture has established sales
territories for its customers. - Each customer does business in one or more of
these sales territories. - The identifier for a sales territory is
Territory_ID. - A sales territory may have any number of
customers, or may not have any customers doing
business.
79Case Study4
- 5. Pine Valley Furniture Company has several
salespersons. The identifier for a salesperson is
Salesperson_ID. - Other attributes include Salesperson_Name,
Salesperson_Telephone, and Salesperson_Fax. - A salesperson serves exactly one sales territory.
- Each sales territory is served by one or more
salespersons.
80Case Study4
- 6. Each product is assembled from one or more raw
materials. - The identifier for the raw material entity is
Material_ID. - Other attributes include Unit_of_Measure and
Unit_Price. - Each raw material may be assembled into one or
more products.
81Case Study4
- 7. Raw materials are supplied by vendors.
- The identifier for a vendor is Vendor_ID.
- Other attributes include Vendor_Name and
Vendor_Address. - Each raw material can be supplied by one or more
vendors. - A vendor may supply any number of raw materials,
or may not supply any raw materials to Pine
Valley Furniture. - An attribute of the relationship between vendor
and raw material is Unit_Price
82Case Study4
- 8. Pine Valley Furniture has established a number
of work centers. - The identifier for a work center is
Work_Center_ID. - Another attribute is Location. Each product is
produced in one or more work centers. - A work center may be used to produce any number
of products, or may not be used to produce any
products.
83Case Study4
- 9. The company has over 100 employees.
- The identifier for employee is Employee_ID.
- Other attributes are Employee_Name,
EmployeeAddress, and Skill. - An employee may have more than one skill. And
Each skill can be mastered by many employees or
none.
84Case Study4
- 10 Each employee works in one or more work
centers. - A work center must have at least one employee
working in that center, but may have any number
of employees.
85Case Study4
- 11. Each employee has exactly one supervisor.
- An employee who is a supervisor may supervise any
number of employees, but not all employees are
supervisors.
86Library Case Study
- When a library first receives a book from a
publisher it is sent, together with the
accompanying delivery note, to the library desk.
Here the delivery note is checked against a file
of books ordered. - If no order can be found to match the note, a
letter of enquiry is sent to the publishers. If a
matching order is found, a catalogue note is
prepared from the details on the validated
delivery note. - The catalogue note, together with the book, is
sent to the registration department. The
validated delivery note is sent to the accounts
department where it is stored. - On receipt of an invoice from the publisher, the
accounts department checks its store of delivery
notes. If the corresponding delivery note is
found then an instruction to pay the publishers
is made, and subsequently a cheque is sent. If no
corresponding delivery note is found, the invoice
is stored in a pending file.Â
87A Case Study
Conference centre booking system A conference
centre takes bookings from clients who wish to
hold courses or conferences at the centre. When
clients make bookings they specify how many
people are included in the booking, and of
these, how many will be resident during the
booking, and how many will require catered or
non-catered accommodation at the centre. The
centre contains a number of facilities which may
be required by clients making bookings as
follows A. There are 400 bedrooms for clients
who will be resident during the Course or
conference. B. A maximum of 250 catered people
can be handled at any one time. C. Six main
lecture theatres providing seating for 200
people. D. Twenty seminar rooms each able to
accommodate 25 people. E. Video conference
facilities. The video conference facilities
consist of four separate video conference
networks. Each video conference network has a
large screen based in one of the main lecture
theatres, along with 3 satellite screens each of
which is based in one of the seminar rooms.
Draw an entity relationship diagram for the
case, stating any assumptions you deem necessary.
88Shipping company example
- The London and Ireland Shipping Company PLC
(LISC) was founded in 1852 and owns a fleet of
cargo ships. The company had historically run
passenger liners, but recent policy decisions
involved the sale of all passenger-carrying
vessels. The company currently has 14 vessels,
including one oil tanker and one tugboat
operating out of Liverpool. Most of the vessels
are registered in Liberia for tax reasons. - Each ship has one or more holds divided into
spaces. The holds are defined by steel bulkheads
and the spaces are defined by shelf racks or
other physical dividers. Sister ships, built by
the same shipbuilders and to the same designs
have similar names, such as Pride of Ireland,
Queen of Ireland, Song of Ireland and Warrior of
Ireland. Sister ships also have identical cargo
storage facilities. - LISC issues contracts to agents for one or more
manifests (lists of cargo items to be shipped).
LISC's charges for cargo carried are based on the
number of spaces the cargo requires for storage.
The types of cargo typically carried by LISC
include grain, coal and ores (carried only in
ships equipped with bulk cargo holds). They also
transport sacked grain, heavy cases, containers
(which may be carried on deck), pallets and so on.
89Shipping company example
- Cargo items may take up less than one space in a
hold, or one or more spaces, depending on the
size of the item. A space may therefore contain
several small cargo items. - The ships owned by LISC are kept as busy and as
full as possible, in order to maximise the
profits that each vessel makes and minimise
running operating costs. LISC's ships ply most
of the seas of the world, but tend to operate
mainly in the Mediterranean, the North and Mid
Atlantic and the Indian Ocean. Different ships
require different crew complements. - LISC intends to create a computer based
information system that will be able to perform
the following tasks - record the voyages of each ship with the start
and end ports. - record the cargo held by a ship on each voyage
- keep records of their employees and the ships
they are assigned to - producing invoices for agents and customers
- keep a record of customers' payments on
invoices - analyse the efficiency of use of cargo space
and of percentage wasted cargo space for ships
voyages
90Film Club Case Study Film Club UK is a
company that owns or leases a number of small
cinemas in the UK. They have commissioned a
database designer to design a database solution
to enable them to maintain details about their
cinemas and the films that they show. Note that
it is possible to have two cinemas in the same
location with the same name (there used to be two
Odeons in Newcastle). It is also possible to have
different films with the same title (for example,
different versions of a Shakespeare play).
Films are scheduled for one or more showings
at a cinema within a season. Season details
are to be notified in advance of the dates and
times of showings, takings, etc. to be notified
later. Any one film may have more than one
season at any one cinema (for example, a cinema
showing The Snowman each Christmas). At
present, all cinemas are single-screen.