Title: CMSC424: Database Design
1CMSC424 Database Design
- Instructor Amol Deshpande
- amol_at_cs.umd.edu
2Data Modeling
- Goals
- Conceptual representation of the data
- Reality meets bits and bytes
- Must make sense, and be usable by other people
- Today
- Entity-relationship Model
- Relational Model
3Motivation
- Youve just been hired by Bank of America as
their DBA for their online banking web site. - You are asked to create a database that monitors
- customers
- accounts
- loans
- branches
- transactions,
- Now what??!!!
4 Database Design Steps
Entity-relationship Model Typically used
for conceptual database design
Conceptual Data Model
Logical Data Model
Relational Model Typically used for
logical database design
Physical Data Model
5Entity-Relationship Model
- Two key concepts
- Entities
- An object that exists and is distinguishable from
other objects - Examples Bob Smith, BofA, CMSC424
- Have attributes (people have names and addresses)
- Form entity sets with other entities of the same
type that share the same properties - Set of all people, set of all classes
- Entity sets may overlap
- Customers and Employees
6Entity-Relationship Model
- Two key concepts
- Relationships
- Relate 2 or more entities
- E.g. Bob Smith has account at College Park Branch
- Form relationship sets with other relationships
of the same type that share the same properties - Customers have accounts at Branches
- Can have attributes
- has account at may have an attribute start-date
- Can involve more than 2 entities
- Employee works at Branch at Job
7ER Diagram Starting Example
- Rectangles entity sets
- Diamonds relationship sets
- Ellipses attributes
8Rest of the class
- Details of the ER Model
- How to represent various types of
constraints/semantic information etc. - Design issues
- A detailed example
9Next Relationship Cardinalities
- We may know
- One customer can only open one account
- OR
- One customer can open multiple accounts
- Representing this is important
- Why ?
- Better manipulation of data
- Can enforce such a constraint
- Remember If not represented in conceptual model,
the domain knowledge may be lost
10Mapping Cardinalities
- Express the number of entities to which another
entity can be associated via a relationship set - Most useful in describing binary relationship sets
11Mapping Cardinalities
- One-to-One
- One-to-Many
- Many-to-One
- Many-to-Many
has
customer
account
has
customer
account
has
customer
account
has
customer
account
12Mapping Cardinalities
- Express the number of entities to which another
entity can be associated via a relationship set - Most useful in describing binary relationship
sets - N-ary relationships ?
13Next Types of Attributes
- Simple vs Composite
- Single value per attribute ?
- Single-valued vs Multi-valued
- E.g. Phone numbers are multi-valued
- Derived
- If date-of-birth is present, age can be derived
- Can help in avoiding redundancy, enforcing
constraints etc
14Types of Attributes
15Types of Attributes
- multi-valued (double ellipse)
- derived (dashed ellipse)
age
cust-name
cust-id
customer
date-of-birth
cust-street
phone no.
cust-city
16Types of Attributes
age
cust-name
cust-id
customer
date-of-birth
cust-street
phone no.
cust-city
month
day
year
Composite Attribute
17Next Keys
- Key set of attributes identifying individual
entities or relationships
18Entity Keys
Possible Keys cust-id cust-name,
cust-city, cust-street cust-id, age
cust-name ?? Probably not. Domain knowledge
dependent !!
date-of-birth
cust-name
cust-id
customer
age
cust-street
phone no.
cust-city
19Entity Keys
- Superkey
- any attribute set that can distinguish entities
- Candidate key
- a minimal superkey
- Cant remove any attribute and preserve key-ness
- cust-id, age not a superkey
- cust-name, cust-city, cust-street is
- assuming cust-name is not unique
- Primary key
- Candidate key chosen as the key by DBA
- Underlined in the ER Diagram
20Entity Keys
- cust-id is a natural primary key
- Typically, SSN forms a good primary key
- Try to use a candidate key that rarely changes
- e.g. something involving address not a great idea
date-of-birth
cust-name
cust-id
customer
age
cust-street
phone no.
cust-city
21Relationship Set Keys
- What attributes are needed to represent a
relationship completely and uniquely ? - Union of primary keys of the entities involved,
and relationship attributes - cust-id, access-date, account number describes
a relationship completely
22Relationship Set Keys
- Is cust-id, access-date, account number a
candidate key ? - No. Attribute access-date can be removed from
this set without losing key-ness - In fact, union of primary keys of associated
entities is always a superkey
23Relationship Set Keys
- Is cust-id, account-number a candidate key ?
- Depends
access-date
number
cust-id
has
customer
account
24Relationship Set Keys
- Is cust-id, account-number a candidate key ?
- Depends
access-date
number
cust-id
has
customer
account
- If one-to-one relationship, either cust-id or
account-number sufficient - Since a given customer can only have one account,
she can only participate in one relationship - Ditto account
25Relationship Set Keys
- Is cust-id, account-number a candidate key ?
- Depends
access-date
number
cust-id
has
customer
account
- If one-to-many relationship (as shown),
account-number is a candidate key - A given customer can have many accounts, but at
most one account holder per account allowed
26Relationship Set Keys
- General rule for binary relationships
- one-to-one primary key of either entity set
- one-to-many primary key of the entity set on the
many side - many-to-many union of primary keys of the
associate entity sets - n-ary relationships
- More complicated rules
27Next Data Constraints
- Representing semantic data constraints
- We already saw constraints on relationship
cardinalities
28Participation Constraint
- Given an entity set E, and a relationship R it
participates in - If every entity in E participates in at least one
relationship in R, it is total participation - partial otherwise
29Participation Constraint
access-date
cust-name
number
cust-id
has
customer
account
cust-street
cust-city
balance
Total participation
30Cardinality Constraints
How many relationships can an entity participate
in ?
access-date
number
cust-id
has
customer
account
0..
1..1
Minimum - 0 Maximum no limit
Minimum - 1 Maximum - 1
31Next Recursive Relationships
- Sometimes a relationship associates an entity set
to itself
32Recursive Relationships
emp-name
emp-id
works-for
manager
employee
worker
emp-street
emp-city
- Must be declared with roles
33Next Weak Entity Sets
- An entity set without enough attributes to have a
primary key - E.g. Transaction Entity
- Attributes
- transaction-number, transaction-date,
transaction-amount, transaction-type - transaction-number may not be unique across
accounts
34Weak Entity Sets
- A weak entity set must be associated with an
identifying or owner entity set - Account is the owner entity set for Transaction
35Weak Entity Sets
Still need to be able to distinguish between
different weak entities associated with the same
strong entity
number
trans-date
trans-number
has
account
Transaction
trans-type
balance
trans-amt
36Weak Entity Sets
Discriminator A set of attributes that can be
used for that
number
trans-date
trans-number
has
account
Transaction
trans-type
balance
trans-amt
37Weak Entity Sets
- Primary key
- Primary key of the associated strong entity
discriminator attribute set - For Transaction
- account-number, transaction-number
38Next Specialization
- Consider entity person
- Attributes name, street, city
- Further classification
- customer
- Additional attributes customer-id, credit-rating
- employee
- Additional attributes employee-id, salary
- Note similarities to object-oriented programming
39Specialization Example
40Finally Aggregation
- No relationships between relationships
- E.g. Associate account officers with has
account relationship set
has
customer
account
?
account officer
employee
41Finally Aggregation
- Associate an account officer with each account ?
- What if different customers for the same account
can have different account officers ?
has
customer
account
?
account officer
employee
42Finally Aggregation
has
customer
account
account officer
employee
43More
- Read Chapter 2 for
- Specialization/Aggregation details
- Different types of specializations etc
- Generalization opposite of specialization
- Lower- and higher-level entities
- Attribute inheritance
44E/R Data ModelDesign Issue 1 Entity Sets vs.
Attributes
- An Example Employees can have multiple phones
(b)
(a)
Uses
Employee
vs
Employee
Phone
no
loc
To resolve, determine how phones are used 1. Can
many employees share a phone? (If yes, then
(b)) 2. Can employees have multiple phones?
(if yes, then (b), or (a) with multivalued
attributes) 3. Else (a), perhaps with
composite attributes
Employee
phone
loc
no
45E/R Data ModelDesign Issue 2 Entity Sets vs.
Relationship Sets
An Example How to model bank loans
Loans
Borrows
Customer
Branch
Customer
Loan
vs
(a)
amt
lno
(b)
- To resolve, determine how loans are issued
- 1. Can there be more than one customer per loan?
- If yes, then (a). Otherwise, loan info must be
replicated for each customer (wasteful, potential
update anomalies) - 2. Is loan a noun or a verb?
- Both, but more of a noun to a bank. (hence (a)
probably more appropriate)
46E/R Data ModelDesign Issue 3 N-ary vs Binary
Relationship Sets
An Example Works_At
Ternary
Works_at
Employee
Branch
Choose n-ary when possible! (Avoids
redundancy, update anomalies)
Dept
(Joe, Moody, Acct) Î Works_At
vs
Binary
WAE
WAB
Branch
WA
Employee
WAD
(Joe, w3) Î WAE (Moody, w3) Î WAB (Acct, w3) Î WAD
Dept
47Example Design
- We will model a university database
- Main entities
- Professor
- Projects
- Departments
- Graduate students
- etc
48(No Transcript)
49proj-number
SSN
sponsor
name
project
professor
start
area
budget
rank
SSN
dept-no
name
name
grad
dept
age
office
degree
homepage
50(No Transcript)
51Time ()
Major
52Time ()
Major
And so on
53Summary
- Entity-relationship Model
- Intuitive diagram-based representation of domain
knowledge, data properties etc - Two key concepts
- Entities
- Relationships
- We also looked at
- Relationship cardinalities
- Keys
- Participation Constraints
54Summary
- Details unimportant
- Key idea We can represent many data properties
and constraints conceptually using this - Read Chapter 2
- Assignment will require you to do this anyway !