Title: Review of the EntityRelationship Model
1Review of the Entity-Relationship Model
- Slides courtesy of Amol Deshpande
- material from ch. 2 of
- Korth Silberschatz Database System Concepts,
2Data Modeling
- Goals
- Conceptual representation of the data
- Reality meets bits and bytes
- Must make sense, and be usable by other people
- Review
- Entity-relationship Model
- Relational Model
3Motivation
- Youve just been hired by Bank of America as
their DBA for their online banking web site. - You are asked to create a database that monitors
- customers
- accounts
- loans
- branches
- transactions,
- Now what??!!!
4 Database Design Steps
Entity-relationship Model Typically used
for conceptual database design
Conceptual Data Model
Logical Data Model
Relational Model Typically used for
logical database design
Physical Data Model
5Entity-Relationship Model
- Two key concepts
- Entities
- An object that exists and is distinguishable from
other objects - Examples Bob Smith, BofA, CMSC424
- Have attributes (people have names and addresses)
- Form entity sets with other entities of the same
type that share the same properties - Set of all people, set of all classes
- Entity sets may overlap
- Customers and Employees
6Entity-Relationship Model
- Two key concepts
- Relationships
- Relate 2 or more entities
- E.g. Bob Smith has account at College Park Branch
- Form relationship sets with other relationships
of the same type that share the same properties - Customers have accounts at Branches
- Can have attributes
- has account at may have an attribute start-date
- Can involve more than 2 entities
- Employee works at Branch at Job
7ER Diagram Starting Example
- Rectangles entity sets
- Diamonds relationship sets
- Ellipses attributes
8Review Roadmap
- Details of the ER Model
- How to represent various types of
constraints/semantic information etc. - Design issues
- A detailed example
9Relationship Cardinalities
- We may know
- One customer can only open one account
- OR
- One customer can open multiple accounts
- Representing this is important
- Why ?
- Better manipulation of data
- Can enforce such a constraint
- Remember If not represented in conceptual model,
the domain knowledge may be lost
10Mapping Cardinalities
- Express the number of entities to which another
entity can be associated via a relationship set - Most useful in describing binary relationship sets
11Mapping Cardinalities
- One-to-One
- One-to-Many
- Many-to-One
- Many-to-Many
has
customer
account
has
customer
account
has
customer
account
has
customer
account
12Types of Attributes
- Simple vs Composite
- Single value per attribute ?
- Single-valued vs Multi-valued
- E.g. Phone numbers are multi-valued
- Derived
- If date-of-birth is present, age can be derived
- Can help in avoiding redundancy, enforcing
constraints etc
13Types of Attributes
14Types of Attributes
- multi-valued (double ellipse)
- derived (dashed ellipse)
age
cust-name
cust-id
customer
date-of-birth
cust-street
phone no.
cust-city
15Types of Attributes
age
cust-name
cust-id
customer
date-of-birth
cust-street
phone no.
cust-city
month
day
year
Composite Attribute
16Next Keys
- Key set of attributes identifying individual
entities or relationships
17Entity Keys
Possible Keys cust-id cust-name,
cust-city, cust-street cust-id, age
cust-name ?? Probably not. Domain knowledge
dependent !!
date-of-birth
cust-name
cust-id
customer
age
cust-street
phone no.
cust-city
18Entity Keys
- Superkey
- any attribute set that can distinguish entities
- Candidate key
- a minimal superkey
- Cant remove any attribute and preserve key-ness
- cust-id, age not a superkey
- cust-name, cust-city, cust-street is
- assuming cust-name is not unique
- Primary key
- Candidate key chosen as the key by DBA
- Underlined in the ER Diagram
19Entity Keys
- cust-id is a natural primary key
- Typically, SSN forms a good primary key
- Try to use a candidate key that rarely changes
- e.g. something involving address not a great idea
date-of-birth
cust-name
cust-id
customer
age
cust-street
phone no.
cust-city
20Relationship Set Keys
- What attributes are needed to represent a
relationship completely and uniquely ? - Union of primary keys of the entities involved,
and relationship attributes - cust-id, access-date, account number describes
a relationship completely
21Relationship Set Keys
- Is cust-id, access-date, account number a
candidate key ? - No. Attribute access-date can be removed from
this set without losing key-ness
22Relationship Set Keys
- Is cust-id, account-number a candidate key ?
- Depends
access-date
number
cust-id
has
customer
account
23Relationship Set Keys
- Is cust-id, account-number a candidate key ?
- Depends
access-date
number
cust-id
has
customer
account
If one-to-one relationship, either cust-id or
account-number sufficient Since a given
customer can only have one account, she can only
participate in one relationship Ditto account
24Relationship Set Keys
- Is cust-id, account-number a candidate key ?
- Depends
access-date
number
cust-id
has
customer
account
If one-to-many relationship (as shown),
account-number is a candidate key A given
customer can have many accounts, but at most one
account holder per account allowed
25Relationship Set Keys
- General rule for binary relationships
- one-to-one primary key of either entity set
- one-to-many primary key of the entity set on the
many side - many-to-many union of primary keys of the
associate entity sets - n-ary relationships
- More complicated rules
26Data Constraints
- Representing semantic data constraints
- We already saw constraints on relationship
cardinalities
27Participation Constraint
- Given an entity set E, and a relationship R it
participates in - If every entity in E participates in at least one
relationship in R, it is total participation - partial otherwise
28Participation Constraint
access-date
cust-name
number
cust-id
has
customer
account
cust-street
cust-city
balance
Total participation
29Cardinality Constraints
How many relationships can an entity participate
in ?
access-date
number
cust-id
has
customer
account
0..
1..1
Minimum - 0 Maximum no limit
Minimum - 1 Maximum - 1
30Recursive Relationships
- Sometimes a relationship associates an entity set
to itself
31Recursive Relationships
emp-name
emp-id
works-for
manager
employee
worker
emp-street
emp-city
- Must be declared with roles
32Weak Entity Sets
- An entity set without enough attributes to have a
primary key - E.g. Transaction Entity
- Attributes
- transaction-number, transaction-date,
transaction-amount, transaction-type - transaction-number may not be unique across
accounts
33Weak Entity Sets
- A weak entity set must be associated with an
identifying or owner entity set - Account is the owner entity set for Transaction
34Weak Entity Sets
Still need to be able to distinguish between
different weak entities associated with the same
strong entity
number
trans-date
trans-number
has
account
Transaction
trans-type
balance
trans-amt
35Weak Entity Sets
Discriminator A set of attributes that can be
used for that
number
trans-date
trans-number
has
account
Transaction
trans-type
balance
trans-amt
36Weak Entity Sets
- Primary key
- Primary key of the associated strong entity
discriminator attribute set - For Transaction
- account-number, transaction-number
37Specialization
- Consider entity person
- Attributes name, street, city
- Further classification
- customer
- Additional attributes customer-id, credit-rating
- employee
- Additional attributes employee-id, salary
- Note similarities to object-oriented programming
38Specialization Example
39Aggregation
- No relationships between relationships
- E.g. Associate account officers with has
account relationship set
has
customer
account
?
account officer
employee
40Aggregation
- Associate an account officer with each account ?
- What if different customers for the same account
can have different account officers ?
has
customer
account
?
account officer
employee
41Aggregation
has
customer
account
account officer
employee
42More
- Read Chapter 2 for
- Specialization/Aggregation details
- Different types of specializations etc
- Generalization opposite of specialization
- Lower- and higher-level entities
- Attribute inheritance
43E/R Data ModelDesign Issue 1 Entity Sets vs.
Attributes
- An Example Employees can have multiple phones
(b)
(a)
Uses
Employee
vs
Employee
Phone
no
loc
To resolve, determine how phones are used 1. Can
many employees share a phone? (If yes, then
(b)) 2. Can employees have multiple phones?
(if yes, then (b), or (a) with multivalued
attributes) 3. Else (a), perhaps with
composite attributes
44E/R Data ModelDesign Issue 2 Entity Sets vs.
Relationship Sets
An Example How to model bank loans
Loans
Borrows
Customer
Branch
Customer
Loan
vs
(a)
amt
lno
(b)
- To resolve, determine how loans are issued
- 1. Can there be more than one customer per loan?
- If yes, then (a). Otherwise, loan info must be
replicated for each customer (wasteful, potential
update anomalies) - 2. Is loan a noun or a verb?
- Both, but more of a noun to a bank. (hence (a)
probably more appropriate)
45E/R Data ModelDesign Issue 3 N-ary vs Binary
Relationship Sets
An Example Works_At
Ternary
Works_at
Employee
Branch
Choose n-ary when possible! (Avoids
redundancy, update anomalies)
Dept
(Joe, Moody, Acct) Î Works_At
vs
Binary
WAE
WAB
Branch
WA
Employee
WAD
(Joe, w3) Î WAE (Moody, w3) Î WAB (Acct, w3) Î WAD
Dept
46Example Design
- We will model a university database
- Main entities
- Professor
- Projects
- Departments
- Graduate students
- etc
47(No Transcript)
48proj-number
SSN
sponsor
name
project
professor
start
area
budget
rank
SSN
dept-no
name
name
grad
dept
age
office
degree
homepage
49(No Transcript)
50Time ()
Major
51Time ()
Major
And so on
52Summary
- Entity-relationship Model
- Intuitive diagram-based representation of domain
knowledge, data properties etc - Two key concepts
- Entities
- Relationships
- Additional Details
- Relationship cardinalities
- Keys
- Participation Constraints
53 Database Design Steps
Entity-relationship Model Typically used
for conceptual database design
Conceptual Data Model
Logical Data Model
Relational Model Typically used for
logical database design
Physical Data Model
54Review Entity-Relationship Model
Entity set
E1
R
Relationship set
Attribute (primary key if underlined)
a
55Thoughts
- Nothing about actual data
- How is it stored ?
- No talk about the query languages
- How do we access the data ?
- Semantic vs Syntactic Data Models
- Remember E/R Model is used for conceptual
modeling - Many conceptual models have the same properties
- They are much more about representing the
knowledge than about database storage/querying
56Thoughts
- Basic design principles
- Faithful
- Must make sense
- Satisfies the application requirements
- Models the requisite domain knowledge
- If not modeled, lost afterwards
- Avoid redundancy
- Potential for inconsistencies
- Go for simplicity
- Typically an iterative process that goes back and
forth
57Relational Data Model
Introduced by Ted Codd (late 60s early 70s)
- Before Network Data Model (Cobol as DDL,
DML) - Very contentious Database Wars (Charlie
Bachman vs. Mike Stonebraker)
Relational data model contributes
- Separation of logical, physical data models (data
independence) - Declarative query languages
- Formal semantics
- Query optimization (key to commercial success)
58Key Abstraction Relation
Account
Terms
Why called Relations?
59Why Called Relations?
Mathematical relations
- Given sets R 1, 2, 3, S 3, 4
- R ? S (1, 3), (1, 4), (2, 3), (2, 4), (3, 3),
(3, 4) - A relation on R, S is any subset (?) of R ? S
(e.g (1, 4), (3, 4))
Account Í Branches ? Accounts ? Balances
(Downtown, A-101, 500), (Brighton, A-201,
900), (Brighton, A-217, 500)
60Relations
Account
Relational database semantics defined in terms of
mathematical relations
61Relations
Account
Terms
- Schema (e.g. Acct_Schema (bname, acct_no,
balance))
62Definitions
- Relation Schema (or Schema)
- A list of attributes and their domains
- We will require the domains to be atomic
- E.g. account(account-number, branch-name,
balance) - Relation Instance
- A particular instantiation of a relation
with actual values - Will change with time
Programming language equivalent A variable (e.g.
x)
Programming language equivalent Value of a
variable
63Rest of the Class
- Converting from an E/R diagram to a relational
schema - Remember We still use E/R models for conceptual
modeling of the database - Relational Algebra
- Data retrieval language
64E/R Diagrams ? Relations
- Convert entity sets into a relational schema with
the same set of attributes
Customer
Customer_Schema(cname, ccity, cstreet)
cname
ccity
cstreet
assets
bcity
bname
Branch_Schema(bname, bcity, assets)
Branch
65E/R Diagrams ? Relations
- Convert relationship sets also into a relational
schema - Remember A relationship is completely described
by primary keys of associate entities and its own
attributes
balance
acct-no
Account_Schema(acct-no, balance)
Account
access-date
Depositor_Schema(cname, acct-no, access-date)
Depositor
Customer
Customer_Schema(cname, ccity, cstreet)
Well Not quite. We can do better. It depends on
the relationship cardinality
ccity
cname
cstreet
66E/R Diagrams ? Relations
- Say One-to-Many Relationship from Customer to
Account - ? Many accounts per customer
balance
acct-no
Account_Schema(acct-no, balance, cname,
access-date)
Account
access-date
Depositor
Customer
Customer_Schema(cname, ccity, cstreet)
ccity
cname
Exactly same information, fewer tables
cstreet
67E/R Diagrams ? Relations
E1
a1
an
68E/R Diagrams ? Relations
E1
a1
an
Not the whole story for Relationship Sets
69E/R Diagrams ? Relations
R
70E/R Diagrams ? Relations
R
R
71E/R Diagrams ? Relations
R
R
R
72E/R Diagrams ? Relations
R
R
R
R
73Translating E/R Diagrams to Relations
assets
bcity
balance
bname
acct_no
Acct-Branch
Account
Branch
Loan-Branch
Depositor
Borrower
Customer
Loan
ccity
cname
lno
amt
cstreet
Q. How many tables does this get translated into?
A. 6 (account, branch, customer, loan, depositor,
borrower)
74Bank Database
75Bank Database
76E/R Diagrams Relations
77E/R Diagrams Relations
Emp
Emp-Phones
78E/R Diagrams Relations
79E/R Diagrams Relations
80E/R Diagrams Relations
- Subclasses example
- Method 1
- Account (acct_no, balance)
- SAccount (acct_no, interest)
- CAccount (acct_no, overdraft)
- Method 2
- SAccount (acct_no, balance, interest)
- CAccount (acct_no, balance, overdraft)
Q When is method 2 not possible?
A When subclassing is partial
81Keys and Relations
As in the E/R Model
- 1. Superkeys
- set of attributes of table for which every row
has distinct set of values - 2. Candidate keys
- minimal superkeys
- 3. Primary keys
- DBA-chosen candidate keys
e.g., Branch (bname, bcity, assets) Þ