Title: DATA MODELLING CONCEPTS
1COMP 332PRINCIPLES OFDATABASE DESIGN
2DATA MODELLING CONCEPTS OUTLINE
- Data Modelling
- Models and Databases
- Abstraction
- Entity-relationship Model
- Entities
- Attributes
- Generalization
- Relationships
- Constraints
- Aggregation
3WHY DO WE BUILD MODELS? BCN 2
- Test a physical entity before building it
- Why is this important?
- Communicate with customers
- Why is this important?
- Visualize the final product
- Why is this important?
- Reduce the complexity of a problem
- Why is this important?
4DATA MODELLING FOR A DATABASE
Disk bits, bytes, etc.
Files records, fields, etc.
LogicalModel relations, attributes
concrete
abstract
- need to consider two aspects of modelling
- data organization data access
5DATA MODELS AND DATA MODELLING BCN 2.3
- model an abstraction that hides unnecessary
details, emphasizes important details - data model an abstraction of data
- data modelling the process of describing some
part of the world using a data model
- A data model consists of
- data structure types specifies data
organization - integrity constraints restricts valid data
organization - operations specifies data access
A data model is a tool used to specify
theconceptual / logical organization of data
6ABSTRACTION IN DATA MODELLING BCN 2.1
- mental process of selecting some characteristics
and properties and excluding others that are not
relevant
- name, address, sex, hobbies, ...
- abstraction is always for some purpose
- many different abstractions of the same thing are
possible - all abstractions are incomplete descriptions of
reality
- We do not need completeness, just modelling
adequacy!
7QUESTION?
If you had to describe a person as an employee of
a company, what is the most relevant information
would you include?
- information about trips taken.
- information about hobbies.
- information about boy/girl friends.
- salary information.
- telephone billing information.
- information about grades.
8QUESTION?
If you had to describe a person as a student of
HKUST, what is the most relevant information
would you include?
- information about trips taken.
- information about hobbies.
- information about boy/girl friends.
- salary information.
- telephone billing information.
- information about grades.
9TYPES OF ABSTRACTIONS
- classification group similar instances of things
IS_MEMBER_OF
IS_MEMBER_OF
Michael Chang
Martina Hingas
San San
Anson Chan
Pete Sampras
- pick out common properties and ignore unique
properties
10TYPES OF ABSTRACTIONS
- aggregation group dissimilar sets of things
IS_PART_OF
account
type
balance
branch
- ignore differences among the parts and
concentrateon the fact that they form the whole
11TYPES OF ABSTRACTIONS
- generalization group similar sets of things
- superclass, subclass / supertype, subtype
Vehicle
IS_A
Truck
Bus
Van
Car
- note difference between classification and
generalization - classification applied to individual instances
of things - generalization applied to sets of things
12TYPES OF ABSTRACTIONS (contd)
- Note we can combine different types of
abstraction when modeling
student
generalization
undergrad
classification
John
. . .
Sam
Tina
aggregation
. . .
name
student id
13ER MODEL ENTITY INSTANCE 2.1.1
- a concept, abstraction, or thing with crisp
boundaries and meaning for the problem at hand
- Has identity existence is not dependent on its
properties - (does a persons name give the person
existence?) - Entity instance purpose
- promote understanding of the real world
- provide a practical basis for implementation
- How to identify/select entity instances?
- There is often no one correct representation
for a given problem! - The nature of the problem and experience
important in selecting.
14ER MODEL ENTITY (TYPE) 2.1.1
- a description of a set of entity instances that
have common - properties relationships - semantics
. . .
Beethoven
Brahms
Bach
15ER MODEL ENTITY (contd)
- often appear as nouns in problem descriptions
- instances of an entity share a common semantic
purpose
Why entities?
How are entities, entity instances and
abstraction related?
16ER MODEL ATTRIBUTE
a description of data values held by an entity
- name unique within an entity
- type the domain from which an attribute can take
its values - continuous ? discrete
Employee
hkid
hkid name sex age entity
A071983
male
22
instance
John
- identifier(key) versus descriptor attributes
How are entities, attributes and abstraction
related?
17ER MODEL NOTATION
complexattribute
hkid
name
skill
multi-valuedattribute
Note data can often have a duality of
representation entity ltgt attribute
Alternate notation Employee (hkid, name,
address, skill) BUT loses some semantics!
18ER MODEL GENERALIZATION
- a relationship between an entity and one or more
refined versions
generalization
Musicalinstrument
instrument-type
Windinstrument
Stringinstrument
Percussioninstrument
size
discriminator
Violin
Cello
Viola
specialization
discriminator an attribute of enumeration type
that indicates which property of an entity is
being abstracted by a generalization
19ER MODEL INHERITANCE
the assumption of properties by a subtype from
its supertype
- promotes reusability, reduces redundancy
- only define properties in one place
- subtype may add new properties (attributes)
- single inheritance versus multiple inheritance
- inheritance conflicts may arise
20ER MODEL INHERITANCE
- conflict resolution by
- user-defined order
- predefined order
- redefine names
amount
account
service-charge
interest-rate
statement-fee
statement-fee
Checking Savings
service-charge
statement-fee
interest-rate
21MULTIPLE INHERITANCE EXAMPLE
id
University member
U
U
id
id
Student
Employee
id
id
id
id
U
U
id ?
? id
TA
22MULTIPLE INHERITANCE EXAMPLE
id
id
Student
Employee
id
id
id
id
U
U
id ?
? id
TA
23MULTIPLE INHERITANCE EXAMPLE
Vehicle
U
U
weight
displacement
size
size
Land vehicle
Water vehicle
U
U
U
U
Car
Amphibious vehicle
Boat
name conflict
24ER MODEL RELATIONSHIP INSTANCE 2.1.1
a physical or conceptual connection between
entity instances
Teaches
Works-for
25ER MODEL RELATIONSHIP (TYPE)
- a description of a set of relationships with
common semantics
Teaches
Course
Person
ER diagram
- often appear as verbs in
- problem descriptions
- conceptually, relationships are inherently
bi-directional
- a relationship is not the same as a pointer!
26ER MODEL RELATIONSHIP (contd)
- there can be several relationships between the
same two entities
Person
Works-for
Attended
Works-for
Attended
University
- or even with the same entity
Person
Manages
Married-to
How are relationships, relationship instancesand
abstraction related?
27RELATIONSHIP DEGREE
- unary (reflexive)relates an entity to itself
- binaryrelates two entities
- ternaryrelates three entities
More on this later!
- n-ary - relates any number of entities
- In practice, the vast majority of relationships
are binary
28RELATIONSHIP RELATIONSHIP ATTRIBUTE
What about an attribute like grade?
A
B
A-
C
A
B
29RELATIONSHIP RELATIONSHIP ATTRIBUTE (contd)
- Possibility 1 use many attributes for grade
(a) in Student
(b) in Course
30RELATIONSHIP RELATIONSHIP ATTRIBUTE (contd)
- Possibility 2 use a multi-valued attribute for
grade
(a) in Student
(b) in Course
31RELATIONSHIP RELATIONSHIP ATTRIBUTE (contd)
- Possibility 3 use a relationship attribute
32RELATIONSHIP RELATIONSHIP ATTRIBUTE
- When to associate an attribute with a
relationship?
?
salary
name
hkid
address
name
address
- Usually needed only for many to many
relationships!
33RELATIONSHIP ROLE NAME
- role one end of a relationship
employee
employer
- for binary relationships there are two roles
- often appear as nouns in problem descriptions
34RELATIONSHIP ROLE NAME
role one end of a relationship
Boss
Worker
Boss
- It is necessary to use role names when a
relationshiprelates instances from the same
entity
35DATA MODEL CONCEPTS AND INSTANCES
ExtentionalInstance(s), Data
IntentionalDefinition, Meta-data
36DATA MODEL (INTEGRITY) CONSTRAINTS
- constraint a logical restriction or property of
data that
- for any set of data values, we can determine
whether the constraint is true or false - we expect to be true always
- we can enforce
- adds additional semantics (meaning) to data
- inherent part of the data structure type
definition of the data model - e.g., every attribute must be atomic
(single-valued) - explicit specified independently on data
structure types or a property of the data - e.g., enrollment quota for a course
37DATA MODEL (INTEGRITY) CONSTRAINTS
- on attributes salary is between 0 and 100,000
- on entities every manager is also an employee
- on relationships every employee works in at most
one department
Manager
Employee
38ATTRIBUTE CONSTRAINTS
set of positive integers from 0 to 9999
set of character strings of length 20 or less
. . .
set of positive integers up to 999 999
. . .
hkid
name
salary
address
employee-id
job-class
Employee
39ATTRIBUTE CONSTRAINTS
X
employee-id salary
Suppose we are told that no two employees have
the same employee-id
40FUNCTIONAL DEPENDENCY (FD)
- Let X, Y be sets of attributes and f a
time-varying function from X to Y. - fX Y
- is a functional dependency if at any point in
time, for a given value of x in X there will be
at most one value of y in Y - X is called the determinant of the FD
41FUNCTIONAL DEPENDENCY (FD) (contd)
- X and Y can be sets of attributes X ºX1UX2UUXn
X1X2Xn
item
quantity
employee-id
customer-id
item
date
date
Order
customer-id, item, date quantity
- If any Xi is not needed to determine Y, then Y is
only partially dependent on X - employee-id, name salary
otherwise, Y is fully dependent on X
42APPLYING FDs TO DATA MODELLING
Given
X
Y
Z
X
X Y X Z
E1
- Value of X unique in E1 gt X is a candidate key
of E1 - (a minimal set of attributes that uniquely
identifies an entity instance) - Usually choose one candidate key as the primary
key
- Does every entity have a primary key?
43APPLYING FDs TO DATA MODELLING (contd)
discriminator attribute(s) that uniquely
identifies an instance given a key value for the
strong entity
employee-id
employee-id
name
name
name
address
Has
Dependent
Employee
hkid
age
salary
job-class
- internal key only attributes of the entity
itself are required to uniquely identify an
instance of the entity - strong entity
- external key attributes of one or more related
entities are needed to uniquely identify an
instance of the entity - weak entity
44APPLYING FDs TO DATA MODELLING (contd)
- Forming entities
- Suppose we just start with FDs
X U X V X Y
W Z W T
X
U
V
Y
W
Z
T
E1
E2
45ENTITY CONSTRAINTS GENERALIZATION COVERAGE
2.2.1
overlapping
disjoint
46ENTITY CONSTRAINTS GENERALIZATION COVERAGE
(contd)
complete (total)
incomplete (partial)
47ENTITY CONSTRAINTS GENERALIZATION COVERAGE
(contd)
overlapping, incomplete
overlapping, complete
48ENTITY CONSTRAINTS GENERALIZATION COVERAGE
(contd)
disjoint, complete
disjoint, incomplete
49RELATIONSHIP CONSTRAINTS CONNECTIVITY 2.1.3
- specifies the number of instances of a
relationship in which an entity may participate
(relationship cardinality)
Teaches
(0,1)
(1,1)
Instructor
Course
- For a given course, how many instructors can
teach it? - assume each course can be taught by one and only
one instructor
- For a given instructor, how many courses can he
teach? - assume an instructor does not have to teach any
course and may teach at most one course in a
given semester
50RELATIONSHIP CONSTRAINTS CONNECTIVITY
(c,d)
(a,b)
E1
E2
R
max-card(E1,R)
min-card(E1,R)
- minimum cardinality (min-card)
- min-card(E1,R) minimum number of relationships
in which each entity of E1 can participate in R
min-card(E1,R) 0 optional participation min-ca
rd(E1,R) gt 0 mandatory participation
- maximum cardinality (max-card)
- max-card(E1,R) maximum number of relationships
in which each entity of E1 can participate in R
51RELATIONSHIP CONSTRAINTS CONNECTIVITY
- special cardinalities
- max-card N an unlimited upper bound (?)
- max-card(E1,R) 1 and max-card(E2,R) 1
- one-to-one relationship (11)
- max-card(E1,R) 1 and max-card(E2,R) N
ormax-card(E1,R) N and max-card(E2,R) 1 - one-to-many relationship (1N)
- max-card(C1,A) N and max-card(C2,A) N
- many to many relationship (NM)
52QUESTION?
?
?
10..40
1..5
- A student must enroll in at least one course and
can enroll in at most five courses
- A course must have at least ten students enrolled
in it and cannot have more than forty students
enrolled in it.
53ER MODEL EXAMPLE RELATIONSHIPS
11 relationship
(0,1)
(1,1)
1
1
optional
mandatory
Capital-of
54ER MODEL EXAMPLE RELATIONSHIPS
1N relationship
(1,N)
(0,1)
N
1
mandatory
optional
Assigned-to
55ER MODEL EXAMPLE RELATIONSHIPS
NM relationship
(0,N)
(0,N)
N
M
optional
optional
Works-on
56N-ARY RELATIONSHIP CONNECTIVITY
- defined with respect to the other n-1 ends
- given a ternary relationship among classes (A, B,
C), the connectivity of the C end states how many
C instances may appear in relationship with a
particular pair of (A, B) instances
0..1
? 0 or 1 professor
- for each (course, student)
- a student will not take the same course from more
than one professor, but a student may take more
than one course from one professor and a
professor may teach more than one course
- for each (student, professor)
? many courses
- for each (course, professor)
? many students
57QUESTION?
0..3
0..1
? 0, 1, 2 or 3 days
? 0 or 1 classrooms
? many courses
58N-ARY RELATIONSHIP CONNECTIVITY CONSTRAINTS
- connectivity (max-card) can be expressed as a
functional dependency
FDs emp dept dept emp
FDs emp dept
59TERNARY RELATIONSHIP CONNECTIVITY
- 1 if only one instance of an entity can be
related with one instance of each of the other
two associated entities - e.g., if E2, E3 E1, then max-card E1 is 1
N if more than one instance of an entity can be
related with one instance of each of the other
two associated entities
60TERNARY RELATIONSHIP CONNECTIVITY (contd)
not allowed since for a given project, P, and
notebook, A, there is more than one technician
61TERNARY RELATIONSHIP CONNECTIVITY (contd)
not allowed since the same employee, X, works on
the same project, N, at different locations
62TERNARY RELATIONSHIP CONNECTIVITY (contd)
not allowed since the same engineer and project
combination, (N, Y), have two managers
63TERNARY RELATIONSHIP CONNECTIVITY (contd)
all combinations are allowed
64CONNECTIVITY CONSTRAINTS WEAK ENTITY 2.1
65CONNECTIVITY CONSTRAINTS WEAK ENTITY (contd)
- a weak entity may be related to more than one
strong entity, but may depend on only some of
these for its identity
Employee
1
How to show which entity it is dependent on?
Has
Has
N
Attends
Dependent
N
1
School
66AGGREGATION/COMPOSITION RELATIONSHIP
- a special type of relationship in which there is
a part-of relationship between one entity and
another entity
- existence of a component may depend on the
existence of the aggregate entity of which it is
a part mandatory existence (called composition)
Course-offering
Course
A
- a component may also exist independent of the
aggregate entity of which it is a part optional
existence(called aggregation)
Disk
Computer
A
67AGGREGATION/COMPOSITION RELATIONSHIP
- sometimes it is not clear whether the parts
should be related with composition or aggregation
Car
- Which aggregation relationship to use is a matter
of the database designers interpretation of
reality and/or the requirements of the application
68WHEN TO USE AGGREGATION/COMPOSITION?
- Would you use the phrase part of to describe
the relationship or name it Has? (But be
CAREFUL! Not all Has relationships are
aggregationsgt) - Are operations on the whole automatically applied
to the part(s) composition? - Are some attribute values propagated from the
whole to all or some of the parts? - Is there an intrinsic asymmetry to the
relationship where one entity is subordinate to
the other(s)?
- It is not wrong to use a relationship rather than
aggregation!(When in doubt, use a relationship!)
69EXAMPLE ENTITY-RELATIONSHIP DIAGRAM (ERD)
N
1
Department
Division
A
1
1
Is- managed-by
Has
N
1
1
boss
Is- occupied-by
Manager-of
1
N
Employee
Office
N
worker
N
1
N
Uses
P
Manages
Works- on
Language
N
M
Project
M
70SUMMARY ER DIAGRAM CONSTRUCTS
entity
relationship
Is- managed-by
1
1
Department
Employee
(a) one-to-one
Has
1
N
Department
Employee
(b) one-to-many
Works- on
N
M
Employee
Project
(c) many-to-many
71SUMMARY ER DIAGRAM CONSTRUCTS
Enrolled- in
(10,40)
(1,5)
Student
Course
(d) explicit cardinality
Works- on
N
M
Employee
Project
(e) relationship attribute
time
(f) recursive (with role names)
N
1
Division
Department
(g) aggregation
A
72SUMMARY ER DIAGRAM CONSTRUCTS
(h) ternary relationship
Uses
N
M
Employee
Project
P
Language
(i) generalization