Title: D2: Modeling and Design of Databases
1(No Transcript)
2D2 Modeling and Design of Databases
- Prof. dr. Bart Kuijpers
- Email bart.kuijpers_at_luc.ac.be
- (best way to reach me)
- Address Limburgs Universitair Centrum,
- Dept. WNI, Universitaire Campus,
- B-3590 Diepenbeek
- My web page http//alpha.luc.ac.be/lucp1265/
- Course web page http//alpha.luc.ac.be/lucp1265/
db.html
3D2 Modeling and Design of Databases
- Part 3
- Basics of the Relational Model
- From ODL to Relational Design
- From E/R to Relational Design
- Converting Subclasses
- Part 4
- Functional Dependencies
- Rules about Functional Dependencies
- Design of Relational Database Schemas
- Weak Entity Sets
- Normal Forms
- Multivalued Dependencies
4D2 Modeling and Design of Databases
- Part 3
- Basics of the Relational Model
- From ODL to Relational Design
- From E/R to Relational Design
- Converting Subclasses
5The Relational Data Model
- OO and E/R approaches to data modeling are useful
to describe the structure of the data - Todays implementations are based on the
relational model (RM) - single data modeling concept the relation
(2-dim. table) - supports SQL (structured query language)
- Translation of ODL and E/R into RM
- Design theory of its own (normalization based on
functional dependencies)
6Basics of the Relational Model
- One way to represent data a 2-dimensional table
called relation. - Example recall the ODL class Movie (simple
version). - interface Movie
- attribute string title
- attribute integer year
- attribute integer length
- attribute enum Film color,blackAndWhite
filmType -
7Basics of the RM attributes
- Table relation.
- Column headers attributes.
- See attributes in ODL or E/R.
8Basics of the RM schema
- Relation schema name(attributes) other
structure info., e.g., keys, other constraints. - E.g. Movie(title, year, length, filmType)
- Order of attributes is arbitrary, but in practice
we need to assume the (standard) order given in
the relation schema. - Relational database schema collection of
relation schemas.
9Basics of the RM tuples - instance
- Row of table tuple
- Also written as (Casablanca,1942,102,
blackAndWhite) - tuple has a component for each attribute in
schema - Objects have identity, tuples not (?no doubles!
additional ID needed) - Relation instance is current set of rows for a
relation schema.
10Basics of the RM domains
- Each component in a tuple is atomic (first
normal form1NF). - E.g., integer, string and not record, structure,
array, - ODL attributes such as
- Struct Addrstring street, string city address
- cannot be translated directly into RM.
- With each attribute a domain (elementary type)
of atomic values is associated. - Example
- with title string is associated
- with year integer is associated
11Basics of RM equivalent representations
- Schemas are sets of attributes (not lists).
- Tuples are sets of components (not lists).
- Instances are sets of tuples (not lists)
- ? After permutation of rows and columns the
relations remains the same! (permute values and
attributes)
12Basics of RM equivalent representations
- Formal notion of a tuple
- a function attributes ? values
- title ? Star Wars
- year ? 1977
- length ? 121
- filmType ? color
- (Star Wars, 1977, 121,color) and
- (1977, 121,color, Star Wars) are the same object.
13Relational Data Model summary
Relation as table Rows tuples Columns
components Names of columns attributes Relation
name set of attribute names schema REL
(A1,A2,...,An)
- Set theoretic
- Domain set of values
- like a data type
- Cartesian product (or product)
- D1 ??D2 ??... ? Dn
- n-tuples (V1,V2,...,Vn)
- s.t., V1 ??D1, V2 ??D2,...,Vn ??Dn
- Relationsubset of cartesian product of one or
more domains - FINITE only empty set allowed
- Tuples members of a relation inst.
- Arity number of domains
- Components values in a tuple
- Domains corresp. with attributes
- Cardinality number of tuples
A1 A2 A3 ... An a1 a2 a3 an b1 b2
a3 cn a1 c2 b3 bn . . . x1 v2 d3
wn
Attributes
C a r d i n a l i t y
Tuple
Component
Arity
14Relation Example
Domain of Relation N A T N1 A1 T1 N1
A1 T2 N1 A1 T3 . . . N1 A1 T7 N1 A2 T1 N1
A3 T1 N2 A1 T1
- name address tel
- 5 3 7
- Cardinality of domain
- Domains
- N A T
- N1 A1 T1
- N2 A2 T2
- N3 A3 T3
- N4 T4
- N5 T5
- T6
- T7
Arity 3 Cardinality lt5x3x7 of relation
Attribute
Component
Tuple µ
Domain
15Schema versus Instance
- DB instances change continuously
- (e.g., movies are added, deleted, changed,)
- The schema is stable
- (attributes change almost never)
- A RDB instance is the set of tuples that are
now in the DB - When designing the DB only the schema is
important (the structure of the data/DB) - We only imagine typical instances to help us with
the design - Intentional level schema
- Extensional level instances
16Why Relations?
- Very simple model.
- Often a good match for the way we think about our
data. - Abstract model that underlies SQL, the most
important language in DBMSs today. - But SQL uses bags while the abstract relational
model is set-oriented.
17Relational Design
- Creation of a DB
- Design phase (on paper, which information,
relationships, constraints, ) - Implementation phase (real RDBMS)
- It is easier to start from ODL or E/R and later
convert to RM - RM has only one concept (relation)
- E/R and ODL have complementary concepts and are
more flexible (constraints, )
18Relational Design
- Design in ODL or E/R (schemaconstraints)
- ?
- implementation in a RDBMS
- Simplest approach (not always best) convert each
ODL class or E/R entity set to a relation and
each relationship to a relation. - Class/Entity Set ? Relation
- Relationship ? Relation
19D2 Modeling and Design of Databases
- Part 3
- Basics of the Relational Model
- From ODL to Relational Design
- From E/R to Relational Design
- Converting Subclasses
20From ODL attributes to RM attributes
- Assume all properties of a class are (atomic)
attributes (no relationships, methods) - interface Movie
- attribute string title
- attribute integer year
- attribute integer length
- attribute enum Film color,blackAndWhite
filmType - We create a relation with schema
- Movie(title, year, length, filmType)
- I.e., one class ? one relation
- one attribute ? one attribute
- Atomic attributes, domains, tuples
21Non-atomic attributes of ODL classes
- Attributes in ODL may be complex (Set, Record,
enumeration, List, Bag, Array, ) - Recordmake a new attribute for each entry in the
record (create new names if necessary) - For interface Star
- attribute string name
- attribute Struct Addr string street, string
city address - we create a relation with schema
- Star(name, street, city)
22Non-atomic attributes of ODL classes
- Enumeration alias for initial fragment of
integers (?integers) - Dates ?strings (e.g., 9/9/99)
- Sets per member of the set we create one tuple
- For interface Star
- attribute string name
- attribute SetltStruct Addr string street,
string citygt address - we create a relation with schema
- Star(name, street, city)
23Non-atomic attributes of ODL classes
- Sets
- interface Star
- attribute string name
- attribute SetltStruct Addr string street,
string citygt address - attribute Date birthDate
- Problems
- Non-key values may be repeated!! (redundancy)
- Harrison Ford has an empty set of addresses and
does not appear - Several set-attributes the number of tuples
muliplies
24Non-atomic attributes of ODL classes
- Bags add a count-attribute (in RM we cannot add
duplicates) - Lists add a position-attribute
- Arrays we repeat the attributes fixed length
- interface Star
- attribute string name
- attribute ArrayltStruct Addr string street,
string city,2gt address
25Single-valued relationships
- interface Movie
- attribute string title attribute integer year
- attribute integer length
- attribute enum Film color,blackAndWhite
filmType - relationship SetltStargt stars inverse
StarstarredIn - relationship Studio ownedBy inverse
Studioowns - interface Studio
- attribute string name
- attribute string address
- relationship SetltMoviegt owns inverse
MovieownedBy - First approach we create a relation with schema
- Movie(title, year, length, filmType,StName,StAddr
ess,Stowns) - Studio contains a relationship owns danger of
circular redefinition! - Redundancy.
26Single-valued relationships
- ODL class may contain relationships to other
classes - interface Movie
- attribute string title attribute integer year
- attribute integer length
- attribute enum Film color,blackAndWhite
filmType - relationship SetltStargt stars inverse
StarstarredIn - relationship Studio ownedBy inverse
Studioowns - interface Studio
- attribute string name
- attribute string address
- relationship SetltMoviegt owns inverse
MovieownedBy - We create a relation with schema
- Movie(title,year,length,filmType,StudioName)
- name is a key of Studio
27Multi-valued relationships
- one-many or many-many relationships
- interface Movie
- attribute string title attribute integer year
- attribute integer length
- attribute enum Film color,blackAndWhite
filmType - relationship SetltStargt stars inverse
StarstarredIn - relationship Studio ownedBy inverse
Studioowns - interface Star
- attribute string name
- attribute Struct Addr string street, string
city address - relationship SetltMoviegt starredIn inverse
Moviestars - Find a key to represent each related object
- Create a tuple for each value (see
set-attributes redundancy) - We take the schema
- Movie(title,year,length,filmType,studioName,starNa
me)
28Multi-valued relationships
- Remark
- If class C has multi-valued relationships
R1,R2,,Rk, - then R(C) has
- attributes for attributes of C
- attributes representing keys for single-valued
relationships - attributes representing keys for multi-valued
relationships - suppose an object in C corresponds to ni objects
via Ri, - then for each object in C there are n1 x n2 xx
nk objects in R(C). - (x,y1,z1) (x,y1,z2) (x,y1,z3)
- (x,y2,z1) (x,y2,z2) (x,y2,z3)
29And if there are no keys?
- In ODL we can have two objects with identical
values (objects have an OID). - Invent a new key-attribute OID
30A relationship and its inverse
- Some problems
- ODL?RM each relationship is translated twice.
- In ODL both a necessary we cannot follow a
pointer backwards. - In RM no pointers, but associated key values
- (this is slower in main memory!--indexing)
- When a relationship is many-one represent it
only in the many-relation (avoid redundancy) - Normalization is another solution (see further).
31D2 Modeling and Design of Databases
- Part 3
- Basics of the Relational Model
- From ODL to Relational Design
- From E/R to Relational Design
- Converting Subclasses
32From E/R to Relational Design
ODL
Relational Model
E/R
- Important differences between ODL and E/R
- In E/R relationships are a separate concept,
rather than being embedded as a property of a
class. - In ODL, attributes can be of set-type.
- If set-type in E/R, then do not use attribute but
entity set to model this! - In E/R, relationships may have attributes.
33From Entity Sets to Relations
- An entity set that is not weak, is translated
into a relation with the same name and
attributes. - E.g.,
- Movie(title, year, length, filmType)
- Star(name, address) or Star(name, street, city)
Stars-in
Owns
Star
Movies
Studios
title
length
name
year
address
address
name
filmType
34From Relationships to Relations
- E/R relationships are also translated to
relations - For each entity set involved in R, take key
attribute(s) as part of schema - If the relationship has attributes, add them to
the schema - If an entity set appears more than once in a
relationship, rename its attributes to avoid
doubles and for clarity! - E.g., Owns(title, year, studioName)
Stars-in
Owns
Studios
Star
Movies
title
length
name
year
address
address
name
filmType
35From Relationships to Relations
- If an entity set appears more than once in a
relationship, rename its attributes to avoid
doubles! name - E.g., Contracts(title, year,starName,
- studioOfStar,producingStudio)
Studio
Producing studio
Contracts
Studio of star
Star
Movies
36Handling weak entity sets
- If there is a weak entity sets W we do the
following - differently
- Attributes of W plus key attributes of other
entity sets that contribute to the key of W
(double-diamond many-one). - Any relationship in which W appears must use as a
key for W all of its attributes including those
of the other entity sets that contribute to Ws
key - Double-diamond relationships from W to another
entity set do not need to be converted - (this information is already in the relation for
W). -
37Handling weak entity sets
name
number
address
Unit-of
Studios
Crews
- Studios(name, address)
- Crews(number,studioName)
- Unit-of(number,studioName,name)?
Unit-of(number,name) - are the same (many-one!)
- (Disney crew 3, Disney) ? (3, Disney, Disney)
38Handling weak entity sets
name
number
address
Unit-of
Studios
Crews
- Studios(name, address)
- Crews(number,studioName)
- Unit-of(number,studioName,name)?
Unit-of(number,name) - are the same (many-one!)
- (Disney crew 3, Disney) ? (3, Disney, Disney)
39Handling weak entity sets
- Contracts(starName, studioName, title, year,
salary) - Relations for Movie-of ,
- Star-of and
- Studio-of are superfluous
address
name
Studios
Studio of star
Star-of
Movie-of
Movies
Stars
Contracts
length
name
salary
title
year
address
filmType
40D2 Modeling and Design of Databases
- Part 3
- Basics of the Relational Model
- From ODL to Relational Design
- From E/R to Relational Design
- Converting Subclasses
41Subclass Structures to Relations
- Differences between E/R and ODL
- In ODL an object belongs to exactly one class.
It inherits properties from superclass. - In E/R an object may belong to several entity
sets related by isa.
42From ODL Subclass to Relations
- Every subclass its own relation
- It has all properties of this subclass including
inherited properties - E.g.,
- interface MurderMysteryMovieattribute string
weapon - interface CartoonMovierelationship SetltStargt
voices - Movie(title,year,length,filmType,studioName,starNa
me) - Cartoon(title,year,length,filmType,StudioName,star
Name, - voice)
- MurderMystery(title,year,length,filmType,StudioNam
e,starName,weapon) - Cartoon-MurderMystery(title,year,length,filmType,
- StudioName,starName,voice, weapon)
43Representing isa in the RM
- A hierarchy is populated by entities related by
isas. - No relation is created for the isa relationship.
- For each entity set, a separate relation with its
own attributes plus key attributes of related
attribute sets. - E.g.,
Movies
year
isa
title
Roger Rabbit
isa
length
filmType
Voices
Cartoons
Murder- Mysteries
weapon
44Representing isa in the RM
- Movie(title,year,length,filmType)
- Cartoon(title,year)
- MurderMysteries(title,year,weapon)
- Voices(title,year,starName)
- Cartoon-MurderMysteries
Scattered information!!
Movies
year
isa
title
Roger Rabbit
isa
length
filmType
Voices
Cartoons
Murder- Mysteries
weapon
45Representing isa in the RM
- Movie(title,year,length,filmType)
- Cartoon(title,year)
- MurderMysteries(title,year,weapon)
- Voices(title,year,starName)
- Cartoon-MurderMysteries
in voices!?
Movies
year
isa
title
Roger Rabbit
isa
length
filmType
Voices
Cartoons
Murder- Mysteries
weapon
46Subclass Structures to Relations
- Differences between E/R and ODL
- In ODL all properties of an object together in
one relation we have to search 4 relations to
find a movie object! - Movie(title,year,length,filmType,studioName,starNa
me) - Cartoon(title,year,length,filmType,StudioName,
- starName,voice)
- MurderMystery(title,year,length,filmType,StudioNam
e,starName,weapon) - Cartoon-MurderMystery(title,year,length,filmType,
- StudioName,starName,voice, weapon)
47Subclass Structures to Relations
- Differences between E/R and ODL
- In ODL all properties of an object together in
one relation (? we have to search 4 relations to
find a movie object!) - In E/R a key of an entity is repeated once for
every entity set and relationship it belongs to
(?scattered information).
48Using NULL values to combine relations
- ODL Information of a hierarchy can be given in
one relation - NULL there is no appropriate value for
this attribute - E.g.,
- Movie(title,year,length,filmType,StudioName,
- starName,voice, weapon)
49RM Design Exercise
- Design a database for a bank (bank, customers and
accounts) - A bank has a name, address.
- A customer has a name, address, phone, SSNr.
- An account has a number, a type (current,
saving), a balance. - Record which banks have which customers
- Record which customers have which accounts
- Add a bank director (can be customer)
- Convert from ODL to RM
- Convert from E/R to RM.
50Summary of Part 3
- Relations are tables representing information.
- Columns are headed by attributes each attribute
has a domain. - Rows are called tuples that have components.
- Schema relation name attribute names.
- Converting ODL to RM.
- Attribute for each attribute
- Key-attributes for relationships
- Many-many relationships (explosion of tuples)
- Converting E/R to RM (scattered information weak
entity sets) - Converting subclasses to relations.