Title: Database Conceptual and Logical Design
1Database Conceptual and Logical Design
- Zachary G. Ives/Grigoris Karvounarakis
- University of Pennsylvania
- CIS 550 Database Information Systems
- October 3, 2007
Some slide content courtesy of Susan Davidson
Raghu Ramakrishnan
2Now How Do We Get the Database in the First
Place?
- Database design theory!
- Neat outcome we can actually prove that we have
optimal design, in a manner of speaking - But first we need to understand how to visualize
in pretty pictures
3Databases AnonymousA 6-Step Program
- Requirements Analysis what data, apps, critical
operations - Conceptual DB Design high-level description of
data and constraints typically using ER model - Logical DB Design conversion into a schema
- Schema Refinement normalization (eliminating
redundancy) - Physical DB Design consider workloads, indexes
and clustering of data - Application/Security Design
4Entity-Relationship Diagram(based on our running
example)
Underlined attributes are keys
PROFESSORS
fid
name
relationship set
Teaches
entity set
semester
Takes
STUDENTS
COURSES
serno
subj
cid
sid
name
exp-grade
attributes (recall these have domains)
5Conceptual Design Process
- What are the entities being represented?
- What are the relationships?
- What info (attributes) do we store about each?
- What keys integrity constraints do we have?
STUDENTS
Takes
name
exp-grade
sid
6Translating Entity Sets toLogical Schemas SQL
DDL
Fairly straightforward to generate a schema
CREATE TABLE STUDENTS (sid INTEGER,
name VARCHAR(15) PRIMARY KEY (sid) )
CREATE TABLE COURSES (serno INTEGER,
subj VARCHAR(30), cid CHAR(15),
PRIMARY KEY (serno) )
7Translating Relationship Sets
- Generate schema with attributes consisting of
- Key(s) of each associated entity (foreign keys)
- Descriptive attributes
CREATE TABLE Takes (sid INTEGER, serno
INTEGER, exp-grade CHAR(1), PRIMARY
KEY (?), FOREIGN KEY (serno) REFERENCES
COURSES, FOREIGN KEY (sid) REFERENCES
STUDENTS)
8 OK, But What about Connectivityin the E-R
Diagram?
- Attributes can only be connected to entities or
relationships - Entities can only be connected via relationships
- As for the edges, lets consider kinds of
relationships and integrity constraints
Teaches
PROFESSORS
COURSES
(warning the book has a slightly different
notation here!)
9Logical Schema Design
- Roughly speaking, each entity set or relationship
set becomes a table (not always be the case see
Monday) - Attributes associated with each entity set or
relationship set become attributes of the
relation the key is also copied (ditto with
foreign keys in a relationship set)
10Binary Relationships Participation
- Binary relationships can be classified as 11,
1Many, or ManyMany, as in
1n
mn
11
111Many (1n) Relationships
- Placing an arrow in the many ? one direction,
i.e. towards the entity thats refd via a
foreign key - Suppose profs teach multiple courses, but may not
have taught yet - Suppose profs must teach to be on the roster
Teaches
PROFESSORS
COURSES
Partial participation (0 or more)
Teaches
PROFESSORS
COURSES
Total participation (1 or more)
12Many-to-Many Relationships
- Many-to-many relationships have no arrows on
edges - The relationship set relation has a key that
includes the foreign keys, plus any other
attributes specified as key
Takes
COURSES
STUDENTS
13Examples
- Suppose courses must be taught to be on the
roster - Suppose students must have enrolled in at least
one course
14Representing 1n Relationships in Tables
CREATE TABLE Teaches( fid INTEGER, serno
CHAR(15), semester CHAR(4), PRIMARY KEY
(serno), FOREIGN KEY (fid) REFERENCES
PROFESSORS, FOREIGN KEY (serno) REFERENCES
COURSES)
CREATE TABLE Teaches_Course( serno INTEGER,
subj VARCHAR(30), cid CHAR(15), fid
CHAR(15), name CHAR(40), PRIMARY KEY
(serno), FOREIGN KEY (fid) REFERENCES
PROFESSORS)
Or embed relationship in many entity set
1511 Relationships
- If you borrow money or have credit, you might
get - What are the table options?
Describes
Borrower
CreditReport
ssn
rid
delinquent?
debt
name
16Roles Labeled Edges
- Sometimes a relationship connects the same
entity, and the entity has more than one role - This often indicates the need for recursive
queries
Includes
qty
Assembly
Subpart
id
Parts
name
17DDL for Role Example
CREATE TABLE Parts (Id INTEGER, Name
CHAR(15), PRIMARY KEY (ID) ) CREATE
TABLE Includes (Assembly INTEGER,
Subpart INTEGER, Qty INTEGER,
PRIMARY KEY (Assemb, Sub), FOREIGN KEY
(Assemb) REFERENCES Parts, FOREIGN KEY (Sub)
REFERENCES Parts)
18Roles vs. Separate Entities
Married
Husband
Wife
id
id
name
name
Married
What is the difference between these two
representations?
Husband
Wife
id
Person
name
19ISA Relationships Subclassing(Structurally)
- Inheritance states that one entity is a special
kind of another entity subclass should be
member of base class
id
People
name
ISA
Employees
salary
20But How Does this Translateinto the Relational
Model?
- Compare these options
- Two tables, disjoint tuples
- Two tables, disjoint attributes
- One table with NULLs
- Object-relational databases
21Weak Entities
- A weak entity can only be identified uniquely
using the primary key of another (owner) entity. - Owner and weak entity sets in a one-to-many
relationship set, 1 owner many weak entities - Weak entity set must have total participation
Feeds
People
Pets
name
weeklyCost
name
species
ssn
22Translating Weak Entity Sets
- Weak entity set and identifying relationship set
are translated into a single table when the
owner entity is deleted, all owned weak entities
must also be deleted
CREATE TABLE Feed_Pets ( name VARCHAR(20),
species INTEGER, weeklyCost REAL, ssn
CHAR(11) NOT NULL, PRIMARY KEY (pname, ssn),
FOREIGN KEY (ssn) REFERENCES People, ON
DELETE CASCADE)
23N-ary Relationships
- Relationship sets can relate an arbitrary number
of entity sets
Student
Project
IndepStudy
Advisor
24Summary of ER Diagrams
- One of the primary ways of designing logical
schemas - CASE tools exist built around ER (e.g. ERWin,
PowerBuilder, etc.) - Translate the design automatically into DDL, XML,
UML, etc. - Use a slightly different notation that is better
suited to graphical displays - Some tools support constraints beyond what ER
diagrams can capture - Can you get different ER diagrams from the same
data?
25Schema Refinement Design Theory
- ER Diagrams give us a start in logical schema
design - Sometimes need to refine our designs further
- Theres a system and theory for this
- Focus is on redundancy of data
- Lets briefly touch on one key concept in
preparation for Mondays lecture on normalization
26Not All Designs are Equally Good
- Why is this a poor schema design?
- And why is this one better?
Stuff(sid, name, cid, subj, grade)
Student(sid, name) Course(cid, subj) Takes(sid,
cid, exp-grade)
27Focus on the Bad Design
- Certain items (e.g., name) get repeated
- Some information requires that a student be
enrolled (e.g., courses) due to the key
28Functional DependenciesDescribe Key-Like
Relationships
- A key is a set of attributes where
- If keys match, then the tuples match
- A functional dependency (FD) is a generalization
- If an attribute set determines another, written A
! Bthen if two tuples agree on A, they must
agree on Bsid ! Address - What other FDs are there in this data?
- FDs are independent of our schema design choice
29Formal Definition of FDs
- Def. Given a relation scheme R (a set of
attributes) and subsets X,Y of R - An instance r of R satisfies FD X ? Y if, for
any two tuples t1, t2 2 r, t1X t2X
implies t1Y t2Y - For an FD to hold for scheme R, it must hold for
every possible instance of r - (Can a DBMS verify this? Can we determine this
by looking at an instance?)
30General Thoughts on Good Schemas
- We want all attributes in every tuple to be
determined by the tuples key attributes - What does this say about redundancy?
- But
- What about tuples that dont have keys (other
than the entire value)? - What about the fact that every attribute
determines itself? - Stay tuned for Monday!