Title: Database Conceptual and Logical Design
1Database Conceptual and Logical Design
- Zachary G. Ives
- University of Pennsylvania
- CIS 550 Database Information Systems
- October 2, 2004
Some slide content courtesy of Susan Davidson
Raghu Ramakrishnan
2How Do I Talk to the DB?
- Generally, apps are in a different (host)
language with embedded SQL statements - Static SQLJ, embedded SQL in C
- Runtime ODBC, JDBC, ADO, OLE DB,
- Typically, predefined mappings between host
language types and SQL types (e.g., VARCHAR ?
String or char)
3Embedded SQL in C
EXEC SQL BEGIN DECLARE SECTION int sid char
name20 EXEC SQL END DECLARE SECTION EXEC SQL
INSERT INTO STUDENT VALUES (sid, name) EXEC
SQL SELECT name, age INTO sid, name FROM
STUDENT WHERE sid lt 20
4The Impedance Mismatch and Cursors
- SQL is set-oriented it returns relations
- Theres no relation type in most languages!
- Solution cursor thats opened, read
- DECLARE sinfo CURSOR FOR SELECT sid, name
FROM STUDENTOPEN sinfowhile () FETCH
sinfo INTO sid, name CLOSE sinfo
5JDBC Dynamic SQL for Java
- Roughly speaking, a Java version of ODBC
- Youll likely use this in the course project
- See Chapter 6 of the text for more infoimport
java.sql.Connection conn DriverManager.getCon
nection()PreparedStatement stmt
conn.prepareStatement(SELECT FROM
STUDENT)ResultSet rs stmt.executeQuery ()
while (rs.next()) sid rs.getInteger(1)
6Database-Backed Web Sites
- We all know traditional static HTML web sites
Web Browser
Web-Server
HTTP-Request GET ...
7DB Access with JavaApplets and Server Processes
BrowserJVM
Java Applet
TCP/UDP IP
Java-Server-Process
JDBC Driver manager
JDBC-Driver
JDBC-Driver
JDBC-Driver
Sybase
Oracle
...
8Java Applets Discussion
- Advantages
- Can take advantage of client processing
- Platform independent assuming standard java
- Disadvantages
- Requires JVM on client self-contained
- Inefficient loading can take a long time ...
- Resource intensive Client needs to be state of
the art - Restrictive can only connect to server where
applet was loaded from (for security can be
configured)
9Server Pages (P) and Servlets(IIS, Tomcat, )
Web Server
Web Server
HTTP Request
File-System
Load File
HTML
HTML?
File
HTML File
10ASP/JSP/PHP Versus Servlets
- The goal combine direct HTML (or XML) output
with program code thats executed at the server - The code is responsible for generating more
HTML, e.g., to output the results of a database
table as HTML table elements - How might I do this?
- HTML with embedded code (P)
- Code that prints out HTML (Servlets)
11Now How Do We Get the Database in the First
Place?
12Databases AnonymousA 6-Step Program
- Requirements Analysis what data, apps, critical
operations - Conceptual DB Design high-level description of
data and constraints typically using ER model - Logical DB Design conversion into a schema
- Schema Refinement normalization (eliminating
redundancy) - Physical DB Design consider workloads, indexes
and clustering of data - Application/Security Design
13Entity-Relationship Diagram(based on our running
example)
Underlined attributes are keys
PROFESSORS
fid
name
relationship set
Teaches
entity set
semester
Takes
STUDENTS
COURSES
serno
subj
cid
sid
name
exp-grade
attributes (recall these have domains)
14Conceptual Design Process
- What are the entities being represented?
- What are the relationships?
- What info (attributes) do we store about each?
- What keys integrity constraints do we have?
STUDENTS
Takes
name
exp-grade
sid
15Translating Entity Sets toLogical Schemas SQL
DDL
Fairly straightforward to generate a schema
CREATE TABLE STUDENTS (sid INTEGER,
name VARCHAR(15) PRIMARY KEY (sid) )
CREATE TABLE COURSES (serno INTEGER,
subj VARCHAR(30), cid CHAR(15),
PRIMARY KEY (serno) )
16Translating Relationship Sets
- Generate schema with attributes consisting of
- Key(s) of each associated entity (foreign keys)
- Descriptive attributes
CREATE TABLE Takes (sid INTEGER, serno
INTEGER, exp-grade CHAR(1), PRIMARY
KEY (?), FOREIGN KEY (serno) REFERENCES
COURSES, FOREIGN KEY (sid) REFERENCES
STUDENTS)
17 OK, But What about Connectivityin the E-R
Diagram?
- Attributes can only be connected to entities or
relationships - Entities can only be connected via relationships
- As for the edges, lets consider kinds of
relationships and integrity constraints
Teaches
PROFESSORS
COURSES
(warning the book has a slightly different
notation here!)
18Logical Schema Design
- Roughly speaking, each entity set or relationship
set becomes a table (not always be the case see
Thursday) - Attributes associated with each entity set or
relationship set become attributes of the
relation the key is also copied (ditto with
foreign keys in a relationship set)
19Binary Relationships Participation
- Binary relationships can be classified as 11,
1Many, or ManyMany, as in
1n
mn
11
201Many (1n) Relationships
- Placing an arrow in the many ? one direction,
i.e. towards the entity thats refd via a
foreign key - Suppose profs teach multiple courses, but may not
have taught yet - Suppose profs must teach to be on the roster
Teaches
PROFESSORS
COURSES
Partial participation (0 or more)
Teaches
PROFESSORS
COURSES
Total participation (1 or more)
21Many-to-Many Relationships
- Many-to-many relationships have no arrows on
edges - The relationship set relation has a key that
includes the foreign keys, plus any other
attributes specified as key
Takes
COURSES
STUDENTS
22Examples
- Suppose courses must be taught to be on the
roster - Suppose students must have enrolled in at least
one course
23Representing 1n Relationships in Tables
CREATE TABLE Teaches( fid INTEGER, serno
CHAR(15), semester CHAR(4), PRIMARY KEY
(serno), FOREIGN KEY (fid) REFERENCES
PROFESSORS, FOREIGN KEY (serno) REFERENCES
Teaches)
CREATE TABLE Teaches_Course( serno INTEGER,
subj VARCHAR(30), cid CHAR(15), fid
CHAR(15), when CHAR(4), PRIMARY KEY
(serno), FOREIGN KEY (fid) REFERENCES
PROFESSORS)
Or embed relationship in many entity set
2411 Relationships
- If you borrow money or have credit, you might
get - What are the table options?
Describes
Borrower
CreditReport
ssn
rid
delinquent?
debt
name
25Roles Labeled Edges
- Sometimes a relationship connects the same
entity, and the entity has more than one role - This often indicates the need for recursive
queries
Includes
qty
Assembly
Subpart
id
Parts
name
26DDL for Role Example
CREATE TABLE Parts (Id INTEGER, Name
CHAR(15), PRIMARY KEY (ID) ) CREATE
TABLE Includes (Assembly INTEGER,
Subpart INTEGER, Qty INTEGER,
PRIMARY KEY (Assemb, Sub), FOREIGN KEY
(Assemb) REFERENCES Parts, FOREIGN KEY (Sub)
REFERENCES Parts)
27Roles vs. Separate Entities
Married
Husband
Wife
id
id
name
name
Married
What is the difference between these two
representations?
Husband
Wife
id
Person
name
28ISA Relationships Subclassing(Structurally)
- Inheritance states that one entity is a special
kind of another entity subclass should be
member of base class
id
People
name
ISA
Employees
salary
29But How Does this Translateinto the Relational
Model?
- Compare these options
- Two tables, disjoint tuples
- Two tables, disjoint attributes
- One table with NULLs
- Object-relational databases
30Weak Entities
- A weak entity can only be identified uniquely
using the primary key of another (owner) entity. - Owner and weak entity sets in a one-to-many
relationship set, 1 owner many weak entities - Weak entity set must have total participation
Feeds
People
Pets
name
weeklyCost
name
species
ssn
31Translating Weak Entity Sets
- Weak entity set and identifying relationship set
are translated into a single table when the
owner entity is deleted, all owned weak entities
must also be deleted
CREATE TABLE Feed_Pets ( name VARCHAR(20),
species INTEGER, weeklyCost REAL, ssn
CHAR(11) NOT NULL, PRIMARY KEY (pname, ssn),
FOREIGN KEY (ssn) REFERENCES Employees,
ON DELETE CASCADE)
32N-ary Relationships
- Relationship sets can relate an arbitrary number
of entity sets
Student
Project
IndepStudy
Advisor
33Summary of ER Diagrams
- One of the primary ways of designing logical
schemas - CASE tools exist built around ER (e.g. ERWin,
PowerBuilder, etc.) - Translate the design automatically into DDL, XML,
UML, etc. - Use a slightly different notation that is better
suited to graphical displays - Some tools support constraints beyond what ER
diagrams can capture - Can you get different ER diagrams from the same
data?
34Schema Refinement Design Theory
- ER Diagrams give us a start in logical schema
design - Sometimes need to refine our designs further
- Theres a system and theory for this
- Focus is on redundancy of data
- Lets briefly touch on one key concept in
preparation for Thursdays lecture on
normalization
35Not All Designs are Equally Good
- Why is this a poor schema design?
- And why is this one better?
Stuff(sid, name, cid, subj, grade)
Student(sid, name) Course(cid, subj) Takes(sid,
cid, exp-grade)
36Focus on the Bad Design
- Certain items (e.g., name) get repeated
- Some information requires that a student be
enrolled (e.g., courses) due to the key
sid name cid subj exp-grade
1 Sam 570 AI B
23 Nitin 550 DB A
45 Jill 505 OS A
1 Sam 505 OS C
37Functional DependenciesDescribe Key-Like
Relationships
- A key is a set of attributes where
- If keys match, then the tuples match
- A functional dependency (FD) is a generalization
- If an attribute set determines another, written A
! Bthen if two tuples agree on A, they must
agree on Bsid ! Address - What other FDs are there in this data?
- FDs are independent of our schema design choice
38Formal Definition of FDs
- Def. Given a relation scheme R (a set of
attributes) and subsets X,Y of R - An instance r of R satisfies FD X ? Y if, for
any two tuples t1, t2 2 r, t1X t2X
implies t1Y t2Y - For an FD to hold for scheme R, it must hold for
every possible instance of r - (Can a DBMS verify this? Can we determine this
by looking at an instance?)
39General Thoughts on Good Schemas
- We want all attributes in every tuple to be
determined by the tuples key attributes - What does this say about redundancy?
- But
- What about tuples that dont have keys (other
than the entire value)? - What about the fact that every attribute
determines itself? - Stay tuned for Thursday!