CS586 Fall 2004 - PowerPoint PPT Presentation

1 / 53
About This Presentation
Title:

CS586 Fall 2004

Description:

Today we'll learn about how to construct an Entity-Relationship Diagram. Later we will learn how to record a data model using the SQL language so a DBMS can use it ... – PowerPoint PPT presentation

Number of Views:98
Avg rating:3.0/5.0
Slides: 54
Provided by: sidki
Category:
Tags: construct | cs586 | fall | how | to

less

Transcript and Presenter's Notes

Title: CS586 Fall 2004


1
CS586 Fall 2004
Day 4
  • Introduction to Database Management Systems
  • Sid Kitchel

2
Plan of the Day
  • Questions
  • Data Modeling
  • The Entity-Relationship Model
  • The Relational Model

3
Modeling
  • What is modeling?
  • Producing a controlled representation of a real
    world object or process
  • The representation is usually an abstraction or a
    miniaturization of the real thing
  • Types of modeling
  • Physical
  • Model train or model aircraft
  • Engineering prototype or analog computer
  • Operational
  • Mathematical models
  • Data
  • Choosing and recording data that represents
    aspects of a real world object or process

4
Data Modeling
  • What is data modeling?
  • Producing a controlled representation of a real
    world object or process through data
  • The representation is done by
  • Choosing elements of data from the real world
    thing that characterize it and are useful for
    applications about the thing
  • Discovering the natural relationships of the
    clusters of modeling data
  • For example
  • Suppose Im helping the PSU basketball coach
    prepare to recruit next years freshman players
  • Id want to collect basic info on players like
    name, address, phone number, high school
  • Also I need to collect stats like field goal
    percentage, rebounds, points scored, etc. by
    playing year
  • The data models the players and data has
    relationships

5
Data Modeling Documentation
  • Data modeling produces what?
  • We have to record our ideas about a proper data
    model
  • In this course, we will do it in two ways
  • Entity-Relationship Diagrams (ERD)
  • Database schemas
  • Today well learn about how to construct an
    Entity-Relationship Diagram
  • Later we will learn how to record a data model
    using the SQL language so a DBMS can use it

6
Overview of Database Development
  • Applications may have to use data in the
    database
  • The data may be already stored there or it could
    be new
  • The application design and the database design
    must agree to be successful
  • Often the designer should be aware that data will
    be shared and design beyond the current
    application and for the future
  • To succeed the developer must think both in terms
    of how to make an application operate and how to
    use data appropriately

7
Phases of Development
Collect Requirements And Analyze
Functional Spec.
Database Requirements
Designing
Data Modeling
Data Model Spec.
ERD
Application Design Doc.
Create the schema
Implement the Design
Application Code
Database Schema
8
Steps Toward a Database Design
  • Learn what the user wants to do in an
    application
  • Learn about the real-world objects and processes
    that the application is about
  • Understand the measurable in them
  • Select the set of data items that can
    successfully represent those real world objects
    and processes in the database
  • Capture both objects and their relationships
  • This is the essence of data modeling
  • Make sure that the database requirements satisfy
    the data needs of the functional specification
  • Add to the database requirements any likely data
    elements that will be needed by future
    applications or future investigations dont
    forget metadata

9
Lets Create an Example
  • You are in a university nowso lets use that
  • Lets imagine you are at a large university
  • You are participating in developing a new class
    scheduling application
  • You have to create a database design for the
    project
  • How do you start???
  • How about requirements gathering?
  • What people/things are needed for class
    scheduling?
  • What operations or activities happen?

10
The Fruits of Analysis
  • What objects are in the university world that are
    involved in scheduling?
  • People are involved students and faculty
  • Courses exist and are described in the catalog
  • Sections occur in time and space
  • Colleges are broken into departments
  • How do objects relate and interact?
  • Sections must be able to accept assignment of
    students to them and be an instance of a course
  • Courses are given by departments
  • Students have departments as their majors and
    minors

11
Representing the Object
  • Objects need to be described
  • This is done through object attributes that tell
    about an object in data
  • Objects need to be identified
  • One or more attributes must act as a key that
    uniquely identifies the object
  • Objects must be independent or owned
  • Independent objects must be uniquely identified
  • Owned objects use a concatenated key the
    owners ID plus some owned object attribute that
    allows the object to be distinguished from the
    other owned objects

12
Capturing Relationships
  • Which objects participate?
  • There is a need to identify the objects
  • There is a need to identify how the participate
    totally or partially
  • What is the nature of the relationship?
  • There is a need to show the cardinality
  • One to one
  • One to many
  • Many to many
  • There is a need to show whether objects are
    independent or owned

13
Relationship Cardinalities
  • One to one
  • If there is a transcript object showing
    graduation, then it will be one to one with the
    student object
  • One to many
  • For each single department, there will be a
    group of students in it
  • Many to many
  • A student may take multiple sections, while
  • a single section will hold multiple students

14
Overview of Database Design
  • Conceptual design (Entity-Relationship Model is
    used at this stage and is an example of a
    Semantic Data Model.)
  • What are the entities and relationships in your
    enterprise?
  • What information about these entities and
    relationships should we store in the database?
  • What are the integrity constraints or business
    rules that hold for the data and its
    relationships?
  • An instance of data modeling in the ER Model can
    be represented pictorially using ER diagrams.
  • More practically you can map an ER diagram into a
    relational schema.

15
ER Model Basics
  • Entity Real-world object distinguishable from
    other objects. An entity is described (in the DB)
    using a set of attributes.
  • Entity Set A collection of similar entities.
    E.g., all employees.
  • All entities in an entity set have the same set
    of attributes. (Until we consider ISA
    hierarchies, anyway!)
  • Each entity set has a key.
  • Each attribute has a domain.

16
name
ER Model Basics (Cont.)
ssn
lot
since
Employee
name
dname
subor-dinate
super-visor
ssn
lot
budget
did
Reports_To
Works_In
Department
Employee
  • Relationship Association among two or more
    entities. E.g., Jill works in the Pharmacy
    department.
  • Relationship Set Collection of similar
    relationships.
  • An n-ary relationship set R relates n entity
    sets E1 ... En each relationship in R involves
    entities e1 ? E1, ... , en ? En
  • The same entity set could participate in
    different relationship sets, or in different
    roles in same set.

17
Key Constraints
budget
did
  • Consider Works_In An employee can work in many
    departments a dept can have many employees.
  • In contrast, each dept has at most one manager,
    according to the key constraint on Manages.

Departments
1-to-1
1-to Many
Many-to-1
Many-to-Many
Relationship set cardinalities
18
Weak Entities
  • A weak entity can be identified uniquely only by
    considering the primary key of another (owner)
    entity.
  • Owner entity set and weak entity set must
    participate in a one-to-many relationship set
    (one owner, many weak entities).
  • Weak entity set must have total participation in
    this identifying relationship set.

name
cost
pname
age
ssn
lot
Dependents
Policy
Employees
19
Alternate Representations
middle
last
first
weak entity
composite attribute
name
R
E1
E2
E2 participates totally in R
1
N
E1
E2
R
relationship set cardinality
derived attribute
multivalued attribute
20
More examples
  • The following slides show examples using the
    alternate representation
  • The alternate representation is often used
  • Also computer or web based ER tools often use
    even simpler representations
  • You might see

Student
SSN Name Address Enroll_date
21
Cardinalities
1
n
Faculty
teaches
Class
1
1
Chair
Department
heads
m
n
takes
Student
Class
22
Offering and Enrollment
Is this correct??
1
Faculty
teaches
n
m
n
takes
Student
Class
Does this model how PSU does scheduling?
23
Offering and Enrollment 2
Is this correct??
1
Faculty
teaches
n
m
n
takes
Student
Section
1
n
has
Course
24
Ternary Relationship
1
Faculty
teaches
n
1
isbn
author
m
n
assigned
Textbook
Section
publisher
1
n
has
Course
title
25
The Beginning
  • This paper is concerned with the application of
    elementary relation theory to systems which
    provide shared access to large banks of formatted
    data.
  • It provides a means of describing data with its
    natural structure only that is, without
    superimposing any additional structure for
    machine representation purposes.
  • Accordingly, it provides a basis for a high
    level data language which will yield maximal
    independence between programs on the one hand and
    machine representations and organization of data
    on the other. A further advantage of the
    relational view is that it forms a sound basis
    for treating derivability, redundancy, and
    consistency of relations
  • -- E.F. Codd, A Relational Model of Data for
    Large Shared Data Banks, CACM, no. 6 (June
    1970), 13377-387.

26
Why Study the Relational Model?
  • Most widely used model.
  • Vendors IBM, Informix, Microsoft, Oracle,
    Sybase, etc.
  • Legacy systems in older models
  • e.g., IBMs IMS, IDS, IDMS, ADABAS
  • Recent competitor object-oriented model
  • GemStone/S, ObjectStore, Versant, Ontos
  • A synthesis emerging object-relational model
  • Informix Universal Server, UniSQL, O2, Oracle, DB2

27
Big Picture
  • The Relational Model has a mathematical
    foundation
  • It offers a fundamental change in architecture
    and implementation

Network
Hierarchical
Relational
28
Relational Database Definitions
  • Relational database a set of relations
  • Relation made up of 2 parts
  • Instance a table, with rows and columns. Rows
    cardinality, fields degree / arity.
  • Schema specifies name of relation, plus name
    and type of each column.
  • E.G. Student (sid string, name string, login
    string, age integer, gpa
    real).
  • You can think of a relation as a set of rows or
    tuples (i.e., all rows are distinct).

29
Relational Database Definitions (more formally)
  • Domain a set of atomic values each domain has a
    name, a datatype, and format a domain is
    semantically unified.
  • Relation r, is a set of n-tuples denoted by r
    t1, t2,, tm in which each ti is a list of
    values ti
  • and each vi is a member of
    domi, the domain associated with attribute ai

30
Relational Terminology
  • Relational theory and relational practice do not
    talk the same

Theory talk Relation Tuple Attribute name Attr
ibute value
Domain
What is it? Main storage unit Repeating subunit
Aspect of the subunit Data chunk Possible
values

DBMS Practice Table Row Column name / field nam
e
Data value / field value Data types
31
Example Instance of Students Relation
  • Cardinality 3, degree 5, all rows distinct
  • Do all columns in a relation instance have to
  • be distinct?

32
Relational Query Languages
  • A major strength of the relational model
    supports simple, powerful querying of data.
  • Queries can be written intuitively, and the DBMS
    is responsible for efficient evaluation.
  • The key precise semantics for relational
    queries.
  • Allows the optimizer to extensively re-order
    operations, and still ensure that the answer does
    not change.

33
Integrity Constraints (ICs)
  • IC a condition that must be true for any
    instance of the database e.g., domain
    constraints.
  • ICs are specified when schema is defined.
  • ICs are checked when relations are modified.
  • A legal instance of a relation is one that
    satisfies all specified ICs.
  • The DBMS should not allow illegal instances.
  • If the DBMS checks ICs, stored data is more
    faithful to real-world meaning.
  • Avoids data entry errors, too!when you use check
    constraints.

34
Primary Key Constraints
  • A set of fields is a key for a relation if
  • 1. No two distinct tuples can have the same
    values in all key fields, and
  • 2. This is not true for any subset of the key.
  • If condition 2 is false? You have a superkey.
  • If theres 1 key for a relation, one of the keys
    is chosen (by the DBA) to be the primary key.
  • Each of these keys is often referred to as a
    candidate key, because they are a candidate for
    being primary.
  • E.g., sid is a key for Student. (What about
    name?) The set sid, gpa is a superkey.

35
Primary and Candidate Keys in SQL
  • Possibly many candidate keys (specified using
    UNIQUE), one of which is chosen as the primary
    key.

CREATE TABLE Enrolled (sid CHAR(20)
cid CHAR(20), grade CHAR(2),
PRIMARY KEY (sid,cid) )
  • For a given student and course, there is a
    single grade. vs. Students can take only one
    course, and receive a single grade for that
    course further, no two students in a course
    receive the same grade.
  • Used carelessly, an IC can prevent the storage of
    tuples that arise in practice!

vs.
CREATE TABLE Enrolled (sid CHAR(20)
cid CHAR(20), grade CHAR(2),
PRIMARY KEY (sid), UNIQUE (cid, grade)
)
36
Foreign Keys, Referential Integrity
  • Foreign key Set of fields in one relation that
    is used to refer to a tuple in another
    relation. (Must correspond to primary key of the
    second relation.) Like a logical pointer.
  • E.g. sid is a foreign key referring to
    Student
  • Enrolled(sid string, cid string, grade
    string)
  • If all foreign key constraints are enforced,
    referential integrity is achieved, i.e., no
    dangling references.
  • Can you name a data model w/o referential
    integrity?
  • Links in HTML!

37
Foreign Keys in SQL
CREATE TABLE Enrolled (sid CHAR(20),
cid CHAR(20), grade CHAR(2),
PRIMARY KEY (sid,cid), FOREIGN KEY (sid
) REFERENCES Students )
  • Only students listed in the Student relation
    should be allowed to enroll for courses.

Enrolled
Student
38
Enforcing Referential Integrity
  • Consider Student and Enrolled sid in Enrolled
    is a foreign key that references Student.
  • What should be done if an Enrolled tuple with a
    non-existent student id is inserted? (Reject
    it!)
  • What should be done if a Student tuple is
    deleted?
  • Also delete all Enrolled tuples that refer to
    it.
  • Disallow deletion of a Student tuple that is
    referred to.
  • Set sid in Enrolled tuples that refer to it to a
    default sid.
  • (In SQL, also Set sid in Enrolled tuples that
    refer to it to a special value null, denoting
    unknown or inapplicable.)
  • Similar if primary key of Student tuple is
    updated.

39
Referential Integrity in SQL
  • SQL/92 and SQL1999 support all 4 options on
    deletes and updates.
  • Default is NO ACTION (delete/update is
    rejected)
  • CASCADE (also delete all tuples that refer to
    deleted tuple)
  • SET NULL / SET DEFAULT (sets foreign key value
    of referencing tuple)

CREATE TABLE Enrolled (sid CHAR(20),
cid CHAR(20), grade
CHAR(2), PRIMARY KEY (sid,cid), FOREIGN
KEY (sid) REFERENCES Students ON DELETE
CASCADE ON UPDATE SET DEFAULT )
40
Where do ICs Come From?
  • ICs are based upon the semantics of the
    real-world enterprise that is being described in
    the database relations.
  • We can check a database instance to see if an IC
    is violated, but we can NEVER infer that an IC is
    true by looking at an instance.
  • An IC is a statement about all possible
    instances!
  • From example, we know name is not a key, but the
    assertion that sid is a key is given to us.
  • Key and foreign key ICs are the most common more
    general ICs supported too.

41
Logical DB Design ER to Relational
  • Entity sets to tables

CREATE TABLE Employee
(ssn CHAR(11),
name CHAR(20),
lot INTEGER,
PRIMARY KEY (ssn))
42
Relationship Sets to Tables
  • In translating a relationship set to a relation,
    attributes of the relation must include
  • Keys for each participating entity set (as
    foreign keys).
  • This set of attributes forms a superkey for the
    relation.
  • All descriptive attributes.

CREATE TABLE Works_In( ssn CHAR(11),
did INTEGER, since DATE, PRI
MARY KEY (ssn, did), FOREIGN KEY (ssn)
REFERENCES Employees, FOREIGN KEY (did)
REFERENCES Departments)
43
Review Key Constraints
  • Each dept has at most one manager, according to
    the key constraint on Manages.

budget
did
Department
Translation to relational model?
Many-to-Many
1-to-1
1-to Many
Many-to-1
44
Translating ER Diagrams with Key Constraints
CREATE TABLE Manages( ssn CH
AR(11), did INTEGER, sinc
e DATE, PRIMARY KEY (did),
FOREIGN KEY (ssn) REFERENCES Employee,
FOREIGN KEY (did) REFERENCES Departments)
  • Map relationship to a table
  • Note that did is the key now!
  • Separate tables for Employees and Departments.
  • Since each department has a unique manager, we
    could instead combine Manages and Departments.

CREATE TABLE Dept_Mgr( did
INTEGER, dname CHAR(20), bud
get REAL, ssn
CHAR(11), since DATE, PR
IMARY KEY (did), FOREIGN KEY (ssn) REFERENCES
Employee)
45
Review Participation Constraints
  • Does every department have a manager?
  • If so, this is a participation constraint the
    participation of Departments in Manages is said
    to be total (vs. partial).
  • Every did value in Department table must appear
    in a row of the Manages table (with a non-null
    ssn value!)

since
since
name
name
dname
dname
ssn
lot
budget
did
budget
did
Department
Employee
Manages
Works_In
since
46
Participation Constraints in SQL
  • We can capture participation constraints
    involving one entity set in a binary
    relationship, but little else (without resorting
    to CHECK constraints).

CREATE TABLE Dept_Mgr( did
INTEGER, dname CHAR(20), budge
t REAL, ssn CHA
R(11) NOT NULL, since DATE,
PRIMARY KEY (did), FOREIGN KEY (ssn) REFER
ENCES Employee, ON DELETE NO ACTION)
47
Weak Entities
  • A weak entity can be identified uniquely only by
    considering the primary key of another (owner)
    entity.
  • Owner entity set and weak entity set must
    participate in a one-to-many relationship set (1
    owner, many weak entities).
  • Weak entity set must have total participation in
    this identifying relationship set.

name
cost
pname
age
ssn
lot
Dependent
Policy
Employee
48
Translating Weak Entity Sets
  • Weak entity set and identifying relationship set
    are translated into a single table.
  • When the owner entity is deleted, all owned weak
    entities must also be deleted.

CREATE TABLE Dep_Policy ( pname CHAR(20),
age INTEGER, cost
REAL, ssn CHAR(11) NOT NULL, PRIMARY KEY
(pname, ssn), FOREIGN KEY (ssn) REFERENCES E
mployee, ON DELETE CASCADE)
49
Review Binary vs.Ternary Relationships
pname
age
Dependent
Covers
  • What are the additional constraints in the 2nd
    diagram?

Bad design
pname
age
Dependent
Purchaser
Better design
50
Binary vs. Ternary Relationships (Cont.)
CREATE TABLE Policy ( policyid INTEGER,
cost REAL, ssn CHA
R(11) NOT NULL, PRIMARY KEY (policyid). F
OREIGN KEY (ssn) REFERENCES Employee,
ON DELETE CASCADE)
  • The key constraints allow us to combine Purchaser
    with Policies and Beneficiary with Dependents.
  • Participation constraints lead to NOT NULL
    constraints.
  • What if Policies is a weak entity set?

CREATE TABLE Dependent ( pname CHAR(20
), age INTEGER, policyid
INTEGER, PRIMARY KEY (pname, policyid).
FOREIGN KEY (policyid) REFERENCES Policy,
ON DELETE CASCADE)
51
Views
  • A view is just a relation, but we store a
    definition, rather than a set of tuples.

CREATE VIEW YoungActiveStudent (name, grade)
AS SELECT S.name, E.grade FROM Studen
t S, Enrolled E WHERE S.sid E.sid
AND S.age
  • Views can be dropped using the DROP VIEW
    command.
  • How to handle DROP TABLE if theres a view on the
    table?
  • DROP TABLE command has options to let the user
    specify this.

52
Views and Security
  • Views can be used to present necessary
    information (or a summary), while hiding details
    in underlying relation(s).
  • Given YoungStudent, but not Student or Enrolled,
    we can find students who are enrolled, but not
    the cids of the courses they are enrolled in.

53
Relational Model Summary
  • A tabular representation of data.
  • Simple and intuitive, currently the most widely
    used.
  • Integrity constraints can be specified by the
    DBA, based on application semantics. DBMS checks
    for violations.
  • Two important ICs primary and foreign keys
  • In addition, we always have domain constraints.
  • Powerful and natural query languages exist.
  • Rules to translate ER to relational model
Write a Comment
User Comments (0)
About PowerShow.com