Functional Dependencies FDs and - PowerPoint PPT Presentation

1 / 28
About This Presentation
Title:

Functional Dependencies FDs and

Description:

(7) List the transaction number and the total value of every transaction ... No hobby information? ... A hobby can be separately supplied or deleted. Spurious ... – PowerPoint PPT presentation

Number of Views:38
Avg rating:3.0/5.0
Slides: 29
Provided by: imadr
Category:

less

Transcript and Presenter's Notes

Title: Functional Dependencies FDs and


1
Lecture 23 (11/08/05)
  • Functional Dependencies (FDs) and
  • Normalization

2
Questionnaire
  • 18/20 answered
  • Most are pleased with the course so far ?
  • Learned so far
  • 7 Fair Amount, 10 Much, 1 Great Deal
  • Met Expectations
  • 1 Not very well, 7 Adequately, 8 Well, 2 Very
    Well
  • Positives shared by most
  • Class interactivity
  • Fast feedback
  • Slides
  • Deadlines/Assignments posted early on website

3
Questionnaire
  • Concerns (comments)
  • Exams
  • Long (will shorten)
  • Matching is too difficulty (none anymore)
  • Better idea of what to expect before exam
  • More daily tasks
  • Smaller but more frequent assignments
  • Read Chapter
  • MySQL (Not the best DBMS---Limited (all are))
  • Simple so that you focus on learning SQL not
    overwhelmed with functionality/GUI
  • Oracle, Postgres
  • Open source ? available for you to download
  • Easy setup JDBC
  • Most free relational datasets online use MySQL
  • Large support community good support website

4
Questionnaire
  • Stop calling on people by name to answer
    questions
  • More interactivity/participation ( attendance ?
    5 of score)
  • NOT INTENDED TO TEST YOUR KNOWLEDGE OR PUT YOU ON
    THE SPOT
  • Select when no one raises hand
  • Pass if you dont feel like answering
  • (better if you raise your hand and answer wrong
    than me calling on you and answer right!)
  • In-class group activities
  • Balance both
  • Cant cover as much/more time
  • Text book --- dry, long

5
Assignment 3
  • (7) List the transaction number and the total
    value of every transaction amounting to more than
    200
  • SELECT T. Trans_Number, SUM(P.Number_OrderedP.Quo
    ted_Price) AS ValFROM Trans T, Trans_Part P
    WHERE T.Trans_Number P.Trans_NumberGROUP BY
    T.Trans_NumberHaving Valgt200
  • (9) Find the customer number, last name, and
    first name for every customer who did not make a
    transaction on September 5, 2002

6
Assignment 3
  • SELECT Customer_number, Last, FirstFROM CUSTOMER
    CWHERE NOT EXISTS (SELECT FROM TRAN T
    WHERE C.Customer_number T.Customer_number and
    T.Trans_Date Like 09/05/200)
  • Use except?
  • Is there any for the QUOTED_PRICE column in table
    TRANS_PART or is it a repetition of the
    UNIT_PRICE column in table PART?

7
The Database Design Process
  • Conceptual design
  • Use a data model language to come up with an
    accurate, high-level description of the system
    requirements
  • Words (unstructured) ? Diagrams
  • Logical design
  • Map the resulting EERD into a set of relations
  • Diagram ? Relations
  • Physical design
  • Use DDL on some DBMS to create tables
    corresponding to your relations

8
The Database Design Process
  • Limitations of E-R Designs
  • The EER model provides a set of guidelines
  • Does not result in a unique database schema
  • Does not provide a formal way of evaluating
    alternatives
  • Relies largely on the common sense of the
    designer
  • Here we try to answer
  • What are the criteria for "good" base relations? 
  • Meaningful grouping of attributes
  • When designing a relation schema, how to decide
    which attributes to include?
  • So far, attributes are grouped to form the
    relation schema by using the common sense of the
    database designer

9
The Database Design Process
  • First discuss informal guidelines for good
    relational design
  • Then we discuss formal concepts of functional
    dependencies and normal forms
  • - 1NF (First Normal Form)
  • - 2NF (Second Normal Form)
  • - 3NF (Third Normal Form)
  • - BCNF (Boyce-Codd Normal Form) (Hopefully)
  • Additional types of dependencies, further normal
    forms, relational design algorithms by synthesis
    are discussed in Chapter 11

10
Relation Schema Informal Measures
  • We have some informal measures
  • Semantics of the attributes
  • Reducing the redundancy values in tuples
  • Disallowing the possibility of generating
    spurious tuples
  • Reducing null values
  • Not always independent of one another

11
Semantics of the Relation Attributes (1)
  • Any grouping of attributes to form a relation
    schema must portray a certain real-world meaning
  • Each tuple in a relation should represent one
    entity or relationship instance
  • Guideline 1 Design a relation schema so that it
    is easy to explain its meaning
  • Semantics of attributes should be easy to
    interpret
  • Attributes of different entities should not be
    mixed
  • Only foreign keys should be used to refer to
    other entities

12
Relations that violate Guideline 1 by intermixing
attributes from different relations
13
Relation schemas that abide by Guideline 1
14
Redundant Information (2)
  • One goal of schema design is to reduce redundancy
  • Information is stored redundantly wasting storage
  • Problems with update anomalies
  • Modification anomalies
  • Insertion anomalies
  • Deletion anomalies
  • Mixing attributes of multiple entities may cause
    the above problems

15
  • Update anomalies
  • Modification Anomalies
  • Update PNAME from ProductY to
    Customer-Accounting
  • Insert Anomalies
  • Insert a new employee not assigned to known
    project
  • Insert a new project with no working employees
  • Delete Project
  • Delete PNUMBER2
  • Delete the sole employee of a project

16
Modification Anomalies
  • Consider the relation
  • EMP_PROJ ( Emp, Proj, No_hours, Ename, Pname,
    Plocation)
  • Modification Anomaly
  • Changing the name of project number P2 from
    Project Y to Customer-Accounting
  • May cause this update to be made for all
    employees working on project P2 otherwise the DB
    will become inconsistent

17
Modification Anomalies
  • Consider the relation
  • EMP_PROJ( Emp, Proj, No_hours, Ename, Pname,
    Plocation)
  • Insert Anomaly Cannot insert a project unless
    an employee is assigned to
  • Inversely - Cannot insert an employee unless
    he/she is assigned to a project.
  • Delete Anomaly When a project is deleted ?
    delete all the employees who work on the project
  • Alternately, if an employee is the sole employee
    on a project, deleting that employee would result
    in deleting the corresponding project

18
Guideline to Redundant Information in Tuples and
Update Anomalies
  • Guideline 2 Design a schema that does not suffer
    from the insertion, deletion and update anomalies
  • If there are any present, then note them so that
    applications can be made to take them into
    account
  • Might need to break the guidelines to improve
    performance for certain queries
  • Imagine that we always access employee
    information only with department information
  • ? The design EMP_PROJ (Emp, Proj, No_hours,
    Ename, Pname, Plocation) might be could for such
    cases

19
Example
ER Model
SSN Name Address
Hobby 1111 Joe 123 Main biking,
hiking
Relational Model (SSN, Hobby, Name, Address)
20
Example
  • Redundancy leads to anomalies
  • A change in Address must be made in several
    places
  • Suppose a person gives up all hobbies. Do we
  • Set Hobby attribute to null? No, since Hobby is
    part of key
  • Delete the entire row? No, since we lose other
    information in the row
  • No hobby information?
  • Hobby value must be supplied for any inserted row
    since Hobby is part of key

21
Decomposition
  • Solution use two relations to store Person
    information
  • Person1 (SSN, Name, Address)
  • Hobbies (SSN, Hobby)
  • People with/without hobbies can now be described
  • No update anomalies
  • Name and address stored once
  • A hobby can be separately supplied or deleted

22
Spurious Tuples (3)
  • Bad designs for a relational database (or bad
    decompositions) may result in erroneous results
    for certain JOIN operations
  • Any decomposition MUST have the "lossless join"
    property
  • No spurious tuples should be generated by doing a
    natural-join of any decomposed relations
  • Person1 (SSN, Name, Address)
  • Hobbies (SSN, Name)
  • Here, loss relates to loss of information

23
Spurious Tuples (3)
  • Suppose we replace
  • EMP_PROJ ( SSN, PNUMBER, ENAME, PNAME, PLOCATION,
    HOURS) by
  • EMP_PROJ1(SSN, PNUMBER, PNAME, PLOCATION, HOURS)
    ANDEMP_LOCS (ENAME, PLOCATION)
  • Guideline 3 The relations should be designed to
    satisfy the lossless join condition
  • Avoid relations that contain matching attributes
    that are not (foreign key, primary key)
    combinations

24
Null Values in Tuples (4)
  • Guideline 4 Relations should be designed such
    that their tuples will have as few NULL values as
    possible
  • Make sure only NULLs are exceptional cases
  • If many attributes do not apply to all tuples in
    the relation, we end up with many nulls
  • Waste space
  • Ambiguity in meaning
  • Attribute not applicable or invalid
  • Value known to exist, but unavailable
  • Attribute value unknown (may or may not exist)
  • Difficulty specifying JOIN operations (inner or
    outer joins)
  • Attributes that are NULL frequently could be
    placed in separate relations (with the primary
    key)

25
Functional Dependencies
  • Functional dependencies (FDs) are used to specify
    formal measures of the "goodness" of a
    relational database design
  • FDs are constraints that are derived from the
    meaning and interrelationships of the data
    attributes
  • An FD is a constraint between two sets of
    attributes X and Y
  • X ? Y holds if whenever two tuples have the same
    value for X, they must have the same value for Y
  • For any two tuples t1 and t2 in any relation
    instance r(R) If t1Xt2X, then t1Yt2Y

26
Functional Dependencies
  • X ? Y A set of attributes X functionally
    determines a set of attributes Y (or Y is
    functionally determined by X) if the value of X
    determines a unique value for Y
  • X ? Y in R specifies a constraint on all
    relation instances r(R)
  • FDs are derived from the real-world constraints
    on the attributes
  • Property of the intention of the database
  • An FD is a property of the attributes in the
    schema R
  • The constraint must hold on every relation
    instance r(R)
  • Can NEVER be deduced from an extension
  • E.g. if in some case, all people having the same
    first name are registered for the same course,
    can we deduce that name ? course?

27
Examples of FD Constraints
  • EMP_PROJ (SSN, PNUMBER, ENAME, PNAME, PLOCATION,
    HOURS)
  • social security number determines employee name
  • SSN ? ENAME
  • project number determines project name and
    location
  • PNUMBER ? PNAME, PLOCATION
  • employee SSN and project number determines the
    hours per week that the employee works on the
    project
  • SSN, PNUMBER ? HOURS

28
More on FD Constraints
  • Definition of a relation (If K is a key of R)
  • K functionally determines all attributes in R
  • If X?Y is true, does that make Y?X true?
  • Some FDs are always true regardless of the
    relation in which they occur
  • State, Driver_License_Number ? SSN
  • Zip ? City, State
Write a Comment
User Comments (0)
About PowerShow.com