Derek Shigemi - PowerPoint PPT Presentation

1 / 32
About This Presentation
Title:

Derek Shigemi

Description:

Dangling tuples tuples that disappear when computing a join. ... A tuple t of relation r1 is a dangling tuple if t is not in the relation. ? Ri ( r1 r2 ... – PowerPoint PPT presentation

Number of Views:49
Avg rating:3.0/5.0
Slides: 33
Provided by: dereks45
Category:

less

Transcript and Presenter's Notes

Title: Derek Shigemi


1
Chapter 7.9-7.10
  • Derek Shigemi

2
Overview
  • More Normal Forms beyond fourth normal form.
  • Different ways to form a database.
  • Denormalization.
  • Inadequate database design.

3
More Normal Forms
  • Fourth normal form is not the best normal form
    for all databases. Multivalued dependencies can
    help eliminate repetition.
  • Join dependencies are types of constraints that
    generalize multivalued dependencies.

4
More Normal Forms (cont.)
  • Project-join normal form (PJNF) or Fifth Normal
    form uses Join dependencies.
  • Domain-key normal form has more general
    constraints.

5
More Normal Forms (cont.)
  • The problem with these two types of normal form
  • They have generalized constraints that are hard
    to reason.
  • No set of rules for reasoning about the
    constraints.
  • These normal forms are used rarely.

6
Overall Database Design Process
  • This section discusses how to produce a database
    with normalization.
  • Let R be a schema. Then R could have been found
    in these possible ways.
  • R could have been converted from an E-R diagram
    to a set of tables.
  • R could have been a single table with all
    attributes and then broken up into smaller
    relations through normalization.

7
Overall Database Design (cont.)
  • 3. R could have been some ad hoc design of
    relations, which are tested to verify that it
    satisfies the desired normal form.

8
E-R Model and Normalization
  • When converting over an E-R Model to tables, the
    tables generated from the E-R diagram should not
    need further normalization.
  • This only works if the entities are correctly
    made in the E-R diagram.

9
E-R Model and Normalization (cont.)
  • An example of when normalization would be
    required is if there is an employee entity that
    had attributes department-number and
    department-address.

10
E-R Model
  • Here is the E-R Model of the above.

Department-number
Employee-name
Department-address
Employee
11
E-R Model to Table
  • The table of the E-R Model
  • There is a function dependency
    department-number -gt department-address.

12
E-R Model to Table (cont.)
  • If the E-R diagram were done correctly, it would
    have department as a separate entity and a
    relationship between employee and department.
  • Functional dependencies can be used to find poor
    E-R design.

13
E-R Model to Table (cont.)
  • The normalization can be done in either the
    E-R model or after the conversion from the E-R to
    the relations.

Employee-name
Employee-street
Department-number
Department-address
Works
Employee
Department
14
The Universal Relation Approach
  • The next approach to database design is to start
    out with a single relation schema with all the
    attributes, and decomposing it with a
    lossless-join decomposition.

15
The Universal Relation Approach
  • Eg. A customer wants to get a lab account.
  • First we list out all the required fields.
  • Customer-info
  • Computerlab-info
  • ComputerAccount-info

16
The Universal Relation Approach
  • Next we can use first normalization on the list
    of attributes, making everything atomic.
  • Customer-id
  • Customer-name
  • Customer-street
  • Customer-city
  • Customer-zip
  • Computerlab-name
  • Computerlab-area
  • ComputerAccount-id
  • ComputerAccount-password

17
The Universal Relation Approach
  • Next use second normal form to separate the
    attributes into tables.
  • The Customer table could have
  • Customer-id (primary key)
  • Customer-name
  • Customer-street
  • Customer-city
  • Customer-zip
  • ComputerAccount-id (foreign key)

18
The Universal Relation Approach
  • The Computerlab table could have
  • Computerlab-name (primary key)
  • Computerlab-area
  • Customer-id (foreign key)
  • The ComputerAccount table could have
  • ComputerAccount-id (primary key)
  • ComputerAccount-password

19
The Universal Relation Approach
  • Then you can go to third normal form and so on
    till you reach the normal form that you want.

20
The Universal Relation Approach (cont.)
  • Dangling tuples tuples that disappear when
    computing a join.
  • Let r1(R1), r2(R2), , rn(Rn) be a set of
    Relations. A tuple t of relation r1 is a
    dangling tuple if t is not in the relation
  • ? Ri ( r1 r2 rn ).

21
The Universal Relation Approach (cont.)
 
 
  • Eg. Decomposition of loan-info

 
 
 
 
 
22
The Universal Relation Approach (cont.)
  • This is an example of when data is recorded for
    the loan, but the rest will be filled in later.
  • If we join the relations with a natural join, we
    loss all the tables corresponding to loan L-58.
  • Therefore we get dangling tuples with this schema.

23
The Universal Relation Approach (cont.)
  • Universal relation is the relation
  • r1 r2 rn,
  • since it involves all the attributes in the
    universe defined by
  • R1 U R2 U U Rn.
  • For the above example, we need to use null
    values to make the it into a universal relation.

24
The Universal Relation Approach (cont.)
  • Unique-role assumption
  • Each attribute name has a unique meaning in the
    database
  • Preferable to reusing of the same name in
    multiple roles.

25
Denormalization
  • Database designers sometimes choose a schema that
    has redundant information.
  • They use redundancy to improve performance.
  • The penalty of not using a normalized schema is
    the extra coding to keep the data consistent.
  • Denormalization is the process of taking a
    normalized schema and making it non-normalized.

26
Denormalization (cont.)
  • Eg. Display an account holders name along with
    the account number and balance.
  • We can make a relation containing all the
    attributes of account and depositor.
  • This displays the information faster.
  • Uses up space because of the redundant
    information.

27
Denormalization (cont.)
  • Or we could use the normalized schema, and
    additionally store the join of account and
    depositor as a materialized view.
  • Using materialized views has space and time
    overheads.
  • Up-to-date by the database and not the
    programmer.

28
Other Design Issues
  • There are some aspects of database design that
    are not addressed by normalization, and can lead
    to bad database design.
  • Eg. A company database, where they want to store
    earnings of companies in different years.
  • earnings(company-id, year, amt)
  • Functional dependency is
  • company-id, year -gt amt
  • This relation can be used to store earnings.

29
Other Design Issues (cont.)
  • Another method is to use multiple relations, each
    storing the earnings for a different year.
  • If we have relations for 2000,2001,2002 then they
    will be
  • earnings-2000, earnings-2001, earnings-2003 from
    the schema(company-id, earnings).
  • Functional dependency is company-id-gtearnings.
  • The problem is when a new year rolls around,
    someone would have to create a new relation and
    new queries.

30
Other Design Issues (cont.)
  • Yet another method is that we can have
  • company-year(company-id, earnings-2000,
  • earnings-2001,
  • earnings-2002)
  • Function Dependency is company-id to any other
    attribute.
  • The problem is very much the same as the previous
    method where someone would have to modify the
    relation and make more queries next year.

31
Other Design Issues (cont.)
  • Crosstabs are representations with one column for
    each value of an attribute.
  • Crosstabs are used in
  • Spreadsheets
  • Reports
  • Data analysis tools
  • Crosstabs are not good for
  • Database design
  • The company-year(company-id, earnings-2000,
  • earnings-2001, earnings-2003)
  • is an example of a crosstab.

32
Bibliography
  • Silberschatz,Korth,Sudarshan Database System
    Concepts, Fourth Edition McGraw-Hill Companies,
    Inc.(2002) pages 293-297.
  • Ullman MySQL Peachpit Press(2003) pages 39-47.
Write a Comment
User Comments (0)
About PowerShow.com