Database Design - PowerPoint PPT Presentation

1 / 72
About This Presentation
Title:

Database Design

Description:

Databases (DB) integrated collections of stored data that are centrally ... integrity automatically after schema designer identifies primary and foreign keys ... – PowerPoint PPT presentation

Number of Views:135
Avg rating:3.0/5.0
Slides: 73
Provided by: valu84
Category:
Tags: database | design

less

Transcript and Presenter's Notes

Title: Database Design


1
Chapter 12
  • Database Design

2
Learning Objectives
  • Describe the differences and similarities between
    relational and object-oriented database
    management systems
  • Design a relational database schema based on an
    entity-relationship diagram
  • High Level
  • Detailed
  • Design an object database schema based on a class
    diagram

3
Learning Objectives (continued)
  • Design a relational schema to implement a hybrid
    object-relational database
  • Describe the different architectural models for
    distributed databases

4
Overview
  • This chapter describes design of relational and
    OO data models
  • Developers transform conceptual data models into
    detailed database models
  • Entity-relationship diagrams (ERDs) for
    traditional analysis
  • Class diagrams for object-oriented (OO) analysis
  • Detailed database models are implemented with
    database management system (DBMS)

5
Databases and Database Management Systems
  • Databases (DB) integrated collections of stored
    data that are centrally managed and controlled
  • Database management system (DBMS) system
    software that manages and controls access to
    database
  • Databases described by a schema description of
    structure, content, and access controls
  • XML Data and Schema are integrated

6
XML Example
7
Database Models
  • Impacted by technology changes since 1960s
  • Model types
  • Hierarchical
  • Network
  • Relational
  • Object-oriented
  • Most current systems use relational and (rarely)
    object-oriented data models

8
Components of a DB and DBMS
Virtually all actions in a relational DB managed
through Structured Query Language
9
Important DBMS Capabilities
  • Simultaneous access by multiple users and
    applications
  • Access to data without application programs (via
    a query language)
  • Organizational data management with uniform
    access and content controls

10
Objectives of SQL
  • Ideally, database language should allow user to
  • create the database and relation structures
  • perform insertion, modification, deletion of
    data
  • perform simple and complex queries.
  • Should perform these tasks with minimal user
    effort and command structure/syntax must be easy
    to learn.
  • Should be portable.
  • The relational model provides Structured Query
    Language (SQL) to meet the above objectives

11
SQL Benefits
  • SQL is relatively easy to learn
  • it is non-procedural - you specify what
    information you require, rather than how to get
    it
  • it is essentially free-format.
  • Relatively easy to learn and use but also very
    powerful
  • Consists of standard English words
  • CREATE TABLE Staff(staffNo VARCHAR(5),
  • lName VARCHAR(15),
  • salary DECIMAL(7,2))
  • INSERT INTO Staff VALUES ('SG16', 'Brown',
    8300)

12
SQL
  • SQL has 3 major components
  • A Data Definition Language (DDL) for defining
    database structure.
  • A Data Manipulation Language (DML) for retrieving
    and updating data.
  • Commands to control access to data (security)

13
Relational Databases
  • Relational database management system (RDBMS)
    organizes data into tables or relations
  • Tables are two dimensional data structures
  • Tuples rows or records represents a specific
    occurrence of an entity
  • Fields columns or attributes
  • Tables have primary key field(s) that can be used
    to identify unique records
  • Keys relate tables to each other foreign keys

14
Partial Display of Relational Database Table
15
Database design
  • High Level logical
  • Detailed physical

16
High Level Design
  • Create table for each entity type
  • Choose or invent primary key for each table
  • Add foreign keys to represent one-to-many
    relationships
  • Create new tables to represent many-to-many
    relationships

17
High Level Design
  • Define referential integrity constraints
  • Evaluate schema quality and make necessary
    improvements. normalization
  • Choose appropriate data types and value
    restrictions (if necessary) for each field
  • Valid values
  • Valid ranges

18
RMO Entity-Relationship Diagram
19
Relationship Between Data in Two Tables
20
Representing Relationships
  • Relational databases use foreign keys to
    represent relationships
  • One-to-many relationship
  • Add primary key field of one entity type as
    foreign key in table that represents many
    entity type
  • Many-to-many relationship
  • Use the primary key field(s) of both entity types
  • Use (or create) an associative entity table to
    represent relationship

21
Entity Tables with Primary Keys (Figure 12-7)
22
Represent One-to-Many Relationships by Adding
Foreign Keys (in italics) (Figure 12-8)
23
Enforcing Referential Integrity
  • Consistent relational database state
  • Every foreign key value also exists as a primary
    key value
  • DBMS enforces referential integrity automatically
    after schema designer identifies primary and
    foreign keys

24
DBMS Referential Integrity Enforcement
  • When rows containing foreign keys are created
  • DBMS ensures that value also exists as a primary
    key in a related table
  • When row is deleted
  • DBMS ensures no foreign keys in related tables
    have same value as primary key of deleted row
  • When primary key value is changed
  • DBMS ensures no foreign key values in related
    tables contain the same value

25
RI in DDL
RI is defined when defining the foreign keys
associated with a table
26
Evaluating Schema Quality
  • High-quality data model has
  • Uniqueness of table rows and primary keys
  • Ease of implementing future data model changes
    (flexibility and maintainability)
  • Lack of redundant data (database normalization)
  • Database design is not objective or
    quantitatively measured it is experience and
    judgment based

27
Normalization
  • A good data model
  • Is simple all attributes in an entity describe
    only that entity.
  • Is non-redundant. All attributes other than
    foreign keys describe one entity
  • Is flexible and adaptive.
  • Normalization is a data analysis technique that
    organizes data attributes such that they are
    grouped to form non-redundant, stable, flexible,
    and adaptive entities.

28
Database Normalization
  • Normal forms minimize data redundancy
  • First normal form (1NF) no repeating fields or
    groups of fields
  • Functional dependency one-to-one relationship
    between the values of two fields
  • 2NF in 1NF and if each non-key element is
    functionally dependent on entire primary key
  • 3NF in 2NF and if no non-key element is
    functionally dependent on any other non-key
    element

29
First Normal Form
The first normal form states that each field is
granular and there are no repeating groups
AND
30
Second/Third Normal Form
The second normal form states that each field in
a multiple field primary key table must be
directly related to the entire primary key. Third
normal form is the same as second normal form
except that it only refers to tables that have a
single field as their primary key. 
31
Exercise
  • Define what would need to be done place data in
    first normal form (look carefully at the data)
  • Place this data in 3rd normal form

32
Detailed Design
  • Design physical representation
  • Analyze transactions
  • Choose file organizations
  • Choose indexes
  • Estimate disk space requirements
  • Design user views
  • Design security mechanisms
  • Consider the introduction of controlled
    redundancy
  • Monitor and tune the operational system

33
Analyze Transactions
  • Assume that a transaction use case
  • Use this information to identify the parts of the
    database that may cause performance problems.
  • To select appropriate file organizations and
    indexes, also need to know high-level
    functionality of the transactions, such as
  • attributes that are updated in an update
    transaction
  • criteria used to restrict rows that are retrieved
    in a query.

34
Analyze Transactions
  • Often not possible to analyze all expected
    transactions, so investigate most important
    ones use Paretto Principle
  • To help identify which transactions to
    investigate, can use
  • transaction/relation cross-reference matrix,
    showing relations that each transaction accesses,
    and/or
  • Access frequency analysis.

35
Cross-referencing transactions and relations
Which relations require the most attention?
36
(2) Determine frequency information
  • We need to identify the key transactions that
    access those relations
  • Key information includes
  • Data volumes anticipated in each relation
  • Average and maximum number of times per period
    that a transaction executes
  • Peak access times for each transaction

Why is peak access time important to know?
37
(2) Determine frequency information
  • Peak times are key to understanding if
    transactions conflict that is, have similar peak
    times. If so, potential for performance issues
    is greater.

38
Transaction usage map for some sample transactions
39
Choose File Organizations
  • Purpose To determine an efficient file
    organization for each table.
  • Determines how records physically ordered on disk
  • File organizations type unordered, ordered,
    hash.
  • Rule of Thumb Use DBMS default organization

40
Choose Indexes
  • Purpose To determine whether adding indexes
    will improve the performance of the system.
  • Need to consider both primary and secondary
    indexes
  • Primary Index how data will be physically
    ordered on disk
  • Secondary Index additional indexes created to
    speed access to to data

41
Why do indexes improve performance?
  • Without an index, need to scan tables
  • Index table with very few attributes.
  • Use the index to find the record(s) that meet
    query criteria.

42
Estimate Disk Space Requirements
  • Purpose To estimate the amount of disk space
    that will be required by the database.
  • We need to do this step to ensure that we have
    sufficient hardware and disk capacity both now
    and in the future to manage the database.
  • To calculate storage, need to consider
  • Amount of data required by each table (width of
    each row count of rows)
  • Indexes to be maintained
  • Log files to be maintained
  • Growth over a defined time period (1 2 years)

43
Design User Views
  • Purpose To design the user views that were
    identified during the requirements stage of the
    relational database application lifecycle.
  • On multi-user systems, views are often used to
    manage security, reduce complexity for users

44
Other Detailed Design Tasks
  • Design security mechanisms
  • Consider the introduction of controlled
    redundancy
  • Monitor and tune the operational system

45
Object-Oriented Databases
  • Direct extension of OO design and programming
    paradigm
  • ODBMS stores data as objects
  • Direct support for method storage, inheritance,
    nested objects, object linking, and
    programmer-defined data types
  • Object Definition Language (ODL)
  • Standard language for describing structure and
    content of an object database

46
Designing Object Databases
  • Determine which classes require persistent
    storage
  • Define persistent classes
  • Represent relationships among persistent classes
  • Choose appropriate data types and value
    restrictions (if necessary) for each field

47
Representing Classes
  • Transient classes
  • Objects exist only during lifetime of program or
    process
  • Examples view layer window, pop-up menu
  • Persistent classes
  • Objects not destroyed when program or process
    ceases execution. State must be remembered.
  • Exist independently of program or process
  • Examples customer information, employee
    information

48
Representing Relationships
  • Object identifiers
  • Used to identify objects uniquely
  • Physical storage address or reference
  • Relate objects of one class to another
  • ODBMS uses attributes containing object
    identifiers to find objects that are related to
    other objects
  • Similar to..?
  • Keyword relationship can be used to declare
    relationships between classes

49
Representing Relationships (continued)
  • Advantages include
  • ODBMS assumes responsibility for determining
    connection among objects
  • ODBMS assumes responsibility for maintaining
    referential integrity
  • Type of relationships
  • 11, 1M, MM (one-to-one, one-to-many,
    many-to-many)
  • Association class used with MM

50
RMO Domain Model Class Diagram (Figure 12-15)
51
One-to-One Relationship Represented with
Attributes Containing Object Identifiers
52
One-to-Many Relationship Between Customer and
Order Classes
53
One-to-Many Relationship Represented with
Attributes Containing Object Identifiers
54
Generalization Hierarchy within the RMO Class
Diagram (Figure 12-21)
55
Object DBMS Realities (from Wikipedia)
  • Object databases based on persistent programming
    acquired a niche in application areas such as
  • engineering and spatial databases,
    telecommunications, and scientific areas such as
    high energy physics and molecular biology.
  • They have made little impact on mainstream
    commercial data processing,

56
Object DBMS Realities (from Wikipedia)
  • ODBMS suggest that pointer-based techniques are
    optimized for very specific "search routes" or
    viewpoints. However, for general-purpose queries
    on the same information, pointer-based techniques
    will tend to be slower and more difficult to
    formulate than relational.
  • Other things that work against ODBMS
  • lack of interoperability with a great number of
    tools/features that are taken for granted in the
    SQL world including but not limited to industry
    standard connectivity, reporting tools, OLAP
    tools and backup and recovery standards.
  • Lack a formal mathematical foundation, unlike the
    relational model,

57
Hybrid Object-Relational Database Design
  • RDBMS (hybrid DBMS) used to store object
    attributes and relationships
  • Design complete relational schema and
    simultaneously design equivalent set of classes
  • Mismatches between relational data and OO
  • Class methods cannot be directly stored or
    automatically executed
  • Relationships are restricted compared to ODBMS
  • ODBMS can represent wider range of data types

58
Classes and Attributes
  • Designers store classes and object attributes in
    RDBMS by table definition
  • Relational schema can be designed based on class
    diagram
  • Table is created for each class
  • Fields of each table same as attributes of class
  • Row holds attribute values of single object
  • Key field is chosen for each table

59
Relationships
  • Relationships are represented with foreign keys
  • Foreign key values serve same purpose as object
    identifiers in ODBMS
  • 1M relationship add primary key field of class
    on one side of the relationship to table
    representing class on many side
  • MM relationship create new table that contains
    primary key fields of related class tables and
    attributes of the relationship itself

60
Data Access Classes
  • OO design based on a three-layer architecture
  • Data access classes are implementation bridge
    between data stored in program objects and data
    in relational database
  • Methods add, update, find, and delete fields and
    rows in table or tables that represent the class
  • Methods encapsulate logic needed to copy data
    values from problem domain class to database and
    vice versa

61
Interaction Among a Domain Class, a Data Access
Class, and the DBMS(Figure 12-25)
62
Data Types
  • Storage format and allowable content of program
    variable, object state variable, or database
    field or attribute
  • Primitive data types directly implemented
  • Memory address (pointer), Boolean, integer, and
    so on
  • Complex data types user-defined
  • Dates, times, audio streams, video images, URLs

63
Relational DBMS Data Types
  • Designer must choose appropriate data type for
    each field in relational database schema
  • Choice for many fields is straightforward
  • Names and addresses use a set of fixed- or
    variable-length character arrays
  • Inventory quantities can use integers
  • Item prices can use real numbers
  • Complex data types (DATE, LONG, LONGRAW)

64
Subset of Oracle RDBMS Data Types
65
Object DBMS Data Types
  • Use set of primitive and complex data types
    comparable to RDBMS data types
  • Schema designer can create new data types and
    associated constraints
  • Classes are complex user-defined data types that
    combine traditional concept of data with
    processes (methods) to manipulate data

66
Distributed Databases
  • Rare for all organizational data to be stored in
    a single database in one location
  • Different information systems in an organization
    are developed at different times
  • Parts of an organizations data may be owned and
    managed by different units
  • System performance is improved when data is near
    primary applications

67
Single Database Server Architecture (Figure
12-27)
68
Replicated Database Server Architecture(Figure
12-28)
69
Partitioned Database Server Architecture
70
Advantages of DDBMSs
  • Reflects organizational structure
  • Improved shareability and local autonomy
  • Improved availability
  • Improved reliability
  • Improved performance
  • Economics
  • Modular growth

71
Disadvantages of DDBMSs
  • Complexity
  • Cost
  • Security
  • Integrity control more difficult
  • Lack of standards
  • Lack of experience
  • Database design more complex

72
Functions of a DDBMS
  • Expect DDBMS to have at least the functionality
    of a DBMS.
  • Also to have following functionality
  • Extended communication services.
  • Extended Data Dictionary.
  • Distributed query processing.
  • Extended concurrency control.
  • Extended recovery services.
Write a Comment
User Comments (0)
About PowerShow.com