Databases and Database Design - PowerPoint PPT Presentation

1 / 35
About This Presentation
Title:

Databases and Database Design

Description:

Attribute ' ... Attribute of an entity whose values uniquely determine its occurrences ... An attribute is functionally dependent on X only if each of its values is ... – PowerPoint PPT presentation

Number of Views:41
Avg rating:3.0/5.0
Slides: 36
Provided by: william502
Category:

less

Transcript and Presenter's Notes

Title: Databases and Database Design


1
Databases and Database Design
  • William A. Yasnoff, MD, PhD
  • Oregon Health Division

2
Databases Database Design
  • Introduction
  • Relational Databases
  • Database Design Data Models

3
Introduction
  • Database large collection of information
  • Stored on hard disk
  • Retrieval
  • 100,000 items X 10 ms 17 minutes
  • Need to organize data to improve retrieval speed
  • How do you organize your paper files?

4
Introduction (continued)
  • Data Organization
  • sorting data phonebook in alpha v. random order
  • tabs at key points dictionary with tabs for
    each letter
  • direct pointers to information index
  • table of key element value and address of other
    data
  • e.g. name mailbox number --gt letters in
    mailbox

5
Relational Database Model
6
Relational Database Model
  • Developed by EF Codd, CJ Date (70s)
  • Table Entity Relation
  • Table row tuple instance
  • Table column attribute
  • Table linkage by values
  • Entity-Relationship Model

7
Attributes
  • Key
  • uniquely identify row/tuple
  • may be one or more attributes
  • Non-key
  • other properties of instance
  • dependent on key

8
Retrieval in RDMS
  • SQL
  • Select ...
  • From ...
  • optional can be computed
  • Where ...
  • Query by example
  • Fill in the blanks
  • WONDER

9
Advantages of RDMS
  • Very strong theoretical basis
  • storage
  • retrieval
  • Easy to implement
  • Conflicts and anomalies can be avoided
  • Intuitive appeal
  • Easy retrieval

10
Relational Model Summary
  • popular method of organizing data
  • strong theoretical properties facilitate
    retrieval
  • relation table
  • attribute column
  • tuple row
  • key attributes that uniquely identify each
    row

11
Relational Model Summary 2
  • Allows complex data relationships with multiple
    tables
  • patient id, patient demographics
  • patient id, patient visit date, blood pressure
  • Structured Query Language (SQL) retrieval
  • Select patient _name where blood_lead_level gt 10

12
Database Design Data Models
13
Data Models
  • Definition A representation of the data and and
    data relationships of an activity
  • data case report of serious E. Coli illness
  • data E. Coli serotype O157
  • relationship case report contains serotype
    (semantic relationship)
  • Database Design Development of a quality data
    model

14
Data Model Quality
  • Correctness
  • Conceptual (concepts represented properly)
  • real world representation
  • Syntactic (relationships represented properly)
  • real world language
  • Completeness (wholeness)
  • Conceptual (all concepts represented)
  • Syntactic (all relationships represented)

15
Entity
  • Any distinguishable object that is to be
    represented in a database C.J. Date
  • Properties
  • within scope of model
  • single concept
  • a set of distinguishable instances
  • satisfies normalization rules
  • Also called a relation

16
Entity Guidelines
  • Relationship with at least one other entity
  • Unique, descriptive name
  • Class or set of things (not just one)
  • Single meaning (no homonyms)
  • No synonyms (two entities describing same class)

17
Attribute
  • a type or characteristic of an entity (e.g.
    gender is an attribute of the entity patient)
  • Domain the set of values from which an
    attribute may be selected (e.g. the domain of the
    attribute gender is male, female)
  • An entity typically has many attributes

18
Attribute Characteristics
  • Key
  • value uniquely identifies entity
  • e.g. Lab test ID is key attribute of specimen
  • Non-Key
  • value does not uniquely identify entity
  • e.g. author does not uniquely identify
    publication

19
Attribute Characteristics
  • Atomic individual data value (one and only one
    fact)
  • Description complete and clear definition
  • e.g. professional privilege date The date on
    which a health care professional is granted
    privilege to practice in a particular health care
    facility, establishing the providers eligibility
    for patient care assignments and liability
    coverage

20
Attribute Guidelines
  • Unique name
  • no plurals, possessives, articles, conjunctions,
    verbs, or prepositions
  • Clear, complete, unambiguous description
  • Atomic (no positional information)
  • Domain with 2 or more values
  • Originates in only one entity

21
Primary Key
  • Attribute of an entity whose values uniquely
    determine its occurrences
  • birth certificate number, person name, birth
    date, mother name, physician name
  • facility, patient name, physician name, date,
    temperature, pulse, blood pressure

22
Primary Key Characteristics
  • Stable does not change over time
  • Minimal fewest attributes necessary
  • Factless no hidden information
  • Definitive value always exists
  • Accessible available when data created
  • Unique absolutely no duplicates

23
Relationship
  • Semantic contains, is part of, belongs to
  • One-to-one
  • serotype of an organism
  • immunization status of a child
  • One-to-many
  • antibiotic resistances of an organism
  • vaccines administered to a child

24
Relationship Guidelines
  • No circular references (e.g. health plans -gt
    markets -gt products -gt health plans should be
    health plans -gt health plan markets -gt market
    products lt- health plan products lt- health
    plans)
  • Single relationship between two entities
  • No recursive relationships

25
Single Relationship of Entities
  • Problem employee lt--gt job assignment lt--gt job
    double circular
  • Solution person --gt job assignment lt-- job
    person includes status as employee or
    contractor

26
Recursive Relationships
  • Problem supervisor lt--gt employee may be
    circular
  • Solution person person ID --gt
    employee-supervisor relationship employee ID,
    supervisor ID
  • multiple supervisory roles for each person

27
Normalization
  • Formalization of common sense rules of
    information organization
  • An attribute is functionally dependent on X only
    if each of its values is determined by the value
    of X (X may be composite)
  • Example DOB is functionally dependent on
    Drivers License number

28
Key Normalization Concept
  • Functional dependence of each entity must be
  • based on entire primary key
  • NOT based on any other attributes

29
Benefits of Normalization
  • Aids in database design, integration
  • Ensures precise capture of business logic
  • Minimizes redundancy
  • Minimizes need for null values
  • Prevents
  • information loss
  • unintentional results

30
Summary of Normalization
  • One Fact in One Place

31
Common DB Design Errors
  • Multiple instances in same row of table, e.g.
    first_value, second_value, third_value Problem
    what to do with 4th value?
  • Same data item repeated in multiple places, e.g.
    address appears in two different tables Problem
    how to keep two values synchronized?

32
Data Model Quality
  • Correctness
  • Conceptual (concepts represented properly)
  • real world representation
  • Syntactic (relationships represented properly)
  • real world language
  • Completeness (wholeness)
  • Conceptual (all concepts represented)
  • Syntactic (all relationships represented)

33
Database Design Pearls
  • Accommodate all data needed
  • Correct relationships between data items
  • No duplicate representation
  • Anticipated retrievals use indexes
  • Meet confidentiality requirements

34
References - 1
  • Reingruber MC Gregory WW The Data Modeling
    Handbook (New York John Wiley Sons, 1994)
  • Montgomery SL Object-Oriented Information
    Engineering (Boston AP Professional/Harcourt
    Brace, 1994)

35
References - 2
  • Codd EF The Relational Model for Database
    Management (Reading, MA Addison-Wesley, 1990)
  • Date CJ An Introduction to Database Systems, 5th
    ed. (Reading, MA Addison-Wesley, 1990)
  • Duncan KA Health Information and Health Reform
    (San Francisco Jossey-Bass, 1994)
Write a Comment
User Comments (0)
About PowerShow.com