Title: Principles of Database Design
 1Principles of Database Design
NLM/MBL Medical Informatics 
 2Session Outline
- Why learn this? 
- Database Principles and Paradigms 
- Principles of Relational Database Design 
- System design and building methods 
- Exercise Transforming flat files to tables
3Why Learn about Database Design?
- Vendors will sell you on user interfaces, but the 
 power and flexibility is in the data model
- Evaluating and comparing products 
- Communicating with vendors and IT support staff 
- Building your own databases
4What is a Database?
- An organized collection of information 
- Computer-based representation 
- Systematic, automated retrieval 
- Systematic, automated symbol manipulation
5Historical Evolution of Databases
- Dedicated files created  maintained by 
 application software (sequential, random access)
- Database Management Systems (DBMSs)
6Hierarchical Databases
Lab Results
Serum Na
5/30/96
PtSmith
Advantages efficient storage and I/O, rapid 
access via predetermined data hierarchies Disadva
ntages difficult to view/retrieve data from 
other perspectives, hard to modify underlying 
structure 
 7Information Network Databases
Database as Hypertext
Advantages Can model complex many-to-many 
relationships as well as hierarchies and simple 
lists Disadvantages difficult to predict  
control effects of transitive relationships 
recursion I/O intensive, potential to become 
incomprehensible 
 8Relational Databases
Rows  Columns with inter-table references
Patient
Lab_test
Pt-UI
Lname Fname
Pt-UI
Testname Date
12345 Smith Elmer 12346 Jones 
 Barbara 12347 Clark 
Arthur 12348 Jones Casey 12349 
Sample Steve
12345 Serum_Na 5/30/96 42353 CBC 
 5/30/96 47756 ESR 
5/30/96 12348 HBsAg 5/30/96 34523 
Amylase 5/30/96
Advantages Understandable, permits variety of 
logical aggregation or views of data elements, 
structure easily modifiable, new elements 
generally do not break existing 
programs Disadvantages I/O intensive, 1 logical 
record may  many physical records, relational 
integrity is a constant concern  must be under 
software control 
 9Object-Oriented Databases
- Multiple data types including text, graphics, 
 sound, signals, etc.
- Encapsulation of data  programs 
- Interprocess messaging e.g., Print Yourself
Advantages applications programs consist of high 
level commands  functions which do not need to 
know the underlying data organization 
modularity, reusability and portability between 
systems Disadvantages early in 
commercialization CPU intensive few standards 
for query  object sharing 
 10Fundamental Assertions about Systems Design
- The Data Model is the most critical aspect of 
 system design and function
- Data Models should reflect real world objects and 
 their relationships to ensure durability
- A correct Data Model subserves and outlasts 
 applications, including many not anticipated at
 system start-up
11Object-oriented Systems designBasic Concepts
- The World contains Things e.g., Collies, 
 Terriers, Bloodhounds
- We develop abstractions of things called 
 objects e.g., dog
- We group objects by criteria which represent the 
 abstract object as an empty table
Dog Name Breed Favorite 
Food Birthdate 
 12Basic Concepts, contd
- Empty tables can be filled in to represent the 
 real world things from which the object was
 abstracted
Dog Name Breed Favorite 
Food Birthdate
Boris St. Bernard Canned 
 Jan 81 Fifi 
Poodle Dry 
May 92 Fido Pomeranian 
Canned Apr 87 
 13Basic Concepts, contd
- There are Relationships between objects which are 
 attributes of those objects
Dog Name License Owner Name Lic. Date
Relationship OWNS Dog Owner OWNS Dogs 
 14Objects
- All of the real-world things in the set (the 
 instances) have the same characteristics
- All instances conform to the same rules
15Types of Objects (ie., types of tables)
- Tangible Things e.g., book 
- Roles e.g., doctor, patient, supervisor 
- Incidents (events, occurences) e.g., ordering of 
 a lab test
- Interactions (bind two or more other objects via 
 a transaction) e.g., Purchase relates Buyer to
 Seller
- Specifications (definition tables of tangible 
 things)
16Table Notation
Graphical Form
Textual Form Patient_Admissions 
(Pt_ID, Date_Adm, Time_Adm, Unit, Room)
Patient_Admissions
 Pt_ID -Date_Adm -Time_Adm -Unit -Room 
 17Formalisms for Tables
- Rule 1 One instance of an object has exactly 
 one value for each attribute (i.e, only one data
 element at each row-column intersection no
 repeating groups, no true holes in table)
- Rule 2 Attributes must contain no internal 
 structure
Name Age-Sex Smith 38-F Jones 
 22-M Clark 18-M
Not OK
If Rules 1 and 2 are obeyed, the data model is in 
First Normal Form 
 18Formalisms for Tables, contd
- Rule 3 Every attribute should represent a 
 characteristic of the entire object, not a
 characteristic of a limited part of the object
Not OK
Attribute of hospital staff appointment, 
not committee
Hospital Committee Membership  Person Name  
Committee Name -Date committee term expires 
OK 
 19Relationships
- A relationship is the abstraction of a set of 
 associations that hold systematically between
 different kinds of real world things
- Patient OCCUPIES bed 
- Library CONTAINS books 
- Specimen IS ASSAYED by Lab Method 
- Most relationships may be stated in the inverse 
 also
- Library LENDS book 
- Book IS LENT BY Library
20Relationship Types
has
Governor
One-to-One 
State
governs
owns
Dog Owner
One-to-Many
Dog
is owned by
writes
Author
Many-to-Many
Book
is written by 
 21Modeling Many-to-Many Relationships
DRUG generic name - other attributes
DRUG MANUFACTURER  manufacturer name - other 
attributes 
LICENSE  manufacturer name  generic name - date 
licensed 
 22Overall System Design Process
- Build the Entity-Relationship diagram for all 
 defined objects (tables), including an Object
 Specification Document
- Create a State Transition Model which describes 
 changes to objects based on events or
 transactions
- Create a Data Flow diagram which models the 
 information elements which cause State
 Transitions
Recommended for multi-programmer projects 
 23Exercise Devise a Relational Model for MEDLINE 
citations 
 24Sample MEDLINE citation
UI - 90134185 AU - Greenes RA  Shortliffe EH TI 
- Medical Informatics. An Emerging academic 
discipline and institutional priority MH - 
Hospital Information Systems Career Choice 
Medical Informatics/EDUCATION/TRENDS PT - 
JOURNAL ARTICLE REVIEW TUTORIAL EM - 9005 AB - 
Information management constitutes a major 
activity of the health care profession. 
Currently a number of forces are focusing 
attention on this function... AD - Department of 
Radiology, Brigham and Womens Hosp., 
Boston, MA 02115 SO - JAMA 1990 Feb 23 
263(8)1114-20 
 25The Bottom Line in Database Design
- The Data Model is the most critical aspect of 
 system design and function
- Data Models should reflect real world objects and 
 their relationships to ensure durability
- A correct Data Model subserves and outlasts 
 applications, including many not anticipated at
 system start-up
26Questions?