ARCH3: Database Design, a Practical Guide - PowerPoint PPT Presentation

1 / 62
About This Presentation
Title:

ARCH3: Database Design, a Practical Guide

Description:

ARCH-3: Database Design A Practical Guide. If you thought this talk was going to ... Music store. Buys compact disc recordings from distributors. Has inventory ... – PowerPoint PPT presentation

Number of Views:52
Avg rating:3.0/5.0
Slides: 63
Provided by: gusbjr
Category:

less

Transcript and Presenter's Notes

Title: ARCH3: Database Design, a Practical Guide


1
ARCH-3 Database Design, a Practical Guide
  • Click to add subtitle

Gus Björklund
Wizard, Progress Software Corporation
2
Rules are made to be broken
To every rule, there is an exception!
3
If you thought this talk was going to be about
indexing
It isnt.
4
Topics
  • Theory
  • What is Database Design
  • Basic Elements
  • Representing the Model as Tables
  • Practice
  • An Example
  • Some Other Topics

5
First, some theory
6
What do we mean by database design?
  • A process for defining a model of a subset of the
    real world, then representing it as data in
    tables in a relational databaseAt least, thats
    the definition we will use for the purpose of
    this talk.

7
Basic Elements
What do we put in our model?
  • Just 3 Things
  • Entities
  • Attributes
  • Relationships

The entity-relationship model was invented by
Peter Chen in 1976. See http//bit.csc.lsu.edu/c
hen/chen.html
8
Basic ElementsEntities
  • Can be thought of as nouns
  • People
  • author, composer, performer, seller, buyer
  • Places
  • home, IP address, URL, destination, factory,
    store
  • Things
  • song, recording, instrument, car, invoice

Is telephone number a place or a thing?
9
Basic ElementsAttributes
Entities have attributes
  • Can be thought of as adjectives (but only
    loosely)
  • Length
  • Color
  • Horsepower
  • Part number
  • Song Title
  • Publication Date
  • Size
  • Fabric
  • Owner

Is telephone number a attribute or an entity?
10
Basic Elementsrelationships
Entities are connected by relationships
  • Can be thought of as verbs
  • has a
  • owns
  • contains
  • supervises
  • performs
  • called
  • sold
  • purchased
  • proved

11
Basic elementsrelationships have attributes too
In May, 1995, Andrew Wiles published a proof of
Fermats Last Theorem
12
What goes in an entity
  • Identifying attributes
  • Must be able to uniquely identify the entity
  • Can have more than one way to id
  • Id can be composite
  • Descriptive attributes
  • the values you need to keep track of
  • generally should be simple, not complex

13
What to include in your model
  • The things your application has to keep track of
  • Telephones, wires, switches
  • The actions your application or its users perform
  • Make calls, send telephone bills, collect
    payments
  • Some attributes of the things and actions
  • Originating number, date and time of call,
    duration, called number
  • Keep it simple
  • Be accurate
  • Keep it up to date

14
What to include in your model
  • Consider the goals of the system
  • Everything you include should be there for a
    reason you can state
  • in no more than two sentences
  • Everything should have a clear name
  • if you cant name it, it doesnt belong

15
What to leave out of your model
  • The real world has properties that dont matter
    (to your application)
  • The real world has relationships that dont
    matter
  • Things happen in the real world that dont matter
  • Keep it simple
  • If you cant say why you need it, leave it out

16
Logical vs Physical Data Models
  • Logical entities often require multiple tables to
    represent them
  • Tables can be thought of as logical or physical
  • It depends on your point of view
  • There is also the physical storage database
    layout
  • storage areas
  • data extents
  • disks
  • etc.
  • We arent going to talk about the physical
    database layout
  • We will talk about tables

17
Mapping Your Model to a Database
Simply put,
  • Entities become tables
  • Identifiers become indexes
  • Attributes become columns
  • Data types pick appropriate
  • Relationships become tables or foreign keys

18
In theory, there is no difference betweentheory
and practice, but in practice there is.Jan van
de Snepscheut
19
Now for some practice.
20
An example
  • Music store
  • Buys compact disc recordings from distributors
  • Has inventory
  • Allows customers to search for what they want
  • Maybe in an in-store kiosk or on the web
  • Sells compact discs to customers

21
What should we do first?
22
Activities
  • We buy discs from a distributor
  • Orders are sent to a distributor
  • Orders are delivered to the store
  • Orders may be cancelled
  • We sell discs to customers in sales transactions
  • Customers buy discs in sales transactions
  • Customers search for what they want to buy

Which of these must be remembered by the system?
23
What do we need to keep track of
  • Discs we have
  • Discs we sold
  • Discs we know about and can get
  • Discs we have ordered
  • Information needed to do our income tax
  • what we paid for stock
  • when we bought it
  • what we sold it for
  • when we sold it

24
Disc entities
  • UPC Code 8697-07416-2
  • Manufacturer Sony BMG
  • Cost to us 2.00
  • Price charged 17.95
  • Tax charged 0.80
  • Date purchased March 19, 2007
  • Date sold June 9, 2007

25
Disc table might look like this
26
Whats wrong?
  • Is upc a unique identifier?
  • Might have bought from a distributor
  • Have no information about what is on the disc
  • How do customers search?
  • Dont know when disc was made
  • Could be more than one tax jurisdiction
  • provincial tax, city tax
  • Dont know if disc is on order
  • Dont know who bought it
  • Duplicated data
  • Etc., etc.

27
Disc entities take 2
  • UPC Code 8697-07416-2
  • Manufacturer Sony BMG
  • Distributor Bobs Wholesale CDs
  • Cost to us 2.00
  • Price charged 17.95
  • Tax charged 0.80
  • Date ordered March 19, 2007
  • Date received March 20, 2007
  • Date sold June 9, 2007
  • Disc Title The Essential Joshua Bell
  • Artist Joshua Bell
  • Track 1 Danse Russe
  • Track 2 Violin Concerto in E Minor
  • Track 3 Nocturne in C-sharp Minor
  • etc.

28
Example Now Whats wrong?
  • This is getting messy
  • Activities combined with discs attributes
  • Have duplicated information
  • How many tracks can there be?
  • What if there is more than one artist?
  • Dont have all the information a customer might
    want to use to search

29
Discs revisited
  • Discs have titles
  • Discs have pictures on the cover
  • Discs contain tracks
  • Discs are made by manufacturers
  • Discs are purchased from distributors
  • Discs are ordered from distributors
  • Discs are delivered to the store
  • Discs are sold to customers

30
Discs contain tracks
  • Tracks contain songs
  • Tracks occur in order
  • Tracks have a duration
  • Songs are performed in performances
  • Songs have performers (usually)
  • Songs have composers
  • Songs have names (titles)
  • Songs have a key (but not always)
  • Performances are done by performers
  • Performers can be groups (bands, orchestras,
    etc.)
  • Performances are performed in a location or venue

31
We seem to need these entities
  • Discs
  • Manufacturers
  • Distributors
  • Orders
  • Customers
  • Inventory
  • Tracks
  • Songs
  • Performers
  • Groups ?

32
Songs have names (titles).Are names properties
of songs?Or are they entities related to
songs?Or are they something else?
33
Song data (track 1)
34
Song data (track 2)
35
Performance data
36
Performance data take 2
37
Performer data
38
Performance to Performer Relationship
39
Performance data take 3
40
Track to Performance Relationship
41
Relationships
track
performance
one to one
performer
performance
performance
disc
performer
track
performance
track
one to many
many to many
track
42
What happened to Songs?
43
Relationships (take 2)
song
performance
track
song
one to many
performance
one to one
performance
disc
performer
track
performance
track
performance
one to many
performer
track
performance
many to many
44
What aboutbusiness entities?
45
Should you use arrays?
46
Indexes
  • Enforce uniqueness
  • Make searches faster
  • Allow fast retrieval of entity by any of its
    identities

47
What indexes do we needfor the music store
database?
48
What should we do next ?
49
Other Topics
  • Normalization
  • Unique keys
  • Word indexes
  • Naming
  • Customisation

50
Normalization
  • It mostly means Dont duplicate data
  • Properties should be simple
  • have only one value
  • are necessary
  • are not derived data
  • dont repeat
  • Complicated properties are often entities in
    their own right
  • For example, addresses might be

51
Unique keys
  • EVERY table must have a unique key
  • Every row needs a unique identifier
  • that never changes even if data moved to another
    database (i.e. if you replicate)
  • Often, users dont need to see it
  • Use a UUID or a sequence or maybe datetime
  • Unique key is the ONLY way to identify rows
    unambiguously
  • ROWIDs are temporary and can change
  • Use the same method throughout
  • Youll be glad you did

52
Word indexes
  • Can be used to hold multiple status or attribute
    values
  • Conflicts with normalisation
  • Flexible
  • Easy to add new ones
  • Queries are fast
  • Example
  • Category classical, violin, orchestral, concerto

53
Naming
Good names are crucial to understanding
  • Table and column names should have clear meanings
    everyone can understand
  • GL01262 vs dateEntered
  • Names with dashes cause inconvenience with SQL
  • order-date
  • Booleans should be named for truth value
  • backOrdered
  • No double negations
  • notOutOfStock

54
Making tables customizable
We will look at 3 ways
  • Spare columns
  • Separate table with spare columns
  • Separate table with name/value pairs

55
Spare columns
What data types should you use? How many spare
columns? Wasted columns when not used How do you
know what each spare got used for? How do you
know how many unused spares you have?
56
Separate table with spare columns
57
Separate table with spare columns
58
Separate table with name/value pairs
59
Modeling Tools
  • Enterprise Architect
  • Power Designer
  • ConceptDraw
  • Erwin
  • Rational

Pencil and paper !
Blackboard !
60
Summary
  • Understand the requirements
  • Leave out what is not needed
  • Review the design with stakeholders
  • Evolve the design as changes come up
  • Test to make sure it works
  • Can it do everything that is needed?
  • Does it perform adequately?

61
Where to learn more
  • Papers
  • Wiles, A. "Modular elliptic curves and Fermat's
    Last Theorem, Annals of Mathematics 141 (3)
    443-551
  • Chen, P. The Entity-Relationship Model --
    Toward a Unified View of Data, ACM TODS Vol 1,
    No 1, 1976
  • Wikipedia articles to start from
  • entity-relationship model
  • data model
  • Books
  • Teorey, Lightstone, Nadeau Database Modeling
    and Design, Morgan Kaufmann.

62
Questions
?
Write a Comment
User Comments (0)
About PowerShow.com