Title: DATA BASE DESIGN
1DATA BASE DESIGN
- Database design is a complex process.
- It is carried out by a specialist data base
administration group or team. - It is crucial to systems development because
- data resources are shared by many processes.
- it must be organised in a way which is flexible
and adaptable. - data is more permanent than processes.
2DATA BASE DESIGN
- Database design can be seen as a three step
process in the SDLC - 1. Conceptual data modelling
- (Analysis phase)
- 2. Logical database design
- (conceptual model ? logical data description for
the DBMS) - 3. Physical database design
- (how the database will be organised and accessed
on the storage devices)
3Database Design Data ModellingReference
Whitten Bentley, Chap. 12
- Data Modelling
- Describes and documents data that must be stored
in a database. - Many different modelling standards are in use.
- The different standards vary in the symbols they
use, not the concepts that they model. - Also known as information modelling.
4Data modelling
- The Logical model
- Shows what data the business system uses.
- Identifies the business constraints on the data.
- Is implementation independent.
- Is part of the requirements specification during
the Analysis stage. - Entity-Relationship diagrams (ERDs) are an
example of a popular logical modelling technique
5Data modelling
- The Physical Model
- Shows how the system is physically and
technically implemented. - Prepared during the Design stage.
- Includes
- Attributes Domains are defined
- Distribution of data is determined
- Database architecture is selected
- Media are selected
- Indexes are defined
- Security permissions are determined
6Data modelling v process modelling
- Data models help analysts to quickly identify
business vocabulary better than process models. - Data models are almost always built more quickly
than process models. - Process modellers often get hung up on
unnecessary detail. - Data models for existing and proposed systems are
far less likely to be thrown away as the system
development proceeds.
7Data Modelling vs Object Modelling
- Data Modelling (relational)
- Data models focus on data which needs to be
stored. - Process models are developed separately from data
models. - Object Modelling
- Object modelling combines processes with the data
- Object modelling models persistent objects as
well as transient or temporary objects.
8Data Modelling vs Object Modelling (cont)
- Most databases are designed on the relational
model. - Most systems are developed using the object model
- BUT
- The database design remains relational
- In this hybrid environment a common approach to
handle the differences is - Each persistent class is mapped to a table
9Class Diagram
Becomes
Relational
10ERD definitions
- Entity - something about which we want to store
data. - Relationship - an association that exists
between one or more entities. - Cardinality - the complexity, or multiplicity of
each relationship. - Degree - the number of entities that can
participate in the relationship.
11ERD definitions
- Attribute - a property or characteristic of an
entity. - Data type - defines what class of data can be
stored in an attribute. - Key - one or more attributes that uniquely
identify an instance of an entity.
12Sample ERD
13Developing a logical model
- Entities are discovered and defined.
- A context data model is built
- Shows fundamental entities and relationships.
- Business rules are identified.
- A key-based model is built, which eliminates
non-specific relationships and adds associative
entities. - All entities in this model are given keys.
14Developing a logical model
- A fully attributed model is built, showing all
the attributes to be stored in the system. - A fully described model is built, which defines
each attributes domain (ie. its properties and
legitimate values ).
15Data dictionary
- The data dictionary is an alphabetic list of all
the data items and their definitions. - The data dictionary should be developed at the
same time as the ERD or Class Diagram. - eg.
- campus-code
- unique identification for a record in the
Campus table 1-letter string, in uppercase - department-code
- unique identification for a record in the
Department table 3-letter string, in uppercase
16Data dictionary cont
- Participant
- table of people who have enrolled in classes
_at_participant-id participant-firstname - participant-surname preferred name
participant-gender participant-type - (campus-code) (department-code)
participant-phone (participant-fax)
participant-email
17Data dictionary cont
- participant-email electronic mail address of
this person 50-character string - participant-fax facsimile number of this
person 15-character string - participant-firstname the first given name of
this person in full (ie. Robert, not
Bob)20-character string - participant-gender the gender of this person
Male Female - participant-id unique identification number
for a record in the Participant table
automatically generated by the system when the
record is created 8-character string, minimum
length 8 - (note the alphabetical order of attributes for
the table)
18Data dictionary cont
- participant-password the password used to
verify this person when they access the system
8-character string, minimum of 4-characters - participant-phone telephone number of this
person 15-character string - participant-surname the family name of this
person 30-character string - participant-type where does this person come
from? Academic staff General staff External
staff Student
19Developing the physical model
- Database design transforms the logical data model
into a physical data model (or database schema). - The technical capabilities and limitations of the
database technology are identified and resolved. - Performance requirements are defined.
- Normalisation is used to ensure adaptability,
flexibility and efficiency of stored data. - Construction of the physical model is dependent
on samples of data, facts and information
supplied by users. - Users must verify that the model is correct.
20A Method for Database Design
- Review the logical data model.
- Create a table for each entity.
- Create fields for each attribute.
- Create an index for each primary and secondary
key. - Create an index for each subsetting criterion.
- Designate foreign keys for relationships.
- Define data types, sizes, null settings, domains,
and defaults for each attribute. - Create or combine tables to implement supertype/
subtype structures. - Evaluate and specify referential integrity
constraints.
21Data architecture
- Files, Tables and Databases that store data
- Technology used to store database engine or file
- The distribution of the data
- The organisational structure which manages the
data resources. - responsible for the database technology
- design and construction, security, backup and
recovery, and performance tuning.
22- Common components of data architectures
- Operational (or production) databases, which
support day-to-day operations. - Data warehouses, containing read-only information
extracted from the production databases. - Personal and work group (or departmental)
databases.
23Files versus Database
24Typical DBMS Architecture
25Database management system (DBMS)
- A DBMS is specialised software that is used to
create, access, control and manage the database. - The core of a DBMS is the database engine, which
executes specific commands to create database
structures, and to create, read, update, and
delete records in the database. - Data definition language (DDL) is used by the
DBMS to physically establish the record types,
fields, and structural relationships, and to
define views of the database. - A data manipulation language (DML) is used to
create, read, update and delete records in the
database, and to navigate between different
records. - The DBMS and the DML hide details concerning how
the records are physically organised and
allocated to the disk.
26Relational databases
- Data is stored in tables where relationships
between data in different tables are supported
(through links). - The DDL and DML of most relational databases is
called SQL (structured query language). - Nearly all relational DBMSs support the SQL
language standards (eg. Oracle, IBMs Database
Manager, MS SQL Server, Sybase for large
systems, and MS Access, Foxpro, Paradox, dBASE
for desktop PCs). - Require data to be normalised to prepare the data
model for implementation as a simple,
non-redundant, flexible and adaptive database.
27Relational Database schema
- Database is defined in terms of tables, keys,
indexes, and integrity rules. - Each entity is implemented as a physical table.
- Each table has a primary key, and a index based
on that key. - Each table may have zero or more secondary keys,
and an index based on each of these keys. - A foreign key can be used to match the primary
key in a related table to allow the tables to be
joined. - Attributes are implemented as fields.
28Common Database Terms
- Definitions
- Field - (column or attribute) the smallest unit
of meaningful data that can be stored. - Record - (or row) a collection of related
fields. - Table - collection of similar records.
- Index - a list to assist with table searches or
enforce key integrity. - Database - a collection of inter-related tables.
- Data is a resource that must be controlled and
managed.
29Defining the fields
- Attribute - which data element is it?
- Field name - how will it be identified in the
table? - Data type - string, numeric, boolean, date, ?
- Field size - how many characters, digits,
decimal places? - Null? - can the field be left empty?
- Domain - what range of values are legal?
- Default - what value is initially offered to the
user? - any other relevant information about the
attribute.
30MS-Access example
- Participant table definition
31Indexes
- An index is a way of ordering the records in the
table. - Indexes are used for
- rapid access to data
- searching for a specific record in a table.
- displaying records in a report in a particular
order. - looking up a related table for the record which
matches a foreign key value. - A table may have many indexes, but should have at
least one based on the primary key (ie. the
primary index). - This key is used to preserve uniqueness of the
key
32Indexes (cont)
- Searching for a record without an index requires
a sequential search. - On average, half the records will need to be
searched. - Searching for a record with an index uses a
binary search. - Greatly reduces the number of records that need
to be searched. - If a query will return a large number of records,
using indexes may slow the return of the results.
33MS-Access example
- Indexes for the Participant table
34Data and referential integrity
- Key integrity
- every table should have a primary key (which may
be concatenated). - no two records in a table can have the same
primary key value. - a primary key value must never be Null.
- Domain integrity
- no field can take a value outside its range of
legal values. - Referential integrity
- a foreign key value in one table must have a
matching primary key value in the related table. - record deletion rules must consider this.
35Estimating database capacity
- Database administrator will need to calculate the
amount of disk space required for the database. - Calculation involves
- summing the field sizes to get the record size.
- forecasting the number of records that will be in
the table after a given period of time. - multiplying the number of records by the record
size to get the table size - summing the table sizes to get the database size.
- Adding in a factor (eg. 50) to allow for indexes
and other overheads.
36The Next Generation of Database Design
- Some applications still use network or
heirarchical database technology - but their use
is declining. - Relational database technology (RDBMS) is widely
used in contemporary information systems. - Object database management systems (OODBMS) are
emerging
37Object database management systems
- Object database management systems store true
objects, that is, encapsulated data and the
processes for each object. - Because relational database management systems
are so widely used, this change will happen
slowly. - ODBMS are unproven in business applications
- Object technology is being biuilt into some
relational DBMSs - New object DBMSs have been developed to for the
transition between relational and object models.