The chapter will address the following questions:

About This Presentation

Title:

The chapter will address the following questions:

Description:

The chapter will address the following questions: What are the similarities and differences between conventional files and modern, relational databases? – PowerPoint PPT presentation

Number of Views:126

Avg rating:3.0/5.0

Slides: 70

Provided by: anvariNet3

Category:

more less

Transcript and Presenter's Notes

Title: The chapter will address the following questions:

1
Introduction

The chapter will address the following questions
What are the similarities and differences between
conventional files and modern, relational
databases?
What are of fields, records, files, and
databases? What are some examples of each?
What is a modern data architecture that includes
files, operational databases, data warehouses,
personal databases, and work group databases?
What are the similarities and differences between
the roles of systems analyst, data administrator,
and database administrators as they relate to
databases?
What is the architecture of a database management
system?

2
Introduction

The chapter will address the following questions
How does a relational database implement
entities, attributes, and relationships from a
logical data model?
How do you normalize a logical data model to
remove impurities that can make a database
unstable, inflexible, and non-scaleable?
How do you transform a logical data model into a
physical, relational database schema?
How do you generate SQL code to create the
database structures in a schema?

3
Conventional Files Versus the Database

Introduction
All information systems create, read, update and
delete data. This data is stored in files and
databases.
Files are collections of similar records.
Databases are collections of interrelated files.
The key word is interrelated.
The records in each file must allow for
relationships (think of them as pointers) to
the records in other files.
In the file environment, data storage is built
around the applications that will use the files.
In the database environment, applications will
be built around the integrated database.

4
(No Transcript)
5
Conventional Files Versus the Database

The Pros and Cons of Conventional Files
Pros
Conventional files are relatively easy to design
and implement because they are normally based on
a single application or information system.
Historically, another advantage of conventional
files has been processing speed.
Cons
Duplication of data items in multiple files is
normally cited as the principal disadvantage of
file-based systems.
A significant disadvantage of files is their
inflexibility and non-scaleability.

6
Conventional Files Versus the Database

The Pros and Cons of Conventional Files
As legacy file-based systems and applications
become candidates for reengineering, the trend is
overwhelmingly in favor of replacing file-based
systems and applications with database systems
and applications.

7
Conventional Files Versus the Database

The Pros and Cons of Database
Pros
The principal advantage of a database is the
ability to share the same data across multiple
applications and systems.
Database technology offers the advantage of
storing data in flexible formats.
Databases allow the use of the data in ways not
originally specified by the end-users - data
independence.
The database scope can even be extended without
impacting existing programs that use it.
New fields and record types can be added to the
database without affecting current programs.

8
Conventional Files Versus the Database

The Pros and Cons of Database
Cons
Database technology is more complex than file
technology.
Special software, called a database management
system (DBMS), is required.
A DBMS is still somewhat slower than file
technology.
Database technology requires a significant
investment.
The cost of developing databases is higher
because analysts and programmers must learn how
to use the DBMS.
In order to achieve the benefits of database
technology, analysts and database specialists
must adhere to rigorous design principles.
Another potential problem with the database
approach is the increased vulnerability inherent
in the use of shared data.

9
Conventional Files Versus the Database

Database Design in Perspective
To fully exploit the advantages of database
technology, a database must be carefully
designed.
The end product is called a database schema, a
technical blueprint of the database.
Database design translates the data models that
were developed for the system users during the
definition phase, into data structures supported
by the chosen database technology.
Subsequent to database design, system builders
will construct those data structures using the
language and tools of the chosen database
technology.

10
(No Transcript)
11
Database Concepts for the Systems Analyst

Fields
Fields are common to both files and databases.
A field is the implementation of a data
attribute.
Fields are the smallest unit of meaningful data
to be stored in a file or database.
There are four types of fields that can be
stored primary keys, secondary keys, foreign
keys, and descriptive fields.
Primary keys are fields whose values identify one
and only one record in a file.
Secondary keys are alternate identifiers for an
database.
A single file in a database may only have one
primary key, but it may have several secondary
keys.

12
Database Concepts for the Systems Analyst

Fields
There are four types of fields that can be
stored primary keys, secondary keys, foreign
keys, and descriptive fields. (continued)
Foreign keys are pointers to the records of a
different file in a database.
Foreign keys are how the database links the
records of one type to those of another type.
Descriptive fields are any other fields that
store business data.

13
Database Concepts for the Systems Analyst

Records
Fields are organized into records.
Like fields, records are common to both files and
databases.
A record is a collection of fields arranged in a
predefined format.
During systems design, records will be classified
as either fixed-length or variable-length
records.
Most database systems impose a fixed-length
record structure, meaning that each record
instance has the same fields, same number of
fields, and same logical size.
Variable-length record structures allow different
records in the same file to have different
lengths.
Database systems typically disallow (or, at
least, discourage) variable length records.

14
Database Concepts for the Systems Analyst

Records
When a computer program reads a record from a
database, it actually retrieves a group or block
of records at a time.
This approach minimizes the number of actual disk
accesses.
A blocking factor is the number of logical
records included in a single read or write
operation (from the computers perspective). A
block is sometimes called a physical record.
Today, the blocking factor is usually determined
and optimized by the chosen database technology,
but a qualified database expert may be allowed to
fine tune that blocking factor for performance.

15
Database Concepts for the Systems Analyst

Files and Tables
Similar records are organized into groups called
files.
A file is the set of all occurrences of a given
record structure.
In database systems, a file corresponds to a set
of similar records usually called a table.
A table is the relational database equivalent of
a file.
Some of the types of files and tables include
Master files or tables contain records that are
relatively permanent.
Once a record has been added to a master file, it
remains in the system indefinitely.
The values of fields for the record will change
over its lifetime, but the individual records are
retained indefinitely.

16
Database Concepts for the Systems Analyst

Files and Tables
Some of the types of files and tables include
(continued)
Transaction files or tables contain records that
describe business events.
The data describing these events normally has a
limited useful lifetime.
In information systems, transaction records are
frequently retained on-line for some period of
time.
Subsequent to their useful lifetime, they are
archived off-line.
Document files and tables contain stored copies
of historical data for easy retrieval and review
without the overhead of re-generating the
document.

17
Database Concepts for the Systems Analyst

Files and Tables
Some of the types of files and tables include
(continued)
Archival files and tables contain master and
transaction file records that have been deleted
from on-line storage.
Records are rarely deleted they are merely moved
from on-line storage to off-line storage.
Archival requirements are dictated by government
regulation and the need for subsequent audit or
analysis.
Table look-up files contain relatively static
data that can be shared by applications to
maintain consistency and improve performance.

18
Database Concepts for the Systems Analyst

Files and Tables
Some of the types of files and tables include
(continued)
Audit files are special records of updates to
other files, especially master and transaction
files.
They are used in conjunction with archive files
to recover lost data.
Audit trails are typically built into better
database technologies.

19
Database Concepts for the Systems Analyst

Databases
Databases provide for the technical
implementation of entities and relationships.
The history of information systems has led to one
inescapable conclusion
Data is a resource that must be controlled and
managed!
Out of necessity, database technology was created
so an organization could maintain and use its
data as an integrated whole instead of as
separate data files.

20
Database Concepts for the Systems Analyst

Databases
Data Architecture
A business data architecture is comprised of the
files and databases that store all of the
organizations data, the file and database
technology used to store the data, and the
organization structure set up to manage the data
resource.
Operational databases have been developed to
support day-to-day operations and business
transaction processing for major information
systems.

21
Database Concepts for the Systems Analyst

Databases
Data Architecture
Many information systems shops hesitate to give
end-users access to operational databases,
because the volume of unscheduled reports and
queries could overload the computers and hamper
business operations.
To remedy that problem, data warehouses were
developed. computers.
Data warehouses store data that is extracted from
the production databases and conventional files.
Fourth-generation programming languages, query
tools, and decision support tools are then used
to generate reports and analyses off these data
warehouses.

22
Database Concepts for the Systems Analyst

Databases
Data Architecture
Personal computer and local network database
technology has rapidly matured to allow end-users
to develop personal and departmental databases.
These databases may contain unique data, or they
may import data from conventional files,
operational databases, and/or data warehouses.

23
Database Concepts for the Systems Analyst

Databases
Data Architecture
To manage the enterprise-wide data resource, a
staff of database specialists may be organized
around the following administrators
A data administrator is responsible for the data
planning, definition, architecture, and
management.
One or more database administrators are
responsible for the database technology, database
design and construction, security, backup and
recovery, and performance tuning.

24
(No Transcript)
25
Database Concepts for the Systems Analyst

Databases
Database Architecture
Database architecture refers to the database
technology including the database engine,
database management utilities, database CASE
tools for analysis and design, and database
application development tools.
The control center of a database architecture is
its database management system.
A database management system (DBMS) is
specialized computer software available from
computer vendors that is used to create, access,
control, and manage the database. The core of the
DBMS is often called its database engine. The
engine responds to specific commands to create
database structures, and then to create, read,
update, and delete records in the database.

26
Database Concepts for the Systems Analyst

Databases
Database Architecture
A systems analyst, or database analyst, designs
the structure of the data in terms of record
types, fields contained in those record types,
and relationships that exist between record
types.
These structures are defined to the database
management system using its data definition
language.
Data definition language (or DDL) is used by the
DBMS to physically establish those record types,
fields, and structural relationships.
Additionally, the DDL defines views of the
database. Views restrict the portion of a
database that may be used or accessed by
different users and programs. DDLs record the
definitions in a permanent data repository.

27
(No Transcript)
28
Database Concepts for the Systems Analyst

Databases
Database Architecture
Some data dictionaries include formal, elaborate
software that helps database specialists track
metadata the data about the data such as
record and field definitions, synonyms, data
relationships, validation rules, help messages,
and so forth.
The database management system also provides a
data manipulation language to access and use the
database in applications.
A data manipulation language (or DML) is used to
create, read, update, and delete records in the
database, and to navigate between different
records and types of records. The DBMS and DML
hide the details concerning how records are
organized and allocated to the disk.

29
Database Concepts for the Systems Analyst

Databases
Database Architecture
Many DBMSs dont require the use of a DDL to
construct the database, or a DML to access the
database.
They provide their own tools and commands to
perform those tasks. This is especially true of
PC-based DBMSs.
Many DBMSs also include proprietary report
writing and inquiry tools to allow users to
access and format data without directly using the
DML.
Some DBMSs include a transaction processing
monitor (or TP monitor) that manages on-line
accesses to the database, and ensures that
transactions that impact multiple tables are
fully processed as a single unit.

30
Database Concepts for the Systems Analyst

Databases
Relational Database Management Systems
There are several types of database management
systems and they can be classified according to
the way they structure records.
Early database management systems organized
records in hierarchies or networks implemented
with indexes and linked lists.
Relational databases implement data in a series
of tables that are related to one another via
foreign keys.
Files are seen as simple two-dimensional tables,
also known as relations.
The rows are records.
The columns correspond to fields.

31
(No Transcript)
32
(No Transcript)
33
Database Concepts for the Systems Analyst

Databases
Relational Database Management Systems
Both the DDL and DML of most relational databases
is called SQL (which stands for Structured Query
Language).
SQL supports not only queries, but complete
database creation and maintenance.
A fundamental characteristic of relational SQL is
that commands return a set of records, not
necessarily just a single record (as in
non-relational database and file technology).

34
Database Concepts for the Systems Analyst

Databases
Relational Database Management Systems
High-end relational databases also extend the SQL
language to support triggers and stored
procedures.
Triggers are programs embedded within a table
that are automatically invoked by a updates to
another table.
Stored procedures are programs embedded within a
table that can be called from an application
program.
Both triggers and stored procedures are reusable
because they are stored with the tables
themselves.
This eliminates the need for application
programmers to create the equivalent logic within
each application that use the tables.

35
Data Analysis for Database Design

What is a Good Data Model?
A good data model is simple.
As a general rule, the data attributes that
describe an entity should describe only that
entity.
A good data model is essentially non-redundant.
This means that each data attribute, other than
foreign keys, describes at most one entity.
A good data model should be flexible and
adaptable to future needs.
We should make the data models as
application-independent as possible to encourage
database structures that can be extended or
modified without impact to current programs.

36
Data Analysis for Database Design

Data Analysis
The technique used to improve a data model in
preparation for database design is called data
analysis.
Data analysis is a process that prepares a data
model for implementation as a simple,
non-redundant, flexible, and adaptable database.
The specific technique is called normalization.
Normalization is a technique that organizes data
attributes such that they are grouped together to
form stable, flexible, and adaptive entities.

37
Data Analysis for Database Design

Data Analysis
Normalization is a three-step technique that
places the data model into first normal form,
second normal form, and third normal form.
An entity is in first normal form (1NF) if there
are no attributes that can have more than one
value for a single instance of the entity.
An entity is in second normal form (2NF) if it is
already in 1NF, and if the values of all
non-primary key attributes are dependent on the
full primary key not just part of it.
An entity is in third normal form (3NF) if it is
already in 2NF, and if the values of its
non-primary key attributes are not dependent on
any other non-primary key attributes.

38
Data Analysis for Database Design

Normalization Example
First Normal Form
The first step in data analysis is to place each
entity into 1NF.

39
(No Transcript)
40
(No Transcript)
41
(No Transcript)
42
(No Transcript)
43
Data Analysis for Database Design

Normalization Example
Second Normal Form
The next step of data analysis is to place the
entities into 2NF.
It is assumed that you have already placed all
entities into 1NF.
2NF looks for an anomaly called a partial
dependency, meaning an attribute(s) whose value
is determined by only part of the primary key.
Entities that have a single attribute primary key
are already in 2NF.
Only those entities that have a concatenated key
need to be checked.

44
(No Transcript)
45
Data Analysis for Database Design

Normalization Example
Third Normal Form
Entities are assumed to be in 2NF before
beginning 3NF analysis.
Third normal form analysis looks for two types of
problems, derived data and transitive
dependencies.
In both cases, the fundamental error is that non
key attributes are dependent on other non key
attributes.
Derived attributes are those whose values can
either be calculated from other attributes, or
derived through logic from the values of other
attributes.
A transitive dependency exists when a non-key
attribute is dependent on another non-key
attribute (other than by derivation).
Transitive analysis is only performed on those
entities that do not have a concatenated key.

46
Data Analysis for Database Design

Normalization Example
Third Normal Form
Third normal form analysis looks for two types of
problems, derived data and transitive
dependencies. (continued)
A transitive dependency exists when a non-key
attribute is dependent on another non-key
attribute (other than by derivation).
This error usually indicates that an undiscovered
entity is still embedded within the problem
entity.
Transitive analysis is only performed on those
entities that do not have a concatenated key.
An entity is said to be in third normal form if
every non-primary key attribute is dependent on
the primary key, the whole primary key, and
nothing but the primary key.

47
(No Transcript)
48
(No Transcript)
49
Data Analysis for Database Design

Normalization Example
Simplification by Inspection
When several analysts work on a common
application, it is not unusual to create problems
that wont be taken care of by normalization.
These problems are best solved through
simplification by inspection, a process wherein a
data entity in 3NF is further simplified by such
efforts as addressing subtle data redundancy.

50
Data Analysis for Database Design

Normalization Example
CASE Support for Normalization
Most CASE tools can only normalize to first
normal form.
They accomplish this in one of two ways.
They look for many-to-many relationships and
resolve those relationships into associative
entities.
They look for attributes specifically described
as having multiple values for a single entity
instance.
It is exceedingly difficult for a CASE tool to
identify second and third normal form errors.
That would require the CASE tool to have the
intelligence to recognize partial and transitive
dependencies.

51
File Design

Introduction.
Most fundamental entities from the data model
would be designed as master or transaction
records.
The master files a typically fixed length
records.
Associative entities from the data model are
typically joined into the transaction records to
form variable length records (based on the
one-to-many relationships).
Other types of files (not represented in the data
model) are added as necessary.
Two important considerations of file design are
file access and organization.
The systems analyst usually studies how each
program will access the records in the file
(sequentially or randomly), and then select
an appropriate file organization.

52
Database Design

Introduction
The design of any database will usually involve
the DBA and database staff.
They will handle the technical details and
cross-application issues.
It is useful for the systems analyst to
understand the basic design principles for
relational databases.

53
Database Design

Goals and Prerequisites to Database Design
The goals of database design are as follows
A database should provide for the efficient
storage, update, and retrieval of data.
A database should be reliable the stored data
should have high integrity to promote user trust
in that data.
A database should be adaptable and scaleable to
new and unforeseen requirements and applications.

54
Database Design

Goals and Prerequisites to Database Design
The data model may have to be divided into
multiple data models to reflect database
distribution and database replication decisions.
Data distribution refers to the distribution of
either specific tables, records, and/or fields to
different physical databases.
Data replication refers to the duplication of
specific tables, records, and/or fields to
multiple physical databases.
Each sub-model or view should reflect the data to
be stored on a single server.

55
Database Design

The Database Schema
The design of a database is depicted as a special
model called a database schema.
A database schema is the physical model or
blueprint for a database. It represents the
technical implementation of the logical data
model.
A relational database schema defines the database
structure in terms of tables, keys, indexes, and
integrity rules.
A database schema specifies details based on the
capabilities, terminology, and constraints of the
chosen database management system.

56
Database Design

The Database Schema
Transforming the logical data model into a
physical relational database schema rules and
guidelines
Each fundamental, associative, and weak entity is
implemented as a separate table.
The primary key is identified as such and
implemented as an index into the table.
Each secondary key is implemented as its own
index into the table.
Each foreign key will be implemented as such.
Attributes will be implemented with fields.
These fields correspond to columns in the table.

57
Database Design

The Database Schema
Transforming the logical data model into a
physical relational database schema rules and
guidelines (continued)
The following technical details must usually be
specified for each attribute.
Data type. Each DBMS supports different data
types, and terms for those data types.
Size of the Field. Different DBMSs express
precision of real numbers differently.
NULL or NOT NULL. Must the field have a value
before the record can be committed to storage?
Domains. Many DBMSs can automatically edit data
to ensure that fields contain legal data.
Default. Many DBMSs allow a default value to be
automatically set in the event that a user or
programmer submits a record without a value.

58
Database Design

The Database Schema
Transforming the logical data model into a
physical relational database schema rules and
guidelines (continued)
Supertype/subtype entities present additional
options as follows
Most CASE tools do not currently support
object-like constructs such as supertypes and
subtypes.
Most CASE tools default to creating a separate
table for each entity supertype and subtype.
If the subtypes are of similar size and data
content, a database administrator may elect to
collapse the subtypes into the supertype to
create a single table.
Evaluate and specify referential integrity
constraints.

59
Database Design

Data and Referential Integrity
There are at least three types of data integrity
that must be designed into any database - key
integrity, domain integrity and referential
integrity.
Key Integrity
Every table should have a primary key (which may
be concatenated).
The primary key must be controlled such that no
two records in the table have the same primary
key value.
The primary key for a record must never be
allowed to have a NULL value.

60
Database Design

Data and Referential Integrity
Domain Integrity
Appropriate controls must be designed to ensure
that no field takes on a value that is outside of
the range of legal values.
Referential Integrity
A referential integrity error exists when a
foreign key value in one table has no matching
primary key value in the related table.

61
Database Design

Data and Referential Integrity
Referential Integrity
Referential integrity is specified in the form of
deletion rules as follows
No restriction.
Any record in the table may be deleted without
regard to any records in any other tables.
DeleteCascade.
A deletion of a record in the table must be
automatically followed by the deletion of
matching records in a related table.
DeleteRestrict.
A deletion of a record in the table must be
disallowed until any matching records are deleted
from a related table.

62
Database Design

Data and Referential Integrity
Referential Integrity
Referential integrity is specified in the form of
deletion rules as follows (continued)
DeleteSet Null.
A deletion of a record in the table must be
automatically followed by setting any matching
keys in a related table to the value NULL.

63
Database Design

Roles
Some database shops insist that no two fields
have exactly the same name.
This presents an obvious problem with foreign
keys
A role name is an alternate name for a foreign
key that clearly distinguishes the purpose that
the foreign key serves in the table.
The decision to require role names or not is
usually established by the data or database
administrator.

64
Database Design

Database Prototypes
Prototyping is not an alternative to carefully
thought out database schemas.
On the other hand, once the schema is completed,
a prototype database can usually be generated
very quickly.
Most modern DBMSs include powerful, menu-driven
database generators that automatically create a
DDL and generate a prototype database from that
DDL.
A database can then be loaded with test data that
will prove useful for prototyping and testing
outputs, inputs, screens, and other systems
components.

65
Database Design

Database Capacity Planning
A database is stored on disk.
The database administrator will want an estimate
of disk capacity for the new database to ensure
that sufficient disk space is available.
Database capacity planning can be calculated with
simple arithmetic as follows.
For each table, sum the field sizes.
This is the record size for the table.
For each table, multiply the record size times
the number of entity instances to be included in
the table.
This is the table size.

66
Database Design

Database Capacity Planning
Database capacity planning can be calculated with
simple arithmetic as follows. (continued)
Sum the table sizes.
This is the database size.
Optionally, add a slack capacity buffer (e.g.,
10) to account for unanticipated factors or
inaccurate estimates above.
This is the anticipated database capacity.

67
Database Design

Database Structure Generation
CASE tools are frequently capable of generating
SQL code for the database directly from a
CASE-based database schema.
This code can be exported to the DBMS for
compilation.
Even a small database model can require 50 pages
or more of SQL data definition language code to
create the tables, indexes, keys, fields, and
triggers.
Clearly, a CASE tools ability to automatically
generate syntactically correct code is an
enormous productivity advantage.
Furthermore, it almost always proves easier to
modify the database schema and re-generate the
code, than to maintain the code directly.

68
The Next Generation of Database Design

Introduction
Relational database technology is widely deployed
and used in contemporary information system
shops.
One new technology is slowly emerging that could
ultimately change the landscape dramatically
object database management systems.
The heir apparent to relational DBMSs, object
database management systems store true objects,
that is, encapsulated data and all of the
processes that can act on that data.
Because relational database management systems
are so widely used, we dont expect this change
to happen quickly.
It is expected that these vendors will either
build object technology into their existing
relational DBMSs, or they will create new, object
DBMSs and provide for the transition between
relational and object models.

69
Summary

Introduction
Conventional Files Versus the Database
Database Concepts for the Systems Analyst
Data Analysis for Database Design
File Design
Database Design
The Next Generation of Database Design

Write a Comment

User Comments (0)

About PowerShow.com

The chapter will address the following questions: - PowerPoint PPT Presentation

The chapter will address the following questions:

The chapter will address the following questions: What are the similarities and differences between conventional files and modern, relational databases? – PowerPoint PPT presentation