Title: Database Management
1Database Management
- COMP 630 R1FF
- Winter 2005
- Instructor Rama Gudhe
2Course Logistics
- Instructor information
- Roster
- Materials
- Communications
- Grading policy
- Submission return policy
- Academic integrity
3Module 1The Database Environment
4Data vs. Information
- Data
- Raw facts building blocks of information
- Unprocessed information
- Information
- Data processed to reveal meaning
- Accurate, relevant, and timely information is key
to good decision making - Good decision making is key to survival in global
environment
5Sales per Employee for Each of ROBCORs Two
Divisions
6Basic Definitions
- Database A collection of related data.
- Data Known facts that can be recorded and have
an implicit meaning. - Database Management System (DBMS) A software
package/ system to facilitate the creation and
maintenance of a computerized database.
7Traditional File Processing
- Program-data dependence
- Data redundancy (duplication of data)
- Limited data sharing
- Excessive program maintenance
8Database Management System
- A DBMS is a data storage and retrieval system
which permits data to be stored non-redundantly
while making it appear to the user as if the data
is well-integrated.
9Data Models
- Data Model A set of concepts to describe the
structure of a database, and certain constraints
that the database should obey. - Data Model Operations Operations for specifying
database retrievals and updates by referring to
the concepts of the data model. Operations on the
data model may include basic operations and
user-defined operations.
10The Evolution of Data Models
- Hierarchical
- Network
- Relational
- Object oriented
11The Hierarchical ModelEvolution
- GUAM (Generalized Update Access Method)
- Based on the recognition that the many smaller
parts would come together as components of still
larger components - Information Management System (IMS)
- Worlds leading mainframe hierarchical database
system in the 1970s and early 1980s
12Hierarchical Model
- Implemented in a joint effort by IBM and North
American Rockwell around 1965. Resulted in the
IMS family of systems. - Other system based on this model System 2k (SAS
inc.) - Corresponds to a number of natural hierarchically
organized domains - e.g., assemblies in
manufacturing, personnel organization in
companies - Language is simple uses constructs like GET, GET
UNIQUE, GET NEXT, GET NEXT WITHIN PARENT etc. - Navigational and procedural nature of processing
- Little scope for "query optimization"
13The Hierarchical ModelCharacteristics
- Basic concepts form the basis for subsequent
database development - Limitations lead to a different way of looking at
database design - Basic concepts show up in current data models
- Best understood by examining manufacturing process
14A Hierarchical Structure
15Hierarchical StructureCharacteristics
- Each parent can have many children
- Each child has only one parent
- Tree is defined by path that traces parent
segments to child segments, beginning from the
left - Hierarchical path
- Ordered sequencing of segments tracing
hierarchical structure
16The Hierarchical Model
- Advantages
- Conceptual simplicity
- Database security
- Data independence
- Database integrity
- Efficiency
17The Hierarchical Model (continued)
- Disadvantages
- Complex implementation
- Difficult to manage
- Lacks structural independence
- Complex applications programming and use
- Implementation limitations
- Lack of standards
18Network Model
- Network Model the first one to be implemented by
Honeywell in 1964-65 (IDS System). Adopted
heavily due to the support by CODASYL (CODASYL -
DBTG report of 1971). - Later implemented in a large variety of systems -
IDMS (Cullinet - now CA), DMS 1100 (Unisys),
IMAGE (HP), VAX -DBMS (Digital Equipment Corp). - Network Model is able to model complex
relationships and represents semantics of
add/delete on the relationships - Can handle most situations for modeling using
record types and relationship types - Database contains a complex array of pointers
that thread through a set of records. Little
scope for automated "query optimization
19The Network Model
- Created to
- Represent complex data relationships more
effectively - Improve database performance
- Impose a database standard
- Conference on Data Systems Languages (CODASYL)
- American National Standards Institute (ANSI)
- Database Task Group (DBTG)
20Crucial Database Components
- Schema
- Conceptual organization of entire database as
viewed by the database administrator - Subschema
- Defines database portion seen by the
application programs that actually produce the
desired information from data contained within
the database - Data Management Language (DML)
- Define data characteristics and data structure in
order to manipulate the data
21Data Management Language Components
- Schema Data Definition Language (DDL)
- Enables database administrator to define schema
components - Subschema DDL
- Allows application programs to define database
components that will be used - DML
- Manipulates database contents
22Network ModelBasic Structure
- Resembles hierarchical model
- Collection of records in 1M relationships
- Set
- Relationship
- Composed of at least two record types
- Owner
- Equivalent to the hierarchical models parent
- Member
- Equivalent to the hierarchical models child
23A Network Data Model
24The Network Data Model
- Advantages
- Conceptual simplicity
- Handles more relationship types
- Data access flexibility
- Promotes database integrity
- Data independence
- Conformance to standards
25The Network Data Model (continued)
- Disadvantages
- System complexity
- Lack of structural independence
26The Relational Model
- Developed by Codd (IBM) in 1970
- Non-procedural language for accessing data
- Conceptually simple
- Computers lacked power to implement the
relational model - Today, microcomputers can run sophisticated
relational database software
27The Relational ModelBasic Structure
- Relational Database Management System (RDBMS)
- Performs same basic functions provided by
hierarchical and network DBMS systems, plus other
functions - Most important advantage of the RDBMS is its
ability to let the user/designer operate in a
human logical environment
28The Relational ModelBasic Structure (continued)
- Table (relations)
- Matrix consisting of a series of row/column
intersections - Related to each other by sharing a common entity
characteristic - Relational schema
- Visual representation of relational databases
entities, attributes within those entities, and
relationships between those entities
29Relational Table
- Stores a collection of related entities
- Resembles a file
- Relational table is purely logical structure
- How data are physically stored in the database is
of no concern to the user or the designer - This property became the source of a real
database revolution
30A Relational Schema
31Linking Relational Tables
32The Relational Model
- Advantages
- Structural independence
- Improved conceptual simplicity
- Easier database design, implementation,
management, and use - Ad hoc query capability
- Powerful database management system
33The Relational Model (continued)
- Disadvantages
- Substantial hardware and system software overhead
- Can facilitate poor design and implementation
34Object Data Models
- Object-Relational Models Most Recent Trend.
Started with Informix Universal Server.
Exemplified in the latest versions of Oracle-10i,
DB2, and SQL Server etc. systems. - Object-oriented Data Model(s) several models
have been proposed for implementing in a database
system. One set comprises models of persistent
O-O Programming Languages such as C (e.g., in
OBJECTSTORE or VERSANT), and Smalltalk (e.g., in
GEMSTONE). Additionally, systems like O2, ORION
(at MCC - then ITASCA), IRIS (at H.P.- used in
Open OODB).
35Internet, Intranet, and Extranet Databases
- Web-enabled
- Examples reflect B2C, B2B
- Large number of potential users
36Characteristics of the Database Approach
- Data independence/reduced maintenance
- Improved data sharing
- Increased application development productivity
- Enforcement of standards
- Improved data quality with constraints
- Better data accessibility/ responsiveness
- Security, backup/recovery, concurrency
37Section IIThree-Schema Architecture
- Module 1 The Database Environment
38Three-Schema Architecture
- Proposed to support DBMS characteristics of
- Program-data independence.
- Support of multiple views of the data.
39Three-Schema Architecture
- Internal schema at the internal level to describe
physical storage structures and access paths.
Typically uses a physical data model. - Conceptual schema at the conceptual level to
describe the structure and constraints for the
whole database for a community of users. Uses a
conceptual or an implementation data model. - External schemas at the external level to
describe the various user views. Usually uses the
same data model as the conceptual level.
40Three-Schema Architecture
- Mappings among schema levels are needed to
transform requests and data. Programs refer to an
external schema, and are mapped by the DBMS to
the internal schema for execution.
41The three-schema architecture
42Data Independence
- Logical Data Independence The capacity to change
the conceptual schema without having to change
the external schemas and their application
programs. - Physical Data Independence The capacity to
change the internal schema without having to
change the conceptual schema.
43Data Independence
- When a schema at a lower level is changed, only
the mappings between this schema and higher-level
schemas need to be changed in a DBMS that fully
supports data independence. The higher-level
schemas themselves are unchanged. Hence, the
application programs need not be changed since
they refer to the external schemas.
44Section IIIDatabase Management System (DBMS)
- Module 1 The Database Environment
45Typical DBMS Functionality
- Define a database in terms of data types,
structures and constraints - Construct or load the database
- Manipulating the database querying, generating
reports, insertions, deletions and modifications
to its content - Concurrent Processing and Sharing by a set of
users and programs yet, keeping all data valid
and consistent
46Typical DBMS Functionality
- Other features
- Protection or Security measures to prevent
unauthorized access - Active processing to take internal actions on
data - Presentation and Visualization of data
47Classification of DBMSs
- Based on the data model used
- Traditional Relational, Network, Hierarchical.
- Emerging Object-oriented, Object-relational.
- Other classifications
- Single-user (typically used with micro-
computers) vs. multi-user (most DBMSs). - Centralized (uses a single computer with one
database) vs. distributed (uses multiple
computers, multiple databases)
48Classification of DBMSs
- A Distributed Database Systems (DDBMS) can have
the actual database and DBMS software distributed
over many sites, connected by a computer network.
49Variations of Distributed Environments
- Homogeneous DDBMS
- Heterogeneous DDBMS
- Federated or Multidatabase Systems
50Simplified database system environment.
51Main Characteristics of the Database Approach
- Self-describing nature of a database system A
DBMS catalog stores the description of the
database. The description is called meta-data).
This allows the DBMS software to work with
different databases. - Insulation between programs and data Called
program-data independence. Allows changing data
storage structures and operations without having
to change the DBMS access programs.
52Main Characteristics of the Database Approach
- Data Abstraction A data model is used to hide
storage details and present the users with a
conceptual view of the database. - Support of multiple views of the data Each user
may see a different view of the database,
which describes only the data of interest to
that user.
53Main Characteristics of the Database Approach
- Sharing of data and multiuser transaction
processing allowing a set of concurrent users
to retrieve and to update the database.
Concurrency control within the DBMS guarantees
that each transaction is correctly executed or
completely aborted. OLTP (Online Transaction
Processing) is a major part of database
applications.
54Database Users
- Users may be divided into those who actually use
and control the content (called Actors on the
Scene) and those who enable the database to be
developed and the DBMS software to be designed
and implemented (called Workers Behind the
Scene).
55Actors on the scene
- Database administrators responsible for
authorizing access to the database, for
coordinating and monitoring its use, acquiring
software, and hardware resources, controlling its
use and monitoring efficiency of operations. - Database Designers responsible to define the
content, the structure, the constraints, and
functions or transactions against the database.
They must communicate with the end-users and
understand their needs. - End-users they use the data for queries, reports
and some of them actually update the database
content.
56Advantages of Using the Database Approach
- Controlling redundancy in data storage and in
development and maintenance efforts. - Sharing of data among multiple users.
- Restricting unauthorized access to data.
- Providing persistent storage for program Objects
- Providing Storage Structures for efficient Query
Processing
57Advantages of Using the Database Approach
- Providing backup and recovery services.
- Providing multiple interfaces to different
classes of users. - Representing complex relationships among data.
- Enforcing integrity constraints on the database.
58Additional Implications of Using the Database
Approach
- Potential for enforcing standards this is very
crucial for the success of database applications
in large organizations Standards refer to data
item names, display formats, screens, report
structures, meta-data (description of data) etc. - Reduced application development time incremental
time to add each new application is reduced.
59Additional Implications of Using the Database
Approach
- Flexibility to change data structures database
structure may evolve as new requirements are
defined. - Availability of up-to-date information very
important for on-line transaction systems such as
airline, hotel, car reservations. - Economies of scale by consolidating data and
applications across departments.
60Extending Database Capabilities
- New functionality is being added to DBMSs
- Scientific Applications
- Image Storage and Management
- Audio and Video data management
- Data Mining
- Spatial data management
- Time Series and Historical Data Management
- The above gives rise to new research and
development in incorporating new data types,
complex data structures, new operations and
storage and indexing schemes in database systems.
61 When not to use a DBMS
- Main inhibitors (costs) of using a DBMS
- High initial investment and possible need for
additional hardware. - Overhead for providing generality, security,
concurrency control, recovery, and integrity
functions. - When a DBMS may be unnecessary
- If the database and applications are simple, well
defined, and not expected to change. - If there are stringent real-time requirements
that may not be met because of DBMS overhead. - If access to data by multiple users is not
required.
62Database System Utilities
- To perform certain functions such as
- Loading data stored in files into a database.
Includes data conversion tools. - Backing up the database periodically on tape.
- Reorganizing database file structures.
- Report generation utilities.
- Performance monitoring utilities.
- Other functions, such as sorting, user
monitoring, data compression, etc.
63Component modules of a DBMS and their
interactions.
64Section IVRelational Model
- Module 1 The Database Environment
65Relational Databases
- Uses rows and columns to model data (spreadsheet)
- Columns represent attributes or
relationships between entities - Rows represent instances of the relations
66Implementing the RDBMS
- Implementing the RDBMS goal is organization wide
integration - Doing several isolated stand along applications
introduces same problems as file processing
systems - Need for historical or summary information drives
data warehouse data mart concept
67Enterprise Data Model
- Enterprise data model is a graphical model that
shows the high-level entities for the
organization and associations among those
entities. - Entity is an object or concept that is
important to the business. - Relationship is the association that describe
how the entities are related.
68Assignment Requirement
- All work is to be word processed
- Retain a copy for your own reference
- Turn in a hardcopy in classroom
- Late policy 1 week late, 90 maximum more than
1 week late, 70 maximum
69Assignment 1-1
- Some advantages of using database approach are
listed. Which one is the most significant? - One paragraph giving your choice and a
justification - Another paragraph comparing the limitation of
file system approach in the same aspect - Cite any references with APA format