Title: Data and Database Administration
1Data and Database Administration
2Chapter Goals
- DA and DBA major functions/changing roles
- Functions and components of DBMS
- DB security and techniques to enhance
- DB recovery facilities
- Concurrency control pessimistic v. optimistic
- Data dictionaries and information repositories
3Ineffective Administration
- Multiple definitions and inconsistent
representation of data elements - Missing key data elements that allow for
integration - Low data quality
- Inadequate familiarity with existing data,
including what data are available, where, and
what they mean.
4Definitions
- Data Administration A high-level function that
is responsible for the overall management of data
resources in an organization, including
maintaining corporate-wide definitions and
standards. - Database Administration A technical function
that is responsible for physical database design
and for dealing with technical issues such as
security enforcement, database performance, and
backup and recovery.
5Data Administration Functions
- Data policies, procedures, standards.
- Planning understanding needs.
- Data conflict (ownership) resolution.
- Internal marketing of DA concepts.
- Managing the data repository.
6Database Administration Functions
- Selection of hardware and software.
- Managing data security, privacy, and integrity.
- Data backup and recovery.
- Fig. 13-1 is a list of DA and DBA functions.
7Changing Admin. Approaches
- Variation from one org. to another
- Blending of DA and DBA roles
- Quinlan
- DB Planning--Improve tech. Selection
- DB Analysis--Work on physical in parallel
- DB Design--Prioritize application transactions
- DB Implementation--modularize for rapid
implementation - Operation and Maintenance--review to see that
quality has not been compromised
8Modeling Enterprise Data
- Information Systems Architecture
- Similar to other complex product development and
manufacturing processes. - Zachman--ISA is that set of descriptive
representations (models) that are relevant for
describing an enterprise such that that
enterprise can be produced to managements
requirements and maintained over the period of
its useful life.
9Planning for Databases
- Design occurs within the constraints of the
enterprise information system. - What data, what processes, which locations, which
work units and people, how will it fit with
significant business functions? - Must include continuing use of existing system.
- Initial assessment, operational feasibility,
formal review.
10Data Security
- Data Security Protection of the data against
accidental or intentional loss, destruction, or
misuse.
11Possible locations of data security threats
12Threats to Data Security
- Accidental losses attributable to
- Human error.
- Software failure.
- Hardware failure.
- Theft and fraud.
- Improper data access
- Loss of privacy (personal data).
- Loss of confidentiality (corporate data).
- Loss of data integrity.
- Loss of availability (through, e.g. sabotage).
13Data Management Software Security Features
- Views or subschemas.
- Authorization rules.
- User-defined procedures.
- E.g. trigger an authorization procedure which
asks additional identification questions. - Encryption.
- Authentication schemes.
- Backup, journaling, checkpoint facilities
14Authorization matrix
15Backup and RecoveryBasic Recovery Facilities
- Backup Facilities
- Journalizing Facilities
- Transaction Log - time, records, input values.
- Database Change Log - before after images.
- Checkpoint Facility
- Quiet period for synchronizing files and journals
- Restart point after a failure.
- Recovery Manager
16Database audit trail
17Recovery and Restart Procedures
- Switch - Mirrored databases.
- Restore/Rerun - Reprocess transactions against
the backup. - Transaction Integrity - Commit or abort all
transaction changes. - Backward Recovery (Rollback) - Apply before
images. Fig. 13-7. - Forward Recovery (Roll Forward) - Apply after
images (preferable to restore/rerun.)
18Basic recovery techniques (a) Rollback
19(b) Rollforward
20Types of Database Failure
- Aborted transactions.
- Incorrect data.
- Backward recovery
- Compensating transactions
- Restart from checkpoint before error occurred
- System failure (database intact.)
- Database destruction
21Concurrency Control
- Concurrency Control - The process of managing
simultaneous operations against a database so
that data integrity is maintained and the
operations do not interfere with each other in a
multi-user environment. - Lost Update Problem - Fig. 13-8 13-9.
- Locking out other users--pessimistic
- Versioning--optimistic
22Figure 13-8 Lost Update Problem
23Figure 13-9 Updates with locking concurrency
control
24Concurrency Control Techniques
- Serializability - Procedures that process
transactions as if one transaction if finished
before starting another. Achieved through
different means. - Locking Mechanisms
- Locking level (Granularity)
- Database
- Table
- Block or page
- Record
- Field
25Concurrency Control Techniques
- Types of locks
- Shared lock - Read but not update permitted.
Used when just reading to prevent another user
from placing an exclusive lock on the record. - Exclusive lock - No access permitted. Used when
preparing to update.
26Deadlock
- Deadlock - Transactions wait for each other to
release needed resources. Fig. 13-10, 11. - Deadlock prevention
- Lock all records required at the beginning of a
transaction - Two-phase locking protocol
- Growing phase.
- Shrinking phase.
- May be difficult to determine all needed
resources in advance.
27Figure 13-10 Problem of Deadlock
28Figure 13-11 Another example of deadlock
29Deadlock
- Deadlock Resolution
- Allow deadlocks to occur.
- Mechanisms for detecting and breaking them.
- Resource usage matrix.
30Concurrency Control Techniques
- Versioning
- Optimistic approach to concurrency control.
- Assumption is that simultaneous updates will be
infrequent. - Each transaction can attempt an update as it
wishes. - The system will reject an update when it senses a
conflict.
31Figure 13-12 The use of versioning
32Managing Data Quality
- Data Steward - Liaisons between IT and business
units. - Five Data Quality Issues
- Security policy and disaster recovery.
- Personnel controls.
- Physical access controls.
- Maintenance controls (hardware software.)
- Data protection and privacy.
33Data Dictionaries and Repositories
- Passive data dictionary.
- Active data dictionary.
- System catalog.
- Repository (CASE tools).
- Information Repository Dictionary System (IRDS) -
standards for dictionary storage and access.
34More on the Repository
- Information repository
- Stores metadata that describe an organizations
data and data processing resources. Manages the
total information processing environment. - Information Repository Dictionary System
- A computer software tool that is used to manage
and control access to the information repository.
35Repository System Components
- Information Model--Schema for information stored
in system - Repository Engine--Manages objects
- Object management
- Dynamic extensibility
- Relationship management
- Version management
- Configuration management
- Repository Database--Stores objects
36Figure 13-13 Three components of repository
system architecture