Title: Database Design and Distributed Systems Architecture
1Database Designand Distributed Systems
Architecture
2Where and when to find me
- Mahmood Moussavi Phone (403) 220-6231 Office
Location ICT-541 Email moussam_at_ucalgary.ca - Office Hours
- Monday, Wednesday, Thursday
- 1300 1400 P.M.
- By Appointment
3Course Focus
- To gain experience in database development using
data modelling and database design methodology. - To gain an understanding of Relational Models and
Languages - To gain experience in Relational Database
Management Systems (RDBMS) - To gain an understanding of emerging trends.
- To gain an understanding of distributed database
system architectures
4Calendar Description
- This course covers principles and concepts of
relational database design, including problem
analysis, data modeling, database design, data
retrieval, and data manipulation, using typical
development methodologies and tools. This course
also introduces concepts and basics of
distributed database systems architecture, and
covers topics such as design and implementation
of distributed database systems.
5Textbook and Other Resources
- Required Textbook
- Database Systems A Practical Approach to Design,
Implementation and Management Third Edition.
Thomas Connolly, Carolyn Begg. Addison Wesley. - Any Oracle textbook or resources
- Other Recommended books
- Database System Concepts, Fourth Edition,
Silberschatz, Korth, and Sudarshan. McGraw-Hill. - Database and Transaction Processing, An
Application Oriented Approach, Philip M. Lewis
Arthur Bernstein, Michael Kifer. Addison Wesley.
6Lecture Topics
- Because this is the first exposure of the
students to the databases systems, this course
begins with basic concepts, leads naturally into
the use of a development tool (Oracle), design,
and finally introduces some basic elements of
distributed database systems.
7Some of the Major Topics
- Introduction to DBMSÂ
- Advantages and disadvantages of database approach
- Components of database management system
- Database concepts and architecture
- Data Modelling
- Analysis and design of database system.
8Some of the Major Topics
- Query languages relational algebra, and
relational calculus. - Various ways of retrieving the required data form
one or more tables. - Introduction to entity relationship models
- SQL commands Data Manipulation (DML)
- SQL commands Data Definition (DDL).
9Some of the Major Topics
- SQL commands insert, drop, or modify data in a
table. - Basic SQL administrative commands such as
- grant or revoke the privileges.
- Introduction to procedures and functions.
- Entity integrity and referential integrity.
- Advanced topics on entity relationship models
- Closer look at the database analysis and design,
and normalization of database.
10Some of the Major Topics
- Storage and Indexing (B tree)
- Miscellaneous Topics
- Query Processing and Optimisation
- Data Warehousing, Data Mining etc.
- Basic elements of distrusted database systems
architecture. - Thin client concept, server side programming, and
interfacing databases
11Assignments
- There will be a scheduled lab every Wednesday,
starting the week of January 24th. - A student's overall assignment mark is based on
all - lab assignments
- final project Â
- Lab assignments must be dropped in the assignment
box(s) designated for this course, located on the
second floor of the ICT building.
12Grading Scheme
- Quizzes 10
- Assignments and projects 20
- Mid-session test(s) 30
- Final exam 40
13Introduction
- What is a database system
- A collection of related data
- An integral part of our day-to-day life.
- Examples in the
- Supermarkets
- Travel agencies
- Libraries
- E-Commerce
- University Registration Systems
- Etc
14File-Based Systems
- Collection of application programs that perform
services for the end users (e.g. reports). - Each program defines and manages its own data.
15Limitations of File-Based Approach
- Separation and isolation of data
- Each program maintains its own set of data.
- Users of one program may be unaware of
potentially useful data held by other programs. - Duplication of data
- Same data is held by different programs.
- Can lead to loss of data integrity
- Wasted space.
16Limitations of File-Based Approach
- Data Dependence
- The structure of the data file is defined in the
application program. - Changes in data can lead to changes to all
applications. - Incompatible file formats
- Programs are written in different languages, and
so cannot easily access each others files. - A file produced by C many not be usable for COBOL
- Any new requirement needs a new program.
17Database Approach
- Shared collection of logically related data (and
a description of this data), designed to meet the
information needs of an organization. - Is a Single, large repository of data which can
be used simultaneously by many users.
18Database Approach
- The database holds not only the organization of
data but also a description of this data (data
about data). - The description of data is known as System
Catalog, or Data Dictionary, or Meta Data. - Definition of data is separated from the
application program. Similar to separation of
objects public interface from its implementation
in object-oriented programming.
19Data Orientation Evolution
80s - Today
20Database Management System (DBMS)
- A Database Management System (DBMS) is a
software that enables users to define, create,
and maintain the database.
21DBMS Environment
- Hardware
- Can range from a PC to a network of computers.
- Software
- DBMS, operating system, network software (if
necessary) and also the application programs. - Data
- Used by the organization and a description of
this data called the schema. - Procedures
- Instructions and rules that should be applied to
the design and use of the database and DBMS. - People
- Data Administrator (DA)
- Database Administrator (DBA)
- Database Designers (Logical and Physical)
- Application Programmers
- End Users (naive and sophisticated)
22Components of Database Management System?
- The common functions of DBMS are
- Storage Manager
- Retrieval and Update
- Data Storage (Data, Metadata)
- Primary
- Secondary
- Disk
- Tape
- A User-Accessible Catalog.
- Transaction Manager.
- Concurrency Control Services.
DDL command
DML command
Query Processor
Transaction Manger
Storage Manger
23Database Management System (DBMS)
- A DBMS should also include
- A security system (prevents unauthorized access).
- An integrity system (maintains consistency of
data) . - A concurrency control system (allows shared
access). - A recovery control system (recovers the data base
to previous consistent state in case of failure)
. - A user-accessible catalog (provides access to
metadata).
24Summary
- To ensure the integrity of the data, we require
that the database systems maintain the following
properties, known as ACID - Atomicity the system must ensure that either the
transaction runs to completion (commits) or, if
it does not complete, it has no effect at all
(aborts) - Consistency Rules of the enterprise generally
limit the occurrence of certain real-world
events. - Student cannot register for a course if has not
completed the course prerequisites. - Isolation Deals with the execution of multiple
transactions concurrently. - Durability The system must ensure that once a
transaction commits its effect on the database
state is not lost in spite of subsequent failures.
25Summary
- Other requirements include
- High Throughput many users gt many
transactions/sec - Low Response Time on-line gt users are waiting
- Long Lifetime complex systems are not easily
replaced - Must be designed so they can be easily extended
as the needs of the enterprise change - Security sensitive information must be carefully
protected since system is accessible to many
users - Authentication, authorization, encryption
26Data and Its Structure
- Data is actually stored as bits, but it is
difficult to work with data at this level. - It is convenient to view data at different levels
of abstraction. - Schema Description of data at some level.
- Each level has its own schema.
- We will be concerned with three schemas
physical, conceptual, and external.
27ANSI-SPARC Three-Level Architecture
- External Level
- The users view of the database.
- Conceptual Level
- The community view of the database
- Internal Level
- Physical representation of the database on the
computer. - Describes how the data is stored in the database.
28What is the external level(View)
- A view mechanism
- Provides users with only the data they want or
need to use. - Reduce complexity
- Provide a level of security
- Provide a mechanism to customize the appearance
of the database - Present a consistent, unchanging picture of the
structure of the database, even if the underlying
database is changed.
29ANSI-SPARC Three-Level Architecture
30ANSI-SPARC Objectives
- Two Types of Data Independence
- Logical Data Independence
- Refers to immunity of external schemas to changes
in conceptual schema. - Conceptual schema changes (add/remove), should
not require changes to external schema or
rewrites of application programs. - Physical Data Independence
- Refers to immunity of conceptual schema to
changes in the internal schema. - Internal schema changes (e.g. using different
file organizations, storage structures/devices),
should not require change to conceptual or
external schemas.
31Multi-User DBMS Architectures
- Teleprocessing
- Traditional architecture.
- Single mainframe with a number of terminals
attached. - Trend is now towards downsizing.
- File-server
- File-server is connected to several workstations
across a network. - Client-server
- Server holds the database and the DBMS.
32Teleprocessing Topology
33File-Server
- Database resides on file-server.
- DBMS and applications run on each workstation.
- Disadvantages include
- Significant network traffic.
- Copy of DBMS on each workstation.
- Concurrency, recovery and integrity control more
complex.
34Client-Server Architecture
- Client manages user interface and runs
applications. - Advantages include
- wider access to existing databases
- increased performance
- possible reduction in hardware costs
- reduction in communication costs
- increased consistency.
35Transaction Processing Monitor as middle tier of
a three-tier client-server architecture
- Program that controls data transfer between
clients and servers in order to provide a
consistent environment, particularly for Online
Transaction Processing (OLTP).