Title: Course of Database and System Analysis
1Course of Database and System Analysis
2 3Introduction
- Database System it is a computerized system
whose overall purpose is to store information and
to allow users to retrieve and update that
information on demand. - Data refer to what actually stored in the
database. - Information it refer to the meaning of that data
as understood by some users. - Single-User System is a system in which at most
one user can access the database at any given
time. - Multi-User System is a system in which many
users can access the database at the same time. - A major objective of Multi-User Systems is to
allow each user to behave as if he or she were
working with a single-user system instead. - The data in the database will be both integrated
and shared. -
4Database Management System(DBMS)
Database Management System (DBMS)
Database
Application Programs
End users
Figure (1) Simplified Picture of a database
system
5Database Management System(DBMS)
- Collection of interrelated data
- Is a layer of software between the data as
physically stored and users of system - Manages very large amounts of data.
- Set of programs to access the data
- DBMS contains information about a particular
enterprise - DBMS provides an environment that is both
convenient and efficient to use. - Database Applications
- Banking all transactions
- Airlines reservations, schedules
- Universities registration, grades
- Sales customers, products, purchases
- Manufacturing production, inventory, orders,
supply chain - Human resources employee records, salaries, tax
deductions - Databases touch all aspects of our lives
6Why Use a DBMS?
- Data independence and efficient access.
- Reduced application development time.
- Data integrity and security.
- Uniform data administration.
- Concurrent access, recovery from crashes.
7Data Models
- A data model is a collection of concepts for
describing data. - A schema is a description of a particular
collection of data, using the given data model. - The relational model of data is the most widely
used model today. - Main concept relation, basically a table with
rows and columns. - Every relation has a schema, which describes the
columns, or fields.
8Levels of Abstraction
- Many views, single conceptual (logical) schema
and physical schema. - Views describe how users see the data.
- Conceptual schema defines logical structure
- Physical schema describes the files and indexes
used.
9Example University Database
- Conceptual schema
- Students (sid string, name string, login
string, - age integer, gpareal)
- Courses (cid string, cnamestring,
creditsinteger) - Enrolled (sidstring, cidstring, gradestring)
- Physical schema
- Relations stored as unordered files.
- Index on first column of Students.
- External Schema (View)
- Course_info(cidstring,enrollmentinteger)
10Instance of Students Relation
- Students( sid string, name string, login
string, - age integer, gpa real )
- sid name login age gpa
- 53666 Jones jones_at_cs 18 3.4
- 53688 Smith smith_at_ee 18 3.2
- 53650 Smith smith_at_math 19 3.8
11Databases
- Database
- Is a collection of persistent data that is used
by the application systems of some given
enterprise. - Or A set of information held in a computer.
- A database is a collection of data, typically
describing the activities of one or more related
organizations. For example, a university database
might contain information about the following - Entities such as students, faculty, courses, and
classrooms. - Relationships between entities, such as students'
enrollment in courses, faculty - teaching courses, and the use of rooms for
courses. - Databases are useful
- Many computing applications deal with large
amounts of information. - Database systems give a set of tools for storing,
searching and managing this information.
12- Database System Architecture
13Database System Architecture
- A three-level architecture
- Internal level For systems designers
- Conceptual level For database designers and
administrators - External level For database users
14Database System Architecture
External Schemas
External/Conceptual Mappings
Conceptual Schema
Conceptual/Internal Mapping
Internal Schema
15Conceptual Level
- Deals with the organisation of the entire
database content - Abstractions are used to remove unnecessary
details of the internal level - Used by DBAs and application programmers
- Conceptual Schema
- e.g.,
- CREATE TABLE
- Employee (Name VARCHAR(25),
- Salary REAL, Department VARCHAR(10))
16External Level
- Provides a view of the database tailored to a
user - Parts of the data may be hidden
- Used by end users and application programmers
- External Schema
- e.g.,
- Create View myView as SELECT Name FROM
Employee - Internal Level
- Deals with physical storage of data
- Structure of records on disk - files, pages,
blocks - Indexes and ordering of records
- Used by database system programmers
- Internal Schema
- e.g.,
- RECORD EMP LENGTH44
- HEADER BYTE(5) OFFSET0
- NAME BYTE(25) OFFSET5
- SALARY FULLWORD OFFSET30
- DEPT BYTE(10) OFFSET34
17Mappings
- Mappings translate information from one level to
the next. - External/Conceptual
- Conceptual/Internal
- These mappings provide some data independence.
- Logical data independence
- Conceptual level changes shouldnt affect all
external levels. - Physical data independence
- Changes to internal level shouldnt affect
conceptual level.
18Database Users
- Database systems programmer
- Writes the database software itself
- Database Administrator (DBA)
- The technical person responsible for implementing
the data administrators decisions. Or designs
manages the database system. - End users
- Use the database system to achieve some goal.
- Application developers
- Write software to allow end users to interface
with the database system
19- Overview of Storage and Indexing
20Data on External Storage
- Disks Can retrieve random page at fixed cost
- But reading several consecutive pages is much
cheaper than reading them in random order - Tapes Can only read pages in sequence
- Cheaper than disks used for archival storage
- File organization Method of arranging a file of
records on external storage. - Record id (rid) is sufficient to physically
locate record - Indexes are data structures that allow us to find
the record ids of records with given values in
index search key fields - Architecture Buffer manager stages pages from
external storage to main memory buffer pool. File
and index layers make calls to the buffer
manager.
21Basic Concepts
- Maximum CPU utilization obtained with
multiprogramming - CPUI/O Burst Cycle Process execution consists
of a cycle of CPU execution and I/O wait - CPU burst distribution
22Alternating Sequence of CPU And I/O Bursts
23Alternative File Organizations
- Many alternatives exist, each ideal for some
situations, and not so good in others - Heap (random order) files Suitable when typical
access is a file scan retrieving all records. - Sorted Files Best if records must be retrieved
in some order, or only a range of records is
needed. - Indexes Data structures to organize records via
trees or hashing. - Like sorted files, they speed up searches for a
subset of records, based on values in certain
(search key) fields - Updates are much faster than in sorted files.
24Indexes
- An index on a file speeds up selections on the
search key fields for the index. - Any subset of the fields of a relation can be the
search key for an index on the relation. - Search key is not the same as key (minimal set of
fields that uniquely identify a record in a
relation). - An index contains a collection of data entries,
and supports efficient retrieval of all data
entries k with a given key value k. - Given data entry k, we can find record with key
k in at most one disk I/O. (Details soon )
25B Tree Indexes
26- Storing Data Disks and Files
27Disks and Files
- DBMS stores information on (hard) disks.
- This has major implications for DBMS design!
- READ transfer data from disk to main memory
(RAM). - WRITE transfer data from RAM to disk.
- Both are high-cost operations, relative to
in-memory operations, so must be planned
carefully!
28Why Not Store Everything in Main Memory
- Costs too much. 1000 will buy you either 128MB
of RAM or 7.5GB of disk today. - Main memory is volatile. We want data to be
saved between runs. (Obviously!) - Typical storage hierarchy
- Main memory (RAM) for currently used data.
- Disk for the main database (secondary storage).
- Tapes for archiving older versions of the data
(tertiary storage).
29Disks
- Secondary storage device of choice.
- Main advantage over tapes random access vs.
sequential. - Data is stored and retrieved in units called disk
blocks or pages. - Unlike RAM, time to retrieve a disk page varies
depending upon location on disk. - Therefore, relative placement of pages on disk
has major impact on DBMS performance!
30Components of a Disk
31The Entity-Relationship Model
32Modeling
- A database can be modeled as
- a collection of entities,
- relationship among entities.
- An entity is an object that exists and is
distinguishable from other objects. - Example specific person, company, event, plant
- Entities have attributes
- Example people have names and addresses
- An entity set is a set of entities of the same
type that share the same properties. - Example set of all persons, companies, trees,
holidays
33Entity Sets customer and loan
- customer_id customer_ customer_
customer_ loan_ amount
name street
city number
34Relationship Sets
- A relationship is an association among several
entities - Example Hayes depositor
A-102customer entity relationship
set account entity - A relationship set is a mathematical relation
among n ? 2 entities, each taken from entity sets - (e1, e2, en) e1 ? E1, e2 ? E2, , en ?
Enwhere (e1, e2, , en) is a relationship -
- Example
- (Hayes, A-102) ? depositor
35Relationship Set borrower
36Relationship Sets (Cont.)
- An attribute can also be property of a
relationship set. - For instance, the depositor relationship set
between entity sets customer and account may have
the attribute access-date
37Degree of a Relationship Set
- Refers to number of entity sets that participate
in a relationship set. - Relationship sets that involve two entity sets
are binary (or degree two). Generally, most
relationship sets in a database system are
binary. - Relationship sets may involve more than two
entity sets. - Relationships between more than two entity sets
are rare. Most relationships are binary. (More
on this later.)
- Example Suppose employees of a bank may have
jobs (responsibilities) at multiple branches,
with different jobs at different branches. Then
there is a ternary relationship set between
entity sets employee, job, and branch
38Attributes
- An entity is represented by a set of attributes,
that is descriptive properties possessed by all
members of an entity set. -
- Domain the set of permitted values for each
attribute - Attribute types
- Simple and composite attributes.
- Single-valued and multi-valued attributes
- Example multivalued attribute phone_numbers
- Derived attributes
- Can be computed from other attributes
- Example age, given date_of_birth
Example customer (customer_id,
customer_name, customer_street,
customer_city ) loan (loan_number, amount )
39Composite Attributes
40Mapping Cardinality Constraints
- Express the number of entities to which another
entity can be associated via a relationship set. - Most useful in describing binary relationship
sets. - For a binary relationship set the mapping
cardinality must be one of the following types - One to one
- One to many
- Many to one
- Many to many
41Mapping Cardinalities
One to one
One to many
Note Some elements in A and B may not be mapped
to any elements in the other set
42Mapping Cardinalities
Many to one
Many to many
Note Some elements in A and B may not be mapped
to any elements in the other set
43Revision on Normalisation
44Purpose of Normalisation
- To avoid redundancy by storing each fact within
the database only once. - To put data into a form that conforms to
relational principles (e.g., single valued
attributes, each relation represents one entity)
- no repeating groups. - To put the data into a form that is more able to
accurately accommodate change. - To avoid certain updating anomalies.
- To facilitate the enforcement of data
constraints.
45Redundancy and Data Anomalies
Redundant data is where we have stored the same
information more than once. i.e., the redundant
data could be removed without the loss of
information.
46Repeating Groups
A repeating group is an attribute (or set of
attributes) that can have more than one value for
a primary key value.
47Functional Dependency
Formal Definition Attribute B is functionally
dependant upon attribute A (or a collection of
attributes) if a value of A determines a single
value of attribute B at any one time.
Formal Notation A ? B This should be read as
A determines B or B is functionally dependant
on A. A is called the determinant and B is
called the object of the determinant.
48Functional Dependency
Partial Functional Dependency This is the
situation that exists if it is necessary to only
use a subset of the attributes of the composite
determinant to identify its object uniquely.
49Functional Dependency
Compound Determinants If more than one attribute
is necessary to determine another attribute in an
entity, then such a determinant is termed a
composite determinant.
Full Functional Dependency Only of relevance
with composite determinants. This is the
situation when it is necessary to use all the
attributes of the composite determinant to
identify its object uniquely.
50Transitive Dependency
Definition A transitive dependency exists when
there is an intermediate functional dependency.
Formal Notation If A ? B and B ? C, then it can
be stated that the following transitive
dependency exists A ? B ? C
51Stages of Normalisation
52 Structured Query Language SQL
53Introduction to SQL
- SQL is a standard language for accessing and
manipulating databases - What is SQL?
- SQL stands for Structured Query Language
- SQL lets you access and manipulate databases
- SQL is an ANSI (American National Standards
Institute) standard
54What Can SQL do?
- SQL can execute queries against a database
- SQL can retrieve data from a database
- SQL can insert records in a database
- SQL can update records in a database
- SQL can delete records from a database
- SQL can create new databases
- SQL can create new tables in a database
- SQL can create stored procedures in a database
- SQL can create views in a database
- SQL can set permissions on tables, procedures,
and views
55SQL Syntax
- Database Tables
- A database most often contains one or more
tables. Each table is identified by a name (e.g.
"Customers" or "Orders"). Tables contain records
(rows) with data. - Below is an example of a table called "Persons"
P_Id LastName FirstName Address City
1 Hansen Ola Timoteivn 10 Sandnes
2 Svendson Tove Borgvn 23 Sandnes
3 Pettersen Kari Storgt 20 Stavanger
The table above contains three records (one for
each person) and five columns (P_Id, LastName,
FirstName, Address, and City).
56SQL Statements
- Most of the actions you need to perform on a
database are done with SQL statements. - The following SQL statement will select all the
records in the "Persons" table - SELECT FROM Persons
57SQL DML and DDL
- SQL can be divided into two parts The Data
Manipulation Language (DML) and the Data
Definition Language (DDL) - The query and update commands form the DML part
of SQL - SELECT - extracts data from a database
- UPDATE - updates data in a database
- DELETE - deletes data from a database
- INSERT INTO - inserts new data into a database
- The DDL part of SQL permits database tables to be
created or deleted. It also define indexes
(keys), specify links between tables, and impose
constraints between tables. The most important
DDL statements in SQL are - CREATE DATABASE - creates a new database
- ALTER DATABASE - modifies a database
- CREATE TABLE - creates a new table
- ALTER TABLE - modifies a table
- DROP TABLE - deletes a table
- CREATE INDEX - creates an index (search key)
- DROP INDEX - deletes an index
58SQL SELECT Statement
- The SELECT statement is used to select data from
a database. - The result is stored in a result table, called
the result-set. - SQL SELECT Syntax
- SELECT column_name(s( FROM
table_name - and
- SELECT FROM table_name
59An SQL SELECT Example
P_Id LastName FirstName Address City
1 Hansen Ola Timoteivn 10 Sandnes
2 Svendson Tove Borgvn 23 Sandnes
3 Pettersen Kari Storgt 20 Stavanger
Now we want to select the content of the columns
named "LastName" and "FirstName" from the table
above. We use the following SELECT statement
SELECT LastName,FirstName FROM
Persons The result-set will look like this
LastName FirstName
Hansen Ola
Svendson Tove
Pettersen Kari
60SELECT Example
- Now we want to select all the columns from the
"Persons" table. - We use the following SELECT statement
- SELECT FROM Persons
- Tip The asterisk () is a quick way of selecting
all columns! - The result-set will look like this
P_Id LastName FirstName Address City
1 Hansen Ola Timoteivn 10 Sandnes
2 Svendson Tove Borgvn 23 Sandnes
3 Pettersen Kari Storgt 20 Stavanger
61SQL SELECT DISTINCT Statement
- In a table, some of the columns may contain
duplicate values. This is not a problem, however,
sometimes you will want to list only the
different (distinct) values in a table. - The DISTINCT keyword can be used to return only
distinct (different) values. - SQL SELECT DISTINCT Syntax
- SELECT DISTINCT
column_name(s( FROM table_name - SELECT DISTINCT Example
- The "Persons" table
P_Id LastName FirstName Address City
1 Hansen Ola Timoteivn 10 Sandnes
2 Svendson Tove Borgvn 23 Sandnes
3 Pettersen Kari Storgt 20 Stavanger
62SQL SELECT DISTINCT Statement
- Now we want to select only the distinct values
from the column named "City" from the table
above. - We use the following SELECT statement
- SELECT DISTINCT City FROM
Persons - The result-set will look like this
City
Sandnes
Stavanger
63SQL WHERE Clause
- The WHERE clause is used to filter records.
- The WHERE clause is used to extract only those
records that fulfill a specified criterion. - SQL WHERE Syntax
- SELECT column_name(s)
FROM table_name WHERE
column_name operator value - WHERE Clause Example
- From the "Persons" table we want to select only
the persons living in the city "Sandnes" - We use the following SELECT statement
- SELECT FROM Persons
WHERE City'Sandnes - The result-set will look like this
P_Id LastName FirstName Address City
1 Hansen Ola Timoteivn 10 Sandnes
2 Svendson Tove Borgvn 23 Sandnes