Title: Database Technologies
1Database Technologies
Yelena Yesha
2Contents
- Database Overview
- DBMS
- Relational Data Model
- SQL
- Web-Database Connectivity
3What is a Database
- A system that stores data
- persistent Exists beyond the immediate use
- Centralized storage
- Single or multiple users
4Advantages
- Reduces redundancy
- Reduces inconsistency
- Shared
- Data representation standards can be enforced
- Enables security restrictions
- Integrity maintained
- Valid cross references between records
- Allows data-independent applications
- Applications ignorant of how data is stored
5DBMS
- Database Management System
- Examples
- Oracle
- DB2
- Microsoft SQL Server
- Sybase
- DBMS Users
- End users
- Programmers
- Database administrations
6DBMS Features
- Optimize Queries
- Manage memory
- Control concurrent data access
7Advanced DBMS Features
- Integrity maintenance
- Entity Integrity ( foreign key references)
- Referential Integrity (foreign key references)
- Cascading deletes
- Set Null or default
- No action (command is rejected)
- Access Control
- Grant Revoke
- Access to tables
- Capability to perform operations (Insert, update,
delete, etc.)
8Data Independence and Access Efficiency
- DBMS allows to avoid rewriting all access
routines every time the data format changes or
data is added/modified/deleted - insulate applications from data storage details.
- Logical independence protection from changes in
logical structure of data - Physical independence protection from changes
in physical structure of data - DBMS maintains data structures and implements
algorithms allowing to avoid linear search - indexing search in O(log n)
- fast access even on complex data queries
9Concurrency Control and Data Integrity
- DBMS insures semantically correct access to the
same data by concurrent applications - two programs accessing the same data at the same
time can result in an inconsistent update - implement sharing in a controlled manner
- Data semantics may require certain constraints to
be satisfied - DBMS guarantees that application programs comply
with the constraints when adding/modifying the
data
10Reliability and Security
- DBMS provides techniques for recovery from
software and hardware failures - guarantee survival of the data across
catastrophes. - DBMS prevents unauthorized users from
accessing/modifying data or denying service to
other users.
11Data Distribution and Heterogeneity
- Centralization is the enemy of scalability
- a vast number of modern applications are
distributed - Data sharing in a distributed environment is a
challenge - Heterogeneity applies to networks, hardware,
operating systems, programming languages, data
formats, etc. - Distributed applications must mask the
differences - Need distributed data management
12Categories of Data Models
- High-level or conceptual
- entities, attributes, relationships
- Representational or implementation or logical
- relational, network hierarchical,
object-oriented, object-relational - Physical or low-level
- data storage
133-schema Architecture
- Physical level description of a database
- how things are stored on disk
- files, record structures,
- indices,
- data structures for disk blocks,
- methodology for dealing with too long records,
etc. - Conceptual level description of a database
- The description of application data (its schema)
using one of the traditional data models.
143-Schema Architecture (cont'd)
- View-level description of a database
- What users of a particular application see
- their own customized schema, e.g., for payroll,
for the ticket agent, for a simulation program. - Multiple levels
- helps with data independence
- helps with maintenance.
- Many views, single logical and physical schema.
- Levels of abstraction give data independence.
15The Entity-Relational Model
- Entity a distinguishable object.
- Entity set a set of entities all of the same
type. - Attribute a single property of an entity
- simple vs composite
- single-valued vs multi-valued
- stored vs derived
- null values.
- Domain set of values permitted for that
attribute.
16The E-R Model (contd)
- Relationship an association between two or more
entities. - Relationship set a set of relationships all of
the same type - There is no correct schema for a batch of data.
Which schema is best depends on the application. - Many basic data modelling choices depend on an
understanding of the application.
17Data Model
- Data model notation for describing data, plus a
set of operations used to manipulate that data. - a set of primitives for defining the structure of
a DB - a set of operations for specifying the retrievals
and updates on a DB - relational, hierarchical, network,
object-oriented.
18The Relational Model (Codd 1970)
- The relational data model is the most important
data model currently existing. - Value-oriented, i.e., allows operations on
relations whose results are relations, thus
enables to combine operations. - As opposed to object-oriented models, in which
- Operations cannot be applied to the result of
other operations - The result of an operation may be a new data
type, and operations may not be available for
this type.
19Domain and Relation
- A domain is a set of atomic values.
- A relation is a finite subset of the Cartesian
product of a finite list of domains - relation is a set of tuples
- order of tuples is irrelevant and
- no relation has 2 identical tuples
- each tuple value is atomic
- no composite attributes
- no multi-valued attributes.
20Relational Model (contd)
- Everything is represented by relations
- Formally Given sets D1, D2, ....Dn (not
necessarily distinct), a relation R ? D1 X D2 X
...X Dn - Di 's are the domains and n is the arity
(degree) of R - elements of R are called tuples
- number of tuples in R is the cardinality of R
- relational data model helps to view a relation as
a table - Observe the following properties
21Relational Model (contd)
- Everything is represented by relations
- Given sets D1, D2, ....Dn (not necessarily
distinct), a relation R ? D1 X D2 X ...X Dn - Di 's are the domains and n is the arity (degree)
of R - elements of R are called tuples
- number of tuples in R is the cardinality of R.
- Relational data model helps to view a relation as
a table - each row represents a tuple (record)
- each column represents an attribute (field).
- Properties
- no two rows are identical
- the ordering of tuples is unimportant
- the ordering of columns is important.
22E-R to Relations (i.e., defining relations)
- Done using DDL (Data Definition Language)
- Name whole database schema
- Declare domains for attributes
- Define relations
- name
- attribute names and domains
- primary and other keys
- foreign keys
23Translating from E-R
- Represent entity set E by a relation whose
attributes are all the E-R attributes of E. Then
each tuple represents one entity of E. - To represent relation R between entity sets E1,
, Ek, create relation R with key attributes of
E1, , key attributes of Ek, as attributes
(rename duplicates). Each tuple of the relation
represents one combination of entities that are
related to one another. - You might have some redundant relations, which
you can delete.
24Schema Normalization
- Formal theory of database design
- based on grouping attributes in a particular way
using attribute dependencies to achieve good
schemas - 1NF, 2NF, 3NF, BCNF, 4NF,
- Goal
- dont store redundant information
- can represent everything (otherwise, the schema
is useless!)
25Query and Update Languages
- DDL data definition language
- used by DBA
- to define schemas, create views, create indices
- DML data manipulation language
- used by sophisticated casual user
- to query data or
- update data
26Relational Query Languages
- Query languages allow manipulation and retrieval
of data from a database. - Relational model supports simple, powerful query
languages - strong formal foundation based on logic
- allows for optimization.
- Two mathematical languages form the basis for
relational languages (e.g., SQL) and for
implementation - Relational Algebra More operational, useful for
representing execution plans - Relational Calculus Lets users describe what
they want, rather than how to compute it
(non-operational, declarative). - Basic operations
- selection, projection, cross-product,
set-difference, union, intersection, join,
division
27SQL
- Originally Structured Query Language
- Common language for interacting with relational
databases - Syntax
- SELECT column_name
- FROM table_name
- WHERE where_clause
- SQL is now the query language for IBM's DB2 and
the de-facto standard on most commercial RDBMS. - SQL is a comprehensive language providing
statements for data definition, query and update.
28Using SQL
- Interactively
- Via a command line program
- Embedded SQL
- Embedded complete SQL statement in code
- Precompiler modifies the source code
- Dynamic SQL
- Embedded in scripts or other programs
- Dynamically interpreted
- Variables bound at run time
29SQL Schema
- EMP(Name,SSN,DNO,BirthPlace)
- DEPT(DName,DNO,MGRSSN)
- PROJECT(PName,PNO,PLocation,DNum)
- WORKSON(ESSN,PNO,Hours)
- CREATE SCHEMA 'COMPANY'
- CREATE TABLE EMP
- (
- EName name_dom NOT NULL,
- SSN CHAR(9) NOT NULL,
- DNO INTEGER NOT NULL,
- BirthPlace city_dom,
- PRIMARY KEY(SSN),
- FOREIGN KEY (DNO) REFERENCES DEPT (DNO)
- )
30Data Definition Language
- DDL is used to define the (schema of) database
- to create a database schema
- to create a domain
- to create, drop. alter a table
- to create, remove an index defunct in SQL2
- to create or drop a view
- to define integrity constraints
- to define access privileges to users (Oracle
CONNECT, RESOURCE, DBA) - to GRANT or REVOKE privileges ON/TO object/user
- SQL2 supports multiple schemas
- CREATE SCHEMA name AUTHORIZATION user
- CREATE SCHEMA EMPLOYEE AUTHORIZATION yesha
31Web Server to Database Connectivity
Database Server
HTTP
DB Protocol
Web Server
Web Browser
CGI
- Easy to implement
- HTTP server dependent
- - Lack of user access control
- - Lack of presentation graphics
32Web Server to Database Connectivity (contd)
Gateway Server
DB protocol
TCP/IP
Database Server
Applet
HTTP
Web Server
Web Browser
- - Higher development complexity
- - True session based database access
- - User access control through database server
- - Presentation graphics
33Web-Database Connectivity Middleware
- Overview of the Web database connectivity, and
focus on the middleware of the Web database - Trinity Three tires of Web Database use to
build a Web database - Web Server
- Web application server (middleware)
- Database
- (see picture)
34How a user interacts with a Web Database
- In a Web browser, a user submits a request to the
Web server. - The Web server passes it onto the middleware
- The middleware writes the request in SQL queries
and sends it to a back-end database. - The data retrieved are handed back to the
middleware - The middleware generates a Web page for the data
- The Web server sends the Web page to the browser
- The browser displays the Web page in front of the
user