Introduction to Databases - PowerPoint PPT Presentation

1 / 29
About This Presentation
Title:

Introduction to Databases

Description:

Introduction to Databases Simplicity is the ultimate sophistication. Leonardo da Vinci Class Outline What is data and why is it important? What is a database ... – PowerPoint PPT presentation

Number of Views:2243
Avg rating:3.0/5.0
Slides: 30
Provided by: Luig75
Category:

less

Transcript and Presenter's Notes

Title: Introduction to Databases


1
Introduction to Databases
Simplicity is the ultimate sophistication.Leona
rdo da Vinci
2
Class Outline
  • What is data and why is it important?
  • What is a database and database schema?
  • What is a database management system?
  • What is a database application?
  • What are the levels of database representation?
  • What were the limitations of the systems that led
    to the development of the current relational
    database systems?
  • What are various types of database systems?
  • What do databases have to do with HEALTH SCIENCE?

3
Databases versus The Rest
4
Principles of Information Resource Management
  • Organizational resources flow into and out of the
    organization
  • Two types of major organizational resources
    Physical resources, Conceptual resources (data
    information)
  • As scale of organization grows, it becomes
    increasingly difficult to manage by observation
    (i.e., reliance on conceptual resources)
  • Conceptual resources can be managed just like
    physical resources or assets (e.g., employees,
    , equipment, etc.)
  • Management of data information means getting it
    before its needed, protecting it, assuring
    quality, and getting rid of it when no longer
    required
  • Management of data information can be achieved
    only through organizational commitment

Adapted from McFadden, F.R. Hoffer, J.A.
(1994). Modern Database Management. Redwood
City, CABenjamin/Cummings Publishing (p. 6)
5
Information is a major organizational resource
(2003) 0.5 with rare mutation(2004) 5 with
rare mutation
Information (organized data)
Dr. Jill conducted 100 CFTS (2004)experiments of
100 patients of which 500 had a rare mutation
6
What is a Database?
  • Organized collection of related information or
    data stored on a computer disk for easy,
    efficient use

data
information
7
In the beginning(in the 1950s)
There were no databases. Just file (or data
processing) systems.
  • File systems were typically organized by function
    (use)
  • The first data management systems performed
    clerical tasks (transactional processing) such as
    order entry processing, payroll, work scheduling.
  • e.g., files for patients (file folder analogy)
    each record for a single patient another file
    for appointment/ billing information

Name Jane Doe Address 123 Easy
St. City London Phone 455-0897
Date Sept 14, 1955 Time 200 p.m. Patient Jane
Doe, 455-0897 OHIP 123456789
8
Limitations of Data File Systems
CustomerprocessingApplication
Customerfile
OrderprocessingApplication
Orderfile
  • Worked adequately if data collection needs were
    relatively small.
  • Problems arose as data files, information needs,
    and reporting requirements grow in complexity due
    to
  • Extensive programming - use of third-generation
    languages (e.g., COBOL, FORTRAN) in which the
    programmer must specify what is be done as well
    as how it is to be done

9
Limitations of Data File Systems
  • Poor mechanisms for sharing data across
    organization - files are often incompatible with
    one another (separate, isolated data)
  • Data redundancy - duplicate information in two
    or more files
  • Program/ data dependence - if the file structure
    changed, ALL programs using the file had to be
    modified - time-consuming
  • Lack of flexibility - could not do ad hoc queries
    or reports required separate programs for every
    report or query
  • Poor security - difficult to program, therefore,
    often omitted
  • Difficulty of representing data in the users
    perspective

10
Historical Roots of Database Systems
  • Developed to overcome limitations of file
    systems, developed initially on mainframe
    computers in late 60s and early 70s - a typical
    early DBMS cost 100,000 (many are still in use)
  • First general databases were created for General
    Electric Company (GEC) - Integrated Data Store
    (IDS), designed to run on GEC machines B.F.
    Goodrich ported IDS to IBM 360 - became dominant
    until 1980s
  • As PCs gained popularity (1980s), single-user,
    personal databases developed at present, most
    database technology is used in workgroups

11
What is a Database Management System (DBMS)?
  • A set of programs used to define,administer, and
    process the database and its applications
    conveniently and efficiently
  • Program (or collection of programs) that enables
    users to create the database. The DBMS manages
    the storage and retrieval of data, and provides
    the user with certain functionalities to
    guarantee that the data will be logically
    organized and consistently applied.

DBMS
DatabaseApplication
Database

User(s)
(e.g., Oracle, dBase, Access, Paradox)
12
Features of a DBMS
DBMS
developer
DBMS Engine
users
13
What is a Database Application?
Database
  • A computer program that performs a specific task
    of practical value in a business situation
  • An interface that allows the user to enter and
    manipulate data User can request abstract views
    of data
  • Created by database designers and developers
    using a DBMS program or a programming language

DBMS
Database application
14
Major Components of a Database Application
5. Program - used to automate a database
15
Evolution of Database Models
Hierarchical
still in use in many older (1970s)
legacysystems very few new databasesreferred
to navigational systems
Network
the vast majority currently use this, therefore,
our courses focus is here
Relational
Semantic
Very few new databases are being created using
Object-Oriented Programming (not many ODBMS for
businesses to implement this model)
Object-Relational
Object-Oriented
16
The Relational Database Model
Genetics Lab
Gel Images
Experiments
Results
Patients
Gene
  • represented by tables (like spreadsheets)
  • tables are NOT linked with physical pointers
  • unlike earlier systems, all three types of
    relationships can be represented
  • accommodates the design of larger databases that
    involve complex relationships and intricate
    manipulations

17
Evaluation of the Relational database model
But 1 problem still is
  • Advantages
  • mechanisms for minimizing data redundancy and
    inconsistency
  • logical database design is separated from
    physical aspects
  • relatively program-data independent
  • management of data for access, manipulation, and
    security
  • flexible mechanisms for generating reports and
    queries
  • program development and maintenance costs are
    reduced
  • data can be accessed in a multiplicity of ways
    within and amongst organizations
  • Disadvantages
  • ease of use - many untrained people create and
    use databases without considering its design -
    usually incorporate many errors

18
Comparison of Database models
  • File Systems
  • data dependence
  • structural dependence
  • demands upon programmer
  • Hierarchical, Network DBMS
  • data independence
  • structural dependence
  • demands upon programmer
  • Relational DBMS
  • data independence
  • structural independence
  • demands upon computer

19
Types of Database Systems
  • Centralized (single site)
  • microcomputer (desktop)
  • legacy mainframe/ mini computer (1 CPU)
  • client/server architecture (gt1 CPU)
  • Distributed
  • gt1 site, requires network
  • not widely adapted yet due to many problems

20
Three levels of Database Representation
Primary focus of the lectures of this course is
the conceptual level because the creation of a
database begins with its design the focus of the
laboratories is the internal and external level.
21
The Database System Environment
  • Hardware - physical devices
  • computer, peripherals, network devices
  • Software
  • DBMS (manages the database)
  • operating systems software (manages hardware
    software)
  • application programs (user access and manipulate
    database)
  • People
  • system administrators (manage general operations)
  • database designers (architects of database
    structure)
  • database administrators (ensure the database is
    functioning)
  • systems analysts programmers (design
    implement database)
  • end users (use application programs)
  • Procedures - rules of the company governing use
    of data
  • Data

22
Focus of this course
  • Lectures
  • Conceptual design of databases determining their
    purpose, developing a model, identifying the
    tables that are required, designing normalized
    tables and identifying their relationship to one
    another.
  • Laboratories
  • Implement a database at the internal and external
    level create databases (tables) and database
    applications (queries, forms, reports, programs)
    using a typical microcomputer relational database
    management system, MS Access 2003.

23
Better Definition of a Database
  • A collection of users data, organized logically
    and managed by a unifying set of principles,
    procedures, and functionalities, which help
    guarantee the consistent application and
    interpretation of that data
  • (a) organized collection of related information
    or data stored on a computer disk for easy,
    efficient use represented in tabular format

24
Better Definition of a Database (cont'd)
  • (b) A database is self-describing (metadata or
    system catalogues or data dictionary)
  • A database contains a description of its own
    structure (e.g., the names of all the tables, the
    names and types of data in each column in all the
    tables)

Kroenke, D.M., Database Processing Fundamentals,
Design Implementation, Prentice Hall, 1998
25
Better Definition of a Database (cont'd)
  • (c) Indexes are stored with the database
  • Data accessed from a source table for sorting and
    searching is time-consuming without a pointer
    system, which improves performance and
    accessibility of the database
  • The overhead cost of indexing is that each time
    data is updated, all indexes must also be
    updated, therefore, reserve index for cases in
    which they are needed

(d) Application Metadata - stores structure and
format of application components not all DBMS
support this feature
26
Table
Users view their data in two-dimensional tables.
  • table file relation

27
Field
The fields within records contain data. Data
within a field must be of the same data type.
Each field within a table must have a unique
name. Order of fields is unimportant.
  • column field attribute

28
Record
  • A record is a group of related fields of
    information about a single instance of one object
    or event in a database.
  • Tables consist of zero, one, or more records.
  • Order of rows is unimportant.
  • row record tuple

29
Database Schema
  • Database schema defines databases structure,
    tables, relationships, domains, and constraint
    rules
  • Tables
  • EXPERIMENT (ExprId, GeneID, ExprName, DoctorID)
  • GENE (GeneID, GeneName, GeneDescription)
  • DOCTOR (DoctorID, DocFName, DocLName)
  • Relationships
  • Each experiment involves one and only one gene
  • Each doctor is in charge of one or more
    experiments
  • Domains (set of values in a column)
  • Physical description (e.g., set of integers 0 lt x
    lt 99999)
  • Constraints (business rules)
  • Gene Name cannot be left blank
Write a Comment
User Comments (0)
About PowerShow.com