Title: Database Systems
1Database Systems
Chapter 7
- Basic Data Management Concepts
- Organizing Data in a Database
- Database Management Systems
- Using Database Systems in Organizations
- Database Trends
- Managing Databases
2The Value of Databases
- Databases and Database Management Systems (DBMS)
transform large quantities of data into specific
and valuable information for accomplishing some
goal.
3Database Management System (DBMS)
- A DBMS consists of a group of programs that
manipulate the database and provide an interface
between the database and the user or the database
and application programs.
SecureAccess
Front End
Back End
4Database
- A collection of data organized to meet users
needs.
5Database Fields
- Fields are set to hold specific types of data.
6Database
A Database is a collection of files/tables
7Database Hierarchy
8Keys and Primary Key
- Key A field in a record that is used to identify
the record - Primary key A field that uniquely identifies a
record - A primary key field prevents duplicate records
from occurring in a table.
9Primary Keys
Which field would act as the best primary key?
10Primary Keys
11(No Transcript)
12Simple but Restrictive DBMS
13The Database Approach to Data Management
147.2 Organizing Data in a Database
15The Relational Model
- In a relational database, tables are linked
(related) through common fields.
16Relation Types
- One-to-many
- Most typical
- Makes use of primary key
- One-to-one
- Many-to-many
17Data Analysis
- Data analysis is a process that involves
evaluating data to identify problems with the
content of a database. - Consider what would happen if CardNumber were not
a primary key, and two or more customers had the
same CardNumber. - Data Integrity refers to the accuracy of the data
in a database.
GIGO, or Garbage In Garbage Out, refers to the
fact that inaccurate data entered in a database
will result in inaccurate information produced
from the database.
187.3 Database Management Systems
19Creating a Database
- A schema is an outline of the logical and
physical structure of the data and relationships
among the data in the database.
20Creating a Database
- A data dictionary provides a detailed description
of all data used in the database.
21Database Strengths
- The power of a database and DBMS lies in the
users ability to manipulate the data to turn up
useful information.
- Data can be sifted, sorted and queried through
the use of data manipulation languages.
22Data Manipulation Language
- A Data Manipulation Language (DML) is a specific
language provided with the DBMS that allows
people and other database users to access,
modify, and make queries about data contained in
the database, and to generate reports. - Structured Query Language (SQL) The most popular
DML. - SELECT FROM EMPLOYEE WHERE JOB_CLASSIFICATION
C2
237.4 Using Database Systems in Organizations
24The data deluge
- The Machinery Moves on
- Moores law processing capacity doubles every
18 months CPU, cache, memory - Its more aggressive cousin Disk storage
capacity doubles every 9 months - The Demand is exploding
- Every business is an eBusiness
- Scientific Instruments and Moores law
- Government
- The Internet the ubiquity of the Web
- The Talent Shortage
25Data Stores
- Data Warehouse A database that holds important
information from a variety of sources. - Data Mart A small data warehouse, often
developed for a specific person or purpose.
- Data Mining the process of extracting
information from a data warehouse. - Connecting the dots
26Databases Data Warehouses
Operational Databases
27What Is a Hypercube?
Create multi-dimensional cubes of information
that summarize transactional data across a
variety of dimensions. OLAP vs. OLTP
28What is Data Mining?
- Finding interesting structure in data
- Structure refers to statistical patterns,
predictive models, hidden relationships - Interesting ?
- Examples of tasks addressed by Data Mining
- Predictive Modeling (classification, regression)
- Segmentation (Data Clustering )
- Affinity (Summarization)
- relations between fields, associations,
visualization - An Example
29Data Mining and Databases
- Many interesting analysis queries are difficult
to state precisely - Examples
- which records represent fraudulent transactions?
- which households are likely to prefer a Ford over
a Toyota? - Whos a good credit risk in my customer DB?
- Yet database contains the information
- good/bad customer, profitability
- did/did not respond to mailout/survey/...
30Example market basket Transactions
- Bread, Milk
- Bread, Diapers, Beer, Eggs
- Milk, Diapers, Beer, Cola
- Bread, Milk, Diapers, Beer
- Bread, Milk, Diapers, Cola
- What pattern can you see?
31A more systematic approach a Decision Tree
All 1615 patients
Split 1 Age
Systolic BP
terminal node
32Visualization is Important
- Factory food example from this weeks New York
Times
33The myths
- Companies have built up some large and impressive
data warehouses - Data mining is pervasive nowadays
- Large corporations know how to do it
- There are tools and applications that discover
valuable information in enterprise databases
34The truths
- Data is a shambles,
- most data mining efforts end up not benefiting
from existing data infra-structure - Corporations care a lot about data, and are
obsessed with customer behavior and understanding
it - They talk a lot about it
- An extremely small number of businesses are
successfully mining data - The successful efforts are one-of, lucky
strikes