Title: Data and Knowledge Management
1Chapter 8
- Data and Knowledge Management
2Learning Objectives
- When you finish this chapter, you will
- Know the difference between traditional file
organization methods and the database approach. - Know how database management systems are used to
construct databases, populate them with data, and
manipulate the data to produce information. - Be familiar with the different database models
and the advantages and disadvantages of each
model.
3Learning Objectives
- Know the most important features and operations
of a relational database. - Understand how databases are changing business
operations across industries and what impact they
might have on our personal lives. - Understand the concepts of data warehousing and
data-mining and their use in business. - Recognize the need for knowledge storage and
management and be able to give examples of the
ways knowledge is managed in organizations.
4Managing Digital Data
- The Traditional File Approach
- Different pieces of information are stored as a
string of bytes there are no labels or
categorizations - i.e. Flat File
- Advantages
- Efficient use of space
- Disadvantages
- Program/Data Dependency
- High Data Redundancy
- Low Data Integrity
5Flat File Layouts
6Data Redundancy
7Moving to Databases
- Maintain and manipulate data about entities
- Entity any object chosen to collect data about
- Field one piece of information about entity
- Fields can hold text, numbers, pictures, sounds
and video clips - Record several fields related to same entity
- File collection of related records
8Managing Digital Data
9Database Management System (DBMS)
- The program used to build databases, populate
them with data, and manipulate the data - Queries Request data from specified fields
- Security Giving users different views addresses
security issue
10Securing Different Data Views
11Managing Digital Data
12Traditional Files vs. Databases Pros and Cons
- Database Advantages
- Reduced data redundancy
- Application/data independence
- Better control
- Greater flexibility
- Traditional File Advantages
- Simplicity
- Efficiency
- Customization
- Traditional File Disadvantage
- Creates data redundancy and application-data
dependence - Does not support as tight control over data
currency, accuracy, and integrity as database
approach - Provides less flexibility in data maintenance
13Database Models
Database model general logical structure which
stores records within a database.
14Database Models
- The Hierarchical Model
- Records are related hierarchically -- each
category is a subcategory of the next level up - OneMany relationships (ParentChild)
- Disadvantages of hierarchical databases
- To retrieve a record, a user must start at the
root and navigate the hierarchy. - If a link is broken, the entire branch is lost.
- Requires considerable data redundancy.
15Database Models
- The Network Model
- Allows a record to be linked to more than one
parent - Supports many-to-many (MM) relationships
- Advantage of the network model
- Reduced data redundancy
- Disadvantages of the network model
- Complicated to build and difficult to maintain
- Difficult to navigate
- Relationship Spaghetti
16Database Models
- The Relational Model
- Consists of tables links among entities are
maintained with foreign keys - Advantages of relational databases
- Same advantages of a network database without the
complications - Easier to conceptualize and maintain
- Virtually all DBMSs offered for microcomputers
accommodate the relational model
17Database Models
- Keys in a Relational Database
- A field whose values identify records
- Either for display or for processing
- Primary Key
- Unique key
- Linking
- To link records from one table with records of
another table, the table must have one field in
common - Repeated field a primary key in one table and a
foreign key field in another table
18Other Database Models
- The Object-Oriented Structure
- An object consists of both data and the
procedures necessary to manipulate the data
(Encapsulation) - Affords maintenance of data along with the
applications that process them - Entity-Relationship Diagrams
- Conceptual (logical) blueprint of a database
- Graphical representation of all entity
relationships
19An Entity-Relationship Diagram
20Components of Database Management Systems
- The Schema
- Describes the structure of the database
- Types of Data fields can hold (alphanumeric,
numeric, date) - Building a Database happens after the schema is
defined - The Data Dictionary (Metadata aka Data about
the Data) - Maintains all information supplied by the
developer when constructing the schema - Record names and types, file names, field names
and types, relationships, roles and
responsibilities
21Components of Database Management Systems
22Components of Database Management Systems
- Data Definition Language (DDL)
- Subprogram used to construct the schema
- Commands and protocols
- Can be presented in a series of interactive forms
for the designer to complete - Something most of us will never directly encounter
23Components of Database Management Systems
- Data Manipulation Language (DML)
- Software used to query the database
- Some require sophisticated commands
- FROM EMPLOYEE LIST LAST_NAME DEPARTNMENT SALARY
WHERE DEPARTMENT 4530 AND SALARY lt25000 - Sometimes this is hidden from the user Query by
Example is used instead - This is the most common interface we will
encounter
24Relational Operations
- A relational operation creates a temporary table
used to manipulate data - Select Select records that meet a certain
criteria - Project Select columns from a particular table
- Join Combination of data from multiple tables to
form a new table
25A Join Table
26Relational Operations
- Structured Query Language (SQL) pr See-kwul
- International standard DDL and DML for relational
DBMS - Advantages of using SQL
- Users do not need to learn different DDLs and
DMLs - Easy to remember / intuitive commands
- SQL can be embedded in widely used 3rd generation
languages (such as COBOL), increasing efficiency
and effectiveness - Programmer not forced to rewrite statements since
SQL statements are portable between operating
systems
27Popular DBMSs
28Database Architecture
- Database Architecture refers to both the physical
and logical layouts - Distributed Databases (geographically remote
sites) - Replication
- Full copy of the entire database is stored at all
sites
29Replicated Database
30Database Architecture
- Database Architecture refers to both the physical
and logical layouts - Distributed Databases (geographically remote
sites) - Replication
- Full copy of the entire database is stored at all
sites - Fragmentation
- Parts of database are stored where they are most
often accessed
31Fragmented Database
32Database Architecture
- Shared Resource and Client/Server Systems
- Four basic client/server models (Access remotely
/ process locally) - Applications run at a server
- Applications run on local PCs
- Applications run on both the local PCs and the
server - Applications and key elements of the database are
split between the PCs and the server
33Database Architecture
34Web Databases
- Databases on the Web
- Catalogs
- Libraries
- Directories
- Client lists and profiles
- Points to consider (when linking a database to
the Internet) - Which application to use
- How to ensure Web surfers do not interfere with
database updates - How to maintain security
35Data Warehousing
- Data warehouse
- Collection of data that supports management
decision making - From Database to Data Warehouse
- Hardware
- Data and Software
- Phases in Building a Data Warehouse
- Extraction Phase
- Cleansing Phase
- Loading Phase
36Data Warehousing
37Data Mining
- Selecting, exploring, and modeling data to
discover unknown relationships
38Data Mining
39Knowledge Management
- Where to find information
- Gathering, organizing, sharing, analyzing and
disseminating knowledge to improve an
organizations performance - Transfer knowledge into databases
- Filter and separate the most relevant knowledge
- Organize knowledge in databases that either
- Allow other employees to easily access the
knowledge - Push specific knowledge to employees based on
their prespecified needs
40Ethical and Societal IssuesA Too-Risky Info
Highway
- Out of Hand -- Out of Control
- DBMSs allow organizations to collect, maintain,
and sell vast amounts of private personal data
easily. - The Web A Source of Data Collection
- Many consumers provide information daily without
being aware of it.
41Ethical and Societal IssuesA Too-Risky Info
Highway
- Our Finances Exposed
- Companies sharing private financial information
with other organizations. - The Upside
- Database technology enables better and faster
services. - Makes markets more competitive.