Data and Knowledge Management - PowerPoint PPT Presentation

About This Presentation
Title:

Data and Knowledge Management

Description:

Information Processing. Compare - in kind and in time ... Changes uploaded to central computer. Agenda. Information processing. Database ... – PowerPoint PPT presentation

Number of Views:17
Avg rating:3.0/5.0
Slides: 68
Provided by: SBA52
Learn more at: https://www.csus.edu
Category:

less

Transcript and Presenter's Notes

Title: Data and Knowledge Management


1
Chapter 10
  • Data and Knowledge Management

2
Agenda
  • Information processing
  • Database
  • Data Administrator
  • The DBMS
  • Distributing data
  • Data warehousing and data mining

3
Data
  • Set of discrete, objective facts about events
  • Business - structured records of transactions
  • Little relevance or purpose

4
Information
  • Message with sender and receiver
  • Meant to change way receiver perceives something
  • Have an impact on his judgment / behavior

5
Data Processing
  • Contextualize - why was data gathered?
  • Categorize - what are its key components?
  • Calculate - analyze mathematically
  • Condense - summarize in more concise form

6
Information Processing
  • Compare - in kind and in time
  • Consequences - how used in decisions / actions
  • Connections - relation to other information
  • Conversation - what other people think about this
    information

7
Agenda
  • Information processing
  • Database
  • Data Administrator
  • The DBMS
  • Distributing data
  • Data warehousing and data mining

8
Database
  • Element
  • Types
  • Structure
  • Models
  • Creation
  • Topology

9
Element
  • Bit, byte, field, record, file, database
  • Entity, attribute, key field
  • Relation
  • Class, object

10
Database Types
  • Business database
  • Geographical information database
  • Knowledge database / deductive database
  • Multimedia database
  • Data warehouse
  • Data marts
  • Multimedia and hypermedia database
  • Object-oriented database

11
Database Structure
  • Data definition language
  • Schema subschema
  • Data Manipulation language
  • Structured Query Language (SQL)
  • Query By Example (QBE)
  • Data dictionary

12
Database Models
  • Hierarchical
  • One to many
  • TPS or routine MIS
  • Network
  • Many to many
  • TPS or routine MIS
  • Relational
  • Normalization
  • Ad hoc reports or DSS
  • Object-oriented
  • E-commerce

13
Database Creation
  • Conceptual design
  • Logical view
  • Entity-relationship (ER) diagram
  • Normalization

14
Entity Relationship Diagram
  • Entity object or concept
  • Relationship meaning association between
    objects
  • Attribute property of an object
  • Simple Composite
  • Single-valued multi-valued
  • Derived
  • Key
  • Primary key
  • Foreign key

15
Normalization
  • A technique for identifying a true primary key
    for a relation
  • Types
  • First normal form not repeating group
  • Second normal form every non-primary-key
    attribute is fully functionally dependent on the
    entire primary key
  • Third normal form no transit dependency

16
Structured Query Language
  • Select
  • Join

17
SQL DML - SELECT
  • SELECT DISTINCTALL col-expr AS
    newname,...
  • FROM table-name alias ,...
  • WHERE condition
  • GROUP by colm , colm
  • HAVING condition
  • ORDER BY colm , colm

18
SQL DML - SELECT
  • SELECT attributes (or calculations , -, /, )
  • FROM relation
  • SELECT DISTINCT attributes
  • FROM relation

19
Examples
  • SELECT stuname FROM student
  • SELECT stuid, stuname, credit FROM student
  • SELECT stuid, stuname, credit10 FROM
    student
  • SELECT DISTINCT major FROM student

20
SQL DML - SELECT
  • SELECT attributes (or wild card)
  • FROM relation
  • WHERE condition

21
Examples
  • SELECT FROM student
  • SELECT stuname, major, credit FROM
    student WHERE stuid S114
  • SELECT FROM faculty WHERE dept
    MIS

22
SELECT - WHERE condition
  • AND OR
  • NOT IN
  • NOT IN BETWEEN
  • IS NULL IS NOT NULL
  • LIKE '' multiple characters
  • LIKE _ single characters

23
Examples
  • SELECT FROM faculty WHERE dept
    MIS AND rank full professor
  • SELECT FROM faculty WHERE dept
    MIS OR rank full professor
  • SELECT FROM faculty WHERE dept
    MIS NOT rank full professor

24
  • SELECT FROM class WHERE room LIKE
    B_S
  • SELECT FROM class WHERE room NOT
    LIKE BUS
  • SELECT productid, productname FROM
    inventory WHERE onhand BETWEEN 50 and
    100

25
  • SELECT companyid, companyname FROM
    company WHERE companyname BETWEEN G
    AND K
  • SELECT productid, productname FROM
    inventory WHERE onhand NOT BETWEEN 50
    and 100
  • SELECT companyid, companyname FROM
    company WHERE companyname NOT BETWEEN
    G AND K

26
  • SELECT facname FROM faculty WHERE
    dept IN (MIS, ACT)
  • SELECT facname FROM faculty WHERE
    rank NOT IN (assistant, lecture)
  • SELECT customername FROM customer WHERE
    emailadd IS NOT NULL

27
  • SELECT customername FROM customer WHERE
    creditlimit IS NULL

28
SELECT - aggregate functions
  • COUNT ()
  • COUNT
  • SUM
  • AVG
  • MIN
  • MAX

29
Examples
  • SELECT COUNT() FROM student
  • SELECT COUNT(major) FROM student
  • SELECT COUNT(DISTINCT major) FROM student

30
  • SELECT COUNT(stuid), SUM(credit),
    AVG(credit), MAX(credit),
    MIN(credit) FROM student

31
SELECT - GROUP
  • GROUP BY
  • HAVING

32
Examples
  • SELECT major, AVG(credit) FROM
    student GROUP BY major HAVING COUNT()
    gt 2
  • SELECT course, COUNT(stuid) FROM
    enrollment GROUP BY course HAVING
    COUNT() gt 2

33
  • SELECT major, AVG(credit) FROM
    student WHERE major IN (MIS, ACT)
    GROUP BY major HAVING COUNT() gt 2

34
SELECT - ORDER BY
  • ORDER BY
  • ORDER BY ... DESC

35
Examples
  • SELECT facname, rank FROM faculty
  • ORDER BY facname
  • SELECT facname, rank FROM faculty
  • ORDER BY rank DESC, facname

36
SELECT - JOIN Tables
  • Multiple tables in FROM clause
  • MUST have join conditions!!!

37
Examples
  • SELECT stuname, grade FROM student,
    enrollment
  • WHERE student.stuid enrollment.stuid

38
  • SELECT enrollment.course, stuname,
    major FROM class, enrollment, student WHERE
    class.course enrollment.course AND
    enrollment.stuid student.stuid AND
    facid F114 ORDER BY enrollment.course

39
SUBQUERY, EXIST, NOT EXIST
  • SELECT s.stuname, major FROM student
    s WHERE EXIST (SELECT FROM
    enrollment e WHERE s.stuid
    e.stuid)

40
  • SELECT s.stuname, major FROM student
    s WHERE NOT EXIST (SELECT
    FROM enrollment e WHERE s.stuid
    e.stuid)

41
Database Creation
  • Physical design
  • Physical view
  • Data topology (organization)
  • Centralized
  • Distributed database
  • Replicated database
  • Partitioned
  • Organization access method
  • Sequential file
  • Indexed sequential file
  • Direct or random file
  • Security
  • Logical, physical, and transmitting

42
Selection Criteria
  • User needs (type of application)
  • Compatibility
  • Portability
  • Reliability
  • Cost
  • Feature
  • Performance
  • Vendors support
  • Others?

43
Agenda
  • Information processing
  • Database
  • Data Administrator
  • The DBMS
  • Distributing data
  • Data warehousing and data mining

44
Data Administrator
  • Clean up data definitions
  • Control shared data
  • Manage distributed data
  • Maintain data quality

45
Clean Up Definitions
  • Synonyms / aliases
  • Standard data definitions
  • Names and formats
  • Data Dictionary
  • Active
  • Integrated

46
Control Shared Data
  • Local - used by one unit
  • Shared - used by two or more activities
  • Impact of proposed program changes on shared data
  • Program-to-data element matrix

47
Manage Distributed Data
  • Geographically dispersed
  • Whether shared data or not
  • Different levels of detail
  • Different management levels

48
Maintain Data Quality
  • Put owners in charge of data
  • Verify data accuracy and quality
  • Purge old data

49
Agenda
  • Information processing
  • Database
  • Data Administrator
  • The DBMS
  • Distributing data
  • Data warehousing and data mining

50
The DBMS
  • Data Base Management System software that
    permits a firm to
  • Centralize data
  • Manage them efficiently
  • Provide access to applications
  • Such as payroll, inventory

51
DBMS Components
  • Data Definition Language (DDL)
  • Data Manipulation Language (DML)
  • Inquiry Language (IQL)
  • Teleprocessing Interface (TP)

52
Definitions
  • Views
  • Physical - how stored
  • Logical - how viewed and used by users
  • Schema - Overall logical layout of records and
    fields in a database
  • Subschema Individual users logical portion of
    database (view)

53
Agenda
  • Information processing
  • Database
  • Data Administrator
  • The DBMS
  • Distributing data
  • Data warehousing and data mining

54
Distributing Data
  • Centralized files
  • Fragemented files
  • Distribute data without duplication
  • Users unaware of where data located

55
Distributing Data
  • Replicated files
  • Data duplicated
  • One site has master file
  • Problem with data synchronization
  • Decentralized files
  • Local data autonomy

56
Distributing Data
  • Distributed files
  • Client / server systems
  • Stored centrally
  • Portion downloaded to workstation
  • Workstation can change data
  • Changes uploaded to central computer

57
Agenda
  • Information processing
  • Database
  • Data Administrator
  • The DBMS
  • Distributing data
  • Data warehousing and data mining

58
Data Warehousing
  • Collect large amounts of data from multiple
    sources over several years
  • Classify each record into multiple categories
  • Age
  • Location
  • Gender

59
Data Warehousing
  • Rapidly select and retrieve by multiple
    dimensions
  • All females in Chicago under 25 years of age
  • Provide tailored, on-demand reports
  • Data mart a replicated subset of the data
    warehouse
  • A functional or regional area

60
Data Mining
  • Fitting models to, or determining patterns from,
    warehoused data
  • Purposes
  • Analyze large amount of data
  • Find critical points of knowledge
  • Perform automatic analyses

61
Data Mining Terms
  • Data Visualization
  • Drill-down Analysis
  • Hierarchical structure
  • Leads to increasing level of detail
  • Expert System (ES) methodology
  • e.g., neural networks

62
Applications
  • Finance - fraud detection
  • Stock Market - forecasting
  • Real estate - property evaluation
  • Airlines - customer retention
  • Retail - customer targeting

63
Data Mining Example
  • What type customers are buying specific
    products?
  • When are the times customers will most
    likely shop?
  • What types of products can be sold together?

64
Points to Remember
  • Information processing
  • Database
  • Data Administrator
  • The DBMS
  • Distributing data
  • Data warehousing and data mining

65
Discussion Questions
  • How can a database help an organization?
  • Why normalization is very important for building
    a database?
  • Do you see any problem of the database in your
    organization?

66
Discussion Questions
  • What kind of database model is most suitable for
  • School?
  • Department store?
  • Police?
  • Some organizations are hesitant to distribute
    data. These organizations feel that they may lose
    control.
  • Do they lose control? Why?
  • Could you suggest a good tactic?
  • Could Data Mining pose a threat to individual
    privacy?
  • Why or why not?
  • If so, how can we mitigate that threat?
  • Do the advantages outweigh the disadvantages?

67
Assignment
  • Review chapters 10
  • Read chapter 8, 9, and 11
  • Group assignment
  • Research paper
Write a Comment
User Comments (0)
About PowerShow.com