Prof. Ray Larson - PowerPoint PPT Presentation

About This Presentation
Title:

Prof. Ray Larson

Description:

Lecture 11: Intro to Database Design SIMS 202: Information Organization and Retrieval Prof. Ray Larson & Prof. Marc Davis UC Berkeley SIMS Tuesday and Thursday 10:30 ... – PowerPoint PPT presentation

Number of Views:220
Avg rating:3.0/5.0
Slides: 71
Provided by: ValuedGate1644
Category:

less

Transcript and Presenter's Notes

Title: Prof. Ray Larson


1
Lecture 11 Intro to Database Design
SIMS 202 Information Organization and Retrieval
  • Prof. Ray Larson Prof. Marc Davis
  • UC Berkeley SIMS
  • Tuesday and Thursday 1030 am - 1200 pm
  • Fall 2003
  • http//www.sims.berkeley.edu/academics/courses/is2
    02/f03/

2
Lecture Overview
  • Review
  • MediaStreams
  • Databases and Database Design
  • Database Life Cycle
  • ER Diagrams
  • Discussion
  • Next Time/Readings

3
Lecture Overview
  • Review
  • MediaStreams
  • Databases and Database Design
  • Database Life Cycle
  • ER Diagrams
  • Discussion
  • Next Time/Readings

4
Streams vs. Clips
5
Stream-Based Representation
  • Makes annotation pay off
  • The richer the annotation, the more numerous the
    possible segmentations of the video stream
  • Clips
  • Change from being fixed segmentations of the
    video stream, to being the results of retrieval
    queries based on annotations of the video stream
  • Annotations
  • Create representations which make clips, not
    representations of clips

6
Keywords vs. Semantic Descriptors
dog, biting, Steve
7
Why Keywords Dont Work
  • Are not a semantic representation
  • Do not describe relations between descriptors
  • Do not describe temporal structure
  • Do not converge
  • Do not scale

8
Natural Language vs. Visual Language
Jack, an adult male police officer, while walking
to the left, starts waving with his left arm, and
then has a puzzled look on his face as he turns
his head to the right he then drops his facial
expression and stops turning his head,
immediately looks up, and then stops looking up
after he stops waving but before he stops
walking.
9
After Capture Media Streams
10
Media Streams Features
  • Key features
  • Stream-based representation (better segmentation)
  • Semantic indexing (what things are similar to)
  • Relational indexing (who is doing what to whom)
  • Temporal indexing (when things happen)
  • Iconic interface (designed visual language)
  • Universal annotation (standardized markup schema)
  • Key benefits
  • More accurate annotation and retrieval
  • Global usability and standardization
  • Reuse of rich media according to content and
    structure

11
Video Retrieval In Media Streams
  • Same interface for annotation and retrieval
  • Assembles responses to queries as well as finds
    them
  • Query responses use semantics to degrade
    gracefully

12
Lecture Overview
  • Review
  • MediaStreams
  • Databases and Database Design
  • Database Life Cycle
  • ER Diagrams
  • Discussion
  • Next Time/Readings

13
What is a Database?
14
Files and Databases
  • File A collection of records or documents
    dealing with one organization, person, area or
    subject (Rowley)
  • Manual (paper) files
  • Computer files
  • Database A collection of similar records with
    relationships between the records (Rowley)
  • Bibliographic, statistical, business data,
    images, etc.

15
Database
  • A Database is a collection of stored operational
    data used by the application systems of some
    particular enterprise (C.J. Date)
  • Paper Databases
  • Still contain a large portion of the worlds
    knowledge
  • File-Based Data Processing Systems
  • Early batch processing of (primarily) business
    data
  • Database Management Systems (DBMS)

16
Why DBMS?
  • History
  • 50s and 60s all applications were custom built
    for particular needs
  • File based
  • Many similar/duplicative applications dealing
    with collections of business data
  • Early DBMS were extensions of programming
    languages
  • 1970 - E.F. Codd and the Relational Model
  • 1979 - Ashton-Tate and first Microcomputer DBMS

17
File Based Systems
File
Application
Toys
Addresses
Delivery List
Naughty
Coal Estimation
Toys
Nice
Just what asked for
18
From File Systems to DBMS
  • Problems with file processing systems
  • Inconsistent data
  • Inflexibility
  • Limited data sharing
  • Poor enforcement of standards
  • Excessive program maintenance

19
DBMS Benefits
  • Minimal data redundancy
  • Consistency of data
  • Integration of data
  • Sharing of data
  • Ease of application development
  • Uniform security, privacy, and integrity controls
  • Data accessibility and responsiveness
  • Data independence
  • Reduced program maintenance

20
Terms and Concepts
  • Data independence
  • Physical representation and location of data and
    the use of that data are separated
  • The application doesnt need to know how or where
    the database has stored the data, but just how to
    ask for it
  • Moving a database from one DBMS to another should
    not have a material effect on application program
  • Recoding, adding fields, etc. in the database
    should not affect applications

21
Database Environment
22
Database Components
23
Types of Database Systems
  • PC databases
  • Centralized database
  • Client/server databases
  • Distributed databases
  • Database models

24
PC Databases
E.g. Access FoxPro Dbase Etc.
25
Centralized Databases
26
Client Server Databases
27
Distributed Databases
Location B
Location C
computer
computer
Homogeneous Databases
computer
Location A
28
Distributed Databases
Client
Heterogeneous Or Federated Databases
Remote Comp.
Local Network
Database Server
Comm Server
Remote Comp.
Client
29
Terms and Concepts
  • A database application is an application
    program (or set of related programs) that is used
    to perform a series of database activities
  • Create
  • Add new data to the database
  • Read
  • Read current data from the database
  • Update
  • Update or modify current database data
  • Delete
  • Remove current On behalf of database users

30
Terms and Concepts
  • Enterprise
  • Organization
  • Entity
  • Person, Place, Thing, Event, Concept...
  • Attributes
  • Data elements (facts) about some entity
  • Also sometimes called fields or items or domains
  • Data values
  • Instances of a particular attribute for a
    particular entity

31
Terms and Concepts
  • Key
  • An attribute or set of attributes used to
    identify or locate records in a file
  • Primary Key
  • An attribute or set of attributes that uniquely
    identifies each record in a file

32
Terms and Concepts
  • Models
  • (1) Levels or views of the Database
  • Conceptual, logical, physical
  • (2) DBMS types
  • Relational, Hierarchic, Network, Object-Oriented,
    Object-Relational

33
Models (1)
More later on this
34
Data Models(2) History
  • Hierarchical Model (1960s and 1970s)
  • Similar to data structures in programming
    languages

35
Data Models(2) History
  • Network Model (1970s)
  • Provides for single entries of data and
    navigational links through chains of data.

36
Data Models(2) History
  • Relational Model (1980s)
  • Provides a conceptually simple model for data as
    relations (typically considered tables) with
    all data visible

37
Data Models(2) History
  • Object Oriented Data Model (1990s)
  • Encapsulates data and operations as Objects

38
Data Models(2) History
  • Object-Relational Model (1990s)
  • Combines the well-known properties of the
    Relational Model with such OO features as
  • User-defined datatypes
  • User-defined functions
  • Inheritance and sub-classing
  • All of the major enterprise DBMS systems are now
    Object-Relational or incorporate
    Object-Relational features

39
Lecture Overview
  • Review
  • MediaStreams
  • Databases and Database Design
  • Database Life Cycle
  • ER Diagrams
  • Discussion
  • Next Time/Readings

40
Database System Life Cycle
41
Design
  • Determination of the needs of the organization
  • Development of the Conceptual Model of the
    database
  • Typically using Entity-Relationship diagramming
    techniques
  • Construction of a Data Dictionary
  • Development of the Logical Model

42
Physical Creation
  • Development of the Physical Model of the Database
  • Data formats and types
  • Determination of indexes, etc.
  • Load a prototype database and test
  • Determine and implement security, privacy and
    access controls
  • Determine and implement integrity constraints

43
Conversion
  • Convert existing data sets and applications to
    use the new database
  • May need programs, conversion utilities to
    convert old data to new formats

44
Integration
  • Overlaps with Phase 3
  • Integration of converted applications and new
    applications into the new database

45
Operations
  • All applications run full-scale
  • Privacy, security, access control must be in
    place
  • Recovery and Backup procedures must be
    established and used

46
Growth, Change, and Maintenance
  • Change is a way of life
  • Applications, data requirements, reports, etc.
    will all change as new needs and requirements are
    found
  • The Database and applications and will need to be
    modified to meet the needs of changes

47
Another View of the Life Cycle
Integration 4
Operations 5
Design 1
Physical Creation 2
Conversion 3
Growth, Change 6
48
Lecture Overview
  • Review
  • MediaStreams
  • Databases and Database Design
  • Database Life Cycle
  • ER Diagrams
  • Discussion
  • Next Time/Readings

49
Database Design Process
50
Entity
  • An Entity is an object in the real world (or even
    imaginary worlds) about which we want or need to
    maintain information
  • Persons (e.g. customers in a business,
    employees, authors)
  • Things (e.g. purchase orders, meetings, parts,
    companies)

Employee
51
Attributes
  • Attributes are the significant properties or
    characteristics of an entity that help identify
    it and provide the information needed to interact
    with it or use it (this is the Metadata for the
    entities)

52
Relationships
  • Relationships are the associations between
    entities
  • They can involve one or more entities and belong
    to particular relationship types

53
Relationships
54
Types of Relationships
  • Concerned only with cardinality of relationship

1
1
n
1
n
m
Chen ER notation
55
Other Notations
Crows Foot
56
Other Notations
IDEFIX Notation
57
More Complex Relationships
1/1/1
1/n/n
n/n/1
SSN
Project
Date
1
4(2-10)
Manages
1
Is Managed By
n
58
Weak Entities
  • Owe existence entirely to another entity

59
Supertype and Subtype Entities
60
Many to Many Relationships
Employee
61
Lecture Overview
  • Review
  • MediaStreams
  • Databases and Database Design
  • Database Life Cycle
  • ER Diagrams
  • Discussion
  • Next Time/Readings

62
Questions Brooke Maury
  • Discussion Questions on Hoffer McFadden
  • The relational database model has remained fairly
    static since its inception in the 1970s. Is this
    evidence of its strength as an organizational
    model or an indication of its inflexibility?  

63
Questions Brooke Maury
  • If the goal of the relational database model is
    to encode a conceptual design into a logical
    design, is it possible that improved technology
    and the development of new modeling techniques
    will supplant the RDBMS? Specifically, what
    impact will XML and the development of document
    engineering have on organizing information in
    multiple normalized tables?
  • Conversely, what does the relational model have
    that would be lost if a conceptual design was
    encoded in another model?

64
Questions Brooke Maury
  • (Next time?) The drive to develop the RDBM was in
    part motivated by a need to minimize the space
    required and improve the performance of database
    systems by removing redundancies. What impact
    will very inexpensive data storage and computing
    power have on the relational database model and
    the third normal form especially?

65
Questions Shane Ahern
  • Discussion Questions for "Logical Database Design
    and the Relational Model"
  • Is the normalization process described really
    necessary? When I design a database schema, I
    find that by thinking of tables in terms of they
    entities they represent (employees, sales,
    events), I avoid most of the problems of
    normalization that the process seeks to address
    (i.e. salesperson and region in Sales table,
    salesperson is clearly a distinct entity from
    sales). If the formal process described in the
    article is not followed, are there potential
    pitfalls that might lead to problems with your
    database schema?

66
Questions Shane Ahern
  • The article points out that "the relational model
    does not yet directly support supertype/subtype
    relationships." Once the tables in a relational
    database have been decomposed to third normal
    form, the database is efficient from systems
    point-of-view, but the tables no longer represent
    a representation of the data that is intuitive to
    humans. The object-oriented model more accurately
    mirrors the way we think about the concepts that
    we wish to store in databases. So perhaps
    object-oriented database systems are worth
    considering. What about XML databases?

67
Questions Arthur Law
  • The three models that we have been presented
    with, Entity Relationship Model, NIAM Model, and
    Object Oriented Model all enforce a specific
    thought process in the organization and
    relationship between items in a database. With
    all of our recent discussion of computers
    understanding natural language are these methods
    now out of date with how we should be organizing
    information? Should we use artificial
    intelligence or learning algorithms to
    statistically determine the relationship between
    entities or is there still value in using these
    models?

68
Questions Arthur Law
  • Each model is approximately one decade apart in
    development and a quick Google search shows that
    companies are using databases with one of the
    three models. However, as new models arise there
    doesn't seem too much interest in migrating from
    one data model to another. Which makes sense
    given that an organization using a given model
    probably finds that it works. Now with the
    proliferation of XML, we see more information
    being shared between organizations, so are we
    fated for an expensive and lengthy translation
    process between databases? Or should all DB
    administrators be responsible for upgrading to
    the latest model?

69
Lecture Overview
  • Review
  • MediaStreams
  • Databases and Database Design
  • Database Life Cycle
  • ER Diagrams
  • Database Design
  • Discussion
  • Next Time/Readings

70
Next Time
  • Database Design Normalization and SQL
  • Readings (no additional DBMS readings)
  • Additional Questions/ or revisit some of todays
    discussion questions in the light of the next
    lecture?
Write a Comment
User Comments (0)
About PowerShow.com