Schema Evolution - PowerPoint PPT Presentation

About This Presentation
Title:

Schema Evolution

Description:

If certain attribute specification is dropped, then the resulting description ... Keep the attribute in the generalization if more relevant ... – PowerPoint PPT presentation

Number of Views:20
Avg rating:3.0/5.0
Slides: 21
Provided by: cobasep
Learn more at: http://web.cs.ucla.edu
Category:

less

Transcript and Presenter's Notes

Title: Schema Evolution


1
Schema Evolution
2
Schema Evolution
  • Ref Alex Borgida K. E. Williamson.
    Accommodating Exceptions in Databases, and
    Refining the Schema by Hearing From Them. VLDB,
    1985.
  • SCHEMA
  • User
  • Describes the domain of discoveries of the
    database
  • Check the correctness of the data entry
  • DBMS
  • Storage efficiency (store extra attributes and
    non-unique keys, attribute domains constrain
    violations)
  • Faster retrieval
  • Semantic query optimization
  • Integrity constraints - used to enforce
    additional conditions not captured by the
    restrictive language of class definition
  • All employees have a supervisor
  • Supervisor wage gt Employee wage

3
  • Exceptions In Database
  • Some employees make more than their supervisor.
  • Some employees may earn a foreign degree, which
    is not even close to the American ones.
  • How To Live With Exceptions
  • Storage and access efficiency.
  • Semantic computation (employee is a consultant).
  • Sharing (US dollars are now Swiss francs).
  • Validating of future updates.
  • Use object concept to represent exceptional
    information so that such information can be
    maintained
  • Alert the user when exceptional information is
    being manipulated as well as when constraints are
    violated

4
  • Learning From Exceptions
  • Modifying integrity constraints to apply in more
    restricted circumstances.
  • Adding new attributes to a class definition.
  • Defining a new class and placing it in the
    current sub-hierarchy.
  • Using Empirical Generalization For Schema
    Refinement
  • DBA keeps track of violations and exceptions to
    constraints in the schema.
  • When sufficient evidence is accumulated, it
    suggests one or more alternative changes to the
    schema.

5
The Generalization Algorithm
  • Introduce/expand range
  • Introduce/generalize class
  • Supervisor Accountant -gt Supervisor Employee
  • Drop attribute specification
  • Research (age 30, Ph.D.) and Research (age 35,
    MS)
  • -gt Research (age 30-35, MS-Ph.D.)

6
Refinement to the Generalization Process
  • If certain attribute specification is dropped,
    then the resulting description becomes overly
    generalized.
  • If no longer detects some of the errors it
    detects earlier.
  • Need heuristic for determining when an attribute
    constraint is essential.
  • Notion of RELEVANCE
  • Keep the attribute in the generalization if more
    relevant
  • Discard the attributes in the generalization if
    less relevant

7
  • Integrity constraints must be refined to
    disregard the contradictions raised by exception
    value.
  • Machine learning to find commonalities among the
    elements of a set of objects, such as exception
    to a rule encountered so far and characterize
    these in the form of a class description.
  • Use such characterization to adjust schema.
  • Changing restriction on property value range
  • Modifying integrity constraints
  • Define new classes
  • Shortcomings
  • Heuristic nature of the RELEVANCE function.
  • Inability to handle negation and disjunction.

8
A Proposal For Automatic Schema Evolution
  • Predefine Schema templates
  • Monitoring the exception frequency of constraints
  • Switch over to new schema when exception
    frequency reaches a certain pre-specified level
  • New schema is selected based on the exception
    characteristics

9
  • Ref Q. Li and D. McLeod. Object Flavor
    Evolution Through Learning in an Object-Oriented
    Database System. Proceedings of the 2nd
    International Conference on Expert Database
    Systems, 1989, pp. 469-495.
  • PKM Personal Knowledge Model
  • A simple object-oriented model
  • Based on a small number of simple concepts
  • Stepwise develop more complex and abstract
    database models
  • Level 1 Atomic Object - Non-decomposable units of
    information at a certain stage (e.g., John
    Smith).
  • Level 2 Open Atomic Set - A collection of atomic
    objects, members are not fixed (e.g., P
    names).

10
  • Closed Atomic Set - A collection of atomic
    objects, members are fixed e.g., sex, age (gt 0,
    integer)
  • Mapping - Mapping from one object to another
  • - Domain range mapping (e.g.,
    has sender mapping from letter to person)
  • Social - Concept describes relationship (e.g.,
    Jon has GPA 3.75)
  • Procedure - Operation/methods (e.g., method for
    creating or destroying objects)

11
  • Level 3 Open Social Set - classified a
    collection of objects (e.g., applicants)
  • Closed Social Set (e.g., courses)
  • Composed Mapping - virtual objects
    derived by composing other objects
  • Composed Procedure - virtual object derived
    by grouping a collection of
    other procedure objects

12
Evolution System Architecture
  • ICE Intelligent Concept Evolver
  • How, what when to evolve
  • PKM Personal Knowledge Manager
  • LFI Learning From Instruction (passive learning)
  • Initiated by user
  • Acquire knowledge from user
  • Dialogue-based
  • Involved some inference
  • LFE Learning From Exceptions (passive learning)
  • Learning is triggered by
  • Violations of constraints
  • Definition of the object in the DB
  • Involved more inference than LFI

13
Evolution System Architecture (contd)
  • LFO Learning From Observation (active learning)
  • Check if any evolution should be conducted
  • Discovery of target evolution
  • Creation of classification criteria for set
    object and similar tasks
  • LFO is automatic, more inference than LFI and LFE

14
Examples
  • 1) Find GPA of all students in the third world
    country. Since third world country is not
    defined in the schema, ICE conducts LFI (learning
    from instruction)
  • Acquire the concept of third world country given
    the member of the country and ask the user to
    identify if they are a third world country.
  • ICE learning all the third world countries in the
    DB introduces a new object countries-class with
    member third world, second world, and first
    world. Countries evolved to countries class.
  • LFO (learning from observation) is triggered to
    find mapping cardinality constraint of has
    class, also determines if it is a many-to-one
    mapping.
  • 2) CS 10 -gt CS 10a, 10b, 10c
  • CS 10 violation! ICE calls LFE and LFI
  • LFI learns there are 10a, 10b, 10c (same course
    name, unit number, instructor, and students)
  • Students of each section should be proper subset
    of original student set.

15
  • 3) Honor student has GPA gt 3.75
  • Honor students can also be selected based on
    recommendation letters from Professors but GPA lt
    3.75
  • ICEs LFO (learning from observation) learning
    that every instance satisfies a generalization
    condition GPA gt 3.70
  • ICE changes the honor students condition from
    GPAgt3.75 to 3.70

16
(No Transcript)
17
Examples Of Distributed Intelligent Database
Systems
18
Planning Scheduling Initiative (PI)
  • Sponsored by the Defense Advanced Research
    Project Agency (DARPA) Rome Laboratories (RL)
  • Goal
  • Promote automated planning technology
  • Smooth transition of the planning technology to
    operational planning and scheduling problems
  • Approach
  • Tier 1 Independent research effects
  • develops automated reasoning capabilities that
    are relevant to PI
  • Tier 2 Integrated Feasibility Demos
  • Merging efforts in Tier 1 into a single system
  • and single operational problem
  • Sizes Involved more than 20 contractors

19
Operational Problems Focus
  • Transportation Planning
  • Provide transportation planner with an ability to
    rapidly generate/evaluate/modify transportation
    plan
  • Initial version used in DESERT SHIELD
  • Features
  • Interconnected with a large number of
    Heterogeneous DBs (thousands)
  • Distributed, Multi-Agent Architecture
    (delegation, negotiation)
  • Knowledge-based (representative, acquisition)
  • Dynamic
  • Common Prototype Environment
  • Visionary Demo

20
Database Requirements
  • Distributed
  • Heterogeneous
  • Intelligent (KB/DB)
  • Temporal Evolutionary
  • Multimedia
  • Security
Write a Comment
User Comments (0)
About PowerShow.com