Web-Base Management Systems - PowerPoint PPT Presentation

About This Presentation
Title:

Web-Base Management Systems

Description:

PENELOPE language used to integrate ADM and DB in a generated hypertext. PENELOPE description = ADM augmented with URL's and references to database fields ... – PowerPoint PPT presentation

Number of Views:21
Avg rating:3.0/5.0
Slides: 27
Provided by: aaron
Learn more at: https://dsf.berkeley.edu
Category:

less

Transcript and Presenter's Notes

Title: Web-Base Management Systems


1
Web-Base Management Systems
  • Aaron Brown and David Oppenheimer
  • CS294-7
  • February 11, 1999

2
Introduction
  • Online data is stored in both databases
    (relational) and web sites (hypertext)
  • Need single framework to manage both types of
    data and present integrated views
  • Solution Web Base Management Systems (WBMSs)
  • 2 challenges
  • 1) querying and extracting structure from
    semi-structured web data, transforming it, and
    presenting custom views
  • 2) mapping structured database data to the web
    (adding navigational access paths, redundancy,
    ...)
  • To address these challenges, we need a data model
    that maps between relational and hypertextual
    models

3
ARANEUS Data Models
4
ARANEUS Data Model
  • ADM Logical data model for web hypertexts
  • Based on page schemes and navigational access
    paths
  • Page scheme logical structure shared by a set
    of pages
  • Like a class
  • Web page instance of page scheme
  • Like an object with identifier (URL)
    attributes

5
ADM Example Fragment
6
Adding Structure to HTML
EDITOR
ULIXES
Relational
ADM
HTML
PENELOPE
ER NCM
Structure
Navigational access
7
EDITOR Structuring HTML
  • EDITOR starts with an existing ADM scheme
  • Generated by inspection of web site
  • EDITOR maps web page text to attributes of an ADM
    page scheme
  • Wrapping a web page
  • Imposes structure on web pages
  • EDITOR uses a procedural language to guide the
    wrapping process
  • Each page seen as object with extraction methods
  • One method for each attribute of page
  • Method accesses pages HTML source, extracts
    value of corresponding attribute

8
Querying ADM-Structured Hypertext
EDITOR
ULIXES
Relational
ADM
HTML
PENELOPE
ER NCM
Structure
Navigational access
9
ULIXES A Navigational Query Lang.
  • Language for defining relational views over
    hypertext that follows an ADM scheme
  • Based on navigational expressions (path
    expressions)
  • DEFINE TABLE statement creates relational views
    based on page schemes
  • local materialized view (tuples) or
  • virtual view
  • user can then pose SQL queries across multiple
    views
  • optimizer chooses optimal navigation path through
    site to satisfy query
  • fetches hypertext pages and extracts attributes
    via EDITOR wrappers
  • cost metric is number of HTML page fetches

10
ULIXES Example
  • DEFINE TABLE VLDBPapers (Authors, Title,
    Reference)
  • AS AuthorSearchPage.NameForm.Submit -gt
    AuthorPage.WorkList
  • IN DBLPScheme
  • USING AuthorPage.WorkList.Authors,
    AuthorPage.WorkList.Title, AuthorPage.WorkLi
    st.Reference
  • WHERE AuthorSearchPage.NameForm.Name
    Leonardo Da Vinci AuthorPage.WorkList.Refer
    ence LIKE VLDB

11
Generating ADM from existing DB
EDITOR
ULIXES
Relational
ADM
HTML
PENELOPE
ER NCM
Structure
Navigational access
12
The ARANEUS Design Methodology
Database Conceptual Design (ER)
Hypertext Conceptual Design (NCM)
Database Logical Design (relational)
Hypertext Logical Design (ADM)
DB Mapping (PENELOPE) Page Design (HTML)
Web Site Generation
13
Database Conceptual Model
  • Starting point for database design
  • Conceptual description of a domain
  • Represents essential properties of data
    abstractly
  • Entity-Relationship Model
  • Based on entities and relationships among
    entities
  • Rectangles entity sets
  • Associated attributes are connected with lines
  • Diamonds relationship sets
  • Lines connect entity sets via relationship sets

14
ER Example
15
Hypertext Conceptual Design
  • ER not suitable for modeling hypertext
  • no directed paths (links)
  • hypertext access paths not modeled (web page
    hierarchies)
  • no way to group related entities into a singe
    macroentity
  • Navigational Conceptual Model (NCM) describes
    these conceptual properties of hypertext
  • macroentities (groups of related ER entites)
    model hypertext nodes
  • associated with simple (atomic) or complex
    (structured) attributes, either mono- or
    multi-valued
  • directed relationships model links (may be
    bidirectional)
  • union nodes model link targets that can be of
    different types
  • aggregations model hierarchical access paths

16
Mapping ER to NCM Example
17
Mapping NCM to ADM
  • 1) macroentity -gt one or more pages
    single-valued attribute -gt ADM simple attribute
    multi-valued attribute -gt ADM list
  • 2) directed relationship -gt link to another page
    scheme
  • anchor a descriptive key of target macroentity
  • reference URL of target page scheme
  • 3) aggregation node -gt ADM unique page scheme
  • unique page scheme page scheme with only one
    instance
  • 4) long lists -gt forms
  • list items retrieved through program running on
    server

18
Mapping NCM to ADM Example
19
The ARANEUS Design Methodology
Database Conceptual Design (ER)
Hypertext Conceptual Design (NCM)
Database Logical Design (relational)
Hypertext Logical Design (ADM)
DB Mapping (PENELOPE) Page Design (HTML)
Web Site Generation
20
Generating web site from ADM DB
EDITOR
ULIXES
Relational
ADM
HTML
PENELOPE
ER NCM
Structure
Navigational access
21
Hypertext Views of DB Data
  • Given a database and an ADM scheme for it
  • database may be local
  • derived from design methodology
  • uses derived ADM scheme
  • composed from one or more remote sites
  • derived from integrated relational view produced
    by one or more ULIXES queries
  • uses new ADM scheme concocted to match integrated
    view
  • PENELOPE language used to integrate ADM and DB in
    a generated hypertext
  • PENELOPE description ADM augmented with URLs
    and references to database fields

22
PENELOPE Description
  • Query reorganize (Da Vincis VLDB) papers based
    on year

DEFINE PAGE YearPageAS URL URL(ltYeargt) Year
TEXTltYeargt WorkList LIST OF (Authors TEXT
ltAuthorsgt Title TEXT ltTitlegt
Reference TEXT ltReferencegt ToRefPage LINK
TO ConferencePage UNION JournalPage
ltToRefPagegt) FROM DaVinciPapers DEFINE PAGE
DaVinciYearsPage UNIQUE AS URL result.html Yea
rList LIST OF (Year TEXT ltYeargt
ToYearPageLINK TO YearPage (URL(ltYeargt)))
FROM DaVinciPapers
23
Derived Hypertext View
24
Resulting Web Pages
25
Retrospective
  • Exceptions during wrapping
  • Logically homogenous pages may be physically
    heterogeneous
  • Different ways of laying out the same information
  • Errors masked by browsers
  • ULIXES syntax is difficult for beginners
  • Alternatives
  • Fill out forms corresponding to pre-determined
    ULIXES queries
  • Developed POLYPHEMUS query interface
  • User selects path for query by clicking on
    graphical representation of ADM page schemes
  • Push vs. Pull
  • Either supported hybrid model preferred
  • Dealing with updates
  • each DB update generates a mixed transaction that
    updates both the DB and any pushed (static) HTML
    pages
  • Managing internal sites
  • PENELOPE-generated HTML includes description of
    page scheme and tags attributes
  • Like XML but uses HTML comments

26
Conclusion
  • ARANEUS provides database-like functionality for
    mixed web/relational DB data
  • More to be filled in later...
Write a Comment
User Comments (0)
About PowerShow.com