Title: Functions of a Web Warehouse
1Functions of a Web Warehouse
- Kai Cheng, Yahiko Kambayashi, Seok Tae Lee
- Graduate School of Informatics, Kyoto
University, Japan - and Mukesh Mohania
- Western Michigan University, USA
2Table of Contents
- Survival from Information Explosion
- Warehouse-Mediated Content Delivery
- Community-Oriented Web Warehouses
- Technical Issues
- Warehouse Enhanced Web Caching
- Related Work
- Concluding Remarks
3Survival from Information Explosion
- Web Traffic Doubled Every 3-6 Months
- Exponential Growth of the Web
- 1 Billion Pages , January 2000
- 2 Billion Pages , June 2000
- 100 Times Increase in the Next 2 Years
Information Overload for both Nets and Users
4Scale up the Web and Internet
- More Bandwidth
- Never Keep Pace with the Traffic Growth
- More Server Capacity
- How to Deal with Hot-Spots ?
- Site Replication
- Only Benefit Replicated Servers
?
5Our Approach
- Tame the Chaotic Info. Streams
Saving Redundant Data Transfers
- Unite the Individual Users
Sharing Findings and Efforts of Each Other
6Warehouse-Mediated Content Delivery
- ? QoS Server, Network ? Overloaded
- ? Personalized Services ? Unrealistic
- ? Information Hunting ? Difficult
7Indirect Content Delivery
Web Warehouse
8Community-Oriented Web Warehousing
The Community of Users People with Special
Information Needs/Interests
Sharing?
?Contribution
9Examples of User Community
10Real/Cyber Communities
(a) Real Communities Dependent on Location
(b) Cyber Communities Independent on Location
11Technical Issues
- Functions of a Web Warehouse
- Web Caching vs. Web Warehousing
- Data Warehousing vs. Web Warehousing
- Dynamic Hierarchical Web Warehouses
12Functions of a Web Warehouse
- Buffering
- Transformation
- Transcoding
- Summarizing
- Content Analysis
- Notification
Transform
13Research Program
Warehousing
Transformation
Content Analysis
Web Caching
14From Web Caching to Web Warehousing
Web Caching Web Warehousing
Object Data Information
Objective Reusing Sharing
Storage Bounded Bound-Free
Population Responses Web View
Model FS Dependent Hypermedia
15From Data Warehousing to Web Warehousing
Items Data WH Web WH
1 Objective Decision Support Information Sharing
2 Model RDB/OORDB Hypermedia
3 Population View Materialization Resource Discovery Content Localization
4 Resource Operational Data Web Documents
5 Data Type Structured Semi-/Un-structured
6 Tie to Web DWH? Web WWH?Web
16Warehouse as Shared Information Repository
- Real Communities ?
- Centralized Management of Warehouses
- Unicast Data Transfer
- Cyber Communities ?
- Distributed Management of Warehouse
- Multicast Data Transfer
17Hierarchy of Web Warehouses
Sports
HP Design
Skiing
Tennis
Mr. A, Ms. C Mrs. D
Mr. A. Mr. D ..
18Dynamic Formation of Web Warehouses (Split)
Skiing
Tennis
B
A
19Dynamic Formation of Web Warehouses (Union)
Painting
Drawing
A
B
20Current StatusContent-Sensitive Caching
Warehousing
Web Caching
21Content-Sensitive Cache Replacement Policy
- Cache Replacement Keep? Replace?
- Traditional Caching
- Long Time Observation ?Replacement Decision
- 60 One-Access Objects ? How Differentiate ?
22LRU-SP Content-SensitiveSize-Adjusted
Popularity-Aware LRU
- Daily Indexing
- Cache Content ? Indices
- Indices ? Popular Topics
- How Similar?
- New Document ? Popular Topics
- Benefit/Size Model
- Observed Pop. Inherent Pop.
- Implement this Model
23Related Work
- LSAMs Proxy Cache (Push)
- Multicast-Based Virtual Cache
- Affinity Groups and Push Channels
- INTELSATs Wormhole Content Delivery
- Warehouse-Koisk Model
- Satellite-Based Delivery Platform
24Concluding Remarks
- Proposed to Cope with the Scaling Problems by Web
Warehouse-Mediated Content Delivery - Discussed the Basic Functions of a Web Warehouse
Buffering, Transformation, Notification and
Content Analysis - Introduced our Current Work Warehouse-Enhanced
Web Caching