Multidimensional Databases - PowerPoint PPT Presentation

About This Presentation
Title:

Multidimensional Databases

Description:

Immersidata (e.g., haptic) User profiles & aggregation/clustering. 2. ISI'02. Storing multidimensional data (matrix vs. relations) Indexing multidimensional data ... – PowerPoint PPT presentation

Number of Views:1079
Avg rating:3.0/5.0
Slides: 14
Provided by: rolfes
Learn more at: https://infolab.usc.edu
Category:

less

Transcript and Presenter's Notes

Title: Multidimensional Databases


1
Multidimensional Databases
  • Challenge representation for efficient storage,
    indexing querying
  • Examples (time-series, images)
  • New multidimensional data sets approaches
  • Graphs (e.g., road networks)
  • Immersidata (e.g., haptic)
  • User profiles aggregation/clustering

2
Challenges
  • Storing multidimensional data (matrix vs.
    relations)
  • Indexing multidimensional data (R-tree)
  • Queries
  • Search for similar objects (similarity
    search)ICDE00,ICME00
  • Spatial and temporal queries IDEAS00,ACM-GIS01,
    KAIS02
  • Multidimensional data mining
  • Aggregation EDBT02,PODS02
  • ClusteringACM-MMj02
  • Classification INFORMS02
  • Finding outliers SSDBM01

3
Stock Prices
S1
Sn
4
More Similarity Search Clustering
C
Shapes ICDE99 ICME00
5
On-Line Analytical Processing (OLAP)
Market-Relation
  • Multidimensional data sets
  • Dimension attributes (e.g., Store, Product, Date)
  • Measure attributes (e.g., Sale, Price)
  • Range-sum queries
  • Average sale of shoes in CA in 2001
  • Number of jackets sold in Seattle in Sep. 2001
  • Tougher queries
  • Covariance of sale and price of jackets in CA in
    2001 (correlation)
  • Variance of price of jackets in 2001 in Seattle

Store Location
Date
Sale
Product
Price
LA Shoes Jan. 01 21,500 85.99
NY Jacket June 01 28,700 45.99
. . .
. . .
. . .
. . .
. . .
Avg (sale)
s(d ltingt 2001)
Too Slow!
s(s ltingt CA)
s(pshoe)
Market-Relation
6
Example Solution (Pre-computation) Prefix-sum
Agrawal et. al 1997
Salary
Age Salary
100k
120k
150k
40k
55k
65k
  • 50k
  • 55k
  • 58k
  • 100k
  • 130k
  • 57 120k

0
25
40
Age
50
60
  • Issues
  • Measure attribute should be pre-selected
  • Aggregation function should be pre-selected
    (sum or count)
  • Updates are expensive (need re-computation)

80
Result I II III IV
7
Spatial Temporal Data
Complex Queries
ACM-GIS01, VLDB01
  • Data types
  • A point ltlatitude, longitude, altitudegt or ltx,
    y, zgt
  • A line-segment ltx1, y1, x2, y2gt
  • A line sequence of line-segments
  • A region A closed set of lines
  • Moving point ltx, y, tgt (e.g., car, train, )
  • Changing region ltregion, value, tgt (e.g.,
    changing temperature of a county)
  • Queries
  • Rivers ltintersectgt Countries
  • Hospitals ltingt Cities
  • Taxi ltwithingt 5km of Home
  • ltin the nextgt 10 min
  • Experiments ltoverlapgt BrainR

Visual99
8
Spatial Temporal Data Queries
  • Data types
  • A point ltlatitude, longitude, altitudegt or ltx,
    y, zgt
  • A line-segment ltx1, y1, x2, y2gt
  • A line sequence of line-segments
  • A region A closed set of lines
  • Moving point ltx, y, tgt (e.g., objects, car,
    train, )
  • Queries
  • Molecules ltintersectgt Microbes
  • Train-stations ltingt Cities
  • Round objects ltwithingt 5cm of Hand ltin the nextgt
    10 s
  • Number of distractions in ltsouth-eastgt of subject

9
Spatial Temporal Data Queries
  • K Nearest Neighbor queries find the k nearest
    objects to a query point (5 closest hospitals to
    my car)

10
Immersidata and Mining Queries
CIKM01, UACHI01
11
Immersidata and Mining Queries
A dynamic sign, e.g., ASL colors
Subject-1
12
User Profiles Clustering Offline Processes
PPED Similarity Measure and Clustering
Favorite Features (Rock High Classical Low Po
p Low Rap High)
Voting
13
User Profiles Clustering Online Processes
Current Users Profile
Write a Comment
User Comments (0)
About PowerShow.com