Title: Polaris: A System for Query, Analysis,
1Polaris A System for Query, Analysis,
Visualization of Relational Databases
- Chris Stolte
- May 29th, 2002
2Motivation
- Large multi-dimensional databases have become
very common - corporate data warehouses
- Amazon, Walmart,
- scientific projects
- Human Genome Project
- Sloan Digital Sky Survey
- Need effective tools for exploration and analysis
of these databases
3Existing Tools Charts
- typically provide a gallery of charts
- hard to iteratively explore
- simple charts can display few dimensions
4Existing Tools Pivot Tables
- common interface to data warehouses
- simple interface based on drag-and-drop
- generate text tables from databases
5Polaris Extending Pivot Tables
- generate rich table-based graphical displays
rather than tables of text - single conceptual model for both graphs and
tables - preserve ability to rapidly construct displays
6Polaris Design Goals
- Two main design goals
- Interactive analysis and exploration versus
static visualization - Simple, consistent interface
7Design Goal Analysis Exploration
- Want to extract meaning from data
- Process of hypothesis, experiment, and
discovery - Path of exploration is unpredictable
-
8UI Requirements for Exploration
- Data dense displays display both many tuples
many dimensions - Multiple display types different displays suited
to different tasks - Exploratory interfaces rapidly change data
transformations and views
9Design Goal Simple, Consistent UI
- Excel Pivot tables provide a simple interface for
building text-based tables - Graphs require multiple steps different
interfaces and conceptual models - Want to unify tables, graphs, and database
queries in one interface
10Polaris Demo
11Display Types
Gantt charts of events for a parallel graphics
application on a 32-processor SGI machine.
Flights between major airports in the USA
Source code colored by cache misses for a
parallel graphics application.
Major wars and the births of well known
scientists as a timeline.
12Polaris Formalism
- UI interpreted as visual specification (in XML)
that defines - table configuration
- type of graphic in each pane
- encoding of data as visual properties of marks
- data transformations
- Specification then compiled into queries
drawing commands to generate visualization
13Design Decision Use a Formalism
- Why a formalism?
- unification unify tables and graphs
- expressiveness build visualizations designers
did not think of - interface simplicity clearly defined semantics
and operations - code simplicity composable language versus
monolithic objects - declarative can state what, not howallows for
optimization, etc.
14Example specification
15Specifying Table Configurations
- Interface define table configuration by dropping
fields on shelves - Formalism shelf content interpreted as
expressions in table algebra - Can express extremely wide range of table
configurations
16Specifying Table Configurations
- Operands are the database fields
- each operand interpreted as a set
- quantitative and ordinal fields interpreted
differently - Four operators
- concatenation (), cross (X), nest (/), dot (.)
17Table Algebra Operands
- Ordinal fields interpret domain as a set that
partitions table into rows and columns - Quarter (Qtr1),(Qtr2),(Qtr3),(Qtr4) ?
- Quantitative fields treat domain as single
element set and encode spatially as axes - Profit (Profit-410,650) ?
18Concatenation () operator
- Ordered union of set interpretations
Profit Sales (Profit-310,620),(Sales0,1000
)
19Cross (x) operator
- Cross-product of set interpretations
Quarter x ProductType
(Qtr1,Coffee), (Qtr1, Tea), (Qtr2, Coffee),
(Qtr2, Tea), (Qtr3, Coffee), (Qtr3, Tea), (Qtr4,
Coffee), (Qtr4,Tea)
ProductType x Profit
20Nest (/) operator
- Quarter x Month
- would create entry twelve entries for each
quarter. i.e., (Qtr1, December) - Quarter / Month
- would only create three entries per quarter
- based on tuples in database not semantics
- can be expensive to compute
21Dot (.) operator Hierarchies
- Many data warehouses have hierarchical
dimensions - Time Year, Month, Day
- Location Country, State, Region
- Dot (.) works like Nest (/) except it exploits
the defined hierarchies - based on semantics not tuples in database
- Demo
22Formalism
- Can mix graph types in single visualization
23Polaris Formalism
- Remainder of formalism defined in papers
- specification of different graph types
- encoding of data as retinal properties of marks
in graphs - data transformations
- translation of visual specification into SQL
queries
Relevant papers Query, Analysis, and
Visualization of Hierarchically Structured Data
using PolarisChris Stolte, Diane Tang and Pat
HanrahanProceedings of the Eighth ACM SIGKDD
International Conference on Knowledge Discovery
and Data Mining, July 2002. Polaris A System
for Query, Analysis and Visualization of
Multi-dimensional Relational Databases (extended
paper)Chris Stolte, Diane Tang and Pat
HanrahanIEEE Transactions on Visualization and
Computer Graphics, Vol. 8, No. 1, January 2002.
24Generating Queries
- Database queries automatically generated from
specification. - Multiple queries required if level-of-detail
varies. - Algebraic manipulation can be used to determine
minimal set of queries. - Current interpreters can generate SQL, MDX, or
Rivet queries.
25Related Visualization Projects
- Formalisms for Graphics
- Wilkinsons Grammar of Graphics
- Bertins Semiology of Graphics
- Mackinlays APT
- Visual Exploration of Databases
- DeVise, Visage, DataSplash/Tioga-2
- Table-based Visualizations
- Table Lens, Spreadsheet for Visualization
26Summary
- Exploratory visualization versus presentation
- Multiple display typesdifferent questions
require different visualizations - Polaris a novel interface for rapidly
constructing table-based graphical displays from
multi-dimensional relational databases - Formalisms powerful declarative tool for
specifying complex graphics and tables