Final Presentation by Visualization team - PowerPoint PPT Presentation

1 / 37
About This Presentation
Title:

Final Presentation by Visualization team

Description:

... value/dirty value/dirty attribute. Web download version ... values between minimum and maximum for given attribute ... to show the range of target attribute ... – PowerPoint PPT presentation

Number of Views:18
Avg rating:3.0/5.0
Slides: 38
Provided by: Pau1141
Category:

less

Transcript and Presenter's Notes

Title: Final Presentation by Visualization team


1
Final Presentation by Visualization team
  • Team members
  • Haibo Liu
  • Robertas Baronas
  • Jonathan Krentel
  • Zaixia Zhang

2
Introduction, Requirements and Website
  • By Haibo Liu

3
Final Presentation
  • Background
  • Requirements
  • Input Files for First Iteration
  • Web site

4
Background
  • Radviz
  • Visualization--n dimensions?2 dimensions
  • Our project--data table?graph
  • Terms
  • target/anchor/
  • division/
  • attribute/numeric attribute/non-numeric
    attribute/
  • data point/graph point
  • normalization/

5
Requirements(1)
  • Input file formatscomma-separated
    txt/excel/oracle/access/xml
  • Target attribute
  • Anchorsonly numeric attributes/25
  • Data100/missing value/dirty value/dirty
    attribute
  • Web download version
  • Early prototypefinished!

6
Requirements(2)
  • User--opens files
  • Usersselects target/anchors
  • Our systemoffers some useful information and
    suggestions
  • Usersplays on the graph
  • Userssave a session/graph or print

7
Input Files for 1st Iteration
  • Names File
  • Data File

8
Names File
  • Description of data file
  • Format
  • --non-numeric name, nominal
  • example country, nominal
  • --numeric
  • 1. name, numeric
  • example number of cylinders, numeric
  • 2. name, numeric, unit
  • example price, numeric, dollar

9
Why?
  • Sometimes, hard for our system to tell a value is
    numeric or non-numeric.
  • Example telephone number, or bus number.
  • So user has to tell us before using our system.

10
Web Site
  • www.cs.umb.edu/visualization

11
Testing later
  • Later, I am going to show the testings on our
    system, mainly about FileProcessor.

12
System Design, Architecture, Implementation
  • By Jonathan Krentel

13
Design Concerns
  • Adaptability to unforeseen new view needs
  • Easy maintenance of view consistency
  • Restrained but flexible exposure of internally
    held data
  • Hot spot performance plotting data points
  • Data structure divorce
  • Avoidance of data redundancy
  • Flexibility to potential new data sources

14
(No Transcript)
15
(No Transcript)
16
(No Transcript)
17
(No Transcript)
18
Design Decisions
  • Internally held data exposed via Focus
  • User selections held, managed in Model
  • Public getters, package setters and constructors
  • Publicly available Division objects
  • Objectification of system components
  • Row based internal data model
  • Heavy referencing, light construction
  • Abstraction of data processing

19
Scenarios
  • Please see handout

20
Open Concerns
  • Data structure marriage
  • Attribute data management
  • DivisionSet exception handling
  • GroupSet home
  • Performance hot spots
  • Dirty data handling
  • Object-verb command structure
  • Graph Architecture Outline

21
Schedule, Features, Databases Testing
  • By Zaixia Zhang

22
Schedule
  • Old
  • 10/29/03 11/24/03
  • 26 days
  • 11/25/03 12/30/03
  • 35 days (holiday in)
  • 01/01/04 02/15/04
  • 45 days
  • 02/16/04 04/30/04
  • 75 days
  • Revised
  • 10/29/03 12/16/03
  • 48 days (holiday in)
  • 12/17/03 01/25/03
  • 40 days (holiday in)
  • 01/26/04 03/15/04
  • 50 days
  • 03/16/04 04/30/04
  • 45 days

23
Schedule
  • Old
  • Revised

24
Features (1)
  • Deal with missing value, dirty value
  • Handle equal values between minimum and maximum
    for given attribute
  • Report statistic information for users data and
    give suggestions to users for selecting anchors
  • Display outer big circle
  • Compute and display anchors, data points
  • Set different colors to data points for target
    with nonnumeric data type and numeric data type
  • User can select target,anchors, add anchors

25
Features (2)
  • Table view display
  • User can input value range in each division after
    selecting target, display target range with color
  • User can move anchor points by using mouse
  • Anchor locations are recomputed and anchors are
    reordered
  • The locations of data points are recomputed and
    data points are re-plotted
  • F. Add, remove anchors by mouse dragging

26
Features (3)
  • Display menu system, like file, tool, help with
    their submenus
  • Save, new, open a section
  • Display the original data table when user
    requests
  • Compute and display the t-test value for every
    two groups, the correlation values for every two
    numeric attributes
  • User can set the base value for correlation
  • Display the data points information and anchor
    information when user do right click on the data
    points or anchors

27
Features (4)
  • User can select an area in the graph, do zoom in
  • Do histogram to show the range of target
    attribute
  • Sort attributes in the table with original values
  • Blinking/jingle data points
  • Access different data source
  • Design a nice web page and put our system on,
    make it downloadable

28
UCI Repository of Databases
  • Contains wide set of different databases
  • Tree structured, table structured
  • What are we interested in?
  • Table structured data sets
  • Mixed with numeric/non-numeric attributes
  • The number of attributes large enough
  • Enough number of instances

29
Data Source Collection
  • Data set collection
  • Full or part of real data set from UCI repository
  • Data type transfer
  • Continuous, integer ? numeric
  • Boolean, nominal ? nonnumeric
  • Unit ? add one if it has, otherwise skip

30
Testing
  • Test strategy
  • Test by use cases
  • Unit test by developer
  • Integration test by two developer
  • System test by specified tester
  • Test cases (7)
  • Test equal values, missing/dirty data, accuracy,
    special cases, capacity, color distribution

31
Testing
  • Bug reports and fixes
  • Report by bug report form (one bug per form)
  • Fixed by developer or tester
  • Test assumption
  • Input data is comma separated, its format
    following our specification
  • First user select target, then select anchors
  • Can not handle Boolean type attribute

32
Test Case 4 Accuracy Test
  • Plotting algorithm (RADVIZ approach)
  • Data segment

33
Test Case 4 Accuracy Test
  • By excel
  • By PlottingAlgorithm

34
Test Case 5 Special Case
  • No anchor is selected
  • One anchor is selected
  • 2 anchors are selected
  • 25 anchor are selected

35
Test Case 6 System capacity
  • Data source
  • case6/housing-names.txt (14 attributes, one of
    them is nominal)
  • case6/housing-data.txt (number of instances 506)
  • Database housing
  • No missing values.

36
Test Case 7 Color Distribution
  • Data source
  • case6/bignames.txt (9 attributes, 4 of them are
    nominal, car name is unique)
  • case6/bigdata.txt (number of instances 392)
  • Database auto-mpg
  • No missing values.

37
Thanks!
  • More questions?
Write a Comment
User Comments (0)
About PowerShow.com