Project Update - PowerPoint PPT Presentation

About This Presentation
Title:

Project Update

Description:

Project Update. Matt Williams. XML Document Visualization and Retrieval. Background. XML vs Web Doc ... Jesse Grosjean et al. TreeMaps ?? Ben Shneiderman ... – PowerPoint PPT presentation

Number of Views:22
Avg rating:3.0/5.0
Slides: 12
Provided by: people90
Category:
Tags: project | update

less

Transcript and Presenter's Notes

Title: Project Update


1
Project Update
XML Document Visualization and Retrieval
  • Matt Williams

2
Background
  • Can we take advantage of this structure when
    searching for documents?
  • XML vs Web Doc
  • Added Structure

ltbookgt lttitlegtMy First XMLlt/titlegt ltprod
id"33-657 media"paper"gt lt/prodgt
ltchaptergtIntroduction to XML ltparagtWhat is
HTMLlt/paragt ltparagtWhat is XMLlt/paragt
lt/chaptergt ltchaptergtXML Syntax
ltparagtElements must have a closing taglt/paragt
ltparagtElements must be properly nestedlt/paragt
lt/chaptergt lt/bookgt
3
Information Retrieval
  • Standard Information Retrieval (IR)
  • tfidf
  • tf frequency of a term in a doc
  • Idf inverse document frequency
  • Number of documents containing the term

4
Information Retrieval
  • A fair bit of previous work on adding structure
    to IR queries.
  • Examples
  • XIRQL Fuhr and GroBjohann
  • //book/chapterheading cw InfoVis
  • XXL Theobald and Weikum
  • Select Z From Index
  • Where zoos.animal.cougar as Z
  • But
  • What if we are unsure of the structure?
  • What if we have variability in the structure?

5
Information Retrieval
  • My goal is to provide an interface to explore the
    XML collection with limited information
  • Meta-Schema Information Element Index
  • Visual Clustering Multidimensional Scaling
  • Visual Queries Element Selection

6
Related Work
  • Visual Information Seeking
  • Homefinder / Periodic Table Algerg and
    Shneiderman

7
Related Work
  • Galaxies
  • Wise et al.
  • Visual Web Retrieval
  • Lighthouse - Leuski

8
Related Work
  • ZUI Pad, Jazz, and Piccolo
  • Ben Bederson
  • SpaceTree
  • Jesse Grosjean et al.
  • TreeMaps ??
  • Ben Shneiderman

9
Multidimensional Scaling
  • Document Similarity
  • Dimensionality Reduction From full dimensional
    distance measure ? 2 dimensional distance measure
  • Problems Speed?

10
Test Environment
  • eXist Open Source XML Native Database
  • Wolfgang M. Meier
  • http//exist-db.org/
  • I am working on providing a front end to the
    Database that provides
  • A Selectable Element Index
  • Interactive Results That Dynamically Cluster and
    Zoom

11
Thus Far
  • Lots of Learning!!
  • XML Databases
  • Multidimensional Scaling
  • XML Queries
  • XML Information Retrieval
  • Zoomable Interfaces
  • Treemaps
  • Added basic GUI to eXist
  • Added a Service to offer the element Index as
    part of the API
Write a Comment
User Comments (0)
About PowerShow.com