The Use of Provenance in Information Retrieval - PowerPoint PPT Presentation

1 / 15
About This Presentation
Title:

The Use of Provenance in Information Retrieval

Description:

Provenance-related User Studies are Hard! Must be done 'in the wild' Involves: ... to/from users not participating in the study. Documents regarding trade ... – PowerPoint PPT presentation

Number of Views:33
Avg rating:3.0/5.0
Slides: 16
Provided by: craymond3
Category:

less

Transcript and Presenter's Notes

Title: The Use of Provenance in Information Retrieval


1
The Use of Provenance in Information Retrieval
  • Simone Stumpf
  • Erin Fitzhenry
  • Tom Dietterich

2
Defining Provenance
  • To us, provenance concerns
  • The origin of content within documents
  • The relationships between documents

3
Why focus on Provenance for Information Retrieval?
  • People remember the relationships between
    documents!
  • Episodic vs. Semantic Memory
  • Studies
  • Blanc-Brude Scapin (2007)
  • Gonçalves Jorge (2004)
  • No need to formulate keyword queries
  • Other common document attributes are often
    inaccurately remembered (Blanc-Brude Scapin
    2007)
  • Title (20 false recall)
  • Size (53.8 false recall)
  • Time (47.6 false recall)

4
Example Use Case Where did I save that again?
I got an email from Tom
I saved the attachment
And I pasted some information from the attachment
into a PowerPoint document
Where did that presentation go??
5
Requirements for Tracking and Visualizing
Provenance
  • Instrument all important document provenance
    events
  • Provenance events are NOT automatically captured
    by Windows
  • Develop a UI enabling users to locate documents
    via the provenance relationships they remember
  • Integrate the UI into the Windows Desktop

6
Capturing Provenance Events with TaskTracer
  • TaskTracer is a Personal Information Management
    system
  • User defines a hierarchy of Projects or
    Activities
  • As the user works, TaskTracer automatically tags
    (according to task/project)
  • Files
  • Folders
  • Email Messages
  • Email Contacts
  • Web pages

7
Instrumenting TT to Capture Provenance Events
  • TaskTracer already instruments many desktop
    events
  • Open, Save, SaveAs, Close
  • EmailArrived, Email Open, Email Close
  • Open URL, Close URL, Follow Hyperlink
  • Idea Extend existing instrumentation to cover
    key provenance events
  • CopyPaste, SaveAs, FileCopy/Rename
  • AttachmentAdd, AttachmentOpen, AttachmentSave,
    EmailForward, EmailReply
  • FileDownload, FileUpload
  • Coming soon

8
Instrumenting TaskTracer to capture Provenance
Events (cont.)
  • Database of document-to-document provenance
    relationships

9
Developing a User InterfaceTaskTrail
  • A tool for visualizing provenance

Double-click to open
Users Query
Mouse over details
Click to Expand
10
Integrating TaskTrail into the Windows UI
  • Launch a query by right clicking on an item
    within
  • Windows Explorer, Outlook, TaskExplorer

11
(No Transcript)
12
(No Transcript)
13
(No Transcript)
14
Research Questions
  • Does TaskTrail help users find documents more
    quickly than other methods?
  • How should the provenance graph be laid out?
  • What kind of provenance events do users
    accurately recall?
  • How large are the provenance graphs?
  • What patterns exist (if any) in terms of the
    succession of provenance events?

15
User Studies Formative
  • Observational Study (planned)
  • What provenance-related actions do users perform?
    Which of those do they remember?
  • Observe 12 participants in their workplaces
  • Record provenance-related actions performed
  • Interview participants after 1 week to see what
    they remember
  • Free Recall
  • Cued Recall
  • How do users layout their documents according to
    what they remember?

16
User Studies Summative
  • TaskTrail Study at Intel (in progress)
  • 4 participants (so far) are using TaskTracer for
    at least 1 month each
  • Then they will use TaskTrail to locate their own
    documents
  • Measures of success
  • Do users locate more documents using TaskTrail?
  • Do users locate documents more quickly using
    TaskTrail?
  • Do users prefer using TaskTrail?

17
Provenance-related User Studies are Hard!
  • Must be done in the wild
  • Involves
  • Long time-scales, which increase chances that
  • Participants will drop out
  • Situation on site will change
  • Potentially sensitive information
  • Emails to/from users not participating in the
    study
  • Documents regarding trade secrets
  • Installation of some event-tracking software
  • Software installation/maintenance can introduce
    compatibility, scheduling and other problems

18
Summary TaskTrail
  • TaskTrail
  • Instruments desktop provenance relationships
  • Allows user to query by right-clicking objects
  • User can browse visualization of provenance
    relationships to find desired documents
  • Exploits human episodic memory to help users find
    documents
Write a Comment
User Comments (0)
About PowerShow.com