Appendix - PowerPoint PPT Presentation

1 / 11
About This Presentation
Title:

Appendix

Description:

If the user hardly requests documents viewed in old sessions should we bother ... Users who don't request new found docs frequently in that session ... – PowerPoint PPT presentation

Number of Views:35
Avg rating:3.0/5.0
Slides: 12
Provided by: Office2004476
Category:

less

Transcript and Presenter's Notes

Title: Appendix


1
Appendix
2
Main Variables in Adaptive Information Retrieval
  • Query
  • User
  • Time of request
  • Information sources

3
1st-Order Markov Model
  • Current action depends on the history H(U) only
    through the last observed action A
  • ?0,k is the probability of observing H(U) as the
    first action in the history with mixture
    component k,
  • ?(h-gth1),k is the probability of observing a
    transition from action h to h1 in the history
    for that component. Note that Markov model only
    depends on the bigrams.

4
Maximum Entropy Model Definition
  • Maximum entropy objective function leads to the
    following model
  • Fs(A,H(U)) are the feature indicator functions. S
    is the total number of bigramstriggers for
    action A.
  • s are maxent model parameters for component k.
  • If s corresponds to bigram (a,b) then
    Fs(A,H(U))1 iff Aa and Aprevb, and
    Fs(A,H(U))0 otherwise.
  • Z?,k(H(U)) is a normalization constant.

5
Experimental Setup
  • 3 months of CiteSeer data was gathered and
    sessionized based on the inactivity time
    criterion (10 minutes)
  • Robots were identified and eliminated based on
    the number of CiteSeer accesses
  • Queries were extracted from the sessionized data
  • The type of queries were identified by the
    following procedure
  • 1st Query in a session is always a new query
  • Current query Qcurrent is a new query if it
    doesnt have any common terms with the previous
    query Qprev (excluding stop words and logical
    operators)
  • Qcurrent is an old query if its exactly the same
    as Qprev
  • Qcurrent is a modified query if it has at least
    one common term with Qprev
  • Qcurrent is a document query if the documents
    were searched, and citation query if the
    citations were being searched.

6
CiteSeer Users Document Request Patterns
  • Document request logs are valuable because they
  • bear contextual information
  • facilitate finding similar users, similar
    documents
  • indicate users accustomed search actions.
  • How frequently does the user request new
    documents?
  • Does (s)he ever request documents (s)he has seen
    in previous sessions?
  • When the user views a new document is (s)he done
    with it in 1 request, or does (s)he request the
    new document frequently in that session?

7
Document Request Patterns - Motivation
  • The object of this study is the 3rd point which
    we believe could play and important role in
    recommendation strategies and user interface
  • If the user hardly requests documents viewed in
    old sessions should we bother recommending them
    at all?
  • If the user searches citations mostly should we
    make that users default search citation
    search?
  • If the user visits the new found documents in
    that session frequently, should we have links to
    those documents on every page during that
    session?
  • If the user requests old documents more
    frequently what is the reason? Are we showing the
    same results? Should the new documents have a
    higher priority in recommendations to this user?

8
Document Request Patterns
  • We identify the types of sessions rather than
    just identifying only each document request type.
    Its straightforward to cluster the users once we
    identify the sessions.
  • Document Request Types
  • New Document Users first request of this
    document
  • This Session Document Document was requested
    earlier in this session
  • Old Document User had requested this document in
    previous sessions
  • Session Types
  • New Documents Session All documents requested
    are new documents
  • Old Documents Session All documents requested in
    this session were visited in previous sessions
  • New And This Session Documents Session Documents
    requested were never viewed in previous sessions,
    but at least one of the new documents were
    requested more than once in this session
  • New And Old Documents Session Documents were
    either old or new
  • All Document Types Session This session has at
    least two requests to a new document and at least
    one to an old document.

9
Session Types - Example
  • Input
  • ltsessionID userID documentID1 documentID2 gt
  • 1 uid1 475572 475572
  • 2 uid1 102844 102844
  • 3 uid1 23039 23039 263905 263905 430232 34069
    34069 123882 87389 87389 470739 89795 89795
    317832 18670 506897 94170 506428 506428 501759
    286736 286736 99777 99777 40383 391244
  • 4 uid1 209066 209066
  • 5 uid1 84571
  • 6 uid1 496615 496615
  • 7 uid1 525188 501003
  • 8 uid1 501621 501003
  • 9 uid1 186821 186821
  • 10 uid1 88085 88085
  • 11 uid1 15714 15714
  • 12 uid1 25344 25344
  • Output
  • ltuid session1Type session2Type gt
  • uid1 New New New New NewANDOld New New New New
  • (first 3 sessions were excluded therefore
    sessions 4 through 12 are in the final output)

10
Document Request Behavior - Results
Only New Doc Requesters
11
Conclusions Future Work
  • Users differ in their use of CiteSeer in every
    way
  • Some view only new documents, some hardly ever do
    so, whereas some like to spend more time on new
    found documents
  • Some search only documents, some search citations
    mostly, and some relentlessly try to modify their
    queries to find the right documents.
  • This study has shown that by looking at the query
    formulations and the type of the documents
    requested we can identify bottlenecks for
    different user groups
  • Our goal is to use the proposed methods to
    provide personalized recommendations to CiteSeer
    users, as well as automatically adapting the
    design of the user interface to better suit their
    needs
Write a Comment
User Comments (0)
About PowerShow.com