Evaluation of Mixed Initiative Systems - PowerPoint PPT Presentation

About This Presentation
Title:

Evaluation of Mixed Initiative Systems

Description:

Builds upon long history in 'hard' sciences and engineering. National Science Foundation ... Sciences and Engineering. Directorate for. Education and. Human ... – PowerPoint PPT presentation

Number of Views:33
Avg rating:3.0/5.0
Slides: 17
Provided by: wadr9
Learn more at: http://lalab.gmu.edu
Category:

less

Transcript and Presenter's Notes

Title: Evaluation of Mixed Initiative Systems


1
Evaluation of Mixed Initiative Systems
  • Michael J. Pazzani
  • University of California, Irvine
  • National Science Foundation

2
Overview
  • Evaluation
  • Micro-level Modules
  • Macro-level Behavior of System Users
  • Caution Dont lose sight of the goal in
    evaluation
  • National Science Foundation
  • CISE (Re)organization
  • Funding for Mixed Initiative Systems
  • Tip on writing better proposals Evaluate

3
Evaluation
  • Micro level
  • Does the module (machine learning, user modeling,
    information retrieval and visualization, etc,)
    work properly.
  • Has been responsible for measurable progress in
    most specialized domains of intelligent systems
  • Relatively easy to do using well known metrics,
    error rate, precision, recall, time and space
    complexity, goodness of fit, ROC curves
  • Builds upon long history in hard sciences and
    engineering

4
Evaluation
  • Macro level
  • Does the complex system, involving a user and a
    machine work as desired.
  • Builds upon history in human (and animal)
    experimentation, not always taught in (or
    respected by) engineering schools
  • Allows controlled experiments comparing two
    systems (or one system with two variations)

5
Adaptive Personalization
6
Micro Evaluating the Hybrid User Model
7
Micro Speed to Effectiveness
Initially, AIS is as effective as a static system
in finding relevant content. After only one
usage, the benefits of AdaptiveInfo's Intelligent
Wireless Specific Personalization are clear
after three sessions even more so and, after 10
sessions the full benefits of Adaptive
Personalization are realized
8
Macro Probability a Story is Read
40 probability a user will read one of the top 4
stories selected by an editor, but a 64 chance
they'll read one of the top 4 personalized
stories - the AIS user is 60 more likely to
select a story than a non-AIS user
9
Macro Increased Page Views
After looking at 3 or more screens of headlines,
users read 43 more of the personally selected
news stories clearly showing AIS's ability to
dramatically increase stickiness of a wireless
web application
10
Macro Readership and Stickiness
20 more LA Times users who receive personalized
news return to the wireless site 6 weeks after
the first usage.
11
Cautions
  • Optimizing a micro level evaluation may have
    little impact on the macro level. It may even
    have a counter-intuitive effect
  • If personalization causes a noticeable delay, it
    may decrease readership
  • Dont lose sight of the goal.
  • The metrics are just approximations of the goal.
  • Optimizing the metric may not optimize the goal.

12
RD within the NSF Organization
13
CISE Directorate 2004
  • Computing Communications Foundations
  • Computer Networks Systems
  • Information and Intelligent Systems (IIS)
  • Deployed Infrastructure

14
Information and Intelligent Systems Programs
  • Information and Data Management
  • Artificial Intelligence and Cognitive Science
  • Human Language and Communication
  • Robotics and Computer Vision
  • Digital Society and Technologies
  • Human Computer Interaction
  • Universal Access
  • Digital Libraries
  • Science and Engineering Informatics

15
Types of proposals/awards
  • IIS Regular Proposal Deadlines 250-600K 3 yr
    12/12
  • CAREER Program ? (400-500K, 5 year) late July
  • REU RET supplements?(10-30K 1 year) 3/1
  • Information Technology Research (ITR) Probably
    Feb

16
NSF Merit Review Criteria
  • Looking for important, innovative, achievable
    projects
  • Criterion 1 What is the intellectual merit and
    quality of the proposed activity?
  • Criterion 2 What are the broader impacts of the
    proposed activity?
  • NSF will return proposal without review if the
    single page proposal summary does not address
    each criteria in separate statements
  • Evaluation Plan of both micro macro levels is
    essential using metrics that you propose (and
    your peers believe are appropriate)
Write a Comment
User Comments (0)
About PowerShow.com