CS376 Evaluation

About This Presentation
Title:

CS376 Evaluation

Description:

CS376 Evaluation – PowerPoint PPT presentation

Number of Views:52
Avg rating:3.0/5.0
Slides: 28
Provided by: jeffr263
Learn more at: https://hci.stanford.edu

less

Transcript and Presenter's Notes

Title: CS376 Evaluation


1
Evaluation Methods
Jeffrey Heer 28 April 2009
2
Project Abstracts
  • For final version (due online Fri 5/1 _at_ 7am)
  • Flesh out concrete details. What will you build?
    If running an experiment, what factors will you
    vary and what will you measure? What are your
    hypotheses and why? Provide rationale!
  • Need to add study recruitment plan and related
    work sections (see http//cs376/project.html).
  • Iterate more than once! Stop by office hours to
    discuss.

3
What is Evaluation?
  • Something you do at the end of a project to show
    it works
  • so you can publish it.
  • Part of the design-build-evaluate iterative
    design cycle
  • A way a discipline validates the knowledge it
    creates.
  • a reason papers get rejected.

4
Establishing Research Validity
  • Methods for establishing validity vary
    depending on the nature of the contribution. They
    may involve empirical work in the laboratory or
    the field, the description of rationales for
    design decisions and approaches, applications of
    analytical techniques, or proof of concept
    system implementations
  • - CHI 2007 Website

5
Evaluation Methods
  • http//www.usabilitynet.org/tools/methods.htm

6
(No Transcript)
7
What to evaluate?
  • Enable previously difficult/impossible tasks
  • Improve task performance or outcome
  • Modify/influence behavior
  • Improve ease-of-use, user satisfaction
  • User experience
  • Sell more widgets
  • What is the motivating research goal?

8
(No Transcript)
9
(No Transcript)
10
(No Transcript)
11
UbiFit Consolvo et al
12
Momento
Momento Carter et al
13
Evaluation Methods in HCI
  • Inspection (Walkthrough) Methods
  • Observation, User Studies
  • Experience Sampling
  • Interviews and Surveys
  • Usage Logging
  • Controlled Experimentation
  • Fieldwork, Ethnography
  • Mixed-Methods Approaches

14
Proof by Demonstration
  • Prove feasibility by building prototype system
  • Demonstrate that the system enables task
  • Small user study may add little insight

15
Inspection Methods
  • Often called discount usability techniques
  • Expert review of user interface design
  • Heuristic Evaluation (Nielsen, useit.com/papers/he
    uristic)
  • Visibility of system status
  • Match between system and real world
  • User control and freedom
  • Consistency and standards
  • Error prevention
  • Recognition over recall
  • Flexibility and efficiency of use
  • Aesthetic and minimalist design
  • Help users recognize, diagnose, and recover from
    errors
  • Help and documentation

16
How many evaluators?
17
Usability Testing
  • Observe people interacting with prototype
  • May include
  • Providing tasks (e.g., easy, medium, hard)
  • Talk-aloud protocol (users verbal reports)
  • Usage logging
  • Pre/post study surveys
  • NASA TLX workload assessment survey
  • QUIS user interaction satisfaction

18
Wizard-of-Oz Techniques
19
Controlled Experiments
  • What are the important concerns?

20
Controlled Experiments
  • Measure response of dependent variables to
    manipulation of independent variables.
  • Within or between-subjects design
  • Change indep vars within or across subjects
  • Randomization, replication, blocking
  • Learning effects
  • Choice of measure and statistical tests
  • t-Test, ANOVA, Chi-squared ?2, Non-parametric

21
Experimental Desiderata
  • P-value probability that results due to chance
  • Type I Error accept spurious result
  • Bonferronis principle if you run enough
    significance tests, youll eventually get lucky
  • Type II Error mistakenly reject result
  • Inappropriate measure or test?
  • Statistical vs. practical significance
  • N1000, p lt 0.001, avg dt 0.12 sec.

22
Internal Validity
  • Internal validity is a causal relation between
    two variables properly demonstrated?
  • Confounds is there another factor at play?
  • Selection (bias) approp. subject population?
  • Experimenter bias researcher actions

23
External Validity
  • External validity do the results generalize to
    other situations of populations?
  • Subjects do subjects aptitudes interact with
    independent variables?
  • Situation time, location, lighting, duration

24
Ecological Validity
  • The degree to which the methods, materials and
    setting of the study approximate the real-life
    situation under investigation.
  • Flight simulator vs. flying a plane
  • Simulated community activity vs. open web

25
(No Transcript)
26
(No Transcript)
27
Next Time Distributed Cognition
  • The Power of Representation in Things that Make
    Us Smart, 1993, pp. 43-76.
  • Donald Norman
  • On Distinguishing Pragmatic from Epistemic
    Action, Cognitive Science, 1994, pp. 513-549,
    David Kirsh and Paul Maglio
Write a Comment
User Comments (0)