EyeBased Interaction in Graphical Systems: Theory

About This Presentation

Title:

EyeBased Interaction in Graphical Systems: Theory

Description:

CFF explains why flicker is not seen when viewing sequence of still images ... Besides necessary flicker rate (60Hz), illusion of apparent, or stroboscopic, ... – PowerPoint PPT presentation

Number of Views:350

Avg rating:3.0/5.0

Slides: 148

Provided by: andrewdu3

Category:

more less

Transcript and Presenter's Notes

Title: EyeBased Interaction in Graphical Systems: Theory

1
Eye-Based Interaction in Graphical Systems
Theory Practice

Andrew Duchowski
Computer Science
Clemson University
andrewd_at_cs.clemson.edu

Roel Vertegaal Computing and Information
Science Queens University roel_at_acm.org
2
Overview

Basics of visual attention, human vision, eye
movements and signal analysis
Eye tracking hardware specifications
Video eye tracker integration
Principal eye tracking system modes
interactive or diagnostic
Example systems and potential applications

3
Course Schedule Part I

Introduction to the Human Visual System (HVS)
Neurological substrate of the HVS
Physiological and functional descriptions
Visual Perception
Spatial, temporal, color vision
Eye Movements
Saccades, fixations, pursuits, nystagmus

4
Course Schedule Part II

Part II Eye tracking systems
The eye tracker
early developments
video-based eye trackers
system use
Integration issues
application design
calibration
data collection / analysis

5
Course Schedule Part III / Demo

Part III Potential applications
VR Human Factors
Collaborative systems Advertising
Psychophysics Displays
Demonstration GAZE Groupware System

6
Eye-Based Interaction in Graphical Systems
Theory Practice

Part I
Introduction to the Human Visual System

7
A Visual Attention
When the things are apprehended by the senses,
the number of them that can be attended to at
once is small, Pluribus intentus, minor est ad
singula sensus' William James

Latin translation Many filtered into few for
perception
Visual scene inspection is performed minutatim
(piecemeal), not in toto

8
A.1 Visual Attentionchronological review

Qualitative historical background a dichotomous
theory of attentionthe what and where of
(visual) attention
Von Helmholtz (ca. 1900) mainly concerned with
eye movements to spatial locations, the where,
I.e., attention as overt mechanism (eye
movements)
James (ca. 1900) defined attention mainly in
terms of the what, i.e., attention as a more
internally covert mechanism

9
A.1 Visual Attentionchronological review
(contd)

Broadbent (ca. 1950) defined attention as
selective filter from auditory experiments
generally agreeing with Von Helmholtzs where
Deutsch and Deutsch (ca. 1960) rejected
selective filter in favor of importance
weightings generally corresponding to James
what
Treisman (ca. 1960) proposed unified theory of
attentionattenuation filter (the where)
followed by dictionary units (the what)

10
A.1 Visual Attentionchronological review
(contd)

Main debate at this point is attention parallel
(the where) or serial (the what) in nature?
Gestalt view recognition is a wholistic process
(e.g., Kanizsa figure)
Theories advanced through early recordings of eye
movements

11
A.1 Visual Attentionchronological review
(contd)

Yarbus (ca. 1967) demonstrated sequential, but
variable, viewing patterns over particular image
regions (akin to the what)
Noton and Stark (ca. 1970) showed that subjects
tend to fixate identifiable regions of interest,
containing informative details coined term
scanpath describing eye movement patterns
Scanpaths helped cast doubt on the Gestalt
hypothesis

12
A.1 Visual Attentionchronological review
(contd)

Fig.2 Yarbus early scanpath recording
trace 1 examine at will
trace 2 estimate wealth
trace 3 estimate ages
trace 4 guess previous activity
trace 5 remember clothing
trace 6 remember position
trace 7 time since last visit

13
A.1 Visual Attentionchronological review
(contd)

Posner (ca. 1980) proposed attentional
spotlight, an overt mechanism independent from
eye movements (akin to the where)
Treisman (ca. 1986) once again unified what
and where dichotomy by proposing the Feature
Integration Theory (FIT), describing attention as
a glue which integrates features at particular
locations to allow wholistic perception

14
A.1 Visual Attentionchronological review
(contd)

Summary the what and where dichotomy
provides an intuitive sense of attentional,
foveo-peripheral visual mechanism
Caution the what/where account is probably
overly simplistic and is but one theory of visual
attention

15
B Neurological Substrate of the Human Visual
System (HVS)

Any theory of visual attention must address the
fundamental properties of early visual mechanisms
Examination of the neurological substrate
provides evidence of limited information capacity
of the visual systema physiological reason for
an attentional mechanism

16
B.1 The Eye

Fig. 3 The eyethe worlds worst camera
suffers from numerous optical imperfections...
...endowed with several compensatory mechanisms

17
B.1 The Eye (contd)

Fig. 4 Ocular optics

18
B.1 The Eye (contd)

Imperfections
spherical abberations
chromatic abberations
curvature of field

Compensations
irisacts as a stop
focal lenssharp focus
curved retinamatches curvature of field

19
B.2 The Retina

Retinal photoreceptors constitute first stage of
visual perception
Photoreceptors ? transducers converting light
energy to electrical impulses (neural signals)
Photoreceptors are functionally classified into
two types rods and cones

20
B.2 The Retinarods and cones

Rods sensitive to dim and achromatic light
(night vision)
Cones respond to brighter, chromatic light (day
vision)
Retinal construction 120M rods, 7M cones
arranged concentrically

21
B.2 The Retinacellular makeup

The retina is composed of 3 main layers of
different cell types (a 3-layer sandwich)
Surprising fact the retina is inverted
photoreceptors are found in the bottom layer
(furthest away from incoming light)
Connection bundles between layers are called
plexiform or synaptic layers

22
B.2 The Retinacellular makeup (contd)

Fig.5 The retinocellular layers (w.r.t. incoming
light)
ganglion layer
inner synaptic plexiform layer
inner nuclear layer
outer synaptic plexiform layer
outer layer

23
B.2 The Retinacellular makeup (contd)

Fig.5 (contd) The neuron
all retinal cells are types of neurons
certain neurons mimic a digital gate, firing
when activation level exceeds a threshold
rods and cones are specific types of dendrites

24
B.2 The Retinaretinogeniculate organization
(from outside in, w.r.t. cortex)

Outer layer rods and cones
Inner layer horizontal cells, laterally
connected to photoreceptors
Ganglion layer ganglion cells, connected
(indirectly) to horizontal cells, project via the
myelinated pathways, to the Lateral Geniculate
Nuclei (LGN) in the cortex

25
B.2 The Retinareceptive fields

Receptive fields collections of interconnected
cells within the inner and ganglion layers
Field organization determines impulse signature
of cells, based on cell types
Cells may depolarize due to light increments ()
or decrements (-)

26
B.2 The Retinareceptive fields (contd)

Fig.6 Receptive fields
signal profile resembles a Mexican hat
receptive field sizes vary concentrically
color-opposing fields also exist

27
B.3 Visual Pathways

Retinal ganglion cells project to the LGN along
two major pathways, distinguished by
morphological cell types ? and ? cells
? cells project to the magnocellular (M-) layers
? cells project to the parvocellular (P-) layers
Ganglion cells are functionally classified by
three types X, Y, and W cells

28
B.3 Visual Pathwaysfunctional response of
ganglion cells

X cells sustained stimulus, location, and fine
detail
nervate along both M- and P- projections
Y cells transient stimulus, coarse features, and
motion
nervate along only the M-projection
W cells coarse features and motion
project to the Superior Colliculus (SC)

29
B.3 Visual Pathways (contd)

Fig.7 Optic tract and radiations (visual
pathways)
The LGN is of particular clinical importance
M- and P-cellular projections are clearly visible
under microscope
Axons from M- and P-layers of the LGN terminate
in area V1

30
B.3 Visual Pathways (contd)

Table.1 Functional characteristics of ganglionic
projections

31
B.4 The Occipital Cortex and Beyond

Fig.8 The brain and visual pathways
the cerebral cortex is composed of numerous
regions classified by their function

32
B.4 The Occipital Cortex and Beyond (contd)

M- and P- pathways terminate in distinct layers
of cortical area V1
Cortical cells (unlike center-surround ganglion
receptive fields) respond to orientation-specific
stimulus
Pathways emanating from V1 joining multiple
cortical areas involved in vision are called
streams

33
B.4 The Occipital Cortex and Beyonddirectional
selectivity

Cortical Directional Selectivity (CDS) of cells
in V1 contributes to motion perception and
control of eye movements
CDS cells establish a motion pathway from V1
projecting to areas V2 and MT (V5)
In contrast, Retinal Directional Selectivity
(RDS) may not contribute to motion perception,
but is involved in eye movements

34
B.4 The Occipital Cortex and Beyondcortical
cells

Two consequences of visual systems
motion-sensitive, single-cell organization
due to motion sensitivity, eye movements are
never perfectly still (instead tiny jitter is
observed, termed microsaccade)if eyes were
stabilized, image would fade!
due to single-cell organization, representation
of natural images is quite abstract there is no
retinal buffer

35
B.4 The Occipital Cortex and Beyond2
attentional streams

Dorsal stream
V1, V2, MT (V5), MST, Posterior Parietal Cortex
sensorimotor (motion, location) processing
the attentional where?
Ventral (temporal) stream
V1, V2, V4, Inferotemporal Cortex
cognitive processing
the attentional what?

36
B.4 The Occipital Cortex and Beyond3
attentional regions

Posterior Parietal Cortex (dorsal stream)
disengages attention
Superior Colliculus (midbrain)
relocates attention
Pulvinar (thalamus colocated with LGN)
engages, or enhances, attention

37
C Visual Perception (with emphasis on
foveo-peripheral distinction)

Measurable performance parameters may often (but
not always!) fall within ranges predicted by
known limitations of the neurological substrate
Example visual acuity may be estimated by
knowledge of density and distribution of the
retinal photoreceptors
In general, performance parameters are obtained
empirically

38
C.1 Spatial Vision

Main parameters sought visual acuity, contrast
sensitivity
Dimensions of retinal features are measured in
terms of projected scene onto retina in units of
degrees visual angle,
where S is the object size and D is distance

39
C.1 Spatial Visionvisual angle

Fig.9 Visual angle

40
C.1 Spatial Visioncommon visual angles

Table 2 Common visual angles

41
C.1 Spatial Visionretinal regions

Visual field 180 horiz. ? 130 vert.
Fovea Centralis (foveola) highest acuity
1.3 visual angle 25,000 cones
Fovea high acuity (at 5, acuity drops to 50)
5 visual angle 100,000 cones
Macula within useful acuity region (to about
30)
16.7 visual angle 650,000 cones
Hardly any rods in the foveal region

42
C.1 Spatial Visionvisual angle and receptor
distribution

Fig.10 Retinotopic receptor distribution

43
C.1 Spatial Visionvisual acuity

Fig.11 Visual acuity at eccentricities and light
levels
at photopic (day) light levels, acuity is fairly
constant within central 2
acuity drops of linearly to 5 drops sharply
(exp.) beyond
at scotopic (night) light levels, acuity is poor
at all eccentricities

44
C.1 Spatial Visionmeasuring visual acuity

Acuity roughly corresponds to foveal receptor
distribution in the fovea, but not necessarily in
the periphery
Due to various contributing factors (synaptic
organization and later-stage neural elements),
effective relative visual acuity is generally
measured by psychophysical experimentation

45
C.2 Temporal Vision

Visual response to motion is characterized by two
distinct facts persistence of vision (POV) and
the phi phenomenon
POV essentially describes human temporal
sampling rate
Phi describes threshold above which humans
detect apparent movement
Both facts exploited in media to elicit motion
perception

46
C.2 Temporal Visionpersistence of vision

Fig.12 Critical Fusion Frequency
stimulus flashing at about 50-60Hz appears steady
CFF explains why flicker is not seen when viewing
sequence of still images
cinema 24 fps ? 3 72Hz due to 3-bladed shutter
TV 60 fields/sec, interlaced

47
C.2 Temporal Visionphi phenomenon

Phi phenomenon explains why motion is perceived
in cinema, TV, graphics
Besides necessary flicker rate (60Hz), illusion
of apparent, or stroboscopic, motion must be
maintained
Similar to old-fashioned neon signs with
stationary bulbs
Minimum rate 16 frames per second

48
C.2 Temporal Visionperipheral motion perception

Motion perception is not homogeneous across
visual field
Sensitivity to target motion decreases with
retinal eccentricity for slow motion...
higher rate of target motion (e.g., spinning
disk) is needed to match apparent velocity in
fovea
but, motion is more salient in periphery than in
fovea (easier to detect moving targets than
stationary ones)

49
C.2 Temporal Visionperipheral sensitivity to
direction of motion

Fig.13 Threshold isograms for peripheral rotary
movement
periphery is twice as sensitive to
horizontal-axis movement as to vertical-axis
movement
(numbers in diagram are rates of pointer movement
in rev./min.)

50
C.3 Color Visioncone types

foveal color vision is facilitated by three types
of cone photorecptors
a good deal is known about foveal color vision,
relatively little is known about peripheral color
vision
of the 7,000,000 cones, most are packed tightly
into the central 30 foveal region

Fig.14 Spectral sensitivity curves of cone
photoreceptors

51
C.3 Color Visionperipheral color perception
fields

blue and yellow fields are larger than red and
green fields
most sensitive to blue, up to 83 red up to 76
green up to 74
chromatic fields do not have definite borders,
sensitivity gradually and irregularly drops off
over 15-30 range

Fig.15 Visual fields for monocular color vision
(right eye)

52
C.4 Implications for Design of Attentional
Displays

Need to consider distinct characteristics of
foveal and peripheral vision, in particular
spatial resolution
temporal resolution
luminance / chrominance
Furthermore, gaze-contingent systems must match
dynamics of human eye movement

53
D Taxonomy and Models of Eye Movements

Eye movements are mainly used to reposition the
fovea
Five main classes of eye movements

saccadic
smooth pursuit
vergence

vestibular
physiological nystagmus
(fixations)

Other types of movements are non-positional
(adaptation, accommodation)

54
D.1 Extra-Ocular Muscles

Fig.16 Extrinsic muscles of the eyes
in general, eyes move within 6 degrees of freedom
(6 muscles)

55
D.1 Oculomotor Plant

Fig.17 Oculomotor system
eye movement signals emanate from three main
distinct regions
occipital cortex (areas 17, 18, 19, 22)
superior colliculus (SC)
semicircular canals (SCC)

56
D.1 Oculomotor Plant (contd)

Two pertinent observations
eye movement system is, to a large extent, a
feedback circuit
controlling cortical regions can be functionally
characterized as
voluntary (occipital cortexareas 17, 18, 19, 22)
involuntary (superior colliculus, SC)
reflexive (semicircular canals, SCC)

57
D.2 Saccades

Rapid eye movements used to reposition fovea
Voluntary and reflexive
Range in duration from 10ms - 100ms
Effectively blind during transition
Deemed ballistic (pre-programmed) and stereotyped
(reproducible)

58
D.2 Saccadesmodeling

Fig.18 Linear moving average filter model
st input (pulse), xt output (step), gk
filter coefficients
e.g., Haar filter 1,-1

59
D.3 Smooth Pursuits

Involved when visually tracking a moving target
Depending on range of target motion, eyes are
capable of matching target velocity
Pursuit movements are an example of a control
system with built-in negative feedback

60
D.3 Smooth Pursuitsmodeling

Fig.19 Linear, time-invariant filter model
st target position, xt (desired) eye
position, h filter
retinal receptors give additive velocity error

61
D.4 Nystagmus

Conjugate eye movements characterized by
sawtooth-like time course pattern (pursuits
interspersed with saccades)
Two types (virtually indistinguishable)
Optokinetic compensation for retinal movement of
target
Vestibular compensation for head movement
May be possible to model with combination of
saccade/pursuit filters

62
D.5 Fixations

Possibly the most important type of eye movement
for attentional applications
90 viewing time is devoted to fixations
duration 150ms - 600ms
Not technically eye movements in their own right,
rather characterized by miniature eye movements
tremor, drift, microsaccades

63
D.6 Eye Movement Analysis

Two significant observations
only three types of eye movements are mainly
needed to gain insight into overt localization of
visual attention
fixations
saccades
smooth pursuits (to a lesser extent)
all three signals may be approximated by linear,
time-invariant (LTI) filter systems

64
D.6 Eye Movement Analysisassumptions

Important point it is assumed observed eye
movements disclose evidence of overt visual
attention
it is possible to attend to objects covertly
(without moving eyes)
Linearity although practical, this assumption is
an operational oversimplification of neuronal
(non-linear) systems

65
D.6 Eye Movement Analysisgoals

goal of analysis is to locate regions where
signal average changes abruptly
fixation end, saccade start
saccade end, fixation start
two main approaches
summation-based
differentiation-based
both approaches rely on empirical thresholds

Fig.20 Hypothetical eye movement signal
66
D.6 Eye Movement Analysisdenoising

Fig.21 Signal denoisingreduce noise due to
eye instability (jitter), or worse, blinks
removal possible based on device characteristics
(e.g., blink 0,0)

67
D.6 Eye Movement Analysissummation based

Dwell-time fixation detection depends on
identification of a stationary signal (fixation),
and
size of time window specifying range of duration
(and hence temporal threshold)
Example position-variance method
determine whether M of N points lie within a
certain distance D of the mean (?) of the signal
values M, N, and D are determined empirically

68
D.6 Eye Movement Analysisdifferentiation based

Velocity-based saccade/fixation detection
calculated velocity (over signal window) is
compared to threshold
if velocity gt threshold then saccade, else
fixation
Example velocity detection method
use short Finite Impulse Response (FIR) filters
to detect saccade (may be possible in real-time)
assuming symmetrical velocity profile, can extend
to velocity-based prediction

69
D.6 Eye Movement Analysis (contd)
(a) position-variance
(b) velocity-detection

Fig.22 Saccade/fixation detection

70
D.6 Eye Movement Analysisexample

Fig.23 FIR filter velocity-detection method
based on idealized saccade detection
4 conditions on measured acceleration

acc. gt thresh. A
acc. gt thresh. B
sign change
duration thresh.

thresholds derived from empirical values

71
D.6 Eye Movement Analysisexample (contd)

Amplitude thresholds A, B derived from expected
peak saccade velocities 600/s
Duration thresholds Tmin, Tmax derived from
expected saccade duration 120ms - 300ms

Fig.24 FIR filters for saccade detection
72
Eye-Based Interaction in Graphical Systems
Theory Practice

Part II
Eye Tracking Systems

73
E The Eye Tracker

Two broad applications of eye movement monitoring
techniques
measuring position of eye relative to the head
measuring orientation of eye in space, or the
point of regard (POR)used to identify fixated
elements in a visual scene
Arguably, the most widely used apparatus for
measuring the POR is the video-based corneal
reflection eye tracker

74
E.1 Brief Survey of Eye Tracking Techniques

Four broad categories of eye movement
methodologies
electro-oculography (EOG)
scleral contact lens/search coil
photo-oculography (POG) or video-oculography
(VOG)
video-based combined pupil and corneal reflection

75
E.1 Brief Survey of Eye Tracking Techniques
(contd)

First method for objective eye movement
measurements using corneal reflection reported in
1901
Techniques using contact lenses to improve
accuracy developed in 1950s (invasive)
Remote (non-invasive) trackers rely on visible
features of the eye (e.g., pupil)
Fast image processing techniques have facilitated
real-time video-based systems

76
E.1 Brief Survey of Eye Tracking TechniquesEOG

most widely used method some 20 years ago (still
used today)
similar to electro-mechanical motion-capture
measures eye movements relative to head position
not generally suitable for POR measurement
(unless head is also tracked)

Fig.25 EOG measurement
relies on measurement of skins potential
differences, using electrodes placed around the
eye

77
E.1 Brief Survey of Eye Tracking
TechniquesScleral Contact Lens/Search Coil

Fig.26 Scleral coil
search coil embedded in contact lens and
electromagnetic field frames

possibly most precise
similar to electromagnetic position/orientation
trackers used in motion-capture

78
E.1 Brief Survey of Eye Tracking
TechniquesScleral Contact Lens/Search Coil
(contd)

highly accurate, but limited measurement range
(5)
measures eye movements relative to head position
not generally suitable for POR measurement
(unless head is also tracked)

Fig.27 Example of scleral suction ring
insertion
most intrusive method
insertion of lens requires care
wearing of lens causes discomfort

79
E.1 Brief Survey of Eye Tracking TechniquesPOG
/ VOG

Fig.28 Example of POG / VOG methods and devices
wide variety of techniques based on measurement
of distinguishable ocular features (similar to
optical mocap)

pupil apparent shape
limbus position of iris-sclera boundary

infra-red corneal reflection of directed light
source

80
E.1 Brief Survey of Eye Tracking
TechniquesVideo-Based Combined Pupil and Corneal
Reflection

Fig.29 Table-mounted (remote) video-based eye
tracker
compute POR, usually in real-time

utilize relatively cheap video cameras and image
processing hardware
can also allow limited head movement

81
E.1 Brief Survey of Eye Tracking
TechniquesVideo-Based Combined Pupil and Corneal
Reflection (contd)

Fig.30 Head-mounted video-based eye tracker
essentially identical to table-mounted systems,
but with miniature optics

most suitable for (graphical) interactive
systems, e.g., VR
binocular systems also available

82
E.1 Brief Survey of Eye Tracking
TechniquesCorneal Reflection

Two points of reference on the eye are needed to
separate eye movements from head movements, e.g.,
pupil center
corneal reflection of nearby, directed light
source (IR)
Positional difference between pupil center and
corneal reflection changes with eye rotation, but
remains relatively constant with minor head
movements

83
E.1 Brief Survey of Eye Tracking
TechniquesCorneal Reflection (contd)

Fig.31 Purkinje images
corneal reflections are known as the Purkinje
images, or reflections
front surface of cornea
rear surface of cornea
front surface of lens
rear surface of lens
video-based trackers typically locate the first
Purkinje image

84
E.1 Brief Survey of Eye Tracking
TechniquesCorneal Reflection (contd)

Purkinje images appear as small white dots in
close proximity to the (dark) pupil
tracker calibration is achieved by measuring user
gazing at properly positioned grid points
(usually 5 or 9)
tracker interpolates POR on perpendicular screen
in front of user

Fig.32 Pupil and Purkinje images as seen by eye
trackers camera
85
E.1 Brief Survey of Eye Tracking
TechniquesCorneal Reflection (contd)

DPI trackers measure rotational and translational
eye movements
1st and 4th reflections move together through
same distance upon eye translation, but separate
upon eye rotation
highly precise
used to be expensive and difficult to set up

Fig.33 Dual-Purkinje image (DPI) eye tracker
so-called generation-V trackers measure the 1st
and 4th Purkinje images

86
F Integration Issues and Requirements

Integration of eye tracker into graphics system
chiefly depends on
delivery of proper graphics video stream to
tracker
subsequent reception of trackers 2D gaze data
Gaze data (x- and y-coordinates) are typically
either stored by tracker or sent to graphics host
via serial cable
Discussion focuses on video-based eye tracker

87
F Integration Issues and Requirements (contd)

Video-based trackers main advantages over other
systems
relatively non-invasive
fairly accurate (to about 1 over a 30 field of
view)
for the most part, not difficult to integrate
Main limitation sampling frequency, typically
limited to video frame rate, 60Hz

88
F Integration Issues and Requirements (contd)

Fig.34 Virtual Reality Eye Tracking (VRET) Lab
at Clemson
integration description based on VRET lab
equipment
two systems described
table-mounted, monocular system
HMD-fitted, binocular system

89
F Integration Issues and RequirementsVRET lab
equipment

SGI Onyx2 InfiniteReality graphics host
dual-rack, dual-pipe, 8 MIPS R10000 CPUs
3Gb RAM, 0.5G texture memory
ISCAN eye tracker
table-mounted pan/tilt camera monocular unit
HMD-fitted binocular unit
Virtual Research V8 HMD
Ascension 6 Degree-Of-Freedom (DOF) Flock Of
Birds (FOB) d.c. electromagnetic head tracker

90
F Integration Issues and Requirementspreliminari
es

Primary requirements
knowledge of video format required by tracker
(e.g., NTSC, VGA)
knowledge of data format returned by tracker
(e.g., byte order, codes)
Secondary requirementstracker capabilities
fine-grained cursor control and readout?
transmission of trackers operating mode along
with gaze data?

91
F Integration Issues and Requirementsobjectives

Scene alignment
required for calibration, display, and data
mapping
use trackers fine-cursor to measure graphics
display dimensionsit is crucial that graphically
displayed calibration points are aligned with
those displayed by eye tracker
Host/tracker synchronization
required for generation of proper graphics
display, i.e., calibration or stimulus
use trackers operating mode data

92
F.1 System Installation

Primary wiring considerations
video cablesimperative that graphics host
generate video signal in format expected by eye
tracker
example problem graphics host generates VGA
signal (e.g., as required by HMD), eye tracker
expects NTSC
serial linecomparatively simple serial driver
typically facilitated by data specifications
provided by eye tracker vendor

93
F.1 System Installation (contd)

HMD driven by VGA
switchbox controls video between monitors and HMD
2 VGA-NTSC converters
TV driven by NTSC

Fig.35 Video signal wiring diagram for the VRET
lab at Clemson

94
F.1 System Installationlessons learned at
Clemson

Various video switches were needed to control
graphics video and eye camera video
Custom VGA cables (13W3-HD15) were needed to feed
monitors, HMD, and tracker
Host VGA signal had to be re-configured
(horizontal sync not sync-on-green)
Switchbox had to be re-wired (missing two lines
for pins 13 and 14!)

95
F.2 Application Program Requirements

Two example applications
2D image-viewing program (monocular)
VR gaze-contingent environment (binocular)
Most important common requirement
mapping eye tracker coordinates to application
programs reference frame
Extra requirements for VR
head tracker coordinate mapping
gaze vector calculation

96
F.2.1 Eye Tracker Screen Coordinate
Mappinggeneral

The eye tracker returns the users POR relative
to the trackers screen reference frame, e.g., a
512?512 pixel plane
Tracker data must be mapped to the dimensions of
the application screen
In general, to map x' ? a,b to range c,d,

97
F.2.1 Eye Tracker Screen Coordinate Mappingto
3D viewing frustum

Fig.36 Eye tracker to VR mapping
note the eye tracker origin at top-left

98
F.2.1 Eye Tracker Screen Coordinate Mappingto
3D viewing frustum (contd)

to convert eye tracker coordinates (x',y') to
graphics coordinates (x,y),

the term (512 - y') handles the y-coordinate flip
so that eye tracker screen is converted to
bottom-left of the viewing frustum
if dimensions of graphics window are static,
e.g., 640?480, above equation can be hardcoded

99
F.2.1 Eye Tracker Screen Coordinate Mappingto
2D image plane

Conversion of eye tracker coordinates (x',y') to
2D image plane coordinates (x,y) is handled
similarly
For example, if viewable image plane has
dimensions 600?450,

Note the above mapping assumes eye tracker
coordinates are in range 0,512
In practice, usable coordinates depend on
location of application window on eye tracking
screen

100
F.2.1 Eye Tracker Screen Coordinate Mappingto
2D image plane (contd)

use eye trackers fine cursor movement to measure
application windows extents
calculate mapping, e.g., for a 600?450 window,

Fig.37 Application window measurement

101
F.2.2 Mapping Flock Of Birds Tracker Coordinates

For VR applications, position and orientation of
the head is required (obtained from head tracker,
e.g., FOB)
The tracker reports 6 Degree-Of-Freedom (DOF)
information regarding sensor position and
orientation
Orientation is given in terms of Euler angles

102
F.2.2 Mapping Flock Of Birds Tracker Coordinates
(contd)
Table 3 Euler angle names

Euler angles roll, pitch, and yaw are represented
by R, E, A, respectively
each describes rotation angle about one axis

Fig.38 Euler angles

103
F.2.2 Mapping Flock Of Birds Tracker Coordinates
(contd)

Euler angles are described by familiar
homogeneous rotation matrices

the composite 4?4 matrix, containing all
rotations in one

104
F.2.2 Mapping Flock Of Birds Tracker Coordinates
(contd)

in VR, the composite transformation matrix,
returned by the head tracker, is used to
transform an arbitrary directional vector, w x
y z 1, to align it with the current sensor
(head) orientation

this formulation is used to align the initial
view vector, up vector, and eventually gaze
vector with the current head-centric reference
frame
note that the FOB matrix may be shifted by 1

105
F.2.2 Mapping Flock Of Birds Tracker Coordinates
(contd)

e.g., transforming the initial view vector, v
0 0 -1 1

e.g., transforming the initial up vector, u 0
1 0 1

gaze vector is transformed similarly

106
F.2.3 3D Gaze Point and Vector Calculation

The gaze point calculation in 3-space depends on
only the relative positions of the two eyes in
the horizontal axis
Parameters of interest here are the 3D virtual
(world) coordinates of the gaze point, (xg, yg,
zg)
These coordinates can be determined from
traditional stereo geometry

107
F.2.3 3D Gaze Point and Vector Calculation
(contd)

Fig.39 Basic binocular geometry
helmet position is the origin, (xh, yh, zh)
helmet view vector is the optical (viewer-local)
z-axis
helmet up vector is the (viewer-local) y-axis
eye tracker provides instantaneous viewer-local
gaze coordinates (mapped to viewing frustum)

108
F.2.3 3D Gaze Point and Vector Calculation
(contd)

given instantaneous binocular gaze coordinates
(xl,yl) and (xr,yr) at focal distance f along the
viewer-local z-axis, the gaze point (xg,yg,zg)
can be derived parametrically

where the interpolant s is given as

109
F.2.3 3D Gaze Point and Vector Calculation
(contd)

the gaze point can be expressed parametrically as
a point on a ray with origin (xh, yh, zh), the
helmet position, with the ray emanating along a
vector scaled by parameter s

or, in vector notation, g h sv, where h is
the head position, v is the central view vector,
and s is the scale parameter as defined
previously
note the view vector here is not related to the
view vector given by the head tracker

110
F.2.3 3D Gaze Point and Vector Calculation
(contd)

the view vector related to the gaze vector is
obtained by subtracting the helmet position from
the midpoint of the eye tracked x-coordinate and
focal distance to the near view plane,

where m denotes the left and right eye
coordinate midpoint
111
F.2.3 3D Gaze Point and Vector Calculation
(contd)

to transform the vector v to the proper
(instantaneous) head orientation, this vector
should be normalized, then transformed by the
orientation matrix returned by the head tracker
the transformed vector v gives the gaze direction
(ray)
using the helmet position h and gaze direction v,
we can express the gaze vector via a parametric
representation of a ray with linear interpolant
t

112
F.2.4 Virtual Fixation Coordinates

The gaze vector can be used in VR to calculate
virtual fixation coordinates
Fixation coordinates are obtained via traditional
ray/polygon intersection calculations, as used in
ray tracing
The fixated object of interest (polygon) is the
one closest to the viewer which intersects the ray

113
F.2.4 Virtual Fixation Coordinatesray/plane
intersection

The calculation of a ray and all polygons in the
scene is obtained via a parametric representation
of the ray

where ro defines the rays origin (point) and rd
defines the ray direction (vector)
For gaze, use ro h, the head position, rd
v, the gaze direction vector

114
F.2.4 Virtual Fixation Coordinatesray/plane
intersection (contd)

Recall the plane equation Ax By Cz D 0,
where A2 B2 C2 1, i.e., A, B, C define the
plane normal
Calculate the ray/plane intersection,

Find the closest ray/plane intersection to the
viewer, where t gt 0

115
F.2.4 Virtual Fixation Coordinatesray/plane
intersection (contd)

possible divide-by-zero need to check for this
(if close to 0 then ray and plane dont
intersect)
if dot product is greater than 0, surface is
hidden from viewer (use to speed up code)

Fig.40 Ray/plane geometry
N is actually -N, to calculate angle between ray
and face normal

116
F.2.4 Virtual Fixation Coordinatesray/plane
intersection (contd)

the parameter t defines the point of intersection
along the ray at the plane defined by N
if t gt 0, then point of intersection p is given
by p ro trd
this only gives the intersection of the ray and
the (infinite!) plane
need to test whether p lies within confines of
the polygonal face

Fig.41 Ray/plane intersection algorithm
117
F.2.4 Virtual Fixation Coordinatespoint-in-polyg
on

for each edge
calculate plane perpendicular to polygon, passing
through the edges two vertices
N' N ? (B - A)
calculate new planes equation
test point p to see if it lies above or below
new plane
is p is above all planes, p is inside polygon

Fig.42 Point-in-polygon problem
118
F.3 System Calibration and Usage

Most video-based eye trackers require calibration
Usually composed of simple stimuli (dots,
crosses, etc.) displayed sequentially at far
extents of viewing window
Application program displaying stimulus must be
able to draw calibration stimulus at appropriate
locations and at appropriate time

119
F.3 System Calibration and Usage (contd)

Fig.43 Usual graphics draw routine augmented by
mode-sensitive eye tracking code

120
F.3 System Calibration and Usage (contd)

calibration stimulus is displayed in both RESET
and CALIBRATE states this facilitates initial
alignment of the application window (default
calibration dot is at center)
stimulus scene (e.g., image or VE) is only
displayed if display condition is satisfied this
can be used to limit duration of display
for VR applications, draw routine may be preceded
by left or right viewport calls (for stereoscopic
displays)
the main program loop is responsible for 1)
reading eye (and head) tracker data 2) mapping
coordinates 3) starting/stopping timers 4)
recording or acting on gaze coordinates

121
F.3 System Calibration and Usage (contd)
while(1) getEyeTrackerData(x,y) mapEyeTracker
Data(x,y) switch(eye tracker state) case
RUN if(!starting) start timer
displayStimulus1 if(timer() gt DURATION)
displayStimulus0 else storeData(x,y) case
RESET case CALIBRATE starting0 redraw
()

Fig.44 Main loop (2D imaging application)

122
F.3 System Calibration and Usage (contd)
while(1) getHeadTrackerData(eye,dir,upv) getE
yeTrackerData(xl,yl,xr,yr) mapEyeTrackerData(xl,
yl,xr,yr) s b/(xl - xr b) // linear gaze
interpolant h eyex, eyey, eyez // head
position v (xlxr)/2 - xh, (ylyr)/2 - yh,
f-zh // central view vector transformVectorToHea
d(v) // multiply v by FOB matrix g h
sv // calculate gaze point switch(eye tracker
state) ... redraw()

Fig.45 Main loop (VR application)

123
F.3 System Calibration and Usage (contd)

once application program has been developed,
system is ready for use general manner of usage
requires the following steps
move application window to align it with eye
trackers default (central) calibration dot
adjust the eye trackers pupil and corneal
reflection thresholds
calibrate the eye tracker
reset the eye tracker and run (program displays
stimulus and stores data)
save recorded data
optionally re-calibrate again

124
F.4 Data Collection and Analysis

Data collection is fairly straightforward store
point of regard info along with timestamp
Use linked list since number of samples may be
large

Fig.46 2D imaging POR data structure
125
F.4 Data Collection and Analysis (contd)

For VR applications, data structure is similar,
but will require z-component
May also store head position
Analysis follows eye movement analysis models
presented previously
Goals 1) eliminate noise 2) identify fixations
Final point label stored data appropriately
with many subjects, experiments tend to generate
LOTS of data

126
F.4 Data Collection and Analysis (contd)

Fig.47 Example of 3D gaze point in VR
calculated gaze point of user in art gallery
environment
raw data, blinks removed

127
Eye-Based Interaction in Graphical Systems
Theory Practice

Part III
Potential Gaze-Contingent Applications

128
G ApplicationsIntroduction

Wide variety of eye tracking applications exist,
each class increasingly relying on advanced
graphical techniques
Psychophysics Human Factors
Advertising Displays
Virtual Reality HCI Collaborative Systems
Two broad categories diagnostic or interactive

129
H Psychology, Psychophysics, and Neuroscience

Applications range from basic research in vision
science to investigation of visual exploration in
aesthetics (e.g., perception of art)
Examples
psychophysics spatial acuity, contrast
sensitivity, ...
perception reading, natural scenery, ...
neuroscience cognitive loads, with fMRI, ...

130
H Psychology, Psychophysics, and Neuroscience
(contd)
(a) aesthetic group
(b) semantic group

Fig.48 Perception of art
small but visible differences in scanpaths
similar sets of fixated image features

131
I Ergonomics and Human Factors

Applications range from usability studies to
testing effectiveness of cockpit displays
Examples
evaluation of tool icon groupings
comparison of gaze-based and mouse interaction
organization of click-down menus
testing electronic layout of pilots visual
flight rules
testing simulators for training effectiveness

132
I Ergonomics and Human Factors (contd)

Fig.49 Virtual aircraft cargo-bay environment

examination of visual search patterns of experts
during aircraft inspection tasks
3D scanpaths gaze/wall intersection points

133
J Marketing / Advertising

Applications range from assessing ad
effectiveness (copy testing) in various media
(print, images, video, etc.) to disclosure
research (visibility of fine print)
Examples
eye movements over print media (e.g., yellow
pages)
eye movements over TV ads, magazines, ...

134
J Marketing / Advertising (contd)

Fig.50 Scanpaths over magazine ads

135
K Displays

Applications range from perceptually-based image
and video display design to estimation of
corrective display functions (e.g., gamma, color
spaces, etc.)
Examples
JPEG/MPEG (no eye tracking per se, but
perceptually based, e.g., JPDs)
gaze-contingent displays (e.g., video-telephony,
)
computer (active) vision

136
K Displays (contd)
(a) Haar HVS reconstruction
(b) wavelet acuity mapping

Fig.51 Gaze-based foveo-peripheral image coding
2 Regions Of Interest (ROIs)
smooth degradation (wavelet interpolation)

137
L Graphics and Virtual Reality

Applications range from eye-slaved foveal Region
Of Interest (ROI) VR simulators to
gaze-contingent geometric modeling
Examples
flight simulators (peripheral display
degradation)
driving simulators (driver testing)
gaze-based dynamic Level Of Detail modeling
virtual terrains

138
L Graphics and Virtual Reality (contd)

Fig.52 Gaze-contingent Martian terrain
subdivided quad mesh
per-block LOD
resolution level based on viewing direction and
distance

139
L Graphics and Virtual Reality (contd)
140
L Graphics and Virtual Reality (contd)
141
L Graphics and Virtual Reality (contd)
142
L Graphics and Virtual Reality (contd)
143
M Human-Computer Interaction and Collaborative
Systems

Applications range from eye-based interactive
systems to collaboration
Examples
intelligent gaze-based informational displays
(text scroll window synchronized to gaze)
self-disclosing display where digital
characters responded to users gaze (e.g.,
blushing)
multiparty VRML environments

144
M Human-Computer Interaction and Collaborative
Systems (contd)

multiparty tele-conferencing and document sharing
system
images rotate to show gaze direction (who is
talking to whom)
document lightspot (deictic look at this
reference)

Fig.53 GAZE Groupware display

Fig.54 GAZE Groupware interface
145
Eye-Based Interaction in Graphical Systems
Theory Practice

For further information
http//www.vr.clemson.edu/eyetracking
SIGGRAPH course notes
Eye Tracking Research Applications Symposium

146
Dont forget to attend

Eye Tracking Research Applications
Symposium 2000
November 6th-8th 2000, Palm Beach Gardens, FL,
USA
Sponsored by ...
With corporate sponsorship from
http//www.vr.clemson.edu/eyetracking/et-conf/

EyeBased Interaction in Graphical Systems: Theory - PowerPoint PPT Presentation

EyeBased Interaction in Graphical Systems: Theory

CFF explains why flicker is not seen when viewing sequence of still images ... Besides necessary flicker rate (60Hz), illusion of apparent, or stroboscopic, ... – PowerPoint PPT presentation