Title: Objective Evaluation of Subjective Decisions
1Objective Evaluation of Subjective Decisions
- Mel Siegel Huadong WuRobotics Institute
School of Computer Science - Carnegie Mellon University - Pittsburgh PA 15232
USA
SCIMA-2003 Soft Computing Techniques in
Instrumentation, Measurement and Related
Applications Brigham Young University Provo UT
USA2003 May 17
2outline
- background problem of sensor fusion for context
aware computing - approach development of an adaptive weighted
Dempster-Shafer (D-S) algorithm - issue ( the talks title) objective evaluation
of subjective decisions - meta-issue is it really an issue?
- discussion receiver operating characteristic
- closing the loop ROC ?? D-S ?
3background
- context detection for HCI
- e.g., your cell phone could ring louder if it
could know it is in your briefcase - context detection requires subjective evaluation
of ordinary sensor signals - sensor fusion required when we have multiple
detectors, none of them very good - sequence of algorithms culminates in an
adaptively weighted Dempster-Shafer method
4Focus-of-Attention decisionby fusion of video
and audio data
5sensor fusion alternatives
1. complementary
3. cooperative
Parametric template, Figures of merit, Syntactic
pattern recognition
Logical template AI rule-based reasoning, Heuristi
c inference Neural network
2. competitive
Classic Inference Sensor i Pi( x detected x appeared ) Simple effective for x vs. x problems Priori knowledge and pdf are required to combine multiple sensor outputs, priori assessments are not used, do not have enough reasoning power
Voting Fusion Associate pdf with confidence estimation, and provide a way to predict the result probabilities of their boolean combinations Though big improvement over Classic Inference method, still not powerful enough to reason at fine granularity
Bayesian Network Likelihood of a hypothesis is updated using a previous likelihood estimation and additional evidence cannot distinguish between lack of belief and disbelief, cannot address a problem like its likely either user A or user B
Fuzzy Logic No pdf required, very cheap in computation It doesnt make sense that a person is assigned as 0.6 membership of user A, 0.7 membership of user B, and 0.9 membership of either user A or B
Neural Network Flexible, powerful, no pdf needed, cheap computational cost in classification process Local minimal problem, results cannot be easily explained, not suitable for dynamic configuration of sensors
6our problem Bayes cant do it
head pan
left
straight
right
sensor noise
right
observed pan
straight
left
straight
right
right
7approachthe Dempster-Shafer method
a theory of evidence
allows belief and plausibility
quantifies both knowledge and ignorance
a generalization/extension of Bayesian inference
network
8sensor fusion using classical Dempster-Shafer
Theory of Evidence
L0.3 R0.6 LR0.1
L0.4 L0.4x0.3 F0.4x0.6 L0.4x0.1
R0.5 F0.5x0.3 R0.5x0.6 R0.5x0.6
LR0.1 L0.1x0.3 R0.1x0.6 LR0.1x0.1
9extension of Dempster-Shafer evidence weighted
by sensors reliabilities
10further extension of Dempster-Shafer weights
change according to performance history
overcomes sensor drift problem!
11an arbitrary effectiveness measure
12generalizing via a simulation ...
head pan
left
straight
right
sensor noise
right
observed pan
straight
left
straight
right
right
13... yields an intriguing resultwhen sensor
precisions are very different
14the issue ...
- objective evaluation of subjective decisions
- a meta-issue is it really an issue?
15objective vs. (?) subjective
- in medicine the distinction is sharp
- subjective means what the patient tells the
physician about his/her complaint, what he/she
thinks is the problem, etc - objective means what the physician observes (and
his/her instruments report) about the condition
of the patient - statisticians talk about rational gambling
- but in most contexts it feels fuzzier ...
16- and even physicians make subjective decisions
- whose quality we can evaluate objectively!
patientreally has SARS patient really doesnt have SARS
physician says patient has SARS TRUEPOSITIVE FALSEPOSITIVE
physician says patient doesnt have SARS FALSENEGATIVE TRUENEGATIVE
17receiver operating characteristic
- originally developed for target analysis
- considers ratio of signal to signal-plus-noise
vs. the discriminator level set - adopted and extensively developed in the medical
diagnostic test community - TP, TN ?? signal, FP, FN ?? noise
- most physicians understand a tests sensitivity
TP/(TPFN) andspecificity TN/(TNFP)vs.
the chosen cut point of the test
18ROC
- (dotted) ideal
- (dashed) useless
- reliable
- (b) typical
-- increasing cut point increases TPs (good) and
FNs (bad) -- decreasing cut point increases TNs
(good) and FPs (bad)
19closing the loop? ...
plausibility
Dempster-Shafer
belief
evidence that supports X-- fever-- white
tongue-- headache
evidence that rules out X-- no virus detected--
had disease once before-- over age 55
TP
TN
ROC
FN
FP
cut point
20conclusions / questions
- adaptive weighted D-S seems to contribute an
incremental but real improvement in appropriate
sensor fusion applications - objective/subjective distinction is fuzzy
- maybe ROC and related cut point analysis
techniques can help us set neural net, fuzzy
system, etc, parameters that are now set either
arbitrarily or iteratively (hence slowly) - is the apparent connection between D-S and ROC
superficial, or real at some deep level?