Title: Comments on WP
1Comments on WP 3.
Working Paper No.13 21 November 2005
STATISTICAL COMMISSION and STATISTICAL
OFFICE OF THE UN ECONOMIC COMMISSION
FOR EUROPEAN COMMUNITIES EUROPE
(EUROSTAT) CONFERENCE OF EUROPEAN WORLD
HEALTH STATISTICIANS ORGANIZATION
(WHO) Joint UNECE/WHO/Eurostat Meeting on the
Measurement of Health Status (Budapest, Hungary,
14-16 November 2005) Session 3
- Discussant Ian McDowell,
- University of Ottawa,
- Canada
2Clarify Purpose Description or evaluation?
Design implications of each
- Descriptive
- Broad ranging. Goal to classify groups
- Themes of interest to people in general (quality
of life, etc) issues of public concern - To debate Emphasize modifiable themes?
- To debate profile rather than index?
- Evaluative
- Content tailored to intervention usually not
comprehensive - Needs to be sensitive to change produced by
particular intervention - Focused fine-grained select indicators that
sample densely from relevant level of severity
unidimensional - ? emphasis on summary score
Discussion point does proposed instrument need
to serve as an evaluative measure?
3Purpose, Performance and Capacity
Analytic purposes
Descriptive purposes
Potential
Unmet needs
Capacity(with any aids)
Environment
Needsthat have been met
Currentpicture
Performance
Capacity (without aids)
4Parsimony, Sensitivity Specificity
- These are in tension! Need for brevity implies
- If goal is to have broad coverage of domains
(descriptive measure), there can only be few
items in each - To achieve breadth within a domain in few items,
we need to use generic items (e.g., the infamous
can you cut your toenails?) - This can achieve sensitivity as a screen, but at
cost of low specificity cannot classify type of
condition - Will also lose interpretability and
unidimensionality - Point 38 the WP discussion of physical function
illustrates choice between measuring overall, vs.
specific functions. Do we care whether its knee
pain, or muscle weakness, or balance that limits
walking ability?
5Unidimensionality (point 11)
- IRT goal of unidimensionality is hard to apply in
many areas of health measurement. Some topics
are hierarchical symptoms of depression (e.g.)
are not, so in IRT analyses, depression or
anxiety scales often do not meet
unidimensionality criterion - Unidimensionality is chiefly important for
clinical interpretation maybe evaluation not
the issue here. Surveys focus on how bad it is,
not what it is - If instrument will be scored as an index, the
issue of unidimensionality becomes irrelevant as
all the items are combined and its impossible to
visualize the persons disability anyway - There is an inherent tension between using
generic, screening-type items (e.g., IADLs) and
unidimensionality - Many functions involve more than one body system
(e.g., recognizing a face across street), so are
not unidimensional
6The Time Frame Debate
- WP 1 says present WP 3 much broader (
varied) - If sample is large, could use yesterday to get
prevalence, but will not tell incidence, or
duration of condition - Duration requires additional questions, as does
change - Width of time window not very important average
is just calculated over a shorter or longer time - Suggest one week (to capture week-ends, etc) or
else yesterday (as today is incomplete)
Sampling window
Problem!
A
B
C
Change only captured if additional questions
asked,so cant distinguish A from B
7Time Window Response Shift
- (Point 13) Larger time windows, and phrasing in
terms of usual can face issue of response shift
(recalibration of persons view of what is
normal) - Usual phrasing seems most problematic may miss
chronic disabilities (cf. criticism of GHQ)
cannot record incidence, maybe not even prevalence
Response Shift
Perception of usual function
Actual trajectory
Typical delay varies according to a range of
factors
8Continuous States vs. Episodic Events
- Mobility limitations often endure. By contrast,
pain, anxiety or marital disputes are commonly
episodic - Averaging over broad time-window can be an issue
for the episodic events (point 15), because - Averaging episodes raises issue of frequency vs.
intensity of events (see next slide) - In general, time averaging is less of an issue
for capacity than for performance, because
capacity is enduring, performance may fluctuate - However, the notion of capacity is hard to apply
to pain, anxiety and depression (in which wording
a question in capacity terms tends to approximate
performance)
9Combining Severity Frequency (e.g., anxiety
questions point 76 pain, point 97)
versus
?
time
- Risk of trying to do too much. The problem of
summarizing frequency severity grows with
increasing length of retrospection. If
yesterday is used, you need only ask about
severity - The term level (How would you describe your
level of anxiety?) is unclear presumably some
combination of severity frequency of episodes,
but how does respondent combine these? - Options. PhD level We want you to judge the
overall amount of pain, considering both
intensity and frequency, you have experienced
Simpler How bad was your pain? Mild,
moderate, severe
10Response options Frequency vs. Difficulty (point
30)
- For chronic conditions, evidently intensity
responses are more appropriate - For fluctuating conditions (insomnia,
depression), frequency seems most appropriate - If brief recall periods, use intensity responses
- For longer-term recall, use frequency
- Also, need to decide on relative vs. absolute
responses. E.g., do you have difficulty keeping
up with people your own age? - Likewise, do we specify level ground for
walking, or where you live. The first is close
to disability and may not be relevant to them,
the second (handicap) will be relevant but may
make direct comparisons difficult
11Discuss Structure of Overall Instrument
- Can it be made dynamic? Item banking tailored
responses computer administration or using skip
patterns. Some examples - Cella http//outcomes.cancer.gov/conference/irt/c
ella_et_al.pdf - www.amIhealthy.com
- Ware JE et al. Item banking and the improvement
of health status measures. Quality of Life
Newsletter 2004 Fall (Special Issue)2-5. - Bjorner JB et al. Using item response theory to
calibrate the Headache Impact Test (HIT) to the
metric of traditional headache scales. Qual Life
Res 2003 12981-1002
12Reference for upper level of function
- Best possible function
- Compared to your potential
- Compared to average person of your age
- Without difficulty
- To adjust for age or not?
13Prosthetics, Analgesics, etc. (points 20-25)
- Rocks hard places
- Without aids approximates impairment with aids
disability - But this distinction is hard to make in ICF
activity and participation both sound like
performance rather than capacity - Not quite clear why eye glasses are singled out
for inclusion, while walking sticks apparently
are not - Asking an amputee about mobility without his
prosthesis seems artificial (point 21) - Likewise, if they are taking effective
analgesics, its hard for them to report pain
without (points 24 25) - If purpose is to indicate health states in this
nation, suggest the approach of using any aids
you normally use. - Suggest not relying on use of analgesics as way
to indicate severity (point 22), because
availability will vary greatly
14Visual Analogue Scales
- In clinical settings, VAS, NRS pain ratings
intercorrelate highly. Verbal scales correlate
with both, but less closely - VAS is visual, so implies use of paper pencil
- If used in telephone format, VAS reduces to a
NRS, so why not just use NRS? - Less educated and older patients appear to find
NRS easier than VAS, so these have been endorsed
for use in cancer trials (Moinpour et al., J Natl
Cancer Inst 1989 81485-495) - The FLIC began with VAS, but changed to 6-pt NRS
- However, the VAS can be very responsive (e.g.,
Hagen et al, J Rheumatol 1999 261474-1480). But
do we need responsiveness? - Many alternative formats, including graphic
rating scale (Dalton et al, Cancer Nurs 1998
2146-49) or box scale (Jensen et al, Clin J Pain
1998 14343-349). See also Cella Perry,
Psychol Rep 1986 59827-833, and Scott
Huskisson, Pain 1976 2175-184.
15Anxiety Depression
- Trying to discriminate between these may focus
attention on the trees rather than the forest - Unitary theory sees A D as expressions of the
same pathology the opposing perspective sees
them as fundamentally different, while the
compromise is to view them as having common roots
but different expressions (Brown et al, J Abnorm
Psychol 1998 107179-192). - Anxiety suggests arousal and an attempt to cope
with a situation depression suggests lack of
arousal and withdrawal the NE and SE quadrants
of the diagram (next slide) - An anxious person might say That terrible event
is not my fault but it may happen again, and I
may not be able to cope with it but Ive got to
be ready to try. A depressed person might say
That terrible event may happen again and I wont
be able to cope with it, and its probably my
fault anyway so theres really nothing I can do.
(Barlow DH. The nature of anxiety anxiety,
depression, and emotional disorders. In Rapee
RM, Barlow DH, eds. Chronic anxiety generalized
anxiety disorder and mixed anxiety-depression.
New York Guilford, 1991 1-28)
16A circumplex model of affect
High positive affect
Anxiety
active,elated,excited
Strong engagement
Pleasantness
content,happy,satisfied
aroused,astonished,concerned
High negative affect
Low negative affect
relaxed,calm, placid
distressed, fearful, hostile
sad, lonely, withdrawn
inactive,still,quiet
sluggish,dull,drowsy
Disengagement
Unpleasantness
Low positive affect
Depression
17Emotions Affect scattered thoughts
- How to fit affect within capacity / performance
distinction? Many anxiety questions use either
state or performance wordings (How severe was
you anxiety? or Did anxiety limit your daily
activities?) - Why try to distinguish anxiety and depression?
- Not completely clear why we need both positive
and negative affect (point 68) if time frame
correctly chosen, they should not be orthogonal - Phrase such as upset or distressed may capture
general affect quite well - Stress may also be pertinent cf. DASS of
Lovibond (Manual for the Depression Anxiety
Stress Scales. Sydney Psychology Foundation,
1995)