Title: Identifying and Selecting Measures for Health Disparities Research
1 Identifying and Selecting Measures for Health
Disparities Research
- Anita L. Stewart, Ph.D.
- University of California, San Francisco
- Clinical Research with Diverse Communities
- EPI 222, Spring
- May 8, 2007
2Organization of Class 5
- How to select measures for your own studies of
diverse populations - Measurement issues in research with diverse
populations
3Organization of Class 5
- How to select measures for your own studies of
diverse populations - Measurement issues in research with diverse
populations
4Selecting Measures for Your Own Study The Problem
- You are beginning a study
- You know the concepts (variables) of interest
- Question Which measure of ________ should I
use? - A popular measure
- One that a colleague used successfully
- Create your own
5Inappropriate Measures can Result in
- Conceptual inadequacy
- Measuring wrong concept for your study
- Poor data quality (e.g. missing data)
- Poor variability
- Poor reliability and validity
- Inability to detect true associations
- e.g., no measured change in outcome when change
occurred
6Two Types of Considerations in Selecting Measures
- Contextual - factors unrelated to specific
measurement tools - Characteristics of target population
- Goals of research
- Practical constraints
- Psychometric - properties of measures in your
context
7Basic Steps in Selecting Appropriate Measures
- 1. Specify context
- 2. Define concept for your study
- 3. Locate potential measures for consideration
- 4. Review potential measures for
- a) conceptual match to your definition
- b) adequate psychometric properties in your
target group - 5. Pretest potential measures in your target
group - 6. Choose best ones based on pretest results OR
- 7. Adapt if necessary to address problems
81. Specify Context
- Research question and how concept fits
research (outcome, predictor, covariate) - Nature of target population (health, age, SES,
race/ethnicity, literacy) - Practical constraints (time, personnel, budget,
respondent burden)
9Step 2 Define Each Concept ForYour Study
- Define each concept from your perspective, taking
into account your.. - study questions
- target population
- For outcomes
- describe how the intervention or independent
variables might affect it - describe specific types of changes you expect
10Step 3. Locate Potential Measures
- Identify candidate measures for all domains or
concepts in your framework - For health outcomes
- generic or condition-specific profiles of
multiple domains OR measures of single domains - Redundancy OK for now
- Do NOT develop your own questions unless it is
absolutely necessary
11Locating Measures
- Compendia
- Web
- Organizations
- National surveys
- Large research studies
- Many other sources
12Locating Measures Compendia
- Compendia of measures (reviewed)
- Books that compile various measures and review
their characteristics (e.g., McDowell and Newell) - Web sources of Compendia
- HaPI-Health and Psychosocial Instruments
13Locating Measures Organizations
- RAND publishes measures, scoring manuals, and
lists citations - Measures of quality of care, patient satisfaction
- Measures of health-related quality of life
- Generic and disease specific
- All Medical Outcomes Study measures
- www.rand.org/health/
- Go to surveys and tools
14Inter-University Consortium for Political and
Social Research
- http//www.icpsr.umich.edu/
- Maintains archive of social science data
- Membership-based organization over 500 member
colleges/universities - UCSF is a member
- Can search website using keywords to locate
studies, data, and questionnaires
15MacArthur Research Network on Socioeconomic
Status and Health
- Has measures in several domains
- Psychosocial
- Social and physical environment
- Socioeconomic status (SES)
- SES across the lifecourse
- MacArthur Network on SES and Health
- www.macses.ucsf.edu/research/overview.htm
16Examples of Psychosocial Measures
- Anxiety
- Coping
- Depression
- Discrimination
- Hostility
- Optimism/pessimism
- Personal control
- Psychological stress
- Purpose in life
- Self-esteem
- Social support
- Vitality and vigor
http//www.macses.ucsf.edu/Research/wgps.htm
17Examples of Environmental Measures
- Individual level
- Economic status
- Occupation
- Education
- Sociodemographic characteristics
- Neighborhood level
- Residential segregation
- Physical environment
- Stress in environment
- Availability of healthy foods
http//www.macses.ucsf.edu/Research/Social20Envi
ronment/chapters.html
18Ottawa Health Research Institute Goals
- Explore ways to help patients make "tough"
healthcare decisions (multiple options, uncertain
outcomes, benefits/harms that people value
differently) - Measure and understand decision support needs of
people and patients, particularly disadvantaged
groups, and their providers - e.g., decisional conflict, decision
self-efficacy, stages of decision making - www.decisionaid.ohra.ca/eval.html
19Locating Measures National and State Surveys
- National Center for Health Statistics
- Surveys and data collection systems
- NHIS, NHANES, HHANES, BRFSS, CHIS, MEPS
- Entire surveys can be downloaded
- Seldom includes measurement information
20National Center for Health Statistics
- National Health Interview Survey
- National Health and Nutrition Examination Survey
- National Health Care Survey
- Ambulatory health care data (NAMCS)
- National Home and Hospice Care Survey
- National Survey of Family Growth
- National Maternal and Infant Health Survey
- Longitudinal Studies of Aging (LSOA)
- www.cdc.gov/nchs/express.htm
21Locating Measures National Agencies
- Agency for Healthcare Research and Quality (AHRQ)
- National Quality Measures Clearinghouse
- Consumer Assessment of Health Plans Survey
(CAHPS) - National Healthcare Quality Report
- National Healthcare Disparities Report
22Locating Measures National Agencies
- National Institutes of Health
- NCI has a measurement initiative
- Health Information National Trends Survey (HINTS)
- Working group compiled cancer screening
questions, identified best ones, conducted
extensive pretesting, cognitive interviewing - Measures are on the NCI web site
23Locating Measures National Agencies
- US Dept. of Veterans Affairs, National Center for
PTSD - Has assessment section of web site
- Variety of instruments to measure stress and
trauma exposure - www.ncptsd.va.gov/ncmain/assessment/
24Locating Measures Large Studies and Centers
- Large research studies on similar topic
- Health ABC, CARDIA, WHI, EPESE
- Projects/Centers
- Toolkit to measure end of life care (TIME)
- Stanford Patient Education Research Center
- Michigan Diabetes Research and Training Center
- Resource Centers for Minority Aging Research
25Locating Measures Bibliographic Searches
- Published measurement articles
- Medline Searches
- MESH headings or keywords
- health status indicators
- outcome assessment (health care)
- psychometrics (methods)
- questionnaires
26Locating Measures Finding Authors of Measures
- Published research using measure you are
interested in - Unpublished measures often described in methods
- Authors may provide measures
27Step 4 Review Potential Measures for
- Conceptual appropriateness relevance
- in your study
- in target group
- Psychometric adequacy in target group or groups
- Practicality
- Acceptability
- To respondents and interviewers
28Conceptual Relevance
- Example you are interested in reports of
perceived discrimination in the health care
setting - In reviewing measures of discrimination, most are
about - Discrimination over the lifecourse
- Discrimination in various life settings (work,
school) - Not relevant for your purpose
29Concept Depicted as a Measurement Model
- Measurement model
- the dimensional structure of a measure
- how the items related to the construct
- Can be depicted as a list or visually
30Measurement Models
- Physical Functioning (4 items)
- Psychological Distress (7 items)
31Measurement Model (List format)
- Physical Functioning defined in terms of
- Walking
- Climbing stairs
- Bending
- Reaching
32Measurement Model (Visual format)
Physical Functioning
Reaching
Climbing Stairs
Bending
Walking
33Measurement Model (List format)
- Psychological distress
- Depression
- Sad
- Lost interest
- Cant get going
- Anxiety
- Restless
- Nervous
34Measurement Model (Visual format)
Psychological Distress
Depression
Anxiety
Sad
Lost interest
Cant get going
Restless
Nervous
35Psychometric Adequacy for Your Study
- In samples similar to yours
- good variability
- low percent of missing data
- good reliability
- good validity
- As an outcome for your planned intervention
- responsive, sensitive to change in similar
population - able to detect expected magnitude of change
36Good Variability
- All (or nearly all) scale levels are represented
- Distribution approximates bell-shaped normal
- No floor or ceiling effects
- Scores bunched at either end
37Reliability
- Extent to which an observed score is free of
random error - Population-specific reliability increases with
- sample size
- variability in scores (dispersion)
- a persons level on the scale
38Reliability Coefficient
- Typically ranges from .00 - 1.00
- Higher scores indicate better reliability
- Types of reliability tests
- Internal-consistency
- Test-retest
- Inter-rater
- Intra-rater
39Internal Consistency Reliability Cronbachs
Alpha
- Requires multiple items measuring same construct
- Extent to which items measure same construct
(same latent variable) - It is a function of
- Number of items
- Average correlation among items
- Variability in your sample
40Minimum Standardsfor Internal Consistency
Reliability
- For group comparisons (e.g., regression,
correlational analyses) - .70 or above is minimum
- .80 is optimal
- above .90 is unnecessary
- For individual assessment (e.g., treatment
decisions) - .90 or above (.95) is preferred
JC Nunnally, Psychometric Theory, McGraw-Hill,
1994
41Validity
- Does a measure (or instrument) measure what it is
supposed to measure? - AndDoes a measure NOT measure what it is NOT
supposed to measure?
42Validation of Measures is an Iterative, Lengthy
Process
- Validity is not a property of the measure
- validity is a property of a measure for
particular purpose and sample - validation studies for one purpose and sample may
not serve another purpose or sample - Accumulation of evidence
- Different samples
- Longitudinal designs
43Three Major Forms of Measurement Validity
- Content
- Criterion
- Construct
44Construct Validity Basics
- A process of answering the following questions
- What is the hypothesis?
- What are the results?
- Do the results support (confirm) the hypothesis?
45Construct Validity NOTE
- Sometimes the hypothesis is that the measure will
NOT be correlated with certain other measures, or
will be less correlated with some than with
others - THUS, observing a low or non-significant
correlation can confirm construct validity
46Limited Data on Psychometric Properties of Many
Measures
- Not easy to find this information
- Many studies do not report any psychometric
properties - Assume the properties from original study carry
over
47Limited Data on Psychometric Properties of Many
Measures (cont.)
- Especially in diverse populations
- Few studies test measures across diverse groups
- Even when diverse groups are included in research
- sample sizes usually too small to conduct
measurement studies by subgroups
48Review Measures for Practicality
- Method of administration appropriate for your
study - Costs of administration within study resources
- Scoring rules clearly documented
- Measure available at cost you can afford
- You are allowed to adapt it if necessary
- Translations available if needed
49Practical - Scoring
- Know ahead of time how to score items
- Count of correct answers? Summated scale?
Weighted? - Are scoring instructions or computer scoring
programs available? - Can scoring programs be purchased from
developers? - Do you have a scoring codebook?
50Review Measures for Availability of Translations
if Needed
- If you need the questionnaire in another
language, are there translations available? - Official (published and tested)
- Unofficial (by some other researcher)
- If not, you have to conduct translations
- Use state-of-the-art methods
51Review Measures for Acceptability
- Acceptability to target population
- respondent burden (length, time needed, distress)
- culturally sensitive
- Acceptability to interviewers
- interviewer burden
- amount of training needed
52Respondent Burden
- Diverse populations may have more difficulty with
instruments, take longer to complete - Perceived burden
- a function of item difficulty, distress due to
content, perceived value of survey, expectations
of length - as important as time burden
535. Pretest Potential Measures in Your Target
Population
- Select best measures for all concepts in your
conceptual framework - existing instrument in its entirety
- subscales of relevant domains (e.g., only those
that meet your needs)
54Pretest
- Pretesting essential for priority measures (e.g.,
outcomes) - Pretest is to identify
- problems with method of administration
- unacceptable respondent burden
- problems with questions or response choices
- Hard to understand, complex, vague
- words and phrases that do not mean what you
intended to target population
55Types of Pretests
- General pretest, small (N10)
- Cognitive interviewing (N5-10 each group)
- Large pretest (N100)
- test measurement properties prior to major study
56Conduct Pretests in All Diverse Groups Being
Included in Your Study
- Important to recruit people from each of your
target populations - Wont learn anything if you just recruit friends,
persons easy to recruit
57Organization of Class 5
- How to select measures for your own studies of
diverse populations - Measurement issues in research with diverse
populations
58The Measurement Problem in Studies of Diverse
Populations
- Measurement goal - identify measures that can be
used across all diverse groups in your study, and - are sensitive to diversity
- have minimal bias between groups
- Most self-reported measures were developed and
tested in mainstream, well-educated groups
59Typical Sequence of Developing New Self-Report
Measures
Develop concept
Create item pool
Pretest/revise
Field survey
Psychometric analyses
Final measures
60Extra Steps in Sequence of Developing New
Self-Report Measures for Diverse Groups
Obtain perspectives of diverse groups
Develop concept
Create item pool
Pretest/revise
Field survey
Psychometric analyses
Final measures
61Extra Steps in Sequence of Developing New
Self-Report Measures for Diverse Groups
Obtain perspectives of diverse groups
Develop concept
Create item pool
.. to reflect these perspectives
Pretest/revise
Field survey
Psychometric analyses
Final measures
62Extra Steps in Sequence of Developing New
Self-Report Measures for Diverse Groups
Obtain perspectives of diverse groups
Develop concept
Create item pool
.. to reflect these perspectives
.. in all diverse groups
Pretest/revise
Field survey
Psychometric analyses
Final measures
63Extra Steps in Sequence of Developing New
Self-Report Measures for Diverse Groups
Obtain perspectives of diverse groups
Develop concept
Create item pool
.. to reflect these perspectives
.. in all diverse groups
Pretest/revise
Field survey
.. in all diverse groups
Psychometric analyses
Final measures
64Extra Steps in Sequence of Developing New
Self-Report Measures for Diverse Groups
Obtain perspectives of diverse groups
Develop concept
Create item pool
.. to reflect these perspectives
.. in all diverse groups
Pretest/revise
Field survey
.. in all diverse groups
Measurement studies across groups
Psychometric analyses
Final measures
65Extra Steps in Sequence of Developing New
Self-Report Measures for Diverse Groups
Obtain perspectives of diverse groups
Develop concept
Create item pool
.. to reflect these perspectives
.. in all diverse groups
Pretest/revise
Field survey
.. in all diverse groups
If results are non-equivalent
Psychometric analyses
Final measures
66Measurement Adequacy vs. Measurement Equivalence
- Making group comparisons requires conceptual and
psychometric adequacy and equivalence - Adequacy - within a diverse group
- concepts are appropriate
- psychometric properties meet minimal criteria
- Equivalence - between diverse groups
- conceptual and psychometric properties are
comparable
67Conceptual and Psychometric Adequacy and
Equivalence
Conceptual
Concept equivalent across groups
Concept meaningful within one group
Adequacyin 1 Group
Equivalence Across Groups
Psychometric properties meet minimal
standards within one group
Psychometric properties invariant
(equivalent) across groups
Psychometric
68Conceptual Adequacy in One Group
Conceptual
Concept equivalent across groups
Concept meaningful within one group
Adequacyin 1 Group
Equivalence Across Groups
Psychometric properties meet minimal
standards within one group
Psychometric properties invariant
(equivalent) across groups
Psychometric
69Conceptual Adequacy in One Group
- Is concept relevant, meaningful, and acceptable
in that group? - Traditional research
- Conceptual adequacy simply defining a concept
- Mainstream population assumed
- Minority and health disparities research
- Mainstream concepts may be inadequate
- Concept should correspond to how a particular
group thinks about it
70Example of Inadequate Concept
- Patient satisfaction typically conceptualized in
mainstream populations in terms of, e.g., - access, technical care, communication,
continuity, interpersonal style - In minority and low income groups, additional
relevant domains include, e.g., - discrimination by health professionals
- sensitivity to language barriers
71Psychometric Adequacy in One Group
Conceptual
Concept equivalent across groups
Concept meaningful within one group
Adequacyin 1 Group
Equivalence Across Groups
Psychometric properties meet minimal
standards within one group
Psychometric properties invariant
(equivalent) across groups
Psychometric
72Psychometric Adequacy in a Single Diverse Group
- Measures meet minimal psychometric criteria in
(new) group - Measures have similar measurement properties in
diverse group as in original mainstream groups on
which the measures were developed
73Psychometric Adequacy in any Group
- Minimal standards within a group
- Sufficient variability
- Minimal missing data
- Adequate reliability/reproducibility
- Evidence of construct validity
- Evidence of responsiveness to change
- Basic classical test theory approach
74Why Not Use Culture-Specific Measures?
- Measurement goal is to identify measures that can
be used across all groups, yet maintain
sensitivity to diversity and have minimal bias - Most health disparities studies require comparing
mean scores across diverse groups - need comparable measures
75Group Comparisons Are the Most Problematic
- Disparities studies involve comparing mean levels
of health or determinants - Requires conceptual and psychometric equivalence,
or s - potential true differences may be obscured
- observed group differences may be inaccurate
76Issues Concerning Group Comparisons
- Observed mean differences across groups in a
measure can be due to - culturally- or group-mediated differences in true
score (true differences) -- OR -- - bias - systematic differences between group
observed scores not attributable to true scores
77Conceptual Equivalence Across Groups
Conceptual
Concept equivalent across groups
Concept meaningful within one group
Adequacyin 1 Group
Equivalence Across Groups
Psychometric properties meet minimal
standards within one group
Psychometric properties invariant
(equivalent) across groups
Psychometric
78Conceptual Equivalence
- Is the concept relevant, familiar, acceptable to
all diverse groups being studied? - Is the concept defined the same way in all
groups? - all relevant domains included (none missing)
- interpreted similarly
- Is the concept appropriate for all diverse groups?
79Bias - A Special Concern
- Measurement bias in any one group may make group
comparisons invalid - Bias can be due to group differences in
- meaning of concepts or items
- extent to which measures represent a concept
- cognitive processes of responding
- use of response scales
- appropriateness of data collection methods
80Effects of Bias on Depression Example in Chinese
Respondents
- Three sources of bias tend to lower observed
score - tendency not to express negative feelings
- meaning of word depression in Chinese is more
severe than for Whites - Comparing groups assume true level of
depression is the same in both groups - Observed scores lower in Chinese group due to
these biases
81Example of Effect of Biased Items
- 5 CES-D items administered to Black and White men
- 1 item subject to differential item functioning
(bias) - 5-item scale including item suggested that Black
men had higher levels of somatic symptoms than
White men (p lt .01) - 4-item scale excluding biased item showed no
differences between Black and white men
S Gregorich, Med Care, 200644S78-S94.
82Summary
- Selecting best measures is critical to validity
of research - Very little published information on measurement
properties in diverse groups - Raises issues of conceptual and psychometric
adequacy and equivalence - Pretesting is the most important thing you can do
83Summary (cont)
- Methods described here are ideal
- Impractical for most researchers
- Apply these methods to your most important
measures - e.g., outcomes, key independent variables
- Keep learning
- Good, appropriate measures remain the foundation
of excellent research
84Want More?
- Epi 225 Measurement in Clinical Research
- Fall 2007
- Thursday 330-5 China Basin
- See handout