Quantitative Methods: - PowerPoint PPT Presentation

1 / 96
About This Presentation
Title:

Quantitative Methods:

Description:

Document providing good care and outcomes. Document effect of managed care. ... Promotes your program and yourself to others... – PowerPoint PPT presentation

Number of Views:342
Avg rating:3.0/5.0
Slides: 97
Provided by: stevenba5
Category:

less

Transcript and Presenter's Notes

Title: Quantitative Methods:


1
Quantitative Methods
  • Introductory Overview How To Guide
  • by
  • Steven B. Auerbach, MD, MPH, FAAP
  • Health Resources Services Administration
  • Deputy Director Medical Epidemiologist
  • Office of Data Analysis
  • NorthEast Cluster (Regions I, II, III)
  • Room 3337, 26 Federal Plaza, New York, NY 10278
  • T 212-264-2550
  • F 212-264-2673
  • E sauerbach_at_hrsa.gov

2
What We Will Cover
  • Why you need to do Quantitative Methods
  • How Why it is feasible practical
  • Reasons to do Quantitative Methods
  • Getting started organizationally
  • Ethics IRBs
  • Format of Protocol
  • Types of Study Designs
  • Questionnaires and Measurement
  • Subjects Sampling
  • Measurement Error, Validity Reliability
  • Association Causation
  • Descriptive Statistics
  • Inferential Statistics
  • References Resources Software, Books, Web Sites

3
Data Competencies
  • Ability to develop primary data-sets.
  • Ability to utilize secondary data-sets.
  • Ability to conduct statistical analyses.
  • Ability to use computer systems and packages.
  • The ability to conduct needs assessments.
  • The ability to develop program evaluation and
    research designs.
  • The ability to conduct economic analysis,
    cost-effectiveness/utility/benefit.
  • The ability to develop, and maintain, quality
    assurance, monitoring, tracking, and management
    information systems.

4
Ecology of Illness Health
5
Why it is important for you to do
  • Need analysis in the community health and primary
    settings to find out what works in the real
    world.
  • Most biomedical research is set in university
    medical centers, and is too selective and too
    controlled, idealized ivory tower, to be
    directly applicable in real world settings.
  • Much of public health research is too
    large-scale population-based, too broad to be
    directly programmatically applicable or
    clinically applicable in real-world community
    health or primary care settings.
  • E.g,. Do the Guidelines or Best Practices
    established from research in other settings,
    apply and can be implemented, really, in your
    setting?

6
Why it is useful for you to do
  • Collect your own performance data.
  • Market yourself to policy makers, payers and
    patients.
  • Increased strength and autonomy of your
    organization.
  • Promotes horizontal linkages with similar
    organizations (network).
  • Promote vertical links with universities,
    foundations, advocacy groups.
  • Learn skills useful in variety of applications,
    capacity building for organization and staff
    grant writing, analytic thinking, etc.
  • Document that given program is important,
    working, worth funding.
  • Document that program is cost-effective.
  • Document providing good care and outcomes.
  • Document effect of managed care.
  • Only way to know program works outcomes good,
    is to MEASURE it!
  • Best way to communicate your work success is to
    PUBLISH it!

7
Doable even given limited resources other
priorities
  • Academics, Industry Others want to work with
    you
  • They need access to data, interesting questions,
    and possible answers.
  • You need funding, technical assistance, warm
    bodies to do the work.
  • Get Free Help
  • Medical, Nursing, Public Health, Economics,
    Sociology, Psychology
  • Profs Grad Students Let their needs and skill
    work for you.
  • Make it Pay
  • Linkage, grants, cooperative agreements with
    variety of third parties Federal Agencies (NIH,
    CDC, AHRQ, HRSA), Disease Associations,
    Foundations, Drug Companies, Managed Care
    Insurance Companies, as well as Universities
    Academics.

8
Research and Publish are not dirty words
  • Might as well do it better more formally so
    that its quality and meaning is better.
  • Turn the program assessment, QA/QI, you need to
    do anyway into intervention or outcome study.
  • If make it research and publish, it can pay for
    itself with additional funding to do own
    performance data, program assessment, outcome.
  • One persons assessment is anothers research
    Publish.
  • Publishing is how we document communicate
    success to others.
  • Add to generalizable knowledge.
  • Promotes your program and yourself to others
  • Raise your profile to policy makers, payers,
    patients and providers
  • Publishing gets attention and funding.
  • Easier and better recruitment and retention.

9
Range of Reasons to do
  • Descriptive -- Describe a Phenomenon with
    Rates, Frequencies Distributions
  • Needs Assessment
  • Quality Assurance
  • Satisfaction
  • Incidence and Prevalence, Vital Statistics
  • Knowledge, Attitude Beliefs and Behaviors
    (KABBS)
  • Explanatory Relationship Between Phenomena
    Association Correlation
  • Disease Risk Case-Control and Cohort Studies
    of Classic Epidemiology
  • Predictive Implement Intervention, Program,
    Treatment, then Measure Effect
  • Quality Improvement
  • Program Evaluation
  • Clinical Outcome Measures
  • Cost Effectiveness and Cost Benefit
  • Disease Prevention
  • Health Promotion

10
Identify the Topic
  • Originate from Personal Experience
  • from Societal Trends
  • from Professional Trends
  • from Previously Published Research
  • from Theory
  • Interesting What about this topic is of interest
    to me and other investigators
  • What about this topic is relevant to my
    immediate setting (boss, payers, community,
    colleagues, patients).
  • Important What about this topic is of broader
    social and professional importance?
  • How is it Novel, Important, Significant.
  • Is it new or at least confirm/refute/extend
    prior findings?
  • Does it pass the So What Test once get result
    Yes/No now so what?
  • Answerable Can it be known and answered
    concretely, as phrased, in principle
  • Feasible Knowable within the the constraints of
    time, time frame to get results, actual
    setting, , personnel, competing agendas, lost
    opportunity costs, ethics

11
Make it Useful
  • How did fact that Care occurred specifically in
    this setting make a difference?
  • How did fact that Study occurred specifically in
    this setting make a difference?
  • How does doing the study This Way make a
    difference?
  • What benefit does the Organization get?
  • What benefit do the participating Staff get?
  • What benefit do the Participant Subjects get?
  • What benefit does the the Community at-large get?
  • Who are the potential Funders, and what benefit
    do they get?
  • How does doing this Study this Way, further
    support the other agendas of the organization?

12
Make it Doable
  • Health Departments, Community-based
    Organizations, NGOs
  • Collecting Using Data, doing studies, are not
    the priority Agenda.
  • May not have adequate day-to-day an long-term
    incentives for doing.
  • May not have adequate support for doing.
  • Yet, still must do it.
  • Creative Tension between Doable and Useful
  • Useful, interesting, worth the effort
  • Quality work, results truly mean what they say
    (poor quality politics)
  • But not too burdensome given limits of staff,
    time, money, other priorities
  • Doable to success, given limitations.

13
Multiple Constituents Agendas
  • Multiple differing agendas of people
    institutions who are de facto involved
  • Who are ALL the participants
  • Who is this going to help
  • Who is being imposed upon, or potentially
    otherwise affected
  • Direct
  • the Investigators
  • the Subjects
  • Others in Organization Colleagues, Boss Staff
  • Funder
  • those higher up with Responsibility or Authority
    (e.g, Politician).
  • Indirect
  • Family of Participants
  • Non-Participating Clients Patients
  • Other Payers Funders (cost-shifting, overhead)
  • Those effected by what didnt get done instead
    (lost-opportunity cost)

14
Format for Planning Protocol
  • Objectives What are the question(s) to be
    answered?
  • Background Significance why are these
    questions interesting important?
  • In general? In your specific setting? What
    is already known and not known?
  • Design What is the study design time frame?
  • Write out the process, step-by-step. Also
    Flow Chart as branching diagram
  • What you are going to do? How you are going
    to do it?
  • Subjects Person, Place Time Setting,
    Selection Criteria, Sampling Design.
  • Who are the subjects and how are they to be
    selected?
  • Variables Predictors, Outcomes, Covariates
    Confounders
  • What are the actual measurement to be made?
  • What are the data sources, from where
    whom, and how to be obtained?
  • Statistics Sample Size Power Calculations.
  • How will data be Handled, Entered, Cleaned
    Descriptive Inferential?
  • Dummy Tables how will present the
    variables, just with results blank.

15
and for Presentation Manuscript
  • Results Present the numeric results without
    comment.
  • Tables, Graphs, Charts, Maps as well as Text.
  • Conclusions What do the results mean (taken at
    face value)?
  • Should reference back to the original stated
    Objective(s).
  • Discussion Strengths and Limitations of the
    Study
  • Place the study into specific context your
    setting and those most like it.
  • Place the study into broad general context
    Other Programs, US, Global
  • Should reference back to Background

16
Operation Manual
  • Protocol is WHAT you are going to do.
  • Operations Manual is HOW you are going to do it.
  • Instructions for others, for when (as if) you are
    not there.
  • Prior written instructions to yourself to avoid
    cheating (bias) once you are in the middle of it.
  • How to DO the protocol operationalizes protocol
    for other people doing it.
  • Organization Policies Who is doing what,
    where, reporting to whom
  • Instructions, Procedures and Rules for subjects,
    field workers Investigators
  • Instructions, Procedures and Rules for each step
    in the process
  • Recruitment of Subjects
  • Definition of each variable
  • Handling of intervention
  • Data collection, quality control, handling,
    checks, entry, cleaning.

17
Ethics IRBs, 1/2
  • It is research if has the intent or result to
    increase generalizable knowledge.
  • If just for internal assessment then it is not
    research
  • If you are going to publish, then it is research,
    and must be submitted to IRB
  • May be Exemptible, Expeditable, or require Full
    Review
  • But only IRB can decide, not you investigator
    cannot exempt self, must submit.
  • Exempt IRB Chairperson agrees meets criteria, no
    other review.
  • Surveys, Interviews, Abstract existing records,
    Subject over 18
  • Data collected in such a way that subjects
    cannot be identified
  • Truly anonymous without any back-linkage
    possible.
  • Not Sensitive Subject Sex, Drugs, Crime
  • Response cannot lead to legal liability,
    financial loss, decreased employability.
  • Expedite IRB Chairperson may review themselves
    no wait for committee.
  • Subjects are over 18, procedure is
    non-invasive and routine normal practice.

18
Ethics IRBs, 2/2
  • Submit to IRB ? Clear IRB /-Revision ? Pilot ?
    Revise ? Full Study
  • Submission to IRB includes full protocol,
    operations manual, data instruments.
  • Specify risks benefits selection procedure
    written informed consent process.
  • Do not fear IRB consider as expert consult to
    improve your project.
  • Re-think again
  • Is this Ethical? Doable? Worth doing?
  • Will be able to answer question, one way or the
    other?
  • Golden Rule still applies do unto others as
    you would have them do unto you!
  • Would you participate, or let your child or other
    family participate?
  • If Intervention
  • What if you or your family were blindly
    randomized got the intervention?
  • What if blindly randomized and happened to not
    get the intervention?

19
Defining Study Question
  • Focus Commit at each step for each task,
    commit and put into writing
  • Research Topic Lit search Medline at
    http//gateway.nlm.nih.gov/gw/Cmd
  • Experts in area, either internal and/or
    external for advisory panel.
  • What are the specific definable questions that
    the study will specifically answer?
  • Are you observing existing situation, or
    measuring result of intervention?
  • Explain issue to be addressed, and goals of study
    in a short paragraph.
  • State as a series of single sentence question.
  • State as hypothesis null hypothesis what would
    and would not be true.
  • State as short declamatory phrases To
    determine or To measure
  • What is the change in an outcome Y, if this X is
    done?
  • Operationalize study question into actual
    variables to be measured
  • Written case definition for each variable
  • If intervention, how is independent variable
    (intervention) manipulated measured?
  • What are the actual sources of data how
    gathered (e.g., interview, self-report)?

20
Types of Variable by Study Question
  • These are the actual things to be measured
  • They represent the Study Question and the Answer
  • Predictor or Independent Variable
  • If interventional then the intervention is the
    predictor or independent variable
  • Outcome or Dependent Variable
  • Does the outcome(s) variable actually being
    measured, really correspond to and answer the
    original study question?
  • Might be intermediate or process or proxy for
    real outcome of interest.
  • Potential Covariates Confounders
  • Analogous to weeds these are the outcomes
    other than one of interest to you.
  • Need to consider issues of Measurement Error,
    Reliability Validity

21
Types of Variables by Measurement
  • Character or Categorical or Alphanumeric
    Variables
  • Represent mutually exclusive categories, labels
  • Might code as number
  • Two kinds with important real differences
    Nominal Ordinal
  • Nominal
  • There is not inherent rank order
  • e.g., Sex, Nationality, Blood type, Yes/No
  • Inherently statistically methodologically
    weakest least power.
  • Ordinal
  • Mutually exclusive categories, but with a rank
    order.
  • Represents relative position, but not real
    quantity or interval.
  • Cannot actually define distance between.
  • e.g., Excellent ? Good ? Fair ? Poor or
    Private ? Captain ? General
  • Likert Scales 1Strongly Agree, 2Agree,
    3Neutral, 4Agree, 5Strongly Disagree
  • Inherently intermediate in statistical and
    methodologic power.
  • Do not pre-code from numeric to ordinal if you
    dont have to (age to age-group).

22
Types of Variables by Measurement
  • Numeric
  • Really number, can perform mathematical
    operations such as average.
  • Not a category that happens to be coded as a
    number.
  • Interval distance between numbers has real
    meaning are equal.
  • Strongest statistically and methodologically.
  • Two kinds, Interval Ratio, but not handled
    differently in general practice.
  • Interval
  • Numeric but does not have a true zero.
  • Can rescale order relative interval same, but
    values place of zero different
  • e.g., Calendar date (Christian, Jewish, Muslim),
    Temperature (C or F)
  • Ratio
  • Numeric Measure with a true zero
  • e.g., Age, Weight, Blood pressure, Hgb, CD4 count

23
Open Ended Questions
  • Generally avoided since cannot analyze unsorted
    and uncoded list.
  • May be useful for pilot or focus group to
    determine range content of answers.
  • Software to help code analyze e.g., TextSmart,
    QSRNUDIST
  • Qualitative Methods is its own field.

24
Closed Ended Questions
  • If can be numeric, then best to collect as
    original number (e.g, age).
  • Number has greater power and truer meaning.
  • Can always re-code later (e.g., age-group)
  • But if collected pre-coded, power is lost
    forever, cannot retrieve original real value.
  • If must be character, then pre-code to mutually
    exclusive categories.
  • For data entry separate code for Missing, Dont
    Know, Didnt Answer, Refused.
  • If include in actual questionnaire, will have
    higher rate of such non-answers.
  • Clarity, Simplicity, Neutrality, Defined time
    frame
  • Avoid double barreled with actual or implied
    AND or OR
  • PreTest and Revise
  • Use standard categories and intervals (e.g.,
    census race groups, NCHS age-groups)
  • Whenever possible use, or incorporate elements
    from, previously used, preferably widely
  • used and standardized questionnaires.

25
Scales
  • Likert Summative
  • 1Strongly agree, 2Agree 3Neutral 4Disagree
    5Strongly agree
  • Single answer to each question
  • Can add together to score
  • Decision choice as to whether to include
    mid-point neutral or not (1-4)
  • Classic example of ordinal variable
  • Guttman Cumulative
  • Series of statements expressed in ordered
    intensity of characteristic
  • Ask to agree or disagree, Yes/No with each
    question in the series.
  • Score is total number of items agreed to for
    each series.
  • Answers should be internally consistent
  • If agree with one level, then should agree with
    all lower levels
  • a. Smoking can cause illness
  • b. Smoking is an important cause of illness
  • c. Smoking is a very important cause of illness
  • d. Smoking is the most important cause of
    illness

26
Standardized Questionnaires
  • Comprehensive health questionnaires web site
    http//www.qlmed.org
  • General Health Status SF-36, WHO-Qual, many
    others for specific conditions
  • Behaviors BRFSS (adult) YRBS (child) from CDC
  • http//www.cdc.gov/nccdphp/surveil.htm
  • Morbidity Utilization NHIS, NHANES and others
    from NCHS
  • http//www.cdc.gov/nchs/products/catalogs/site
    map.htm
  • Patient Satisfaction PEERS from NACHC, BPHC,
    Managed Care Co., etc.
  • http//www.bphc.hrsa.gov/quality/
  • Compendium Books
  • Measuring Health Guide to Rating Scales
    Questionnaires, McDowell Newell, Oxford
    University Press.
  • Measuring Health A Review of Quality of Life
    Measurement Scales, Bowling, Open University
    Press.
  • Measuring Disease A Review of Disease Specific
    Quality of Life Measurement Scales, Bowling, Open
    University Press.

27
Computerizing Variables
  • Every Variable Defined by its Type, Length
    Name
  • Database Access, FileMaker Pro, Approach,
    Paradox, dBase, FoxPro
  • Spreadsheet Excel, 123, Quattro Pro
  • Statistical gt900 SPSS, SAS
  • 500-900 Systat, Stata, Statistica,
    DataDesk, JMP
  • 200-500 NCSS, StatMost, GB-Stat,
    Statistix
  • Free EpiInfo from CDC (EpiTable only
    in dos)
  • Complex Sample Sudaan, WesVar (from Westat).
    Some Stata, SAS, EpiInfo
  • Small Numbers StatXact LogXact from Cytel
  • DoEpi (CDC) Free tutorial to learn methods
    EpiInfo earn free cme ceu.
  • Free SPSS Class http//www.shef.ac.uk/scharr/sps
    s/index.html

28
Data Entry Analysis
  • Data Entry Manual key punch
  • Scan OCR
  • Remote by fax, email, internet
  • Data Cleaning Duplicate entry validation
  • Re-check of outliers
  • Descriptive Stats Measures of frequency, rates
    proportions.
  • Central tendency (Average) Dispersion
    (Standard Deviation).
  • Inferential Stats Measures of Association
    magnitude of effect (OR, RR, R).
  • Tests of Significance probability occurred
    by chance (P, CI).
  • Presentation Tables, Graphs, Maps (graph with
    geographic ordinates), Charts.

29
Correct Statistic for Study Design, Sample
Method, Variable Type
  • Is the Variable Continuous, Ordinal,
    Nominalgt2-way or Dichotomous?
  • If Continuous are Values Normally Distributed
    Heterogeneous?
  • Use Non-Parametric for Ordinal Non-Normal
    Continuous
  • Are the Measurements Independent or not (paired,
    matched or repeated measures)
  • Are Measurements Equal time contribution or
    Censored (survival time)
  • Use of Life Table Methods Log Rank,
    Kaplan-Meier, Cox
  • Sampling Method DE 1 if Simple Random
    Sampling normal methods.
  • If not SRS then Cluster Correlated Account for
    Weighting Design Effect
  • DE lt 1 if Stratified sampling (increased
    power)
  • DE gt 1 if Cluster or Multi-Stage sampling (lt
    power)
  • Special Methods Variance Estimation methods
    Taylor, Replication, Bootstrap
  • Alternate method Hierarchical or Multilevel
    Modeling
  • http//www.fas.harvard.edu/stats/survey-soft/
    survey-soft.html
  • Small N/Sparse Distribution Exact Methods.
    LogXact/StatXact see http//www.cytel.com

30
Subjects Who How Selected
  • Population ?Sample Frame Sampling Method ?
    Study Sample
  • Population What is the setting the subjects
    come from?
  • Who are they meant to represent?
  • Sample Frame Must have written objective
    inclusion exclusion criteria to define
    who is potentially eligible.
  • Sampling Written protocol how select actual
    participants from sample frame.
  • Data Sources Chart review, Lab, Questionnaire,
    Interview, Examination
  • Recruitment Who is going to actually recruit,
    explain study, consent process.
  • Post-Recruit How to handle refusal and
    non-response
  • Follow-up Who does it, how, and for how
    long
  • What to do with drop-out, attrition, loss
    to follow-up

31
Truth Data




External Validity











Research Question



Target Population


Accessible Population
.


Study Plan







Suited to Answer ?


Represent Target




Truth in the Universe
Intended Sample
à
à
à



Specify
clinical

Specify temporal







Internal Validity








geographic






demographic







characteristics
characteristics


Actual Sample




















Study Subjects

Inclusion Criteria


Exclusion Criteria


Measures























Truth in

















Study
32
Sample Size
  • Must first Answer or Make Assumption Regarding
  • What is the possible range for the expected
    magnitude of the effect?
  • What is the expected average or frequency, and
    variance
  • How big a difference is important to be able to
    show?
  • i.e. What is the smallest difference that is
    clinically/programmatically significant?
  • Can then determine sample size, level of
    confidence (alpha) power (beta)
  • Additional factors to consider in choice of
    sample or study population size
  • Is sample size feasible, over what period of
    time how many sites?
  • Is it an effective use of my available
    population? comprehensive? representative?
  • Can I analyze subgroups do I need to
    over-sample or weight subgroups?
  • Allow extra for ineligibles, refusals,
    non-response missing data, drop-outs.
  • Always better off with a smaller, but more
    representative less biased, sample!!!

33
Quick Dirty Sample Size Calculations
  • for Difference in Means
  • Guesstimate plausible/possible minimum maximum
    values.
  • Difference between maximum and minimum is the
    Range.
  • Assuming normal distribution, then 67 of Range
    /- 1 Standard Deviation.
  • N 16 x (Standard Deviation)2

  • (Difference in Mean)2
  • Where Standard Deviation)2 Variance
  • Difference in Means Smallest difference
    want to be able to detect.
  • N Total sample size, with N/2 size
    for each group,
  • assuming equal numbers and equal variance
    in each group.

34
Quick Dirty Sample Size Calculations
for Difference in Proportions N
16 x P ( 1 P )
( P1 P2 )2 Where P1 P2 are the
two proportions (e.g, 0.45 for 45) for the two
groups. P is the average of the two P
(P1 P2) / 2 P1 P2 Smallest difference
in proportions you want to be able to
detect. N Sample size in each group
(total 2N), assuming equal N in each group.
35
Sampling Methods Non-Probability
  • Non-Probability - Do not know chance for any
    given unit to be selected
  • Convenience Selected because available, cheap
    easy.
  • Consecutive Select in order from list or line.
  • Most Similar / Dissimilar Cases Selected because
    thought to be similar/not.
  • Typical Cases Select cases known to be
    available, useful and not extreme.
  • Critical Cases Selected because considered
    essential.
  • Snowball Included members identify additional
    members.
  • Quota Select sample to yield predetermined
    proportions of some variables.
  • Probability Each unit has specified, measured,
    and known chance for selection
  • Simple Random Each unit has an equal probability
    of being chosen.
  • Enumerate list, then choose with random
    numbers.
  • Systematic Same, except random start, then
    select at equal intervals

36
Sampling Simple Probability
  • Probability Sampling
  • Each unit has specified, measured, and known
    chance for selection
  • Simple Random Each unit has an equal probability
    of being chosen.
  • Enumerate list, then choose with random
    numbers.
  • Systematic Same, except random start, then
    select at equal intervals

37
Complex Probability Sampling
  • Stratified Assign each potential member to
    group/stratum by characteristic (sex, race, age)
  • then select a simple random sample from
    within each stratum.
  • Better than SRS and should be done more
    often, if have info needed, because
  • Can choose ratio between subgroups with known or
    desired frequency.
  • Can get desired weighting and over-sample for
    small subgroup analysis
  • ?Power, since ?overall sample random error, since
    exact within each strata.
  • Cluster Assign each unit to a group called a
    cluster (e.g., geographic cluster)
  • randomly select some clusters, all
    members of selected clusters are included.
  • Done because less expensive, easier, need
    less prior information, but.
  • Increases Overall sample random error
  • all units in excluded clusters have no chance of
    being selected.
  • all units in selected clusters have 100 chance
    of being selected.
  • units within given cluster are more likely to be
    more homogenous.
  • units between clusters are more likely to differ.
  • MultiStage Extension of Cluster method Clusters
    are selected as in cluster example
  • then simple random sample of units from
    within selected cluster
  • Cluster selection may be done at more
    than one stage.

38
Stratified
General Population made up of Squares, Donuts and
Lambda, they come in two colors Pinks (58.3) and
Greens (41/7).
??????????????????????????????????????????????????
??????????????????????????????????
Stratify into Greens Pinks, then can choose
exact proportional number of each
?????????????????????????????????????????????????
???????????????????????????????????
  • Set target N from within Green Pink, so
    assured to get exactly 58.3 41.7.
  • Then simple random sample from within each
    strata of Color.
  • If know prior proportion for square, donuts
    lambdas could do second level stratification for
    Shape

39
Cluster
  • Dont have a list of persons or households for
    whole State.
  • Do have a list for all Census Tracts.
  • Therefore, randomly select 50 census tracts.
  • Then interview ALL household within those 50
    census tracts, to represent all households in
    the State.
  • Unlike SRS or Stratified, can do even if do
    not have any listing of, or enumeration for, all
    households in the State (which in reality you do
    not.).

40
Multi-Stage
  • Dont have a list of persons or households for
    whole State.
  • Have list for all Counties, Census Tracts in
    Counties, Blocks in Tracts.
  • Randomly select 20 Counties
  • Randomly select 10 Census Tracts from each of
    those 20 selected Counties
  • Randomly select 5 Census Blocks from each of
    those 200 Census Tracts
  • Pick one household from each of the 1000
    Census Blocks to interview.
  • Can and should use known population data to
    make it more representative of States true
    population distribution more likely to select
    high population county than low population
    county, select proportionally more Tracts from
    high population county than from low population
    counties.

41
Measurement Error
  • Sources of Error - Subjects, Instruments,
    Observers, Methods
  • Random Error Due to probability, by chance
    alone, Control Statistically.
  • Depends upon distribution variance, range,
    sample size.
  • Systematic Error aka Bias Methodologic error
    ? control for methodologically
  • Reliability Degree of consistency in measurement
    of a variable.
  • aka Reproducibility Does measurement of
    variable give same result when done
    repeatedly? Re-done over different situations?
  • Validity Degree to which item measures what it
    is intended to measure.
  • Internal Validity Result accurately measures the
    truth in the study group.
  • External Validity aka Generalizability
  • Study group accurately represents population
    wish to generalize to.
  • Can be Measured Standardized questionnaire means
    measures of reliability validity (kappa
    correlation coefficients) reported not just that
    it was used before.

42
Types of Systematic Error
  • Selection Bias Sample not representative of
    Population by parameter of interest.
  • Inclusion or exclusion criteria, Refusal,
    Referral, Health worker, screwed-up sampling
    procedure, differential loss to follow-up
  • Misclassification/ Systematic error in
    measurement
  • Information Bias Recall or Interviewer Exposure
    info varies with outcome status
  • Contamination Control got the intervention
    (e.g., MR-FIT)
  • Suspicion, Social desirability
  • Confounding Distortion of relationship between
    exposure and outcome of interest.
  • Caused by association of exposure with a 3rd
    factor, which is also associated with the
    outcome of interest.
  • Operationally defined when statistical
    adjustment for 3rd factor alters the estimated
    magnitude of effect of association between
    putative
  • exposure outcome.
  • Control by Stratify, Match, Restriction,
    Covariance Multivariate.

43
Interaction
  • Effect Modifier 3rd factor which is antecedent
    to cause.
  • Modifies magnitude of effect between exposure
    outcome.
  • e.g., Age for many conditions
  • Not the same as confounding, since Cause
    ?Effect is true.
  • Contingent 3rd factor is intermediate between
    cause and effect
  • Modifier Effect the magnitude of the effect on
    outcome
  • Is itself effected by the cause.
  • Not the same as confounding since Cause ?
    Effect is true.

44
Confounding
Match Carrying
Smoking
Lung Cancer
  • Appears that Match Carrying causes Lung
    Cancer.
  • Confounder is Smoking, the true cause for both
    Match Carrying Lung Cancer.
  • Smoking is the 3rd factor that is truly causal
    for the other two.
  • It is false that match carrying causes lung
    cancer, so this is confounding

45
Effect Modification
2 Cigarette Packs/day (RR39)
Risk for Heart Attack
1 Pack/day (RR7)
Non-Smoker (RR1)
Oral Contraceptive Use
The amount of cigarette smoking is an modifies
the magnitude of the effect of oral contraceptive
use on the risk for heart attacks. Relationships
are true, so this is effect modification, not
confounding.
46
Contingent/Intermediate Variable
Activity
Genetics
Diet
Serum Lipids
Heart Disease
  • Serum Lipids are contingent or intermediate
    variable between Diet Heart Disease.
  • Diet has true causal effect on serum lipids
    serum lipids have true causal effect on heart
    disease.
  • and there are also many other causal
    interactions going on.

47
Complex Interaction
Poverty is truly causal for both exposure to lead
and child development problems (via other
mechanisms such as diet, education, exposure to
violence, etc.). Lead exposure is truly directly
causal for child development problems. Child
development problems are truly causal for lead
exposure (pica behavior). lead exposure can
cause poverty child development problems can
cause poverty!
48
Types of Validity
  • Face Validity Test Item appears to test what it
    is supposed to
  • Expert opinion says so Hey its Subjectively
    plausible
  • Content Validity Test Item reflects full content
    of domain being measured
  • Item contains all important concepts,
    behaviors elements
  • Test item IS the same thing as the Concept
  • Criterion Validity Test Item can substitute for
    another (harder to measure) item
  • Easier/Cheaper screen test item in place of
    gold standard Dx.
  • Concurrent validity if measures made at same
    time.
  • Predictive validity if substitute used to
    predict.
  • e.g., Parents report of vaccine status PSA
    for Biopsy
  • Construct Validity Theoretical construct
    measured by instrument (reify) IQ, QoL
  • Severity of Illness score, does correlate with
    probability dying

49
Types of Reliability
  • Correlation Degree of association between two
    sets of data
  • Vary together, not necessarily in agreement
    ht shoe size
  • Inter-Rater Two raters score same test and both
    give same grade.
  • Intra-Rater Same rater gives same score when
    same test is repeated
  • Test-Retest Same subject gives same test result
    when test is repeated
  • Equivalence Different forms of the test give the
    same result
  • Homogeneity Internal consistency different
    items test the same characteristic

50
Regression to Mean
  • When retest, by chance alone, result could move
    to mean or extreme.
  • More extreme result, the greater probability will
    move in direction of mean
  • Not a matter of truth.
  • Just from the probability of direction of change
    under bell curve.
  • If select sub-group with poor result and do
    nothing ? Retest ? Will Improve.
  • Hence Program has bad results at baseline, so
    do QI intervention
  • Do assessment and result shows
    improvement
  • but likely to have improved even if did
    nothing!

51
Alternate Explanations for Results
  • If there was an observed association, it could be
    due to
  • Chance, Random Type I Error found association
    when there really wasnt one.
  • Bias, Systematic Error such as Confounding,
    selection bias, recall bias, etc.
  • True, but association is really Apparent Effect
    ? Cause
  • True, and association is indeed causal Cause ?
    Effect
  • If there was no observed association, it could be
    due to
  • Chance, Random Type II Error didnt find
    association when there really was one.
  • Bias, Systematic Error such as Confounding,
    selection bias, recall bias, etc.
  • True There really is no causal relationship.

52
Assessing Causation
  • Strength of Association, Magnitude of Effect
  • Consistency
  • Specificity
  • Temporal Relationship Cause comes before Effect
  • Biological Gradient, Dose Response
  • Biological Plausibility, Theoretical Coherence
  • Hierarchy of Design RCT gt Cohort gt Case-Control
    gt Ecologic
  • These are not hard and fast rules
  • There are exceptions to all (except causal
    order?) of them.

53
How to Reduce Measurement Error
  • Written protocols and operations manual
  • Standardization and training
  • Blinding
  • Objective Measures
  • Automation
  • Repeated Measures
  • Quality Control Checks

54
Control Threats to Validity
  • Random Assignment Assignment gives each each
    subject an equal and
  • independent chance to be placed in any
    group.
  • Matching Intervention Control group are
    equated on one or
  • more potential confounding variables
    before study.
  • Blocking Build potential confounder into
    design as an independent variable.
  • Creates blocks (groups) of subjects that
    are homogeneous.
  • for the different levels of potential
    confounding variable.
  • Homogeneous Subjects Choose subjects who have
    same value of the potential confounder.
  • Makes value of potential confounder
    inclusion/exclusion criteria.
  • Subject Own Control Expose each subject to all
    levels of independent variable
  • Subjects only compared to self repeated
    measures design
  • Statistical Modeling Stratify, Analysis of
    Covariance, Multivariate Analysis

55
Hypothesis Testing Random Error
Truth is no difference
Truth is they are different
H
H
1
2
H
H
1
2
measured
measured
No difference detected, when
Difference detected when the
truth is that there is no difference.
truth was there is a difference.
b
a
Power
Occurs
of the time 1 -
Occurs
of the time 1 - Significance
Typically 5.
Typically 20.
56
Study Designs - Observational
  • Observe effects of experiment of nature.
  • What has or will occur naturally without
    intervention.
  • Researcher does not control or affect
    intervention passive.
  • This is the field of Classic Epidemiology.
  • Ecologic
  • Cross-Sectional (Survey)
  • Case-Control
  • Cohort

57
Study Designs - Interventional
  • Intervention is under control of the Researcher.
  • Measure how change in Independent (Risk,
    Predictor) variable, which is controlled by the
    the Researcher (the Intervention), effects the
    Dependent (Outcome) variable.
  • Intervention can be anything Drug trial,
    Surgical technique, Use of new device, Quality
    improvement program, Patient or Provider
    education program, Office protocol or tool,
    etc
  • Experimental Quasi-Experimental Designs
  • Individual Randomized Controlled Trial
  • Group or Community Randomized Controlled Trial
  • Non-Equivalent PreTestPostTestControl, and
    Variants
  • Time Series
  • Repeated Measures
  • Factorial

58
Elements of Experimental Control
  • Manipulation of Variables Predictor/Independent
    variable is an intervention
  • It is deliberately manipulated by study
  • Control Group Experimental (Intervention)
    group is Compared
  • to non-Intervention group
  • May be placebo or standard care
  • May be self (repeated measures)
  • Random Assignment Each subject has equal
    chance of assignment to either
  • intervention or non-intervention group
  • Assignment is independent of any
    attribute
  • Blinding Subjects dont know which group
    they are in
  • Persons measuring outcomes dont know.
  • May have 3rd party who does know,
    monitoring results
  • stops trial when reach significant
    endpoint (good or bad).

59
Controlling Inter-Subject Differences
  • Randomize Probability balances distribution of
    inter-subject differences across groups.
    Works even for unmeasured or unknown differences
  • Homogeneity Select subjects that are homogenous
    for the extraneous variable
  • e.,g. Inclusion criteria study only white,
    males, ages 45-54.
  • Matching For each person of the extraneous
    characteristic(s) in the intervention group,
    select a person with the same value of the
    characteristic(s) for the control group. For
    every black female 30-39 years old select
    another.
  • Blocking Divide subjects into groups by the
    extraneous variable, treating it as another
    independent variable. Creates homogenous
    blocks of subjects for different values of
    the variable.
  • e.g., Divide subjects into 3 age-groups. In
    addition to the Rx, age is now a 2nd
    independent variable with 3 levels.
  • Self Control Each subject is exposed to all
    level of intervention repeated measures.
  • Statistical Select extraneous variable as a
    covariate, and use to adjust score for
    stratified or multivariate analysis.

60
One-Group PreTest/PostTest
  • N1 O1 X O2
  • Key N Study Population
  • O Observation (measurement)
  • X Intervention
  • Simplest Interventional Study
  • Measure outcome of interest and potential
    confounders and other covariates
  • Once at beginning, before the intervention.
  • Once at the end, after the intervention.
  • However, ability to infer that any change is due
    to the intervention is limited
  • No Control Change could be due to other factors
    that changed over time.
  • Change could be due to doing study making
    measurements, not the intervention itself
    (behavior change due to fact of study).

61
Multigroup PreTest/PostTest
  • N1 O1 X O2
  • N2 O1 X O2
  • N2 O1 X O2
  • Somewhat stronger than one-group, since changes
    other than than the official intervention may be
    unique to one site, and hence are partially
    distinguishable.
  • If there is the same trend across numerous
    different sites, then it is more likely to be due
    to the official intervention than if occurs only
    at one site.
  • Other factors that change with time may still
    apply to all sites time, seasonality (if start
    and finish time are not staggered), social
    changes (managed care, public awareness, practice
    guidelines)
  • Still doesnt control for possible effect of just
    doing study and taking measurements.

62
PreTest/PostTest/Control
N1 O1 X O2 N2 O1
O2
  • Assignment of subject to 2 or more groups, either
    intervention or not.
  • Having control group makes study much more
    powerful, since other changes or trends over
    time, including time itself, occur in control
    groups also.
  • Also partially controls for the effect of just
    doing the study making measurements
  • Equivalent If assignment is both random and at
    individual level RCT
  • Non-Equivalent Assignment is at group level, or
  • Not random at same level as measures.
  • e.g., All patients at one site intervention
    another site is control.
  • Results are measured in individuals,
    but assignment is site.

63
Early/Late Design
N1 O1 X O2 O3 N2
O1 O2 X O3
  • Variant of PreTest/PostTest/Control
  • Initial non-Intervention control group gets the
    intervention, but at a later period
  • If measurement occur in both early intervention
    and late intervention sites over all 3 time
    periods, as shown above, then this becomes even
    more powerful
  • Can measure if there is anything unique to time
    periods.
  • Measure if effect in first group wanes over
    time.
  • Becomes variant of repeated measures and time
    series designs.
  • Very useful method for getting control sites or
    persons to agree to participate, since they will
    eventually get the intervention (assumed to be
    good).

64
Solomon Four Group Design
  • O1 X O2 PreTest/PostTest, with
    Intervention
  • O1 O2 PreTest/PostTest,
    without Intervention
  • X O2 PostTest only, with
    Intervention
  • O2 PostTest only, without
    Intervention
  • Adds the two groups without the baseline measure.
  • Extra control for potential effect of baseline
    observation, measuring, study.

65
Non-Equivalent PreTest/PostTest/Control
  • not Individualized Randomized Controlled Trial
    if either or both of 2 limitations
  • Non-Equivalence of Intervention and Control
    Groups
  • Selection and/or Assignment of site or patients
    is not random.
  • There are inherent differences between sites that
    cannot be randomized.
  • Subjects are not assigned within groups randomly.
  • Group Assignment versus Individual Level
    Measures
  • Assignment (random or not) and application of
    intervention is at the group (e.g., clinic site)
    level
  • But the Outcomes (and confounders, covariates)
    are measured at the individual (e.g.,
    patient) level.
  • Overlap with Group- or Community-Randomize
    Controlled Trial
  • Not as strong or as pure as true Individual
    RCT, but
  • It is still a legitimate, accepted, published,
    real study method
  • Can still compare compare pretest values to
    measure, test for, and partially control for
    initial non-equivalence, at least for known and
    measured outcomes, confounders and other
    covariates.

66
Non-Equivalent PreTest/PostTest/Control
N1 O1 X O2 N2 O1
O2
Randomized Controlled Trial
O1 X O2
R
N
O1 O2
67
Individual Randomized Controlled Trial
General Population
Sample Frame
Sample
Randomize
Non-Intervention (Control)
Intervention
Lost-to-follow-up
Lost-to-follow-up
Stopped Intervention Include if Intent to Treat
design
Measure Outcome
Measure Outcome
68
Repeated Measures
  • O1a X O1b
  • Prior designs compared 2 or more independent
    individuals or separate groups.
  • May compare same individuals or group to
    itself, before vs. after intervention
  • Each subject acts as its own control (before)
    to intervention (after).
  • Cross-Over Design is an extension of this.
  • Self as own control automatically controls for
    even unknowable confounders.
  • Useful if not able to randomize from large
    enough or homogeneous group.
  • Different statistical test for this situation
    of non-independence, but is beneficial
  • Inherent decrease in sample variance if same
    people compared to themselves, rather than
    to another group of people, which increases power
    of study.
  • For same sample size a smaller magnitude of
    effect will be statistically significant,
  • or one only needs smaller sample size to detect
    given magnitude of effect.
  • Limits May be usable only if there are no
    practice effects or carry-over effect no memory.

69
Time Series
  • Multiple measurement before and after the
    Intervention extends PreTest/PostTest.
  • One Group O1 O2 O3 O4 X O5 O6 O7 O8
  • Multi-Group O1 O2 X O3 O4 O5 O6 O7 O8 Same
    Intervention, introduced
  • O1 O2 O3 O4 X O5 O6 O7 O8 at different times,
    staggered
  • O1 O2 O3 O4 O5 O6 X O7 O8 early, late, later
  • Multi-Group O1 O2 O3 O4 X1 O5 O6 O7 O8
    Different Interventions,
  • O1 O2 O3 O4 X2 O5 O6 O7 O8 introduced at same
    time.
  • O1 O2 O3 O4 X3 O5 O6 O7 O8
  • Withdrawal O1 O2 X1 O3 O4 O5 X0O6 O7 O8
  • Multiple Rx O1 O2 X1 O3 O4 O5 X2O6 O7 O8
  • Same statistical tests as repeated measures,
    non-independence.
  • Special time time-series methods take into
    account varying lengths of intervals, cycles and
    trends (e.g., seasonality), etc.

70
Factorial Designs
  • More than one Intervention being introduced.
  • Hence 2 or more Independent variables.
  • Groups assigned to different combinations at
    different levels for each intervention.
  • Can measure interactive effects of different
    interventions on outcomes.
  • Treatment Gender Race
  • Female Male Black White
  • Diuretic
  • Low Dose S1 S7 S12 S18
  • Med Dose S2 S8 S13
    S19
  • High Dose S3 S9 S14 S20
  • ACE Inhibitor
  • Low Dose S4 S10 S15 S21
  • Med Dose S5 S11 S16 S22
  • High Dose S6 S12 S17 S22

71
Factorial Designs
  • More than one Intervention being introduced.
  • Hence 2 or more Independent variables.
  • Groups assigned to different combinations at
    different levels for each intervention.
  • Can measure interactive effects of different
    interventions on outcomes.
  • ACE Inhibitor
  • Diuretic Low Med High
  • Low Dose S1 S4 S7
  • Med Dose S2 S5 S8
  • High Dose S3 S6 S9
  • where each Sx is a different sample group,
    randomly selected from the whole

72
Observational
  • The study is Observational if it is just
    describing the existing situation.
  • Not controlling the intervention, is observing
    the effects of Natural Experiment.
  • Try to determine which are the important
    Independent Predictors, Risk or Protective
    factors, for the Dependent Outcome(s) of interest
    which have/are/will occur anyway.
  • Such Observational methods are the subject of
    Classic Epidemiology
  • Even more so than with Quasi-Experimental
    Designs, have difficulty in determining is
    apparent Association or Correlation is in fact
    causal
  • Notation N Study Group or Population
  • E Exposed Not
    Exposed
  • D Has Outcome (Disease) Doesnt
    have Outcome

E
D
73
Ecologic
E,
E

N1
D,
D

74
Ecologic
  • Measure association between Exposure Outcome
    based on comparison of aggregate group data.
  • Relationship is between the groups, not
    individuals, lead to Ecologic Fallacy.
  • Know have count, or rate of exposure for each
    group or area .
  • Separately, also know the count, or rate for
    the outcome in each group.
  • But do NOT know if the individuals with the
    outcome are the individuals
  • with the exposure.
  • Do just by making associations between existing
    aggregate data.
  • Cheap, easy and may be useful for initial
    hypothesis generation.
  • Very weak ability to draw inference of
    causality when an association is,
  • or is not, found.

75
Cross-Sectional or Survey
ED
D
E
N
E

D
ED

76
Cross-Sectional or Survey
  • e.g., most Surveys, record abstraction QA,
    Polls
  • Exposure and Outcome are measured at the same
    time, and only once.
  • Can measure rates of outcome if population
    surveyed is representative.
  • Can measure Association, but link to Causation
    is weak.
  • Since measuring both exposure outcome at same
    time, do not know which came first, therefore
    cannot know causal direction.
  • Recall bias since both exposure outcome
    depend on recall
  • Incidence-Prevalence bias, since cases that die
    quickly, or where evidence for exposure
    disappears quickly, are missed.
  • Confounders may not be equally distributed
    among the groups.
  • Useful for hypothesis generating and some
    preliminary testing.

77
Case-Control
E
D
E
E
D
E
78
Case-Control
  • Select individuals on the basis of whether they
    have the Outcomes (Cases)
  • or not (Controls).
  • Then measure if they have the predictor/exposure
    of interest or not, retrospectively.
  • Analysis Are people with the disease more
    likely to have the exposure
  • than people without?
  • Validity Generalizability depend upon selection
    characteristics of the control group Cases
    are easy, Controls are hard.
  • Cannot measure prevalence of disease since not a
    natural population
  • People were selected for inclusion on the basis
    of having the disease or not.
  • Ratio of persons with without the disease is
    artificial, determined by
  • the researcher.

79
Case-Control
  • Advantages
  • Can be done in real time no wait for the onset
    of disease at some unknown time in future.
  • When disease is rare.
  • Where there is a long lag time between exposure
    and disease.
  • Limits in making assumption of causality from
    association include
  • Rely on recall or existing records for measure
    of old prior exposure, where information may be
    from long ago, missing, never recorded, or
    inaccurate.
  • Recall bias more likely, where person with
    disease is more likely to recall exposure than
    healthy person.
  • Subject to confounding by unmeasured or unknown
    exposure factors.

80
Cohort
D
E
D
N
D
E
D
81
Cohort
  • Select individuals belonging to a natural/common
    population group (the cohort)
  • Exposure measured at earlier point in time,
    before the disease has occurred.
  • Analysis Rate Risk of onset of the disease
    among those with the prior
  • exposure, compared to those who did not
    have the exposure.
  • Classically Prospective
  • Measure exposures and wait (e.g., Framingham)
  • Can be Retrospective
  • Measures of exposures of interest done already in
    cohort at an earlier
  • time, and later investigator follows-up to
    determine who subsequently
  • got outcomes (e.g, Harvard Class Health).

82
Cohort
  • Advantages
  • Since exposure is definitely measured prior to
    the onset of disease
  • Can determine causal direction.
  • Less subject to recall and other bias.
  • More easily measure and control for potential
    confounders.
  • Can measure true prevalence and incidence of
    exposures and diseases or other outcome.
  • Disadvantages
  • Takes too long /or Expen
Write a Comment
User Comments (0)
About PowerShow.com