1: Measurement and Sampling - PowerPoint PPT Presentation

About This Presentation
Title:

1: Measurement and Sampling

Description:

1: Measurement & Sampling. 2. HS 167 Logistics ... 1: Measurement & Sampling. 13. Data Quality. An analysis is only as good as its data ... – PowerPoint PPT presentation

Number of Views:20
Avg rating:3.0/5.0
Slides: 20
Provided by: sjsu
Learn more at: https://www.sjsu.edu
Category:

less

Transcript and Presenter's Notes

Title: 1: Measurement and Sampling


1
1 Measurement and Sampling
  • What is biostatistics?
  • What is measurement?How do we sample populations?

2
HS 167 Logistics
  • Syllabus materials (text, lab workbook,
    calculator)
  • Calendar and assignments are on
    www.sjsu.edu/biostat ? click HS167 (become
    familiar with Web site)
  • Exam1 10/9, Exam2 11/13, Final Thur 12/13
    245
  • Lab 0 and Lab 1 (Tu and We lab may have
    additional time to complete Lab 1)
  • Text (reading) pp. 1 10, 15 19 (note
    vocabulary on p. 11)
  • Exercises 1.1 1.6, 1.8, 1.9, 2.1 2.3, 2.11
    2.13 due at beginning of next lecture
  • Yahoo group send email to hs167-F07-subscribe_at_yah
    oogroups.com 
  • Academic integrity (do your own work)
  • Odd-numbered exercises and lab work ? OK to get
    help from friends
  • Even numbered exercises exams ? do NOT get help
    from friends
  • How to get a good grade
  • Attend all classes and labs (attendance required)
  • Stay on task
  • Read text (listed to Nancy)
  • Do Lab HWs diligently
  • Do not cut corners

3
Biostatistics
  • is not merely a compilation of computational
    techniques
  • is a way of learning from data
  • is concerned with all many elements of study
    design and analysis (not just computations)
  • requires more judgment than math (pay attention
    to vocabulary)
  • is statistics applied to biological and health
    problems

4
Biostatistics involves
  • A data detective element
  • Uncovering patterns and clues
  • This is a combination of exploratory data
    analysis (EDA) and descriptive statistics
  • A data judge element
  • Confirmation of clues
  • This often requires inferential methods

5
Measurement
  • Measurement assigning of numbers and codes
    according to prior-set rules
  • Three types of statistical measurements
  • Categorical classify observations into named
    (nominal) categories
  • e.g., HIV classified as positive or negative
  • Ordinal ranked categories
  • e.g., OPINION ranked 5 strongly agree, 4
    agree, 3 neutral, and so on
  • Quantitative numbers with equal spacing
  • e.g., AGE in years
  • e.g., BLOOD_PRESSURE in mm Hg

6
Illustrative Example Weight Change and Heart
Disease
Source Willett et al., 1995
  • Goal to determine the effect of weight change on
    coronary heart disease risk
  • 115,818 women 30- to 55-years of age, free of CHD
  • Body mass index (BMI, kg/m2) determined at entry
    to study
  • Body weight determined as of age 18
  • Subjects followed for 14 years
  • Number of CHD onsets (fatal and nonfatal) counted
    (1292 cases)

7
Illustrative Example (cont.)
Variables
  • Smoker or nonsmoker
  • Family history of heart disease (yes or no)
  • Non-smoker, light-smoker, moderate smoker, heavy
    smoker
  • BMI (kgs/m3)
  • Age (years)
  • Weight presently
  • Weight at age 18

Categorical
Ordinal
Quantitative
8
Variable, Value, Observation
  • Observation ? the unit upon which measurements
    are made
  • Can be an individual (e.g., a person)
  • Can be an aggregate of individuals (e.g., a
    region)
  • Variable ? the generic thing we measure
  • e.g., AGE of a person
  • e.g., HIV status of a person
  • Value ? a realized measurement
  • e.g.,27
  • e.g.,positive

9
Data Structure (Forms)
Observation 1
Data Collection Form Var1 (ID) 1 Var2 (AGE)
27 Var3 (SEX) F Var4 (HIV) Y Var5 (KAPOSISAR
C) Y Var6 (REPORTDATE) 4/25/89 Var7
(OPPORTUNIS) N
Observation 2
Observation 3
Observation 4
10
U.S. Census Form
11
Data Structure (Table)
Observations ? rowsVariables ? columnsValues ?
cells
12
Illustrative Example Cigarette Consumption and
Lung Cancer
Variables country name of country/region cig193
0 per capita cigarette consumption,
1930 mortalit lung cancer deaths per 100,000 in
1950
Note Unit of observation in this data set are
regions (not people)
13
Data Quality
  • An analysis is only as good as its data
  • GIGO garbage in, garbage out
  • Does a variable measure what it purports to?
  • Validity freedom from systematic error
  • Objectivity seeing things as they are without
    making it conform to your worldview
  • Discussion on avoiding bias when questioning
    e.g., consider the word jam

14
Ethos Which do you choose?
Blackburn, S. (2005). Truth. Oxford Univ. Press
Frankfurt, H. G. (2005). On Bullshit. Princeton
University Press
The difference is intention and method BS has a
predetermined outcome. Truth is earnest in its
intent and does not bend the facts to a
predetermined outcome.
15
Truth Versus Perception
I cannot give any scientist of any age any better
advice than this The intensity of the conviction
that a hypothesis is true has no bearing on
whether it is true or not. Peter Medawar
1915-1987
Platos Allegory of the Cave We observe shadows
on the wall. The truth lies outside.
16
Two Types of Statistical Studies
  • Surveys quantify population characteristics
  • e.g., of population that is overweight
  • e.g., expected life span
  • Comparative Studies determine relationships
    between variables
  • e.g., relationship between weight gain and heart
    disease risk
  • e.g., relationship between alcohol consumption
    and esophageal cancer risk
  • We start by considering survey sampling

17
Sampling for a Survey
  • We seldom (if never) study an entire population
  • Take a subset (sample) of the population
  • Use characteristics of the sample to infer
    population characteristics
  • Select a probability sample
  • chance determines which individuals are selected
  • Avoid non-probability samples
  • Discuss volunteer bias as an example

18
Simple Random Sample (SRS)
  • SRS (definition) every possible sample from the
    population has the same probability
  • this is the most basic type of probability sample
  • SRSs have sampling independence
  • selection of one individual does not influence
    selection of any other
  • SRSs can be done with replacement or without
    replacement (both methods are usually valid)
  • Sampling fraction n N probability of
    selection where
  • n ? sample size
  • N ? population size

19
SRS Method
  • Compile census listing (sampling frame)
  • individuals numbered 1, 2, . . ., N
  • Generate n random numbers between 1 and N
  • Can be done with random number generator (lab) or
    with table of random digits
  • Select individuals based on random number list

You will take a SRS in lab this week
Write a Comment
User Comments (0)
About PowerShow.com