Introduction to Inferential Statistics - PowerPoint PPT Presentation

1 / 20

About This Presentation

Title:

Introduction to Inferential Statistics

Description:

Introduction to Inferential Statistics Higher rank more stress r = -.6 r2 = .36 Higher income less crime Inferential statistics An extension of statistical ... – PowerPoint PPT presentation

Number of Views:252

Avg rating:3.0/5.0

Slides: 21

Provided by: jwac9

Category:

more less

Transcript and Presenter's Notes

Title: Introduction to Inferential Statistics

1
Introduction toInferential Statistics
2
Inferential statistics

So far weve assessed relationships between
variables two ways
Categorical variables tables and proportions
(percentages)
Continuous variables scattergrams and simple
correlation (r)
Inferential statistics are an extension of these
procedures
Provide far more precise assessments of
relationships

Higher rank ? more stress
r -.6 r2 .36
Higher income ? less crime
3
Using inferential statistics

Examples of inferential statistics
Categorical variables Chi-Square (X2)
Combination of categorical dependent and
continuous independent variable
Difference between the means test (t statistic)
Continuous variables
Correlation and regression (r and r2) can be used
inferentially
b statistic, generated through regression
analysis
Combination of nominal and continuous variables
Logistic regression, generates b and exp b (odds
ratio) statistics
Requirements
Must use probability sampling techniques (e.g.,
random sampling)
Parametric inferential statistics, including r,
r2, b and t
Variables must be continuous and approximately
normally distributed in the population
Non-parametric statistics
Variables need not be normally distributed. We
will cover one Chi-Square (X2).

4
General procedure

Types of hypotheses
Working hypothesis what a regular hypothesis
is called
Null hypothesis Fixed presumption that any
observed relationship between two variables is
caused by chance
Draw one or more samples and code the independent
and dependent variables
Use a test statistic (e.g., r) to assess the
hypothesized relationship
The computer calculates a coefficient for the
test statistic (e.g., r .21)
These coefficients are the sum of two components
Systematic variance The actual, systematic
relationship between variables
Error variance An apparent relationship,
caused by sampling error. The size of this
component can be precisely calculated and shrinks
as sample size increases.

The big question Once we remove the error
component, is there enough of a real
relationship left to reject the null hypothesis?
Systematic variance
Errorvariance
5
Test statistics and the null hypothesis

To reject the null hypothesis, the test statistic
coefficient (e.g., r .7) must be sufficiently
large, after subtracting sampling error, to
reject the null hypothesis of no relationship
How much room is required? Enough to yield a
probability of less than five in one-hundred (lt
.05) that the relationship between variables was
produced by chance.
If the computer decides that the coefficient is
sufficiently large it will award at least one
asterisk. The relationship between variables is
statistically significant and the null
hypothesis (no relationship) is FALSE.
If the coefficient is too small, no asterisk ()
is awarded. The association between variables is
deemed non-significant and the null hypothesis
is TRUE. Working hypotheses that depend on this
relationship must be rejected.
For significant relationships, one to three
asterisks usually appear next to the test
statistics coefficient (e.g., .25, .36,
.41). More asterisks greater confidence that
a relationship is systematic not the product of
chance.
Probability less than 5 in 100 that a
coefficient was produced by chance (plt .05)
Probability less than 1 in 100 that a
coefficient was produced by chance (plt .01)
Probability less than 1 in 1,000 that a
coefficient was produced by chance (plt .001)
Instead of asterisks, sometimes the actual
probability that a coefficient was produced by
chance are given, usually in a column labeled
p.
Again, significant relationships are denoted by
ps less than .05

Good Better Best
6
Some statistics used for testing relationships
Procedure Level of Measurement Statistic Interpretation
Correlation All variables continuous r Range -1 to 1, with 0 meaning no relationship. For example, .35 denotes a moderately strong positive relationship
Regression All variables continuous r2, R2 b Proportion of change in the dependent variable accounted for by change in the independent variable. R2 denotes cumulative effect of multiple independent variables. Unit change in the dependent variable caused by a one-unit change in the independent variable
Logistic regression DV nominal dichotomous, IVs nominal or continuous b exp(B) Dont try Odds that DV will change if IV changes one unit, or, if IV is dichotomous, if it changes its state. Range 0 to infinity 1 denotes even odds, or no relationship. Higher than 1 means positive relationship, lower negative relationship. Use percentage to describe likelihood of effect.
Chi-Square All variables categorical, not ordinal X2 Reflects difference between Observed and Expected frequencies. Use table to determine if coefficient is sufficiently large to reject null hypothesis
Difference between means IV dichotomous, DV continuous t Reflects magnitude of difference. Use table to determine if coefficient is sufficiently large to reject null hypothesis.
7
A caution on hypothesis testing

Probability statistics are the most common way to
evaluate relationships, but they are being
criticized for suggesting misleading results.
(Click here for a summary of the arguments.)
We normally use p values to accept or reject null
hypotheses. But the actual meaning is more
subtle
Formally, a p lt.05 means that, if an association
between variables was tested an infinite number
of times, a test statistic coefficient as large
as the one actually obtained (say, an r of .3)
would come up less than five times in a hundred
if the null hypothesis of no relationship was
actually true.
For our purposes, as long as we keep in mind the
inherent sloppiness of social science, and the
difficulties of accurately quantifying social
science phenomena, its sufficient to use
p-values to accept or reject null hypotheses.
We should always be skeptical of findings of
significance, and particularly when very large
samples are involved, as even weak relationships
will tend to be statistically significant. (More
on this later.)

8
Examples of tables fromarticles, panels 1-12
9
1
Hypothesis Alcohol consumption ?
VictimizationMethod Logistic regression
Statistics b and Odds Ratio (Exp b)
Richard B. Felson and Keri B. Burchfield,
Alcohol and the Risk of Physical and Sexual
Assault Victimization, Criminology (424, 2004)
10
2
Hypothesis Black race related factors ?
Distrust of policeMethod Logistic regression
Statistic b (called the Estimate)
Elaine B. Sharp and Paul E. Johnson, Accounting
for Variation in Distrust of Local Police,
Justice Quarterly (261, 2009)
11
3
Hypothesis Race and class ? Satisfaction with
policeMethod Logistic regression Statistics b
and Exp b (odds ratio)
Yuning Wu, Ivan Y. Sun and Ruth A. Triplett,
Race, Class or Neighborhood Context Which
Matters More in Measuring Satisfaction With
Police?, Justice Quarterly (261, 2009)
12
4
Hypothesis Low self control ? More contact with
policeMethod Logistic regression Statistics b
and Exp b (odds ratio)
Kevin M. Beaver, Matt DeLisi, Daniel P. Mears and
Eric Stewart, Low Self-Control and Contact with
the Criminal Justice System in a Nationally
Representative Sample of Males, Justice
Quarterly (264, 2009)
13
5
Hypothesis Gender and race of victim ?
Imposition of death sentenceMethod Logistic
regression Statistics b (coefficient) and
odds-ratio (exp b)
Marian R. Williams, Stephen Demuth and Jefferson
E. Holcomb, Understanding the Influence of
Victim Gender in Death Penalty Cases The
Importance of Victim Race, Sex-Related
Victimization, and Jury Decision Making,
Criminology (454, 2007)
14
6
Hypothesis Academic performance ?
DelinquencyMethod Tobit regression
Statistic b
Richard B. Felson and Jeremy Staff, Explaining
the Academic Performance-Delinquency
Relationship, Criminology (442, 2006)
Best when the DV for a large proportion of
cases has a zero value
15
7
Hypothesis Strains of imprisonment ?
RecidivismMethod Logistic regression
Statistics B and exp B (odds-ratio)
Shelley Johnson Listwan, Christopher J. Sullivan,
Robert Agnew, Francis T. Cullen and Mark Colvin,
The Pains of Imprisonment Revisited The Impact
of Strain on Inmate Recidivism, Justice
Quarterly (301, 2013)
16
8
Hypothesis Fathers incarceration ? Sons
delinquencyMethod Logistic regression
Statistic Odds ratio (Standard Error in
parentheses)
Michael E. Roettger and Raymond R. Swisher,
Associations of Fathers History of
Incarceration With Sons Delinquency and Arrest
Among Black, White and Hispanic Males in the
United States, Criminology (494, 2011)
17
9
Hypothesis Officer and driver race ? Vehicle
searchMethod Logistic regression Statistics
Odds ratio (Standard Error in parentheses)
Jeff Rojek, Richard Rosenfeld and Scott Decker,
Policing Race The Racial Stratification of
Searches in Police Traffic Stops, Criminology
(504, 2012
18
Brian D. Johnson and Stephanie M. Dipietro, The
Power of Diversion Intermediate Sanctions and
Sentencing Disparity Under Presumptive
Guidelines, Criminology (503, 2012)
19
11
Hypothesis Child abuse neighborhood factors ?
Childs subsequent violent behaviorMethod
Logistic regression Statistic b (coefficient)
Emily M. Wright and Abigail A. Fagan, The Cycle
of Violence in Context Exploring the Moderating
Roles of Neighborhood Disadvantage and Cultural
Norms, Criminology (512, 2013)
20
12
Hypothesis Marriage ? Desistance from
crimeMethod HLM (like logistic regression)
Statistics b (Coeff.) Can compute log odds)
Bianca E. Bersani and Elaine Eggleston Doherty,
When the Ties That Bind Unwind Examining the
Enduring and Situational Processes of Change
Behind the Marriage Effect, Criminology (512,
2013)

Write a Comment

User Comments (0)