P1246990939QubTa - PowerPoint PPT Presentation

1 / 45
About This Presentation
Title:

P1246990939QubTa

Description:

sum vars = happy zhappy. What is the correlation between zhappy and happy? ... Happiness and occupational prestige . regr happy prestg80. Source | SS df MS ... – PowerPoint PPT presentation

Number of Views:46
Avg rating:3.0/5.0
Slides: 46
Provided by: vaness6
Category:

less

Transcript and Presenter's Notes

Title: P1246990939QubTa


1
Properties of Estimators
2
OLS review
Ordinary Least Squares minimizes the squared
errors from the slope. The standard error is the
average deviation from the slope
3
Residuals
6
5
Slope
4
Political Tolerance
Mean
3
2
1
0
6
5
4
3
2
1
0
Education
4
Residuals review
  • Residuals of OLS analysis (errors of the slope)
    have a mean of zero
  • This is true by definition they have been
    computed by their minimization.
  • We also assume that they are distributed
    normally.

5
Regression results review
--------------------------------------------------
---------------------------- happy
Coef. Std. Err. t Pt
Beta ----------------------------------------
-------------------------------------
prestg80 -.0380391 .0209348 -1.82 0.103
-.518061 _cons
3.330371 .8050567 4.14 0.003
. --------------------------------------
----------------------------------------
6
Residuals are variables
For each observation, they represent the squared
distance from the slope.
7
Residuals and OLS
  • Therefore they are distributed along a standard
    normal distribution, mean of zero.
  • The standard deviation is not necessarily 1, but
    it is assumed to be constant across all values of
    x.
  • Foreshadowing if this assumption does not hold,
    you are not advised to use OLS.

8
What is the question that we ask in scientific
analysis?
  • Are we wrong about our theory?
  • Or how likely is it that we are wrong about our
    theory?
  • Is there a non-zero relationship?
  • How much better than the mean have we done in
    predicting the dependent variable from the
    independent variable?

9
The Null Hypothesis
  • The null hypothesis is that the relationship is
    zero, that the slope is zero, that we are doing
    no better than the mean.
  • We are trying to reject the null hypothesis.

10
Confidence in point estimates
  • We have a point estimate of y for each value of
    x
  • The set of predicted values is a variable
  • Predicted values comprise a slope, but the
    values of the slope are only true for our sample
  • We do not know anything about the population.

11
Error in estimation
  • So, we know that there is error in our estimate.
    We put bounds around that estimate.
  • So, to reject the null hypothesis, neither the
    upper nor lower bound of our estimate is likely
    to contain zero.

12
Strange question to ask
  • How likely is it that the true value from the
    population is zero? (not different from the mean
    of y)
  • How likely is it that the true value of the slope
    is NOT zero?

13
A Caveat
  • Standardization review
  • Z scores
  • Normal distribution
  • Standard normal distribution

14
Standardized variable review
  • Z scores are linear transformations of variables
  • Z score (x) (x-mean of x) /standard deviation
    of x

15
Z scores
  • Z scores always have
  • a mean of zero
  • a standard deviation of 1

16
Histogram of Happiness
17
Creating a zscore
  • sum happy
  • Variable Obs Mean Std. Dev.
    Min Max
  • -------------------------------------------------
    --------------------
  • happy 11 1.909091 .700649
    1 3
  • generate zhappy happy - 1.9/.7
  • . sum zhappy happy
  • Variable Obs Mean Std. Dev.
    Min Max
  • -------------------------------------------------
    --------------------
  • zhappy 11 -.8051948 .7006491
    -1.714286 .2857143
  • happy 11 1.909091 .700649
    1 3

18
Frequency of zhappy
  • . tab zhappy
  • zhappy Freq. Percent Cum.
  • -----------------------------------------------
  • -1.285714 3 27.27 27.27
  • .1428571 6 54.55 81.82
  • 1.571429 2 18.18 100.00
  • -----------------------------------------------
  • Total 11 100.00

19
Histogram of z score of happiness
20
Descriptives Syntax
sum vars happy zhappy.
21
What is the correlation between zhappy and happy?
22
(No Transcript)
23
Normal distribution review
24
  • Approximately 68 percent of the area under a
    standard normal curve lies between the values of
    the mean and the standard deviation and the
    mean.

25
  • Approximately 95 of the area lies between 2
    standard deviations and the mean.

26
  • Approximately 99.7 lies between 3 standard
    deviations and the mean.

27
Standard normal distribution
28
Attributes of standard normal
  • Mean is zero
  • Standard deviation is 1
  • 67 of the area lies between -1 and 1
  • 95 of the area lies between -2 and 2
  • 99 of the area lies between -3 and 3

29
95 confidence interval
  • Generally, we want to be at least 95 confident
    that our estimate does not include zero.
  • So, to be 95 confident, then the slope must be
    two standard deviations from the mean of the
    standard normal curve, which is zero.

30
Review Central limit theorem
  • The central limit theorem is based on a theory of
    repeated samples
  • A 95 confidence interval means that if this
    process of estimation occurred in 100 samples
    from the same population, 5 times out of a
    hundred, this estimate would be zero.

31
We are trying to reject the hypothesis that the
relationship is zero
  • So, we are more confident as we believe that the
    slope is not zero.
  • We know that the area under the normal curve at 2
    standard deviations away from zero (the mean) is
    2.5 of the area of the curve (approximately).
  • We also know that 2 standard deviations away from
    the mean in the other direction is 2.5 of the
    area of the curve.

32
T statistic
If the slope falls out of the range of 2 standard
deviations from 0 then we can say that we are 95
confident that the relationship is not zero.
33
Formula for t
  • T slope/standard error
  • If the t is at least 2, then it is two standard
    deviations from the mean of the curve which is
    zero (why is it 0?), then we are 95 confident
    that the relationship is NOT zero
  • Significance is a linear transformation of the t
    statistic based on the theory of the normal
    curve.
  • Also known as probability values (p).

34
How confident are we?
  • If the slope falls within two standard
    deviations from zero, then we have a difficult
    time saying that we are confident.
  • Since we can say with precision what the
    probability is that the relationship from the
    population would be zero if we repeated samples,
    then we estimate how confident we are.

35
T 1
  • Approximately 68 percent of the area under a
    normal curve lies between the values of the mean
    and the standard deviation and the mean.
  • If t 1, then we are 68 confident.
  • That is not very confident.

36
T 3
Approximately 99.7 lies between 3 standard
deviations and the mean. If t 3, then the
theory (from which theorem?) is that if we
repeated samples, 99.7 of the time, the sample
slope would not be zero.
37
One tailed versus two tailed test
95
2.5
2.5
You can use theory to rule out one of the areas
covering 2.5. If you know the slope should be
positive, then you can cross out the 2.5 on the
left. Then you are 97.5 confident that the
relationship is not zero.
38
One tailed versus two tailed test
95
2.5
2.5
You can use theory to rule out one of the areas
covering 2.5. If you know the slope should be
negative, then you can cross out the 2.5 on the
left. Then you are 97.5 confident that the
relationship is not zero.
39
Defining the meaning of 95 confidence
If a certain interval is a 95 confidence
interval, then we can say that if we repeated the
procedure of drawing random samples and computing
confidence intervals over and over again, 95 of
those confidence intervals include the true value
from the population. This is not to say that we
are 95 confident that the true value lies
between the upper and lower bound.
40
Defining the meaning of 95 confidence
  • Instead, I am 95 confident that a confidence
    interval covers the true value from the
    population, based not on this single confidence
    interval from this single test,
  • but rather
  • as a result of what would happen were I to repeat
    the process of drawing samples and doing this
    test over and over again.

41
Happiness and occupational prestige
. regr happy prestg80 Source SS
df MS Number of obs
11 -------------------------------------------
F( 1, 9) 3.30 Model
1.31753739 1 1.31753739 Prob F
0.1026 Residual 3.59155351 9
.399061502 R-squared
0.2684 ------------------------------------------
- Adj R-squared 0.1871 Total
4.90909091 10 .490909091 Root
MSE .63171 ------------------------------
------------------------------------------------
happy Coef. Std. Err. t
Pt 95 Conf. Interval ------------------
--------------------------------------------------
--------- prestg80 -.0380391 .0209348
-1.82 0.103 -.085397 .0093187
_cons 3.330371 .8050567 4.14 0.003
1.509207 5.151536 ----------------------------
--------------------------------------------------
42
Effect of Index of Signals on the Number of Cases
on the U.S Supreme Court Agenda, 1953-1995
8
7
6
4.62
5
3.85
Upper bound of the 95 confidence
interval Estimate Lower bound of the 95
confidence interval
4
3
2.11
2
1.27
1.19
1.34
1
0
-1
1
2
3
4
5
6
-2
Lag Year
43
The Effect of Supreme Court Signals on Amicus
Briefs at Courts of Appeals
44
Upper bound 95 Confidence Interval
Point Estimate - slope
Lower bound 95 Confidence Interval
45
Upper bound 95 Confidence Interval
Point Estimate - slope
Lower bound 95 Confidence Interval
Write a Comment
User Comments (0)
About PowerShow.com