Title: On World Poverty: Causal Graphs from the 1990s
1On World Poverty Causal Graphs from the 1990s
- David A. Bessler
- Texas AM University
- January 2003
2 Outline
II. Scatter Plots on Measures of
Poverty and Related Variables
III. Causal Modeling
IV. Directed Graphs
- V. Regressions and Front Door
- and Back Door Paths
VI. Summary and Discussion
3Measures of Poverty
- Alternatives are Discussed in Sen
- Poverty and Famines, Oxford Press, 1981.
- Economic Measures e.g., of Population
- Living on One or Two Dollars per Day or
Less
- Biological Measures e.g. deficits in
- calorie intake
4A Short List of Literature on Causes and Effects
of Poverty
- Agricultural Income (Mellor, 2000).
- Freedom (Sachs and Warner 1997).
- Income (Sen 1981).
- Income Inequality (Sen 1981 Miller and Ruby
1971). - Child Mortality (Belete, et al 1977).
5Literature Continued
- Birth Rate (Sen, 1981)
- Rural Population (Rivers, et al 1976)
- Foreign Aid (World Bank, 2000)
- Life Expectancy (Rowntree 1901)
- Illiteracy (Huffman, 1989)
- International Trade (Bhagwati, 1996)
6Data Sources
- World Bank Development Indicators
- 80 Countries of Population Living off of
One and Two Dollars - per Day or Less.
- Heritage Foundation
- Index of Economic and Political Freedom on 80
countries. - FAO
- of Population that is Under-Nourished.
-
7Table 1.Countries Studied
8Table 1.Countries Studied, Continued
9Table 1.Countries Studied, Continued
10(No Transcript)
11(No Transcript)
12(No Transcript)
13(No Transcript)
14(No Transcript)
15(No Transcript)
16(No Transcript)
17(No Transcript)
18(No Transcript)
19(No Transcript)
20(No Transcript)
21100
75
50
lt 2/day
25
Figure 12. Scatter Plot of Living on 2/Day or
Less and Relative Importance of International
Trade, Eighty Low Income Countries, mid-1990s
Data.
22Directed Acyclic Graphs
- Recently Papineau (1985) has
- uncovered an asymmetry in causal
- relations which may prove to be every
- bit as helpful as Grangers (Suppes)
- time sequence in causal systems.
23Motivation
- Oftentimes we are uncertain about which variables
are causal in a modeling effort. - Theory may tell us what our fundamental causal
variables are in a controlled system however, it
is common that our data may not be collected in
a controlled environment. - In fact we are rarely involved with the
collection of our data.
24Use of Theory
- Theory is a good potential source of
information about direction of causal flow.
However, theory usually invokes the ceteris
paribus condition to achieve results.
Data are usually observational (non-experimental)
and thus the ceteris paribus condition may not
hold. We may not ever know if it holds because
of unknown variables operating on our system (see
Malinvauds econometric text).
25Observational Data
- In the case where no experimental control is
present in the generation of our data, such data
are said to be observational (non-experimental)
and usually secondary, not collected explicitly
for our purpose but rather for some other primary
purpose.
26Experimental Methods
- If we do not know the "true" system, but have
an approximate idea that one or more variables
operate on that system, then experimental methods
can yield appropriate results. - Â
Experimental methods work because they use
randomization, random assignment of subjects to
alternative treatments, to account for any
additional variation associated with the unknown
variables on the system.
27Directed Graphs Can Be Used To Represent
Causation with Observational Data
- Directed graphs help us assign causal flows to a
set of observational data. - The problem under study and theory suggests
certain variables ought to be related, even if we
do not know exactly how.
With Observational Data we dont know the "true"
system that generated our data.
28Causal Models Are Well Represented By Directed
Graphs
- One reason for studying causal models,
represented here as X ? Y, is to predict the
consequences of changing the effect variable (Y)
by changing the cause variable (X). The
possibility of manipulating Y by way of
manipulating X is at the heart of causation.
Hausman (1998, page 7) writes Causation seems
connected to intervention and manipulation One
can use causes to wiggle their effects.
29We Need More Than Algebra To Represent Cause
- Linear algebra is symmetric with respect to the
equal sign. We can re-write y a bx as x
-a/b (1/b)y.
Either form is legitimate for representing the
information conveyed by the equation.
A preferred representation of causation would be
the sentence x ? y, or the words if you
change x by one unit you will change y by b
units, ceteris paribus. The algebraic statement
suggests a symmetry that does not hold for causal
statements.
30Arrows Move Information
- An arrow placed with its base at X and head at Y
indicates X causes Y X ? Y. - By the words X causes Y we mean that one can
change the values of Y by changing the values of
X. - Arrows indicate a productive or genetic
relationship between X and Y. - Causal Statements are asymmetric X ?Y is not
consistent with Y ? X.
31A Causal Fork
- For three variables X, Y, and Z, we illustrate
- X causes Y and Z as
- Here the unconditional association between Y
- and Z is non-zero, but the conditional
- association between Y and Z, given
- knowledge of the common cause X, is zero
- common causes screen off associations between
- their joint effects.
32An Example of a Causal Fork
- X is the event, the patient smokes.
- Y is the event, the patient (a light-skin
person) has - yellow fingers.
- Z is the event, the patient has lung cancer.
P (Z Y) gt P (Z) Here yellow fingers are
helpful in forecasting whether a patient has
lung cancer.
P (Z Y, X) P (Z X) Here, if we add
the information on whether he/she smokes,
the influence of yellow fingers disappears.
33An Inverted Fork
- Illustrate X and Z cause Y as
- Here the unconditional association between X
- and Z is zero, but the conditional
association - between X and Z, given the common effect Y
is - non-zero
Common effects do not screen off the association
between their joint causes.
34The Causal Inverted Fork An Example
- Let Y be the event that my car wont start
- Let Z be the event that my gas tank is empty
- Let X be the event that my battery is dead
- My battery being dead and my gas tank being
empty are independent
P(XZ) P(X) - Given I know my car is out of gas and it wont
start gives me some information about my battery
P(XY,Z) lt P (XY)
35The Literature on Such Causal Structures has been
Advanced in the Last Decade Under the Label of
Artificial Intelligence
- Pearl, Causality, Cambridge Press, 2000
- Spirtes, Glymour and Scheines, Causation,
- Prediction and Search, MIT Press, 2000
- Glymour and Cooper, editors, Computation,
- Causation and Discovery, MIT Press, 1999
36Causal Inference Engine
- PC Algorithm
- 1. Form a complete undirected graph connecting
every variable with all other variables.
2. Remove edges through tests of zero
correlation and partial correlation.
3. Direct edges which remain after all possible
tests of conditional correlation.
- Use screening-off characteristics to
accomplish edge direction
37Assumptions(for PC algorithm to give same causal
model as a random assignment experiment)
- 1. Causal Sufficiency
- 2. Causal Markov Condition
- 3. Faithfulness
- 4. Normality
38Causal Sufficiency
- No two included variables
- (X and Y in diagram) are caused
- by a common omitted variable (Z)
39Causal Markov Condition
- The data on our variables are
- generated by a Markov property,
- which says we need only condition
- on parents
P(W, X, Y, Z) P(W) P(XW) P(Y) P(ZX,Y)
40Faithfulness
- There are no cancellations of
- parameters, eg
A b1 B b3 C C b2 B
It is not the case that -b2 b3 b1
So deep parameters b1, b2 and b3 do not form
combinations that cancel each other (economist
know this as a version of the Lucas Critique).
41(No Transcript)
42Table 2.Edges Removed
43Table 2.Edges Removed, Continued
Edge Removed
Partial Correlation
Corr.
Prob.
44Table 2.Edges Removed, Continued
Edge Removed
Partial Correlation
Corr.
Prob.
45Table 2.Edges Removed, Continued
Edge Removed
Partial Correlation
Corr.
Prob.
46(-)
Agricultural Income/Person
Illiteracy
Unfreedom
()
()
Gini
()
GDP/Person
()
Birthrate
Child Mort
()
()
(-)
lt2/day
()
Foreign Aid
Pop Rural
Int. Trade
()
Malnourished
(-)
(-)
Life Expectancy
47(-)
Agricultural Income/Person
Illiteracy
Unfreedom
()
()
Gini
()
GDP/Person
()
Birthrate
Child Mort
()
()
lt1/day
(-)
Foreign Aid
Pop Rural
Int. Trade
()
Under Nourished
(-)
Life Expectancy
48Rising Tide Lifts All Boats?Regressions Based
on 1/day Graph
- 1/Day 27.45 - .004 GDP/Person R2
.60 - (2.65) (.001)
- (std. errors in parentheses)
- Here merely regressing 1/day on GDP/Person
gives us the expected negative and significant
estimate! - Notice from the graph however that no line
connects GDP and 1/day. We removed the edge by
conditioning on Child Mortality. - 1/Day 2.75 - .0004 GDP/Person .237
Child Mort R2 .84 - (2.82) (.001)
(.022)
49Rising Tide Lifts All Boats?Regressions Based
on 2/day Graph
- 2/Day 57.96 - .007 GDP/Person R2
.81 - (3.39) (.001)
- Here regressing 2/day on GDP/Person gives
us the expected negative and significant
estimate! - Notice from the 2/day graph that we have a
connection between GDP and 2/day. So
conditioning on Child Mortality does not
eliminate GDP as an actor in explaining 2/day. - 2/Day 28.42 - .0033 GDP/Person .287
Child Mort R2 .91 - (4.22) (.001)
(.034)
50Regression Analysis Backdoor and Front Door Paths
- The previous results on the rising tide
argument are generalized as necessary conditions
for estimating the magnitude of the effect of a
causal variable.
- To estimate the effect of X on Y using regression
analysis, one must block any backdoor path from
X to Y via the ancestors of X. We block
backdoor paths by conditioning on one or more
ancestors of X.
- To estimate the effect of X on Y using regression
analysis one must not condition on descendants of
X. One must not block the front door path.
51Front Door PathConsider the Effect of
Agricultural Income on lt2/day
- From above we have the following causal chain
- Ag Income/Person ? GDP/Person ? 2/Day
Since GDP/Person is caused by AG Income/Person,
we cannot have GDP/Person in the regression
equation to measure the effect of Agricultural
Income/Person on 2/Day do not block the front
door!
Biased Regression 2/Day 57.99 -
.0007 Ag Inc. - .0068 GDP R2 .37
(3.60) (.0014)
(.0018)
Unbiased Regression 2/Day -51.73 -
.0038 Ag Inc. R2 .23
(4.34) (.0018)
52Backdoor paths Consider the Effect of GDP/Person
on lt2/Day
- We have the following sub-graph
- GDP/Person ?
Un-Freedom - ?
- 2/Day ? Birth Rate ?
Gini
The front door path would suggest that we regress
2/Day on GDP/Person. But there exists a
backdoor path, through freedom to Gini and Birth
Rate. We must block the backdoor path by
conditioning on either Un-Freedom, Gini or Birth
Rate.
53Comparison of 2/Day on GDP Regressions
- Biased Regression (fails to block the backdoor)
- 2/Day 57.98 - .0077 GDP/Per R2
.37 - (3.62) (.001)
Unbiased Regression (blocks the backdoor)
2/Day 4.97 - .0031 GDP/Per 1.635 Birth
Rt R2 .71 (3.62)
(.001) (.148)
54Conclusions
- Illiteracy, Freedom, Income Inequality,
- and Agricultural Income are Exogenous
- movers of Poverty.
- We are not able to direct causal flow
- among our four exogenous variables.
- Foreign Aid appears not to be a mover of
- Poverty.
55Caution
- Our methods assume
- Causal Sufficiency
- Markov Property
- Faithfulness
- Normality
- Failure of any of these may change results.
Dynamic representation of poverty
should be pursued. This will require a richer
data set.
56Acknowledgements
- Motivation for the study
- Aysen Tanyeri-Abur, FAO
- Motivation on our study of Directed Graphs Clark
Glymour, CMU - Judea Pearl, UCLA
- PowerPoint Presentation
- Todd D. Bessler, COB, TAMU