Title: A Flexible Mathematical Programming Model to Estimate Multiregional InputOutput Accounts
1A Flexible Mathematical Programming Model to
Estimate Multiregional Input-Output Accounts
- Zhi Wang
- Bureau of Economic Analysis, US Department of
Commerce - Patrick Canning
- Economic Research Service, US Department of
Agriculture
2Presentation Outline
- Motivations and Objectives
- Literature Review and Basic Ideas
- Model Specifications and Major Properties
- Experiment Design
- Testing Results
- Conclusions and Limitations
- Apply to the U.S. -- the Large Dimension Case
3Motivations
- There are tremendous disparities in economic
development among different regions in large
developing countries. Globalization has
different impacts on the urban and coastal areas
compared to the rural and inland less developed
regions. How to maintain balanced development to
reduce income inequalities between different
parts of the country and achieve a high rate of
economic growth simultaneously is one of the
pressing policy challenges faced by governments
in most large developing countries today. - A major obstacle in conducting policy analysis
for regional economic development is the lack of
consistent, reliable regional data, especially
data on interregional trade.
4Motivations
- Despite decades of efforts, regional data
analogous to national input-output accounts and
international trade accounts remain unavailable,
even for well defined sub-national regions in
many developed countries. Regional economists
have to develop various non-survey methods to
estimate these data. - In the past two decades, there have been well
developed mathematical procedures to estimate
unknown data based on limited prior information
subject to a set of linear constraints
(Constraint matrix balance), but they have not
been widely used in regional economic analysis.
5Objectives
- This research intend to bridge such gaps. We have
three objectives - develop and implement a formal model to
estimate inter-regional, inter-industry
transaction - flows based on incomplete regional data
- To evaluate the models performance using
real-world data - To apply the model in a large dimension case
using publicly available economic and
transportation statistics describing the US
economy in 1997
6Literature Review Constrained matrix-balancing
problem
- It involves the computation of the best estimate
of an unknown matrix from a given matrix, with
some prior information to constrain the solution.
It is a core mathematical structure in diverse
applications - Estimating input-output tables and inter-regional
trade flows in regional science (todays
presentation) - Balancing of social/national accounts in
economics (on going research) - Estimating interregional migration in demography
- Analysis of voting patterns in political science
- Treatment of census data and estimation of
contingency tables in statistics - Estimation of transition probabilities in
stochastic modeling - Projection of traffic within telecommunication
and transportation networks
7Literature ReviewApplication to interregional
trade flows
- David F. Batten. The Interregional Linkages
Between National and Regional Input-Output
Models. International Regional Science Review,
Vol.7, No. 1, pp. 53-67, 1982. - David F. Batten and D. Martellato. Classical
versus Modern Approaches to Interregional
Input-output Analysis. Annals of Regional
Sciences, 19 1-15, 1985. - Golan, Amos, George Judge and Sherman Robinson
Recovering Information from Incomplete or
partial Multi-sectoral Economic Data. The Review
of Economics and Statistics, 76(3),541-49,1994. - Robinson, Sherman, Andrea Cattaneo, and Moataz
El-Said. Updating and Estimating a Social
Accounting Matrix Using Cross Entropy Methods.
Economic System Research, 13(1), 47-64,2001 - Patrick Canning and Zhi Wang A Flexible
Mathematical Programming Model to Estimate
Interregional Input-Output Accounts.Journal of
Regional Sciences, August, 2005.
8Basic Ideas of the Framework
- To apply this methodology as a data
reconciliation tool, three pieces of information
are needed - Initial estimates of the same economic variables
from different sources (in national economic
accounts, estimates of the same variables can
often be obtained from income, expenditure or
production data). - An accounting framework and other constraints
(demands have to equal supplies and components
have to sum to totals, ). - Reliability information on the initial estimates
(standard error, ranking index, )
9Problems of Proportional Adjustment
10Problems of Proportional Adjustment(Cont.)
11Notation Conventions
- Domestic deliveries
- Intermediate demand
- Gross output
- Value added
- Final demand
- Exports
- Imports
- Counterparts of national totals without
superscripts - Variables with a bar denote initial estimates for
that variable - An additional w before the variable indicates
the reliability measure for that variable - Superscripts denote regions, subscripts denote
products
12The Estimation Problems
- Given a nm2 non-negative array and a
n2m non-negative array , determine a
non-negative array D and a non-negative
array Z that is close to D and Z such that
accounting identities hold. - In other words, modify a given set of prior
inter-regional and inter-industrial transaction
estimates according to the following objective
function to satisfy known accounting constraints
13Basic Accounting Identities in a National System
of Economic Regions
14The Battern Interregional IO Model (1982)
15Theoretical Properties
- Statistical interpretations underlying the model
differ when different reliability weights are
used - The quadratic and entropy objective functions are
equivalent in the neighborhood of initial
estimates - In all but the trivial case, posterior estimates
derived from entropy or quadratic loss minimand
will always be closer to the unknown, true values
than the associated initial estimates. - The choice of weights in the objective function
has very important impacts on the estimation
results
16Why Balanced Estimates Better?
- Initial estimates
- W variance matrix of initial estimates ,
- A coefficient matrix of all linear constraints
AD 0 - The BLUE
- D will always not worse than with equal or
smaller variance
17Empirical Advantages
- Flexibility
- This model permits a wider variety and volume of
information to be brought to bear on the
estimation process than is possible with scaling
methods such as RAS - Incorporation of data reliabilities in a systemic
way - The weights in the objective function reflect the
relative reliability of a given set of priors.
Entries with higher reliability should undergo
less adjustment than entries with lower
reliability. - Smaller model dimension than Batten model
- Has NG2(N-1) less variables and the same
constraints. This increases estimation efficiency
and facilitates the computation process.
18Testing Data Set
- Version 4 of the Global Trade Analysis Project
(GTAP) database was first aggregated into a
4-region, 10-sector data set. - Then 3 of the 4 regions (the United States,
European Union and Japan) were further aggregated
into a single open economy which engages in both
inter-regional trade among its 3 internal regions
and international trade with rest of the world. - The model was used to replicate the underlying
inter-continental trade flows among Japan, EU and
the United States as well as the individual
countrys input-output account.
19Experiment Design
- Experiment 1
- The three regions weighted average I-O flows
and distorted inter-regional trade data in the
GTAP were used as initial estimates. - Experiment 2
- The region-specific I-O flows are assumed to be
constant. The inter-regional shipments in the
first experiment were re-estimated. - Experiment 3
- The inter-regional shipments are known with
certainty. The three regions weighted average
I-O flows were used as priors to estimate the
region-specific I-O flows.
20Experiment Design (Cont.)
- Experiment 4
- David F. Battens model was used to estimate the
inter-regional shipments and individual region
I-O flows. - Solutions from both models are compared with the
true inter-regional trade and inter-sector I-O
flow data in the GTAP data set.
21Measures to evaluate test resultsMean absolute
percentage error (MAPE)
22Estimate Results Mean absolute percentage error
of inter-regional tradePercent difference from
true trade data
23Estimate Results (Cont.)Mean absolute percentage
error of inter-regional tradePercent difference
from true trade data
24Estimate Results (Cont.)Mean absolute percentage
error of inter-sector flowsPercent difference
from true IO data
25Observations
- In all experiments except the Batten model, most
of the mean absolute percentage errors are about
4-7 percent of the true data. In contrast,
recovering the individual regions input-output
flows from national averages values only had
limited success. - When there is no additional information that can
be incorporated into the estimation framework, a
more detailed model may not perform any better
than a simpler model. However, the accuracy is
improved by a more detailed model when more
detailed data are available.
26Observations(Cont.)
- The marginal accuracy gained from actual
individual regional I-O flows is significant in
estimating inter-regional trade flow using the
IRIO model, but quite small in the MRIO version.
In contrast, the marginal value of accurate
inter-regional shipment data is rather small in
estimating individual regional I-O coefficients
under both versions of the model. - However, caution is needed before developing a
firm conclusion because the particular data set
used to test the model in this paper may have
skewed the results. Because the United States, EU
and Japan are all large economies, their demand
for intermediates are largely met by their own
production.
27Conclusions
- This paper developed a mathematical model to
estimate inter-regional trade patterns and I-O
accounts based on an inter-regional accounting
framework and initial estimates of inter-regional
shipments in a national system of economic
regions. - The model is quite flexible in its data
requirements and has desirable theoretical and
empirical properties. - The model performed remarkably well in
identifying the true patterns of inter-regional
trade from highly distorted initial estimates of
inter-regional shipments.
28Limitations
- Based on the data set aggregated from the GTAP
data, tests show that the current model is
limited in its ability to improve the IO
transaction estimates of individual regions from
national averages. - Continuing research on the true underlying causes
is needed to further enhance the models capacity
as an estimating and reconciliation tool in
building inter-regional production and trade
accounts.
29Estimating a U.S. Multiregional Input-Output
Account
- Apply the Mathematical Programming Model to a
Large Dimension Case - Data preparation for the United States
301997 Detailed Benchmark Input-Output Account 483
x 494
1997 Detailed Benchmark Input-Output Account 483
x 483
convert to C by C
expand farm sectors
USDA production, cost, and utilization
data, plus other ERS value added statistics
ERS expanded Detailed Input-Output Account 494 x
494
Economic, Ag, and Govt. Census, NASS,
ARMS APHIS, ERS Value added data
U.S. customs Data (SITC) by ports and detailed
data by customs districts (HS)
BLS-CES BEA St. Inc. tangible wealth
Census 2000 Gov GSA procurement ancillary
concordance optimization
concordance optimization
concordance optimization
31State estimates of gross output, value added,and
wagebill
State estimates of imports and exports
State estimates of household and
govt. consumption investment
- Commodity flow survey
- USDA transportation stats
- Monopolistic comp. model
concordance aggregation optimization
concordance product mix
State-to-State flow estimates for goods and
services 51-regions, 94-sectors
State estimates of inter-sectoral transactions 51-
regions, 94-sectors
Unbalanced 51-region, 94-sector multi-regional
input-output account All initial data for the
final mathematical programming model
32Specifications for Model to Reconcile Consumption
and Saving Statistics
- Data available from three different sources
(observed statistics) - Bureau of Labor Statistics (BLS) - Consumer
Expenditure Survey - rgexp0ik HHs expenditure by commodity and US
regions - szexp0is HHs expenditure by commodity and
family size - incexp0in HHs expenditure by commodity and
family income groups - rginc0k HHs disposable income by US regions
- szinc0s HHs disposable income by family size
- incinc0n HHs disposable income by family income
groups - Census Bureau - Population Census
- Number of HHs by size, income group and
state - Bureau of Economic Analysis (BEA) - Disposable
income data - sinc0r Disposable income by state
33Specifications of Model to Reconcile Consumption
and Saving Statistics (Cont.)
- Dimension of data
- i Commodity categories, i 1,2,,75
- s Households by family size, s 1,2,5
- n Households by income, n 1,2, 7
- r State
r 1,2,51 - k Region k Northeast, Mid-West, South,
West - Unobservable statistics estimated from the model
- HHs expenditure by commodity and state
- A Two Stage Quadratic Programming Model to
Reconcile the data -
-
34Estimates of Consumer Savings and Expenditures
by Commodities
- Disposable income by Region and Income Group from
BLS survey is held constant in the adjustment - Percentage adjustments are small for disposable
income by state from BEA and for expenditure by
family size from BLS survey. - Percent adjustment from BEA state disposable
income - AL 1.621, AK -1.337, AZ 0.280, AR
2.137, CA -0.894, CO -0.652, CT -1.093,
DE -0.613, - DC 0.733, FL 0.633, GA -0.066,
HI -1.211, ID 0.894, IL -0.585, IN
0.174, IA 0.685, - KS 0.393, KY 1.711, LA 1.734,
ME 1.135, MD -1.198, MA -0.725, MI
-0.293, MN -0.605, - MS 2.199, MO 0.916, MT 2.562, NE
0.663, NV -0.397, NH -0.791, NJ -1.329,
NM 1.914, - NY -0.095, NC 0.636, ND 1.938,
OH 0.348, OK 1.755, OR 0.369, PA
0.500, RI 0.238, - SC 0.980, SD 1.694, TN 1.167
TX 0.195, UT -1.092, VT 0.444, VA
-0.498, WA -0.431 - WV 2.928, WI -0.136, WY 1.067
- Percent adjustment from expenditure by family
size from BLS survey - size1
size2 size3 size4 size5 - SAVE -6.185 -4.481 -4.085
-4.541 -7.304
35Specifications for Model to Fill the Missing
Value in Commodity Flow Survey Data
- Data available in CFS
- x0isr State to state shipment by commodity at 2
digit SCTG level with missing values - sx0ir Shipment of commodity i by state of origin
at 3 digit SCTG (table 5) - st0sr Total outbound shipment from state s to
state r at 3 digit SCTG (table 7) - dt0sr Total inbound shipment from state s to
state r at three 3 SCTG (table 8) - us0 i Total shipment of commodity i in the
United States at 2 and 3 digit (US report table
5) - wsxir Variance of sx0ir (Appendix)
- wstsr Variance of st0sr (Appendix)
- wdtsr Variance of dt0sr (Appendix)
- dx0ir Shipment of commodity i by state of
destination and x0isr at three digit SCTG are
complete missing. - First fill the missing x0isr at 2 digit SCTG
level, then fill the missing values of dx0ir at
3 digit SCTG level, finally fill the missing
x0isr at 3 digit SCTG level
36Service Sector Trade Flows
- Treyz, F., and J. Bumgardner. "Monopolistic
Competition Estimates of Interregional Trade
Flows in Services." In H. Kohno, P. Nijkamp, and
J. Poot. eds. Regional Cohesion and Competition
in the Age of Globalization. Edward Elgar, 2000. - Monopolistic competitive service sector w/
economies of scale technologies - Consumer and producer demands for services are
characterized by preference for variety, ala
Dixit-Stiglitz (1977) - Free entry and exit drives profits to zero and
uniform firm sizes - location of production and demand markets are
pre-determined - cif prices reflect market and non-market costs of
overcoming distance - Given location of production and demand, demand
elasticity's, and distance costs, solve the the
set of fob prices that minimize the cost of
serving each market--this produces a unique
spatial equilibrium in service trade flows
37Specifications for Model to Fill the Missing
Value in Commodity Flow Survey Data