Title: Scanner Data in CPI Research and Compilation
1Scanner Data in CPI Research and Compilation
- Guia Alcausin
- Michael Anderson
- Jonathan Khoo
- Ken Tallis
22. Opportunities available through Scanner Data
Sets
- Improving current CPI methods and practices
continuing to collect price data directly, but
using scanner-based research to tune the design
of the collection, index construction methods and
data treatments (such as editing and imputation) - Data substitution ceasing direct price
collection for some segments of the CPI, and
using scanner data instead - Data augmentation using both directly collected
and scanner data to compile segments of the CPI
33. It Is Not Cheap! Costs Include
- Acquiring the data either drawing data directly
from individual stores or chains or purchasing
data from a commercial clearing house - Redesigning compilation practices
- Retraining price statisticians in the new
compilation practices - Reworking the mathematics of CPI construction
such - as the microindex formulae and the aggregation
tree and - aggregation formulae
- Redeveloping computer systems
- Understanding the effects of all these changes on
the published CPI and explaining them to users
44. Nine Research Themes in ABS Work
- Guiding current CPI design and practice
- Theme 1 Aggregation tree and index formulae
- Theme 2 Sample allocation
- Theme 3 Treatment of discontinuities
5- Data substitution and data augmentation
- Theme 4 Frequency and treatment of
discontinuities and quality changes - Theme 5 Volatility
- Theme 6 An altered economics of CPI compilation?
- Theme 7 Subsampling the scanner data?
- Theme 8 What is the true cardinality of a scanner
dataset? - Theme 9 A different theoretical foundation needed?
65. Some Features of the Australian CPI
- The Australian CPI is built up from
- around 1,000 elementary aggregates for each of
the eight capital cities, which are combined to
form - around 90 expenditure classes, which are combined
to form - around a dozen expenditure groups, which are
combined to form - the all-groups CPI
76. Some Features of the Australian CPI
- Data to construct weights collected at
five-yearly intervals - Quarterly series
- Price data collected quarterly or more often
- using handheld devices or directly from outlets
- purposive not probability sample
- 10,000 price observations
- Geometric means used at lowest level of
aggregation - Laspeyres index at higher levels
87. Some Key Properties of Scanner Data
- Provide a (quasi) census of purchase transactions
- Include data on both prices and quantities
- Show observations in almost continuous time
- Provide weighting data at the same frequency as
used for the price data
98. Properties of the Experimental Scanner Data Set
- Obtained from AC Nielsen
- 65 week period (first 13 weeks used as base
period) - 19 grocery commodities
- Variables (quantities, prices, commodity-brand,
size and packaging) - Four supermarket chains (over 80 of grocery
sales)
109. Theme 1 Aggregation Tree and Index Formulae
- Three key questions being asked
- (i) Can a better understanding of substitution
between items guide our drawing of commodity
boundaries at the lower levels of the CPI
aggregation tree? - (ii) What index formula should be used at each
level of aggregation? - (iii) Under what conditions can a unit value
index be validly used in CPI compilation?
1110. Theme 1 Aggregation Tree and Index Formulae
- Three key questions being asked
- (i) Can a better understanding of substitution
between items guide our drawing of commodity
boundaries at the lower levels of the CPI
aggregation tree? - (Work in progress)
12Theme 1 contd
- (ii) What index formula should be used at each
level of aggregation? - (Confirms geometric mean is best performing
microindex) - (iii) Under what conditions can a unit value
index be validly used in CPI compilation? - (Unit values calculated over items tend to cause
appreciable bias OK to compute over outlets for
same item)
1311. Theme 2 Sample Allocation
- Conclusion greater gains in index quality from
increasing the number of items in sample than
from increasing the number of stores
1412. Theme 3 Treatment of Discontinuities in
Traditional Data Sources
- Question what is the best way of dealing with
missing price observations (e.g. gaps, quality
changes)? - Possibilities using matching observations only,
impute missing data, move forward last
observation, hedonics - (Work in progress)
1513. Theme 4 Frequency and Treatment of
Discontinuities and Quality Changes in Scanner
Data
1614. Theme 5 Volatility
- Finding Scanner data has increased volatility
of indexes (when introduced straight into our
existing compilation practices) - Does this reflect real-world volatility?
- Or is it a flaw in conceptual framework for
dealing with high-frequency data? - Change to scanner based index compilation
procedures?
1715. Theme 6 an Altered Economics of CPI
Compilation
- Cost structure would change dramatically would
need to - rethink design
- Much larger data set
- Higher computer transaction costs
- Purchase costs (some field costs)
- Reworking of mathematics of CPI construction
- Change in compilation practices
- Retraining of price statisticians
- Redevelopment of computer systems
- Objective of theme to understand change in cost
- components
1816. Theme 7 Subsampling the Scanner Data
- Objective to reduce costs of using the scanner
data - (Work in progress)
1917. Theme 8 What is the True Cardinality of the
Data Set?
- Some evidence of supermarket chains using common
price schedules for significant parts of the CPI
basket - Exploring the possibility of using price
schedules for parts of the CPI compilation
2018. Theme 9 a Different Theoretical Foundation
Needed?
- Our theoretical tools (mainly the existing
corpus of Cost of Living index theory) are
not fully adequate for the economic behaviours
search, shopping and inventory behaviour that
are incorporated in high-frequency data - (Triplett 2000)
2119. Conclusions
- Scanner data has considerable potential to assist
with CPI compilation - Consensus conclusions are emerging for some
research questions - Costs are not trivial
- Research is expensive
- Meta analysis would be useful to generalise
research findings. Need a home for relevant
research documents and links. ABS is prepared to
do this through its Ottawa Group secretariat
activities