Title: Diapositiva 1
1Quality reporting in a short-term business
survey based on administrative data
M. Carla Congia congia_at_istat.it Fabio
Rapiti rapitifa_at_istat.it ISTAT - Italy
European Conference on Quality in Official
Statistics Session on Quality reporting
Rome, 8-11 July 2008
2Outline
Quality reporting
- The Italian Oros Survey
- Quality issues in using administrative data
- Peculiarities of data quality assessment
- Oros quality indicators and reporting
- Final remarks
Q2008 - Rome, 8-11 July 2008
3The Oros Survey
Quality reporting
- Since 2003 the Oros survey has released
quarterly indicators on gross wages and total
labour cost per FTE covering all size enterprises
in the private non-agricultural sector (C to K
sections Nace Rev. 1.1) - Based on extensive use of administrative data
(National Social Security Institute - INPS)
combined with survey data on Large firms with
more than 500 employees (Monthly Large Enterprise
Survey) - Provisional estimates based on the provisional
population are released with a 70-days delay -
- Final estimates are produced after 5 quarters on
the basis of the whole population and complete
updated information - Meets also the requirements of the European
regulations - STS - Short-Term Statistics
- LCI - Labour Cost Index (hourly labour cost
index)
Q2008 - Rome, 8-11 July 2008
4The administrative source
Quality reporting
- National Social Security Institute - INPS
- All Italian firms in the private sector with at
least one employee have to pay monthly social
security contributions to INPS (roughly 1.3
million employers and 12 millions employees) - DM10 form
- The Monthly Declaration is a highly detailed grid
where information on total employment,
wage-bills, paid days, overtime hours and social
contributions is identified by specific
administrative codes (about 5,000 valid codes) - Each DM10 lays in several records (8 on average)
- Data capturing
- Every firm monthly transmits to INPS the DM10 in
electronic format, not later than 30 days after
the reference period - Then the whole raw declarations are redirected
to Istat at 35 days from the end of the reference
period (about 10 millions records each month)
Q2008 - Rome, 8-11 July 2008
5The administrative data exploiting strategy
Quality reporting
A constrain became an opportunity At first INPS
could not aggregate in the very strict time
scheduled the DM10 data in the format required
for Oros purposes. So the Istat strategy
became Catch what you can as quick as you
can from a typical one collection-for one
single output/product
- to focus on the whole data source- the wage and
contribution system - Advantages
- microdata are exactly those sent by firms and
this allows a more direct control of the
aggregation/translation process - a lot of information available for many other
different statistical purposes - Disadvantages
- a complex preliminary phase of checks and
computation inside the single DM10 to get to the
target variables at micro level - a lot of data not necessarily useful for
short-term objectives
Q2008 - Rome, 8-11 July 2008
6Quality issues in using INPS administrative data
Quality reporting
- The Oros challenge is to produce short-term
indicators processing - a huge quantity of very detailed microdata
- in a very short time scheduled
- coping with the frequent changes in the basic
INPS metadata - enterprises have to use DM10 form to take
advantage of labour costs reduction policies and
these contribution laws continuously change - After preliminary studies INPS data have been
considered to be suitable for Oros purposes but
still statisticians have - no quarterly ex-ante control
- over the quality of the raw administrative data
- Only a complex quality-oriented production
process can assure - ex-post quality
- coping with unusual problems
Q2008 - Rome, 8-11 July 2008
7Quality issues in using INPS administrative data
Quality reporting
Fragmented and insufficient Inps metadata
In-house Metadata database
Highly disaggregated raw data
Preliminary checks and accurate translation into
statistical variables
Integration with LE Survey data
Checks to avoid double counting
Continuos legislation changes
Final key checks - macroediting
Q2008 - Rome, 8-11 July 2008
8Peculiarities of data quality assessment
Quality reporting
- Relating to quality assessment of administrative
data Eurostat recommends to produce a
source-specific report and a product-specific one - In the Oros case the non-conventional use of
administrative data implies that the two reports
overlap.while new approaches on administrative
data quality assessment are empirically explored - Oros practice has been developed trying
- to find better tools to assess quality
- to manage the measurement of rather new
indicators on - efficient and stable data capturing
- completeness and consistency of metadata
- stable traslation/retrieval of target
statistical variables - correct integration with LE survey data
- to quarterly produce quality indicators along
the whole production process - to meet both Istat and Eurostat requests on
quality reporting
Q2008 - Rome, 8-11 July 2008
9Oros quality reporting an overview
PROCESS
PRODUCT
Survey Documentation and Methodological Handbook
Metadata in SDDS
QUALITATIVE
Oros PR explanatory notes
SIDI information system for survey documentation
Istat Quality Report
LCI Quality Report
Oros Process Monitoring Report
QUANTITATIVE
Quarterly LCI meta information
10Quality reporting
Survey Documentation and Methodological Handbook
- Initial basic quality assessment of the INPS
administrative source to evaluate the suitability
for the production of quarterly labour market
indicators - Concepts and definitions of variables and
population - Translation scheme of administrative information
into statistical variables - Coverage
- Reference time
- Accuracy
- Stability over time
- And obviously contening more about.. the survey
methods and the description of the whole
production process
Q2008 - Rome, 8-11 July 2008
11Quality reporting
Metadata in SDDS format
- Metadata in Special Dissemination Data Standard
format used to deliver information to the IMF - Base page data, access by the public,
integrity and quality - Summary methodology statements key features
enabling users to assess the suitability of the
data for their purposes - totally qualitative and compiled once it is
updated following the relevant changes in the
methodology - compiled for the 3 outputs and different users
efforts to systematize - Oros ConIstat - short-term indicatorsTS
database on Istat web-site - Oros Eurostat
- LCI Eurostat
- STS Eurostat
Q2008 - Rome, 8-11 July 2008
12Quality reporting
Process Monitoring Report 1
Quantitative indicators to keep continuosly
under control and improve the quality along the
whole Oros production process Some of them are
also warning indicators ? signal decisive
problems or detect sources of error Main quality
indicators for some key steps of the process
- Number of monthly records
- Number of DM10 forms
- Time lag between scheduled and actual delivery
dates
Data capturing
- Date of last updating of DM10 metadata on INPS
web-site - Number of new and expired DM10 codes by type
- Rate of new DM10 codes to include/exclude
- Number of official INPS acts to analyse
Metadata Database updating
Q2008 - Rome, 8-11 July 2008
13Quality reporting
Process Monitoring Report 2
- DM10 codes error rateNumber of impossible
codes/Total number of codes - DM10 codes edit rateNumber of codes changed by
editing/Number of impossible codes - Rate of duplicate unitsNumber of duplicate
units/Total number of units
Preliminary checks on administrative data
- Edit rateNumber of unit edited/ Total number of
units in scope for the item - Total contribution to key estimates from edited
valuesTotal weighted quantity for edited values
on total weighted quantity for all final values
Micro editing
Q2008 - Rome, 8-11 July 2008
14Quality reporting
Process Monitoring Report 3
- Number of units manually checked due to record
linkage problems (i.e. mergers or split-ups
recorded in different times)
Integration with LE survey data
- Number of suspicious aggregates identified
automatically by TERROR or through graphical
checks - Number of outliers treated at micro or macro
level - Total contribution to the estimates from treated
values - Length of the homogeneous time series
Macroediting
Q2008 - Rome, 8-11 July 2008
15Quality reporting
Istat Quality Report
- Still experimental Oros has been
involved in the pilot test - quality indicators within a framework of a
qualitative report coherent with Eurostat quality
components - disseminated within the System on the Quality
(SIQual) available on Istat website - external-user oriented
- subset of standard quality indicators
appropriately chosen within those available from
the Information System for Survey Documentation
(SIDI) - Response Rate
- Indicators on the Revision policy (MR, MAR)
- Timeliness for provisional data release
- Timeliness for definitive data release
- Length of the homogeneous time series
- description of non-sampling error, relevance,
accessibility
Q2008 - Rome, 8-11 July 2008
16Quality reporting
LCI Quality Report
- Required by Eurostat to evaluate the quality of
national LCI used to produce the European
aggregate index LCI was established with an
harmonization of output and not harmonization
of input approach - since 2004 the LCI QR has been annually produced
- standard structure based on Eurostat dimensions
of quality with a further aspect
completeness - main standard quality indicators used
- Revision policy (MR, MAR)
- Timeliness for provisional data release
- description of method for compiling hours worked
(LCI denominator)
Quarterly LCI meta information
- Standard Template mainly qualitative
release-specific - Changes in the labour market (collective
agreements, laws) which has an impact on wages
and labour cost - Reasons of revisions in NSA, WDA and SA data
Q2008 - Rome, 8-11 July 2008
17Final remarks
Quality reporting
The Oros innovative quarterly use of
administrative data forces to monitor peculiar
aspects of quality not usually taken into
consideration in the standard quality assessment
approach suggested by Eurostat Several specific
indicators to assess the quality of the process,
in particular the metadata updating and the
translation/aggregation of raw INPS data, have
been implemented but they need to be more
systematized These specific indicators are
essential from the producer point of view, but
they could also be used to report to the users
the quality of some key issues On the other
hand, the Oros survey satisfies the internal
(SIDI, SiQual) and external (Eurostat) requests
of standard quality reports A better integration
of all the reviewed quality reporting tools is
desirable but only partially achievable
Q2008 - Rome, 8-11 July 2008