Title: Imputation and Editing of Income from the Administrative File in Israel
1Imputation and Editing of Income from the
Administrative File in Israels Censuses
prepared by Orly Furman and Dmitri Romanov
- UNITED NATIONS
- ECONOMIC COMMISSION FOR EUROPE
- CONFERENCE OF EUROPEAN STATISTICIANS
- Work Session on Statistical Data Editing
- Ljubljana, May 2011
2Use of Administrative Income Files in Israels
Censuses
2008 Census 1995 Census
Employment Employment Salary and self-employment income (net and gross) Questionnaire
December 2008 Calendar year 2008 September 1995 Reference period
Imputation of salary and self-employment income for all records Consistency checks of reportage on employment Consistency checks of reportage on salary Imputation of 8.6 records when data missing Administrative income files
3Administrative Income File used in the 1995
Census
- Source National Insurance Institute.
- Coverage 1.965 million employee posts, months of
work and annual salary and wage of 1.815 million
employees, as reported to the NII.
Diff. census-adm. file for the census sample, Reported by the 20 census sample Adm. file for the 20 census sample Adm. file
-16.5 3,978 4,763 4,706 Average wage, NIS
-8.1 1,566.5 1,705.0 1,815.1 Emplyees, 000
4Amendments to the Salary Data in the 1995 Census
Percentage of total Treatment/amendment
69.0 Non-amended value
21.0 Gross salary imputed by regression from net salary
8.6 Imputation of data from adm. file
1.2 Editing of irregularities (division by 100/10)
0.2 Other editing
100.0 Total
5Reporting Salary in Census Rounding
Distribution of September Salary Reported in the
Census
6Reporting Salary in Census Confounding Net and
Gross
Deviation of the September Salary Reported in the
Census from the Gross and Net Monthly Salary Per
Job in the Administrative File, by Annual Salary
Percentiles, as a percentage of gross calculated
salary
7Imputing Salary in Census Challenge of Multiple
Jobs
Distribution of Values Imputed on the Basis of
the Administrative File for Employees who Held
One Job (Left) and More Then One Job (Right) in
1995
8Administrative Income Files Used in the 2008
Census
- Source Tax Authority.
- Coverage employee posts, months of work and
annual salary and wage of the employees, and
business income of the self-employed individuals. - Usage Imputation of earnings (salary and
business income of the self-employed),
conditional on workforce status as reported in
the Census and occurrence in the administrative
income files. - Challenge Treatment of inconsistencies between
the two sources, due to misreporting in the
Census, or/and omissions of the administrative
file.
9Identification of Cases in Which the Census Data
and the Administrative Data do Not Coincide
10Discrepancies between the Census Data and the
Administrative Data
- Group A Individuals that were found to have work
income in 2008 as per the administrative income
data base, which according to the census did not
belong to the annual workforce. - Group B Individuals that reported in the census
as belonging to the annual workforce, but were
not found to have work income according to the
administrative data bases.
11Analysis of Cases in Group A
- 67 of individuals in this group are in the
primary working age-group (19 to 65). 51 worked
in 2008, according to the income tax data, less
than half a year. This reinforces the hypothesis
that under-reporting of employment in the Census
is connected to irregular employment over the
year. - For 43 of this group, a record exists in the
administrative income data base for December
2008. - 74 of the individuals having work income in
December, who did not report employment in the
census, were in in the primary working age-group
For two thirds of them the income data base
includes information on ongoing employment in
2008, for over six months of employment. This
indicates a high probability of inaccurate
reporting in the census with respect to labour
market non-participation.
12Analysis of Cases in Group B
Absent from the income data base, of cell Distribution as reported in the census Work status
12.3 100.0 Total
10.7 86.3 Employees
16.5 8.3 Self employed not employing
12.7 4.4 Self employed employing
25.6 0.1 Cooperative members
75.6 0.8 Kibbutz members
51.2 0.1 Unpaid family members
13Analysis of Cases in Group B
Absent from the income data base, of cell Distribution as reported in the census Work status
12.3 100.0 Total
10.7 86.3 Employees
16.5 8.3 Self employed not employing
12.7 4.4 Self employed employing
25.6 0.1 Cooperative members
75.6 0.8 Kibbutz members
51.2 0.1 Unpaid family members
14Analysis of Cases in Group B
- The work hypothesis is that the absence of
information on employees and the self-employed is
due to late or failed reporting by employers and
self employed individuals to the income tax
authority. - Accordingly, the employer of an employee who was
absent from the 2008 income data base should be
examined, to check whether the employee was
active in the preceding year. - The examination shows that more than 50 worked
in 2007 and have employee jobs. 80 of these work
for employers that did not report in 2008 but did
report in 2007.
15Algorithm of Income Imputation
of cases reported in census of total imputed cases Income recording method Group
7.9 61.7 Work months and salary imputed as per the income data base. Found to have work income according to income data base but do not belong to the workforce according to the census.
16Algorithm of Income Imputation (cont.)
of cases reported in census of total imputed cases Income recording method Group
1.9 15.2 The individuals salary for 2007 was imputed, adjusted for the average salary increase in the economic industry. Belong to the workforce according to the census but found not having work income according to the income data base, found to be employed by employers in 2007 that did not report in 2008
17Algorithm of Income Imputation (cont.)
of cases reported in census of total imputed cases Income recording method Group
0.3 2.6 Income was imputed for holders of active files in 2007, adjusted for the average income increase in the economic industry. Belong to the workforce according to the census but found not having work income according to the income data base, found to be reporting self employed individuals in 2007 who did not report in 2008
18Algorithm of Income Imputation (cont.)
of cases reported in census of total imputed cases Income recording method Group
0.5 3.6 Income was imputed from the ongoing survey, according to the average income as per defined estimation cells. Belong to the workforce according to the census but found not having work income according to the income data base, military personnel, housekeepers, caretakers and unknown denotation of occupation
19Algorithm of Income Imputation (cont.)
of cases reported in census of total imputed cases Income recording method Group
2.1 16.9 Income was imputed based on average income in the defined estimation cells, according to the number of months worked as reported in the census. Individuals who reported having worked in the census but do not belong to the abovementioned groups
20The Bottom Line
- All in all, only in 12.7 of cases that reported
employment in the 2008 census discrepancies
between the reportage and the administrative
source were treated, and income information from
the administrative file was amended. In 87.3 the
data on earnings of the employees and the
self-employed from the administrative file was
imputed. - In contrast, income data as reported in the
traditional 1995 census, had to be amended or
imputed in 29.6 of cases.
21Thank you!