Title: Working with Asian and Pacific Islander Data
1Working with Asian and Pacific Islander Data
- I. Coding Issues with Major Asian Groups
Francis P. Boscoe, Ph.D New York
State Cancer Registry
2Specified Asian race groups in the NAACCR Data
Standards Chinese (04) Japanese (05) Filipino
(06) Korean (08) Asian Indian/Pakistani
(09) Vietnamese (10) Laotian (11) Hmong
(12) Kampuchean (13) Thai (14) Asian, not
otherwise specified (96)
3Pie chart showing share of population of Asian
groups
6 major groups 89
Source 2000 Census, SF2
4- Top 5 cancers among Asian subgroups Males,
2000-2004 - Filipino (total cases 11,429)
- Prostate 30.3
- Lung 18.1
- Colon Rectum 12.9
- Non-Hodgkin Lymphoma 5.2
- Liver 4.2
- Japanese (8,502)
- Prostate 28.1
- Colon Rectum 17.1
- Lung 12.4
- Stomach 6.7
- Bladder 5.7
- Chinese (14,850)
- Prostate 21.2
- Lung 15.9
Vietnamese (4,983) Lung 18.5 Prostate
15.5 Liver 15.1 Colon Rectum 11.3 Stomach
5.7 Korean (4,471) Colon Rectum 14.8 Stomach
14.1 Prostate 13.7 Lung 13.6 Liver 10.2
source CINA vol. 3
5- Top 5 cancers among Asian subgroups Females,
2000-2004 - Filipino (total cases 13,391)
- Breast 35.7
- Colon Rectum 9.2
- Lung 8.6
- Corpus Uterus, NOS 6.9
- Thyroid 6.9
- Japanese (9,839)
- Breast 32.5
- Colon Rectum 15.9
- Lung 9.8
- Corpus Uterus, NOS 5.3
- Stomach 5.2
- Chinese (14,283)
- Breast 28.5
- Colon Rectum 14.1
Vietnamese (4,469) Breast 24.0 Lung 11.6 Colon
Rectum 10.8 Thyroid 7.0 Cervix 6.0 Korean
(5,276) Breast 25.8 Colon Rectum 13.6 Stomach
9.2 Lung 8.8 Thyroid 5.2
source CINA vol. 3
6 Kwong et al. 2005 Cancer incidence and
mortality rates among Chinese, Filipino,
Japanese, Korean, and Vietnamese in California
varied greatly. Chu and Chu 2005 - Cancer
incidence and mortality among Asian Indian,
Chinese, Filipino, Japanese, Koreans, Vietnamese,
Hawaiian and Samoan exhibited variation and
change over time. (both in Cancer 2005 10412
suppl.)
7 NAPIIA algorithm n Method for reassigning
cases with code 96 (Asian, NOS) to a more
specific race group using name birthplace n
Version 1 is complete n Expected to be part of
2008 Call for Data
8Research question Are the incidence data
accurate and reliable enough to publish rates for
specific Asian groups, at least on a nationwide
basis? As a way of testing this question, cancer
cases diagnosed in New York State between 1996
and 2004 with a single race code of Chinese,
Japanese, Filipino, Korean, Asian Indian/
Pakistani, and Vietnamese (n19,290) were
assessed to see if there was supporting evidence
for the race code.
9- Supporting evidence included first name, surname
at birth, birth place, and whether the case was
coded as Asian in the NYS hospital inpatient
database. - Cases with no supporting evidence, or where the
only supporting evidence was the hospital
inpatient database, were flagged as suspicious.
- Cases with birth places in areas of negligible
Asian population (e.g., Eastern Europe, South
America, Middle East) were also flagged as
suspicious. - Suspicious cases were subsequently manually
reviewed to see if a more appropriate race code
could be chosen.
10First name and surname NAPIIA name list was used
(for women, the birth name was given precedence
where available). Birth place Followed NAPIIA
rules, with some additional acceptable
combinations (e.g., birth place of Guyana and
race of Asian Indian). Examples of suspicious
cases n John Chooying, birthplace New York,
raceChinese n Tae Kim, birthplace unknown,
raceChinese n Walter Parker, birthplace New
Jersey, raceAsian Indian
Examples are fictional, for illustrative purposes
11(No Transcript)
12(No Transcript)
13Asian Indian/Pakistani 09 Unknown race
99 Most NAACCR variables unknown 9
White 01 Vietnamese 10 A transposition error
rate of 1 in 4000 would be sufficient to account
for the results seen.
57 of the Chinese cases miscoded as Japanese
came from a single facility.
14- This analysis implies that Asian rates may be 5
too high, but misclassification works both ways. - - Checked for cases coded to white, black, other,
or unknown whose names and birthplaces suggest
Asian race - 1,126 were found. Most commonly these were cases
coded as white but with distinctively Chinese,
Asian Indian, or Filipino names and born in these
places. - The gross misclassification errors cancel out
overall Asian rates are accurate to within 1
15(No Transcript)
16- Selected cancer counts among Asian subgroups in
NYS, 1996-2004 - BEFORE CLEANUP
- Chinese males
- Lung 917
- Colorectal 780
- Prostate 677
- Liver 532
- Stomach 412
- Asian Indian males
- Prostate 541
- Colorectal 190
- Lung 184
- NHL 110
- Vietnamese females
- Breast 73
AFTER CLEANUP Lung 930 Colorectal 804 Prostate
702 Liver 540 Stomach 421 Prostate 441 Lung
173 Colorectal 168 NHL 105 Breast 49 Cervix
26 Colorectal 21 Lung 15 Stomach 11 Uterus 9
17- Conclusions recommendations
- Data coding problems have little impact on
overall Asian cancer rates - Rates for Asian Indians and Vietnamese are
artificially high (10 and 28, respectively)
because of code confusion - For cancer sites more typically associated with
whites (e.g., prostate, lung), Asian Indian and
Vietnamese rates are even more in error - Rates for Chinese and Koreans are 3 low,
largely driven by the Asian Indian miscodes - Rates for Japanese and Filipinos are 7-9 low,
driven by Asian Indian miscodes and other
problems unrelated to the codes.
18- Conclusions recommendations
- These findings are only based on NYS data, but
have been corroborated anecdotally by New Jersey,
Texas, Louisiana and Alaska. - Problem could be minimized through the creation
of new codes - Asian Indian (say, 15)
- Vietnamese (say, 16)
19- Conclusions recommendations
- -There are other two digit codes with leading
zeros that are transposable with other valid
codes in the NAACCR Data Standards, for example - Follow-up source central
- 03DMV registration, 30Hospital
inpatient/outpatient - 09HMO file, 99unknown source
20Casefinding source the best designed two digit
code of them all