. Author manuscript; available in PMC: 2014 Sep 9.
Published in final edited form as: J Natl Cancer Inst. 2006 Sep 20;98(18):1335–1338. doi: 10.1093/jnci/djj363
Elizabeth B Lamont
1, James E Herndon II
1, Jane C Weeks
1, I Craig Henderson
1, Craig C Earle
1, Richard L Schilsky
1, Nicholas A Christakis
1, For the Cancer and Leukemia Group B
PMCID: PMC4158031NIHMSID: NIHMS614815PMID: 16985253
The publisher's version of this article is available at J Natl Cancer Inst
This article has been corrected. See the correction in volume 100 on page 70.
This article has been corrected. See the correction on page 1584.
Abstract
To determine the accuracy with which Medicare claims data measure disease-free survival in elderly Medicare beneficiaries with cancer, we performed a criterion validation study. We merged gold-standard clinical trial data of 45 elderly patients with node-positive breast cancer who were treated on the Cancer and Leukemia Group B (CAL-GB) adjuvant breast trial 9344 with Centers for Medicare and Medicaid Services (CMS) data files and compared the results of a CMS-based algorithm with the CALGB disease-free survival information to determine sensitivity and specificity. For 5-year disease-free survival, the sensitivity of the CMS-based algorithm was 100% (95% confidence interval [CI] = 81% to 100%), the specificity was 97% (95% CI = 83% to 100%), and the area under the receiver operator curve was 97% (95% CI = 90% to 100%). For 2-year disease-free survival, the test characteristics were less favorable: sensitivity was 83% (95% CI = 36% to 100%), specificity was 95% (95% CI = 83% to 100%), and area under the receiver operator curve was 84% (95% CI = 66% to 100%).
The elderly are numerically under-represented (1–3) and possibly physiologically misrepresented (4) in clinical trials of cancer chemotherapy such that trial participants are potentially younger and healthier than the average elderly cancer patient. The general population of elderly Americans therefore may not experience the same benefits and toxicities of chemotherapy as trial participants. Nevertheless, clinicians need information about the expected benefits and toxicities of chemotherapy in the elderly. A solution to this problem is to supplement clinical trial results with observational results. Prior research has shown that Centers for Medicare and Medicaid Services (CMS) data can be used to accurately measure chemotherapy use (5,6), but the extent to which it can be used to measure outcomes that are traditionally reported in clinical trials, including conventional survival and toxicity endpoints, is unknown.
Disease-free survival is one of the most common survival metrics in adjuvant chemotherapy trials and is defined as the time from enrollment to the first of two events: cancer relapse or death from any cause. Disease-free survival incorporates the morbidity associated with recurrent disease that overall survival does not and is thus most relevant to those cancers with long periods between recurrence and death (e.g., local or regionally advanced cancers of the breast, prostate, colon, and rectum). Indeed, the Food and Drug Administration has accepted disease-free survival as a regulatory endpoint that demonstrates clinical benefit for adjuvant therapy.
We are unaware of previous attempts to measure disease-free survival through Medicare claims. This absence is not due to an inability to measure survival—the accuracy of CMS vital status information is well established (7). Rather, it is the result of ambiguity regarding the ability of claims to capture cancer recurrence. As a preliminary study to determine the accuracy with which Medicare claims data capture disease-free survival in elderly breast cancer patients, we developed an algorithm for measuring both disease recurrence and death using Medicare data files and compared the results with an external gold-standard measure of disease-free survival, Cancer and Leukemia Group B (CALGB) clinical trial data.
The model cohort included all patients aged ≥ 65 years (n = 52) who consented, enrolled, and were treated on CALGB trial 9344, “Doxorubicin dose escalation, with or without Taxol, as part of the CA adjuvant regimen for node positive breast cancer” between January 1, 1995, and December 31, 1997. These 52 patients represented only 5.5% (52 of 944) of the full enrollment during this period. Despite what seems like a small percentage of elderly enrollees, the trial was chosen because the absolute value of enrolled elderly was among the largest of all phase 3 trials during that period.
We then carefully linked the patients’ CALGB clinical trial data (e.g., disease-free survival information) to their CMS Medicare claims files (i.e., denominator, Carrier, Outpatient, and MedPAR files) from enrollment through December 31, 2000 (8). We were able to match 51 (98%) of the 52 patients to Medicare files, a rate consistent with previous studies (9). Six patients were excluded because they were enrolled in health maintenance organizations and their claims were not processed through CMS (n = 3) or because they were not enrolled in Medicare part B (n = 3). Thus, the final analytic sample contained 45 patients.
This study was approved by the University of Chicago and Massachusetts General Hospital institutional review boards and conducted in compliance with their regulations. Data quality was ensured by careful review of data by CALGB Statistical Center staff and by the study chairperson. Statistical analyses were approved and confirmed by CALGB statisticians. All analyses were two-sided and performed using STATA version 8 SE; P<.05 was considered statistically significant.
We developed and ultimately refined a clinically intuitive algorithm to measure disease-free survival that required screening patients’ Medicare claims from the calendar date of their enrollment on CALGB 9344 forward through their last Medicare claim up to December 31, 2000 (the last day of available claims files), for evidence of cancer relapse or death. In clinical medicine, cancers are described according to primary anatomic sites (i.e., where the tumor originates) and secondary anatomic sites (i.e., location of tumor spread). The primary anatomic site is well documented in Surveillance, Epidemiology, and End Results (SEER) (10) and may be reasonably well documented in CMS files (11). However, whether (or when) the tumor has spread to anatomically distinct sites is not reported or measured by SEER, and it is not known whether such events are captured reliably in CMS claims. Because CMS claims use International Classification of Diseases 9th Revision–Clinical Modification (ICD-9-CM) diagnostic codes that do have distinct values for primary and secondary cancer sites, it is at least theoretically possible that CMS claims distinguish primary anatomic malignancy sites from secondary sites of primary cancer spread.
In CMS files, we identified cancer relapse by evaluating MedPAR, Carrier, and Outpatient files for ICD-9 diagnostic codes indicating secondary sites of cancer (Table 1) and then dated the relapse by “claim through date.” From the universe of ICD-9-CM secondary sites of cancer codes, we omitted the code 198.2 indicating “secondary malignant neoplasm of the breast” to avoid erroneous overcoding of patients’ primary breast cancer as “relapse” and ultimately omitted the code 198.89 “ secondary malignant neoplasm of unspecified site” because of its high false-positive rate. Date of death was obtained from CMS denominator files. The first of the two possible events (i.e., relapse or death) was chosen to represent disease-free survival according to CMS claims. Individuals without either event were censored at the date of their last CMS claim with a maximum date of December 31, 2000.
Table 1.
Algorithm for measuring disease-free survival in Centers for Medicare and Medicaid Services (CMS) claims files
Code type | Value | Description |
---|---|---|
ICD-9 Dx* | 197.0 | Secondary malignant neoplasm of the lung |
197.1 | Secondary malignant neoplasm of the mediastinum | |
197.2 | Secondary malignant neoplasm of the pleura | |
197.3 | Secondary malignant neoplasm of other respiratory organs | |
197.4 | Secondary malignant neoplasm of the small intestine, including duodenum | |
197.5 | Secondary malignant neoplasm of the large intestine and rectum | |
197.6 | Secondary malignant neoplasm of the retroperitoneum and peritoneum | |
197.7 | Secondary malignant neoplasm of the liver | |
197.8 | Secondary malignant neoplasm of the other digestive organs and spleen | |
198.0 | Secondary malignant neoplasm of the kidney | |
198.1 | Secondary malignant neoplasm of other urinary organs | |
198.2 | Secondary malignant neoplasm of the skin | |
198.3 | Secondary malignant neoplasm of the brain and spinal cord | |
198.4 | Secondary malignant neoplasm of the other parts of the nervous system | |
198.5 | Secondary malignant neoplasm of the bone and bone marrow | |
198.6 | Secondary malignant neoplasm of the ovary | |
198.7 | Secondary malignant neoplasm of the adrenal gland | |
198.8 | Secondary malignant neoplasm of other sites | |
198.82 | Secondary malignant neoplasm of the genital organs | |
Death indicator† | Date of death | Field contains date of death of Medicare beneficiary |
Open in a new tab
*
ICD-9 = International Classification of Diseases 9th Revision codes applied to CMS ambulatory and hospital files (i.e., Carrier, Outpatient, MedPAR).
†
CMS denominator file.
We compared CMS disease-free survival information with the gold standard of CALGB disease-free survival information (censored at December 31, 2000) to calculate sensitivity, specificity, and area under the receiver operator curve of the disease-free survival indicator during the entire follow-up period and at traditional trial–censoring points (i.e., 2 years and 5 years). We defined sensitivity as the proportion of the patients known (according to CALGB data) to have either died or relapsed who were correctly identified through CMS claims as such. We defined specificity as the proportion of the patients known (according to CALGB data) to be both alive and without disease relapse who were correctly identified through CMS claims as such. The global performance of the measures was summarized by the area under the receiver operator curve (graph of sensitivity versus 1 – specificity); the greater the area under the receiver operator curve (maximum 1.00), the better the discriminatory accuracy of the measure.
The sample was exclusively female, the mean age was 69.6 years (standard deviation ± 4.2 years), and 82% were white, 11% black, and 7% Hispanic. During the entire follow-up period (i.e., a maximum of 2162 days in the CALGB data source and 2142 days in the CMS data source), 15 (33%) of the 45 patients had relapsed or had died according to CMS claims. By comparison, the gold-standard source of survival information noted, as a first event, relapse or death in 14 (31%) of 45 (P = .84) (Supplementary Figure, available at http://jncicancerspectrum.oxfordjournals.org/jnci/content/vol98/issue18). The 5-year disease-free survival sensitivity (100%, 95% confidence interval [CI] = 83% to 100%), specificity (97%, 95% CI = 87% to 100%), and area under the receiver operator curve (97%, 95% CI = 90% to 100%) were favorable, each being more favorable than 2-year disease-free survival (sensitivity = 83%, 95% CI = 36% to 100%; specificity = 95%, 95% CI = 83% to 100%; and area under the receiver operator curve = 84%, 95% CI = 66% to 100%) (Table 2).
Table 2.
Test characteristics of the Centers for Medicare and Medicaid Services (CMS) disease-free survival algorithm*
Censoring Period | Sensitivity (95% CI) | Specificity (95% CI) | Area under ROC (95% CI) |
---|---|---|---|
2 y | 83% (36% to 100%) | 95% (83% to 100%) | 84% (66% to 100%) |
5 y | 100% (81% to 100%) | 97% (83% to 100%) | 97% (90% to 100%) |
Open in a new tab
*
The test characteristics were estimated by comparing CMS files with gold-standard clinical trial data pertaining to 45 elderly women with lymph node–positive breast cancer treated on the adjuvant chemotherapy trial CALGB 9344. CI = confidence interval; ROC = receiver operator curve.
We compared results of the CMS algorithm to the gold-standard data source with respect to the three components of disease-free survival (i.e., relapse, death, and censoring) at 5 years. According to the gold-standard data source, 86% (12 of 14) of the disease-free survival failures were the result of relapsed disease and 14% (2 of 14) were the result of death (Table 3). The ICD-9-CM codes for “secondary malignant neoplasm” applied to CMS data correctly categorized 11 of these 12 relapsed patients in CALGB data (sensitivity = 92%, 95% CI = 66% to 100%) and correctly categorized 31 of the 33 nonrelapsed patients (specificity = 94%, 95% CI = 82% to 99%).
Table 3.
Five-year disease-free survival (disease-free survival) censoring proportions for Cancer and Leukemia Group B (CALGB) versus Centers for Medicare and Medicaid Services (CMS) data
Censoring category | CMS data | CALGB gold-standard data |
---|---|---|
disease-free survival = 1 | ||
Relapsed | 13 | 12 |
Dead | 2 | 2 |
disease-free survival = 0 | ||
Censored | 30 | 31 |
Total | 45 | 45 |
Open in a new tab
We measured the patient-level difference in the gold-standard and CMS algorithm time variables. The median difference in CALGB compared with CMS time was 40 days, and the inter-quartile range was 12–86 days. The mean difference was 99 days (95% CI = 4.5 to 193 days).
This preliminary validation study of Medicare claims shows that for elderly Medicare beneficiaries with histories of lymph node–positive breast cancer who were treated on a randomized phase 3 CALGB adjuvant chemotherapy trial, contemporaneous Medicare claims files reflect subsequent 5-year disease-free survival with a high degree of accuracy. The algorithm appears to also distinguish relapse from death within the domain of disease-free survival, which suggests that the secondary anatomic site codes may capture cancer recurrence. Taken with results of prior work documenting the ability of CMS claims to accurately measure certain anticancer therapies (6,12,13), these results suggest that CMS data capture some key elements of the clinical trial paradigm, and thus existing CMS-based data sources (e.g., SEER-Medicare) may be leveraged to yield clinical information regarding the effectiveness of adjuvant cancer treatments in the general population of elderly with solid tumors. Future research will focus on confirming these findings in a larger and more diverse patient sample.
Acknowledgments
Notes
The study was supported, in part, by grants from the National Cancer Institute (CA93892 to E. Lamont), the American Cancer Society (institutional grant to the University of Chicago Cancer Center), and the Department of Medicine, University of Chicago. The research for CALGB 9344 was supported, in part, by National Cancer Institute grants CA31946 to the CALGB, Richard L. Schilsky, MD, Chairman; CA33601 to the CALGB Statistical Center, Durham, NC; CA32291 to Dana Farber Cancer Institute; CA47577 to Duke University Medical Center, Durham, NC; CA60138 to University of California at San Francisco, San Francisco, CA; and CA41287 to University of Chicago Medical Center, Chicago, IL.
We are grateful to Jeannette Dowell of the CALGB Statistical Center for assembly and preparation of the CALGB data and Laurie Meneades of the Department of Health Care Policy at Harvard Medical School for preparation of the CMS data. We appreciate methodologic advice from Drs Diane Lauderdale, Will Manning, and Ron Thisted of the Department of Health Studies at the University of Chicago.
Funding to pay the Open Access publication charges for this article was provided by the CALGB Foundation.
Footnotes
The sponsor had no role in the study design, data collection and analysis, interpretation of the data, or the preparation of the manuscript.
The following institutions participated in CALGB 9344: CALGB Statistical Office, Durham, NC—Stephen George, PhD, supported by CA33601; Baptist Cancer Institute CCOP, Memphis, TN—Lee S. Schwartzberg, MD, supported by CA71323; Christiana Care Health Services, Inc, CCOP, Wilmington, DE—Irving M. Berkowitz, DO, supported by CA45418; Community Hospital–Syracuse CCOP, Syracuse, NY—Jeffrey Kirshner, MD, supported by CA45389; Dana Farber Cancer Institute, Boston, MA—George P. Canellos, MD, supported by CA32291; Dartmouth Medical School–Norris Cotton Cancer Center, Lebanon, NH—L. Herbert Maurer, MD, supported by CA04326; Duke University Medical Center, Durham, NC—Jeffrey Crawford, MD, supported by CA47577; Eastern Cooperative Oncology Group, Philadelphia, PA—Robert L. Comis, MD, Chairman; Green Mountain Oncology Group CCOP, Bennington, VT—H. James Wallace Jr. MD, supported by CA35091; Kaiser Permanente CCOP, San Diego, CA—Jonathan A. Polikoff, MD, supported by CA45374; Long Island Jewish Medical Center, Lake Success, NY—Marc Citron, MD, supported by CA11028; Massachusetts General Hospital, Boston, MA—Michael L. Grossbard, MD, supported by CA12449; Mount Sinai Medical Center CCOP–Miami, Miami Beach, FL—Enrique Davila, MD, supported by CA45564; Mount Sinai School of Medicine, New York, NY—James F. Holland, MD, supported by CA04457; North Central Cancer Treatment Group, Rochester, MN—Michael J. O’Connell, MD, Chairman, supported by CA25224; North Shore University Hospital CCOP, Manhasset, NY—Vincent Vinciguerra, MD, supported by CA35279; North Shore University Hospital, Manhasset, NY—Daniel R. Budman, MD, supported by CA35279; Rhode Island Hospital, Providence, RI—Louis A. Leone, MD, supported by CA08025; Roswell Park Cancer Institute, Buffalo, NY—Ellis Levine, MD, supported by CA02599; South New Jersey CCOP, Camden, NJ—Jack Goldberg, MD, supported by CA54697; Southeast Cancer Control Consortium Inc. CCOP, Goldsboro, NC—James N. Atkins, MD, supported by CA45808; Southern Nevada Cancer Research Foundation CCOP, Las Vegas, NV—John Ellerton, MD, supported by CA35421; Southwest Oncology Group, San Antonio, TX—Charles Coltman, MD, Chairman; St Michael’s Medical Center Tri-County CCOP, Paterson, NJ—Arnold D. Rubin, MD, supported by CA60247; SUNY Health Science Center at Syracuse, Syracuse, NY—Stephen L. Graziano, MD, supported by CA21060; University of Alabama Birmingham, Birmingham, AL—Robert Diasio, MD, supported by CA47545; University of California at San Diego, San Diego, CA— Stephen L. Seagren, MD, supported by CA11789; University of California at San Francisco, San Francisco, CA—Alan P. Venook, MD, supported by CA60138; University of Chicago Medical Center, Chicago, IL—Nicholas J. Vogelzang, MD, supported by CA41287; University of Illinois at Chicago, Chicago, IL—Jeffrey A. Sosman, MD, supported by CA74811; University of Iowa Hospitals, Iowa City, IA—Gerald H. Clamon, MD, supported by CA47642; University of Maryland Cancer Center, Baltimore, MD—David Van Echo, MD, supported by CA31983; University of Massachusetts Medical Center, Worcester, MA—F. Marc Stewart, MD, supported by CA37135; University of Minnesota, Minneapolis, MN—Bruce A. Peterson, MD, supported by CA16450; University of Missouri/Ellis Fischel Cancer Center, Columbia, MO—Michael C. Perry, MD; supported by CA12046; University of Nebraska Medical Center, Omaha, NE—Anne Kessinger, MD, supported by CA77298; University of North Carolina at Chapel Hill, Chapel Hill, NC—Thomas C. Shea, MD, supported by CA47559; University of Tennessee Memphis, Memphis, TN—Harvey B. Niell, MD, supported by CA47555; Vermont Cancer Center, Burlington, VT—Hyman B. Muss, MD, supported by CA77406; Virginia Commonwealth University MB CCOP, Richmond, VA—John D. Roberts, MD, supported by CA52784; Wake Forest University School of Medicine, Winston-Salem, NC—David D. Hurd, MD, supported by CA03927; Walter Reed Army Medical Center, Washington, DC—John C. Byrd, MD, supported by CA26806; Washington University School of Medicine, St Louis, MO—Nancy L. Bartlett, MD, supported by CA77440; and Weill Medical College of Cornell University, New York, NY — Ted P. Szatrowski, MD, supported by CA07968.
References
- 1.Trimble EL, Carter CL, Cain D, Freidlin B, Ungerleider RS, Friedman MA. Representation of older patients in cancer treatment trials. Cancer. 1994;74(Suppl):2208–14. doi: 10.1002/1097-0142(19941001)74:7+<2208::aid-cncr2820741737>3.0.co;2-#. [DOI] [PubMed] [Google Scholar]
- 2.Murthy VH, Krumholz HM, Gross CP. Participation in cancer clinical trials: race-, sex-, and age-based disparities. JAMA. 2004;291:2720–6. doi: 10.1001/jama.291.22.2720. [DOI] [PubMed] [Google Scholar]
- 3.Hutchins LF, Unger JM, Crowley JJ, Coltman CA, Jr, Albain KS. Underrepresentation of patients 65 years of age or older in cancer-treatment trials. N Engl J Med. 1999;341:2061–7. doi: 10.1056/NEJM199912303412706. [DOI] [PubMed] [Google Scholar]
- 4.Lewis JH, Kilgore ML, Goldman DP, Trimble EL, Kaplan R, Montello MJ, et al. Participation of patients 65 years of age or older in cancer clinical trials. J Clin Oncol. 2003;21:1383–9. doi: 10.1200/JCO.2003.08.010. [DOI] [PubMed] [Google Scholar]
- 5.Lamont EB, Lauderdale DS, Schilsky RL, Christakis NA. Construct validity of medicare chemotherapy claims: the case of 5FU. Med Care. 2002;40:201–11. doi: 10.1097/00005650-200203000-00004. [DOI] [PubMed] [Google Scholar]
- 6.Lamont EB, Herndon JE, 2nd, Weeks JC, Henderson IC, Lilenbaum R, Schilsky RL, et al. Criterion validity of Medicare chemotherapy claims in Cancer and Leukemia Group B breast and lung cancer trial participants. J Natl Cancer Inst. 2005;97:1080–3. doi: 10.1093/jnci/dji189. [DOI] [PubMed] [Google Scholar]
- 7.Cowper DC, Kubal JD, Maynard C, Hynes DM. A primer and comparative review of major US mortality databases. Ann Epidemiol. 2002;12:462–8. doi: 10.1016/s1047-2797(01)00285-x. [DOI] [PubMed] [Google Scholar]
- 8.Henderson IC, Berry DA, Demetri GD, Cirrincione CT, Goldstein LJ, Martino S, et al. Improved outcomes from adding sequential Paclitaxel but not from escalating Doxorubicin dose in an adjuvant chemotherapy regimen for patients with node-positive primary breast cancer. J Clin Oncol. 2003;21:976–83. doi: 10.1200/JCO.2003.02.063. [DOI] [PubMed] [Google Scholar]
- 9.Hatten J. Medicare’s common denominator: the covered population. Health Care Financ Rev. 1980;2:53–64. [PMC free article] [PubMed] [Google Scholar]
- 10.Zippin C, Lum D, Hankey BF. Completeness of hospital cancer case reporting from the SEER Program of the National Cancer Institute. Cancer. 1995;76:2343–50. doi: 10.1002/1097-0142(19951201)76:11<2343::aid-cncr2820761124>3.0.co;2-#. [DOI] [PubMed] [Google Scholar]
- 11.Nattinger AB, Laud PW, Bajorunaite R, Sparapani RA, Freeman JL. An algorithm for the use of Medicare claims data to identify women with incident breast cancer. Health Serv Res. 2004;39(Pt 1):1733–49. doi: 10.1111/j.1475-6773.2004.00315.x. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 12.Virnig BA, Warren JL, Cooper GS, Klabunde CN, Schussler N, Freeman J. Studying radiation therapy using SEER-Medicare-linked data. Med Care. 2002;40(Suppl):IV-49–54. doi: 10.1097/00005650-200208001-00007. [DOI] [PubMed] [Google Scholar]
- 13.Cooper GS, Virnig B, Klabunde CN, Schussler N, Freeman J, Warren JL. Use of SEER-Medicare data for measuring cancer surgery. Med Care. 2002;40(Suppl):IV-43–8. doi: 10.1097/00005650-200208001-00006. [DOI] [PubMed] [Google Scholar]