Lung cancer is the leading cause of cancer death worldwide. Accurate prediction of lung cancer risk is of value for individuals, clinicians, and researchers. The aims of this study were to characterize the associations between pulmonary function and sputum DNA image cytometry (SDIC) and lung cancer, and their contributions to risk prediction. During 1990 to 2007, 2,596 high-risk individuals were enrolled and followed prospectively for development of lung cancer (n = 139; median follow-up 7.7 years) in trials at the British Columbia Cancer Agency. At baseline, an epidemiologic questionnaire was administered, sputum was collected for aneuploidy measurement and spirometry was obtained. Multivariable logistic models were prepared including known lung cancer predictors (model 1), that additionally included percent-expected-forced expiratory volume in 1 second [forced expiratory volume in 1 second (FEV1%), model 2], and that additionally included SDIC (model 3). Prediction was assessed by evaluating discrimination (receiver operator characteristic area under the curve (ROC AUC)) and calibration. Net reclassification indices (NRI) were calculated with cutoff points for 8-year risks identifying low, intermediate, and high risk at 1.5% and 3%. Lung cancer risk increased with decline in FEV1%, but did so more for men than for women (interaction P < 0.001). SDIC demonstrated a dose–response with lung cancer (P = 0.022). The ROC AUCs for models 1, 2, and 3 were 0.718 (95% CI: 0.671–0.765), 0.767 (95% CI: 0.725–0.809), and 0.773 (95% CI: 0.732–0.815), respectively. Model 2 versus 1 had a NRI of 12.6% (P < 0.0001) and model 3 versus 2 had a NRI of 3.1% (P = 0.059). Spirometry and SDIC data substantially and minimally improved lung cancer prediction, respectively. Cancer Prev Res; 4(4); 552–61. ©2011 AACR.
Lung cancer is the leading cause of cancer deaths in North America and world-wide (1–3). Accurate estimation of lung cancer risk is of importance to individuals, clinicians, researchers, and health system administrators. Estimation of high risk may help motivate some individuals to quit smoking or remain former smokers. Knowledge of an individual's high risk could be used by clinicians to promote smoking cessation programs, and increase monitoring or application of early detection programs. Researchers may attempt to improve study efficiency by selectively enrolling high-risk individuals into lung cancer screening or chemoprevention trials. Healthcare officials use prediction models to estimate population burden of disease and the requirements to handle such a burden. Recently, the National Cancer Institute announced that the National Lung Screening Trial (4) found that low-dose computed tomography (CT) screening statistically significantly reduced lung cancer mortality by 20% in high-risk individuals (5). Cost-effective adoption of CT lung cancer screening programs will require identification and application of such programs to high-risk individuals.
Several lung cancer risk prediction models have been proposed (6–11), but none have incorporated data on pulmonary function or sputum DNA image cytometry (SDIC). An association between airflow obstruction and increased lung cancer risk has been recognized since the 1980s (12). Several studies have found an association between reduced pulmonary function or forced expiratory volume in 1 second (FEV1) and lung cancer incidence or mortality or both (13–23). The exact nature of the relationship between FEV1 and lung cancer is not well understood. It is unclear whether mild reduction in FEV1 increases lung cancer risk, whether there is a threshold effect, and whether there is a difference in effect between men and women. It is also not known whether reduced pulmonary function has an independent effect or just reflects smoking exposure. To date, no study has evaluated to what extent pulmonary function data improve lung cancer risk prediction.
Several studies indicate that abnormal sputum cells, in particular identifying genetic alterations associated with malignant transformation, may be useful in identifying individuals at high risk of lung cancer (24–28). Several studies have reported that SDIC and automated SDIC improve sensitivity of detection of lung cancer over conventional pathology-based cytology and suggest that SDIC might be useful for early detection of lung cancer (24, 29–35). The extent to which SDIC data can improve lung cancer risk prediction is unclear. In the current study, we evaluate SDIC because it is relatively inexpensive and rapid and can be adapted to large-scale population screening.
The aims of the current study were (i) to describe the independent associations between pulmonary function and SDIC and lung cancer, and (ii) to evaluate to what extent inclusion of these factors improve lung cancer risk prediction. It should be noted that factors that are significantly associated with lung cancer may nonetheless not contribute substantially to lung cancer prediction. Association is measured with effect estimates, such as OR, and prediction is often measured by estimating discrimination (accuracy of classification) and calibration (correspondence between probabilities predicted by model and observed probabilities). Generally, only factors with strong associations with the outcome substantially improve prediction when included in models (36). Thus, for this study we evaluated both association and prediction.
Between December 14, 1990 and May 31, 2007, 2,596 current and former smokers above 40 years of age who had smoked 20 pack-years or more were enrolled as part of several National Cancer Institute-sponsored lung cancer chemoprevention trials (U01-CA-96109, N01-CN-85188, N01-CN65030 and P01 CA096964) and biomarker studies at the British Columbia Cancer Agency (BCCA). Participants were included into the studies if they did not have a history of cancer except nonmelanoma skin cancer, localized prostate cancer, carcinoma in situ of the cervix, or superficial bladder cancer with conclusion of treatment more than 6 months prior to enrollment. Individuals with uncontrolled intercurrent illness such as symptomatic congestive heart failure, unstable angina pectoris, cardiac arrhythmia, severe chronic obstructive pulmonary disease requiring supplemental oxygen were excluded from the study. The participants were recruited into the study through the community outreach network of the public relations department of the BCCA using television programs, radio broadcasts, local newspapers, and through local physicians and dentists. Upon enrollment, subjects completed an epidemiologic questionnaire (see Supplementary Appendix S1) to document their sociodemographic data, occupational exposure to asbestos, smoking history, medical history, medication use, and family history of lung cancer. A former smoker was defined as a person who had not smoked for at least 1 year. Self-reported current smoking status was verified by exhaled carbon monoxide and urinary cotinine measurements.
Baseline spirometry was conducted using a flow-sensitive spirometer (Presto Flash Portable Spirometer Version 1.2, Spacelab Burdick Inc.) in accordance with the American Thoracic Society recommendations (37, 38). To estimate lung function, both FEV1 and forced vital capacity (FVC) were used. The results were recorded in liters (L) and as a percent of predicted (FEV1%) based on age, height, and sex using standardized prediction equations (39). Three measures of pulmonary function were evaluated: FEV1%, FVC, and FEV1: FVC ratio. Of these, the latter 2 did not significantly enhance the prediction model (data not shown). Only the results for FEV1% are presented here.
A sputum sample was obtained from each participant using simultaneous high-frequency chest wall oscillation and inhalation of 3% hypertonic saline from an ultrasonic nebulizer for 12 minutes as described previously (40, 41). The subjects were instructed to cough intermittently during the induction procedure and for at least 2 hours afterward to produce sputum samples. The sputum samples were fixed in 50% ethanol and each sample was cytospun onto a glass slide and DNA was stained with Feulgen-thionin. We used an automated, high-resolution image cytometer (Cyto-Savant™ system from Oncometrics Inc.) to measure the DNA content of at least 3,000 epithelial cells per sample (30, 42). Diploid DNA had a DNA index of 1.0. The number of cells with a DNA index greater than 1.2 was recorded.
Between 1990 and 2000, a chest radiograph was obtained in all subjects. Since April 2000, a low-dose spiral CT scan was obtained (24, 43). The participants were followed prospectively through personal contact or through the British Columbia Cancer Registry to determine whether they developed lung cancer. By law, pathology laboratories in the province of British Columbia are required to report all newly diagnosed cancers to this Registry. Ethics approval for this study was provided by the University of British Columbia Research Ethics Board.
Statistics used to describe the study population include Fisher's exact test for categorical data, t test for comparison of continuous data between 2 groups, and nonparametric test of trend for ordinal data. Aims of the study were evaluated using multivariable logistic regression models for the outcome lung cancer, including both non–small cell and small cell lung cancers. Base models, excluding and including pulmonary function data or SDIC data or both, were compared. In modeling, a priori predictors of lung cancer (base model covariates) were forced into models and included age, socioeconomic status (estimated by education in 7 levels), body mass index (BMI, weight in kilograms/height in meters squared), family history of lung cancer, smoking pack-years, smoking status (current vs. former), and smoking quit-time. “Pack-years smoked” was calculated as the average number of cigarettes smoked per day divided by 20, multiplied by the number of years smoked. Quit-time was calculated by subtracting the age at which the subject last smoked a cigarette from the age of enrolment into the study. These base predictor covariates come from a validated risk prediction model developed using Prostate, Lung, Colorectal and Ovarian Cancer Screening Trial (PLCO) data (44). Interactions and nonlinear effects of predictors were evaluated, with the latter being assessed using restricted cubic splines (45).
Significant independent associations between FEV1% or SDIC and lung cancer risk were evaluated by the likelihood ratio test (LRT). Overall model performance was evaluated with the R2 statistic (Nagelkerke's pseudo-R2). Models’ abilities to discriminate were assessed using the concordance or C-statistic or its equivalent, the receiver operator characteristic area under the curve (ROC AUC; ref 46). The ability of FEV1% or SDIC to improve discrimination were tested by evaluating if there was a statistically significant difference in ROC AUCs between nested predictive models including and excluding FEV1% or SDIC. Statistical differences in AUCs were evaluated by the nonparametric method described by DeLong and colleagues (47). Model calibration was assessed by the Hosmer–Lemeshow goodness-of-fit test, and by evaluating how much the slope of the calibration line (plotting the model predicted probabilities versus the observed probabilities) deviated from the ideal of 1. The mean absolute error and 90th percentile absolute error were statistics used to appraise calibration, with error referring to the difference between the observed values and the bias-corrected calibrated values. “Optimism” or overfit-corrected estimates of model predictive performance (internal validation) were prepared using bootstrap methods applying the programs calibrate and validate in Harrell's Design and Hmisc packages in R using 200 resamplings (45, 48). In addition, risk stratification table analysis was used to calculate the net reclassification index (NRI) (49, 50). Cutoff points for defining risk were set at P < 0.015 for low risk, P ≥ 0.015 to P < 0.030 for intermediate risk, and P ≥0.030 for high risk. It was thought that these risks are generally in the range that might be useful in selecting individuals into a lung cancer screening program.
For comparative purposes, a Cox proportional hazards model was prepared and compared with the final logistic regression model. All presented P values are 2 sided. Models, statistics, and figures were prepared using Stata/MP 11.1 (StataCorp; College Station) and R (version 2.11.0; ref 48) statistical programs.
The study population is described in Table 1. The mean age was 58.2 years. The population included slightly more males than females (52.4% vs. 47.6%). The most common educational category achieved by participants was completion of high school (28.1%). Approximately 11.9% of subjects had a family history of lung cancer. The mean BMI was 26.9. Slightly more participants were former than current-smokers (54.2% vs. 45.8%). The mean pack-years smoked and smoking duration were 47.2 pack-years and 37.3 years, respectively. The mean overall quit-time was 4.7 years and in former smokers was 8.7 years. The mean FEV1% was 87.5. The mean and median number of abnormal cells observed in the sputum were 8.25 (SD 16.77) and 5 (25–75% interquartile range 2–10), respectively. During the study follow-up (median 7.7 years, interquartile range 5.0–9.9 years), 139 lung cancers were detected (incidence rate = 74.5 per 10,000 person-years). In univariate analysis, compared with noncases, cases were significantly older, had lower education, lower BMI, were more likely to be current smokers, had greater pack-years of smoking, longer duration of smoking, shorter average quit-times, lower FEV1%, and more abnormal cells in their sputum (Table 1).
A baseline model excluding FEV1% and SDIC is presented in model 1, Table 2. Adjusted for relevant covariates (predictors in model 1), FEV1% was significantly and independently associated with lung cancer risk (model 2, Table 2). A statistically significant interaction was present between FEV1% and sex with regard to lung cancer risk (LRT for interaction P < 0.0001). Lower FEV1% increased the risk of lung cancer in men and women, but did so more strongly in men than in women (Fig. 1). The possibility that the relationship between FEV1% and lung cancer risk was nonlinear was evaluated using restricted cubic splines. The nonlinear component had a P = 0.560, indicating that the linear expression for FEV1% in the logistic model was adequate.
Model predictive performance statistics are presented in Table 2. The multivariable base model (model 1, Table 2) and the base model that in addition includes FEV1% and the FEV1%–sex interaction (model 2, Table 2) have ROC AUCs of 0.718 (95% CI: 0.671–0.765) and 0.767 (95% CI: 0.725–0.809), respectively (Figure 2). The P value testing the hypothesis that the 2 areas are equal is 0.0001, indicating significant improvement in discrimination.
The calibration slope (bootstrap adjusted for optimism) and mean and 90th percentile absolute error (Table 2) indicates that models 1 and 2 are well calibrated. For the model including FEV1% and the FEV1%–sex interaction, the calibration slope was slightly higher and the 90th percentile error was slightly smaller, suggesting that the calibration for model 2 was at least as good as for model 1.
A risk stratification table analysis (Table 3, model 1 to model 2) was conducted comparing the base model 1 to the model that additionally included FEV1% interacting with sex (model 2). Model 2 had a NRI of 12.6% (P < 0.0001). A net of 3.6% of cases were reclassified into a higher risk category by model 2 (P = 0.366), and a net of 9.0% of noncases were reclassified into a lower risk category by model 2 (P < 0.0001). Model 2 placed cases into the high-risk group more often than model 1 (89.2% vs. 84.9%) and placed noncases into the low-risk group more often than model 1 (13.0% versus 8.0%). The risk stratification analysis indicates that FEV1% does contribute substantially to the prediction of lung cancer, in addition to the base predictors.
When model 2 was analyzed including pulmonary function data which were collected more than 1 to more than 6 years prior to diagnosis or last follow-up, the association between FEV1% and lung cancer as measured by LRT was present for up to 5 years before diagnosis, but not strongly past 5 years (Table 4).
Sputum DNA image cytometry
In the fully adjusted model, including tertile groupings of SDIC (model 3, Table 2), a dose–response was observed: OR(2nd tertile vs. 1st) = 1.57 (95% CI: 0.98–2.53; P = 0.060), and OR(3rd tertile vs. 1st) = 1.74 (95% CI: 1.09–2.78; P = 0.019). SDIC when modeled as a single 3-level variable had an OR of 1.30 (95% CI: 1.04–1.63; P = 0.022) per 1 level change.
When SDIC was excluded from and added to model 2, the ROC AUCs are 0.767 (95% CI: 0.725–0.809) and 0.773 (95% CI: 0.732–0.815; Table 2). The P value testing the hypothesis that the 2 AUCs are equal is 0.191. Thus, although SDIC was significantly associated with lung cancer risk, inclusion of this variable in the prediction model did not significantly improve the discrimination when assessed by ROC AUC. Calibrations of models 2 and 3 were comparable and high when assessed by calibration slope and absolute errors (Table 2).
Model 3 versus 2 lead to net reclassification of 1.4% of cases into a lower risk category (P = 0.317), and 4.5% of noncases into a lower risk category (P < 0.0001). Models 2 and 3 placed cases into the high-risk group with the same sensitivity (89.2%), and model 3 placed noncases into the low-risk group more often than model 2 (15.2% vs. 13.0%). The risk stratification analysis indicates that SDIC adds modestly to prediction overall, and does significantly improve prediction model specificity, possibly at the cost of sensitivity.
When model 3 was analyzed including sputum cytology data that were collected more than 1 to more than 6 years prior to diagnosis or last follow-up, the association between SDIC and lung cancer disappeared for sputum cytology data that were collected more than a year prior to diagnosis (Table 4).
Logistic regression models formatted with beta coefficients and model constant, which enable computation of individual probabilities, for models 2 and 3 (Table 2) are presented in Supplementary Appendix S2 and S3. A base model which in addition includes SDIC, but excludes FEV1%, is presented in the Supplementary Appendix S4. A Cox proportional hazards regression model analogous to logistic regression model 3 (Table 2) is presented in Supplementary Appendix S5. The direction and magnitude of effect estimates is generally consistent between the logistic and Cox models.
Several possible risk factors for lung cancer did not approach significant when added to the full model 3: history of adult pneumonia (P = 0.592), emphysema (P = 0.964), or bronchitis (P = 0.854). Occupational asbestos exposure was not significantly associated with lung cancer (OR = 1.19, 95% CI: 0.80–1.77; P = 0.397) and inclusion of asbestos exposure in models did not improve prediction. For example, model 3 had an ROC AUC of 0.773 and when including asbestos had a decline in ROC AUC to 0.771.
The current study found that FEV1% and SDIC were both significantly associated with lung cancer risk. Importantly, the effect of FEV1% was modified by gender such that the impact of reduced lung function on lung cancer risk was greater in men than in women. Regarding prediction, addition of FEV1% and/or SDIC to models did not substantially impact calibration. Addition of FEV1% to the base model significantly improved discrimination as assessed by ROC AUC and by NRI. Addition of SDIC to model 2 led to modest improvement in ROC AUC from 0.767 to 0.773 (P = 0.191). The NRI from risk stratification analysis was 3.1% (P = 0.059). The positive NRI was entirely due to improvement in model specificity—SDIC lead to greater net reclassification of noncases into lower risk categories. Both models 2 and 3 had the same sensitivities in that they classified 123 of 139 cases in the high-risk category (89.2%). However, model 3, compared with model 2, reclassified 3 cases to a lower risk category, whereas only reclassifying 1 case into a higher risk category. The net reclassification probability of cases for model 3 was in the wrong direction (−0.045, P = 0.317). Although this change could have been due to chance, it is worrisome, because a risk prediction model whose primary usage might be for identifying individuals for lung cancer screening, should optimize sensitivity, that is, it should avoid missing true cases.
Past studies suggested a link between FEV1% and lung cancer (13–23). The current study confirms this relationship and characterizes it further. Our study indicates that no threshold effect exists as the test for nonlinear effect was not significant, and that the association is independent of smoking variables. That the FEV1%-lung cancer association appears to be present for up to 5 years prior to lung cancer diagnosis, suggests that FEV1% may be a predictor useful for early detection.
Wasswa-Kintu and colleagues conducted a meta-analysis of FEV1% and lung cancer risk, which included the pooled of results from 4 studies (51). Their meta-analysis was limited to studies that were prospective, population based, large in size (≥5,000 participants), and adjusted analysis for cigarette smoking status. They found that FEV1% was associated with increased risk of lung cancer as it dropped below 100% predicted, and the effect was greater in women than in men. Compared with the highest quintile of FEV1% (more than 100%), the lowest quintile of FEV1% (≤70%) was associated with a 2.23-fold increase (95% CI: 1.73–2.86) in risk for lung cancer in men and a 3.97-fold increase (95% CI: 1.93–8.25) in women. The findings of the current study are in contrast to the meta-analysis. Several study differences may explain this. The populations in the meta-analysis were mostly of the general population, not of high-risk smokers, as in our study. Furthermore, in the meta-analysis, only relative but not absolute probabilities were computed and compared.
Our study has several strengths. The study was prospective, which minimizes selection and recall biases that case–control study designs are vulnerable to. The sample was population based and thus more representative than hospital or medical system-based studies. The sample size and number of outcomes were large enough to provide adequate power for most of our study aims and for predictive modeling with reduced likelihood of overfitting models.
Our analysis used modern statistical approaches. This is the first study to evaluate nonlinear effects for FEV1% using restricted cubic splines, a method considered to be superior to alternative quadratic models, because it can describe diverse and rapidly changing patterns (52). Application of this method allowed us to statistically test for nonlinear or threshold effects. We used bootstrap methods to correct measures of predictive performance for optimism or overfitting of models. This method is considered superior to split sample and cross-classification methods of validation (45, 53).
Our study has limitations. We cannot draw any definitive conclusions on the predictive utility of SDIC at this time. Although, the NRI for adding SDIC to model 2 suggested improvement that approached statistical significance, the benefit appeared to come solely from improved specificity at the possible expense of sensitivity. The added predictive value of SDIC in lung cancer risk prediction models requires further study and its routine inclusion in models cannot be recommended at this time. SDIC or aneuploidy is thought to reflect genetic damage that has accumulated in the lungs (25, 54). In this study, SDIC as measured by quantitative image cytometry was significantly associated with lung cancer risk, but did not strongly improve prediction. It may be that alternative methods of assessing genetic alterations in sputum cells, such as fluorescence in situ hybridization (FISH; ref 27), alone or in combination, will be superior. Such methods need to be evaluated concurrently in large prospective studies.
In the models presented here, smoking exposure was carefully adjusted for, and included pack-years smoked, current smoking status, and smoking quit-time. The effect of FEV1% appears to be independent of carefully adjusted smoking variables. This suggests that FEV1% is not just adjusting for residual confounding due to inadequately adjustment for smoking. Does FEV1% just reflect tissue damage and inflammation, which might be more directly causally linked with lung carcinogenesis? Many studies have demonstrated that emphysema and chronic obstructive pulmonary disease are risk factors for lung cancer (55, 56). Of these factors, in the current study, only emphysema was associated with lung cancer in unadjusted analysis (OR = 2.22, 95% CI: 1.17–3.78; P = 0.012). In the fully adjusted model, none of emphysema, bronchitis, or adult pneumonia, which were significant in other studies (8–10), approached significance (P values of 0.916, 0.792, and 0.510, respectively). Thus, in this study, FEV1% data predominated over specific lung diseases thought to be linked with increased lung cancer risk. Objective assessment of lung function may therefore be a more accurate measure than subjective recall or interpretation of a history of diagnosis of chronic obstructive lung disease or pneumonia.
The current study indicates that FEV1% is an important contributor to lung cancer risk prediction. Spirometric measurement of pulmonary function should be part of the standard of care in evaluating smokers for the presence and severity of chronic obstructive pulmonary disease (57, 58). The current study shows that a relatively inexpensive, simple, and noninvasive test, namely, spirometry (FEV1%) significantly improves lung cancer risk prediction.
Future research needs to validate the current study findings in large well-designed studies in diverse populations. The FEV1%-lung cancer association is currently under study in the Pan-Canadian Early Lung Cancer Detection Study (59) and the National Lung Screening Trial (4), and we are planning to validate and elaborate on the current study finding in those study data. The current study raises important questions. What is the mechanism explaining how FEV1% is related to lung cancer? If FEV1% is in the direct causal pathway independent of smoking and other important factors, then will improving FEV1% lead to reduction in lung cancer risk, when other factors are held constant? Should FEV1% be considered an intermediate marker in lung cancer chemoprevention trials?
In conclusion, our study demonstrates that pulmonary function data, in particular FEV1%, add significantly to lung cancer risk prediction. Upon further validation, our findings will lead to the encouragement of the collection of spirometric data in smokers and adoption of such data into complete lung cancer risk prediction models. SDIC improved prediction to a lesser extent. Further study of this biomarker in large prospective studies is recommended. Our study illustrates the importance of evaluating the incremental value of biomarkers over and above what can be achieved using a prediction model based on sociodemographic and smoking data alone.
Disclosure of Potential Conflicts of Interest
No potential conflicts of interest were disclosed
This study was supported by the funding provided by U.S. NIH and National Cancer Institute, and British Columbia Cancer Foundation (Canada).
The costs of publication of this article were defrayed in part by the payment of page charges. This article must therefore be hereby marked advertisement in accordance with 18 U.S.C. Section 1734 solely to indicate this fact.
NOTE: Supplementary material for this article is available at Cancer Prevention Research Online (http://cancerprevres.aacrjournals.org/).
- Received August 3, 2010.
- Revision received January 12, 2011.
- Accepted January 12, 2011.
- ©2011 American Association for Cancer Research.