TY - JOUR
T1 - A data-mining framework for large scale analysis of dose-outcome relationships in a database of irradiated head and neck cancer patients
AU - Robertson, Scott P.
AU - Quon, Harry
AU - Kiess, Ana P.
AU - Moore, Joseph A.
AU - Yang, Wuyang
AU - Cheng, Zhi
AU - Afonso, Sarah
AU - Allen, Mysha
AU - Richardson, Marian
AU - Choflet, Amanda
AU - Sharabi, Andrew
AU - McNutt, Todd R.
N1 - Publisher Copyright:
© 2015 American Association of Physicists in Medicine.
PY - 2015/7/1
Y1 - 2015/7/1
N2 - Purpose: To develop a hypothesis-generating framework for automatic extraction of dose-outcome relationships from an in-house, analytic oncology database. Methods: Dose-volume histograms (DVH) and clinical outcomes have been routinely stored to the authors' database for 684 head and neck cancer patients treated from 2007 to 2014. Database queries were developed to extract outcomes that had been assessed for at least 100 patients, as well as DVH curves for organs-at-risk (OAR) that were contoured for at least 100 patients. DVH curves for paired OAR (e.g., left and right parotids) were automatically combined and included as additional structures for analysis. For each OAR-outcome combination, only patients with both OAR and outcome records were analyzed. DVH dose points, D(Vt), at a given normalized volume threshold Vt were stratified into two groups based on severity of toxicity outcomes after treatment completion. The probability of an outcome was modeled at each Vt = [0%,1%,⋯,100%] by logistic regression. Notable OAR-outcome combinations were defined as having statistically significant regression parameters (p < 0.05) and an odds ratio of at least 1.05 (5% increase in odds per Gy). Results: A total of 57 individual and combined structures and 97 outcomes were queried from the database. Of all possible OAR-outcome combinations, 17% resulted in significant logistic regression fits (p < 0.05) having an odds ratio of at least 1.05. Further manual inspection revealed a number of reasonable models based on either reported literature or proximity between neighboring OARs. The data-mining algorithm confirmed the following well-known OAR-dose/outcome relationships: dysphagia/larynx, voice changes/larynx, esophagitis/esophagus, xerostomia/parotid glands, and mucositis/oral mucosa. Several surrogate relationships, defined as OAR not directly attributed to an outcome, were also observed, including esophagitis/larynx, mucositis/mandible, and xerostomia/mandible. Conclusions: Prospective collection of clinical data has enabled large-scale analysis of dose-outcome relationships. The current data-mining framework revealed both known and novel dosimetric and clinical relationships, underscoring the potential utility of this analytic approach in hypothesis generation. Multivariate models and advanced, 3D dosimetric features may be necessary to further evaluate the complex relationship between neighboring OAR and observed outcomes.
AB - Purpose: To develop a hypothesis-generating framework for automatic extraction of dose-outcome relationships from an in-house, analytic oncology database. Methods: Dose-volume histograms (DVH) and clinical outcomes have been routinely stored to the authors' database for 684 head and neck cancer patients treated from 2007 to 2014. Database queries were developed to extract outcomes that had been assessed for at least 100 patients, as well as DVH curves for organs-at-risk (OAR) that were contoured for at least 100 patients. DVH curves for paired OAR (e.g., left and right parotids) were automatically combined and included as additional structures for analysis. For each OAR-outcome combination, only patients with both OAR and outcome records were analyzed. DVH dose points, D(Vt), at a given normalized volume threshold Vt were stratified into two groups based on severity of toxicity outcomes after treatment completion. The probability of an outcome was modeled at each Vt = [0%,1%,⋯,100%] by logistic regression. Notable OAR-outcome combinations were defined as having statistically significant regression parameters (p < 0.05) and an odds ratio of at least 1.05 (5% increase in odds per Gy). Results: A total of 57 individual and combined structures and 97 outcomes were queried from the database. Of all possible OAR-outcome combinations, 17% resulted in significant logistic regression fits (p < 0.05) having an odds ratio of at least 1.05. Further manual inspection revealed a number of reasonable models based on either reported literature or proximity between neighboring OARs. The data-mining algorithm confirmed the following well-known OAR-dose/outcome relationships: dysphagia/larynx, voice changes/larynx, esophagitis/esophagus, xerostomia/parotid glands, and mucositis/oral mucosa. Several surrogate relationships, defined as OAR not directly attributed to an outcome, were also observed, including esophagitis/larynx, mucositis/mandible, and xerostomia/mandible. Conclusions: Prospective collection of clinical data has enabled large-scale analysis of dose-outcome relationships. The current data-mining framework revealed both known and novel dosimetric and clinical relationships, underscoring the potential utility of this analytic approach in hypothesis generation. Multivariate models and advanced, 3D dosimetric features may be necessary to further evaluate the complex relationship between neighboring OAR and observed outcomes.
KW - dose-outcome modeling
KW - head and neck cancer
KW - large-scale analytics
KW - toxicity
UR - http://www.scopus.com/inward/record.url?scp=84933059854&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=84933059854&partnerID=8YFLogxK
U2 - 10.1118/1.4922686
DO - 10.1118/1.4922686
M3 - Article
C2 - 26133630
AN - SCOPUS:84933059854
SN - 0094-2405
VL - 42
SP - 4329
EP - 4337
JO - Medical physics
JF - Medical physics
IS - 7
ER -