Deep reinforcement learning for automated radiation adaptation in lung cancer

doi:10.1002/mp.12625

Full metadata record

DC Field	Value	Language
dc.contributor.author	Tseng, Huan-Hsin	en_US
dc.contributor.author	Luo, Yi	en_US
dc.contributor.author	Cui, Sunan	en_US
dc.contributor.author	Chien, Jen-Tzung	en_US
dc.contributor.author	Ten Haken, Randall K.	en_US
dc.contributor.author	El Naqa, Issam	en_US
dc.date.accessioned	2019-04-02T05:59:58Z	-
dc.date.available	2019-04-02T05:59:58Z	-
dc.date.issued	2017-12-01	en_US
dc.identifier.issn	0094-2405	en_US
dc.identifier.uri	http://dx.doi.org/10.1002/mp.12625	en_US
dc.identifier.uri	http://hdl.handle.net/11536/147861	-
dc.description.abstract	Purpose: To investigate deep reinforcement learning (DRL) based on historical treatment plans for developing automated radiation adaptation protocols for nonsmall cell lung cancer (NSCLC) patients that aim to maximize tumor local control at reduced rates of radiation pneumonitis grade 2 (RP2). Methods: In a retrospective population of 114 NSCLC patients who received radiotherapy, a three-component neural networks framework was developed for deep reinforcement learning (DRL) of dose fractionation adaptation. Large-scale patient characteristics included clinical, genetic, and imaging radiomics features in addition to tumor and lung dosimetric variables. First, a generative adversarial network (GAN) was employed to learn patient population characteristics necessary for DRL training from a relatively limited sample size. Second, a radiotherapy artificial environment (RAE) was reconstructed by a deep neural network (DNN) utilizing both original and synthetic data (by GAN) to estimate the transition probabilities for adaptation of personalized radiotherapy patients' treatment courses. Third, a deep Q-network (DQN) was applied to the RAE for choosing the optimal dose in a response-adapted treatment setting. This multicomponent reinforcement learning approach was benchmarked against real clinical decisions that were applied in an adaptive dose escalation clinical protocol. In which, 34 patients were treated based on avid PET signal in the tumor and constrained by a 17.2% normal tissue complication probability (NTCP) limit for RP2. The uncomplicated cure probability (P+) was used as a baseline reward function in the DRL. Results: Taking our adaptive dose escalation protocol as a blueprint for the proposed DRL (GAN + RAE + DQN) architecture, we obtained an automated dose adaptation estimate for use at similar to 2/3 of the way into the radiotherapy treatment course. By letting the DQN component freely control the estimated adaptive dose per fraction (ranging from 1-5 Gy), the DRL automatically favored dose escalation/de-escalation between 1.5 and 3.8 Gy, a range similar to that used in the clinical protocol. The same DQN yielded two patterns of dose escalation for the 34 test patients, but with different reward variants. First, using the baseline P+ reward function, individual adaptive fraction doses of the DQN had similar tendencies to the clinical data with an RMSE = 0.76 Gy; but adaptations suggested by the DQN were generally lower in magnitude (less aggressive). Second, by adjusting the P+ reward function with higher emphasis on mitigating local failure, better matching of doses between the DQN and the clinical protocol was achieved with an RMSE = 0.5 Gy. Moreover, the decisions selected by the DQN seemed to have better concordance with patients eventual outcomes. In comparison, the traditional temporal difference (TD) algorithm for reinforcement learning yielded an RMSE = 3.3 Gy due to numerical instabilities and lack of sufficient learning. Conclusion: We demonstrated that automated dose adaptation by DRL is a feasible and a promising approach for achieving similar results to those chosen by clinicians. The process may require customization of the reward function if individual cases were to be considered. However, development of this framework into a fully credible autonomous system for clinical decision support would require further validation on larger multi-institutional datasets. (C) 2017 American Association of Physicists in Medicine	en_US
dc.language.iso	en_US	en_US
dc.subject	adaptive radiotherapy	en_US
dc.subject	deep learning	en_US
dc.subject	lung cancer	en_US
dc.subject	reinforcement learning	en_US
dc.title	Deep reinforcement learning for automated radiation adaptation in lung cancer	en_US
dc.type	Article	en_US
dc.identifier.doi	10.1002/mp.12625	en_US
dc.identifier.journal	MEDICAL PHYSICS	en_US
dc.citation.volume	44	en_US
dc.citation.spage	6690	en_US
dc.citation.epage	6705	en_US
dc.contributor.department	電機工程學系	zh_TW
dc.contributor.department	Department of Electrical and Computer Engineering	en_US
dc.identifier.wosnumber	WOS:000417919000055	en_US
dc.citation.woscount	0	en_US
Appears in Collections:	Articles