Actor-Critic Deep Reinforcement Learning for Solving Job Shop Scheduling Problems

doi:10.1109/ACCESS.2020.2987820

Full metadata record

DC Field	Value	Language
dc.contributor.author	Liu, Chien-Liang	en_US
dc.contributor.author	Chang, Chuan-Chin	en_US
dc.contributor.author	Tseng, Chun-Jan	en_US
dc.date.accessioned	2020-07-01T05:21:15Z	-
dc.date.available	2020-07-01T05:21:15Z	-
dc.date.issued	2020-01-01	en_US
dc.identifier.issn	2169-3536	en_US
dc.identifier.uri	http://dx.doi.org/10.1109/ACCESS.2020.2987820	en_US
dc.identifier.uri	http://hdl.handle.net/11536/154323	-
dc.description.abstract	In the past decades, many optimization methods have been devised and applied to job shop scheduling problem (JSSP) to find the optimal solution. Many methods assumed that the scheduling results were applied to static environments, but the whole environments in the real world are always dynamic. Moreover, many unexpected events such as machine breakdowns and material problems may be present to adversely affect the initial job scheduling. This work views JSSP as a sequential decision making problem and proposes to use deep reinforcement learning to cope with this problem. The combination of deep learning and reinforcement learning avoids handcraft features as used in traditional reinforcement learning, and it is expected that the combination will make the whole learning phase more efficient. Our proposed model comprises actor network and critic network, both including convolution layers and fully connected layer. Actor network agent learns how to behave in different situations, while critic network helps agent evaluate the value of statement then return to actor network. This work proposes a parallel training method, combining asynchronous update as well as deep deterministic policy gradient (DDPG), to train the model. The whole network is trained with parallel training on a multi-agent environment and different simple dispatching rules are considered as actions. We evaluate our proposed model on more than ten instances that are present in a famous benchmark problem library & x2013; OR library. The evaluation results indicate that our method is comparative in static JSSP benchmark problems, and achieves a good balance between makespan and execution time in dynamic environments. Scheduling score of our method is 91.12 & x0025; in static JSSP benchmark problems, and 80.78 & x0025; in dynamic environments.	en_US
dc.language.iso	en_US	en_US
dc.subject	Job shop scheduling	en_US
dc.subject	Machine learning	en_US
dc.subject	Benchmark testing	en_US
dc.subject	Dynamic scheduling	en_US
dc.subject	Learning (artificial intelligence)	en_US
dc.subject	Training	en_US
dc.subject	Optimization	en_US
dc.subject	Job shop scheduling problem (JSSP)	en_US
dc.subject	deep reinforcement learning	en_US
dc.subject	actor-critic network	en_US
dc.subject	parallel training	en_US
dc.title	Actor-Critic Deep Reinforcement Learning for Solving Job Shop Scheduling Problems	en_US
dc.type	Article	en_US
dc.identifier.doi	10.1109/ACCESS.2020.2987820	en_US
dc.identifier.journal	IEEE ACCESS	en_US
dc.citation.volume	8	en_US
dc.citation.spage	71752	en_US
dc.citation.epage	71762	en_US
dc.contributor.department	工業工程與管理學系	zh_TW
dc.contributor.department	Department of Industrial Engineering and Management	en_US
dc.identifier.wosnumber	WOS:000530814400002	en_US
dc.citation.woscount	0	en_US
Appears in Collections:	Articles