Path Planning for Multi-Arm Manipulators Using Deep Reinforcement Learning: Soft Actor–Critic with Hindsight Experience Replay

Since path planning for multi-arm manipulators is a complicated high-dimensional problem, effective and fast path generation is not easy for the arbitrarily given start and goal locations of the end effector. Especially, when it comes to deep reinforcement learning-based path planning, high-dimensio...

Full description

Bibliographic Details
Main Authors:	Evan Prianto, MyeongSeop Kim, Jae-Han Park, Ji-Hun Bae, Jung-Su Kim
Format:	Article
Language:	English
Published:	MDPI AG 2020-10-01
Series:	Sensors
Subjects:	path planning multi-arm manipulators reinforcement learning Soft Actor-Critic (SAC) Hindsight Experience Replay (HER) collision avoidance
Online Access:	https://www.mdpi.com/1424-8220/20/20/5911

id	doaj-aa0d5f02b1fe49e2b2bcf2dd60120d9c
record_format	Article
spelling	doaj-aa0d5f02b1fe49e2b2bcf2dd60120d9c2020-11-25T03:36:56ZengMDPI AGSensors1424-82202020-10-01205911591110.3390/s20205911Path Planning for Multi-Arm Manipulators Using Deep Reinforcement Learning: Soft Actor–Critic with Hindsight Experience ReplayEvan Prianto0MyeongSeop Kim1Jae-Han Park2Ji-Hun Bae3Jung-Su Kim4Department of Electrical and Information Engineering, Research Center for Electrical and Information Technology, Seoul National University of Science and Technology, Seoul 01811, KoreaDepartment of Electrical and Information Engineering, Research Center for Electrical and Information Technology, Seoul National University of Science and Technology, Seoul 01811, KoreaApplied Robot R&D Department, Korea Institute of Industrial Technology (KITECH), Ansan 15588, KoreaApplied Robot R&D Department, Korea Institute of Industrial Technology (KITECH), Ansan 15588, KoreaDepartment of Electrical and Information Engineering, Research Center for Electrical and Information Technology, Seoul National University of Science and Technology, Seoul 01811, KoreaSince path planning for multi-arm manipulators is a complicated high-dimensional problem, effective and fast path generation is not easy for the arbitrarily given start and goal locations of the end effector. Especially, when it comes to deep reinforcement learning-based path planning, high-dimensionality makes it difficult for existing reinforcement learning-based methods to have efficient exploration which is crucial for successful training. The recently proposed soft actor–critic (SAC) is well known to have good exploration ability due to the use of the entropy term in the objective function. Motivated by this, in this paper, a SAC-based path planning algorithm is proposed. The hindsight experience replay (HER) is also employed for sample efficiency and configuration space augmentation is used in order to deal with complicated configuration space of the multi-arms. To show the effectiveness of the proposed algorithm, both simulation and experiment results are given. By comparing with existing results, it is demonstrated that the proposed method outperforms the existing results.https://www.mdpi.com/1424-8220/20/20/5911path planningmulti-arm manipulatorsreinforcement learningSoft Actor-Critic (SAC)Hindsight Experience Replay (HER)collision avoidance
collection	DOAJ
language	English
format	Article
sources	DOAJ
author	Evan Prianto MyeongSeop Kim Jae-Han Park Ji-Hun Bae Jung-Su Kim
spellingShingle	Evan Prianto MyeongSeop Kim Jae-Han Park Ji-Hun Bae Jung-Su Kim Path Planning for Multi-Arm Manipulators Using Deep Reinforcement Learning: Soft Actor–Critic with Hindsight Experience Replay Sensors path planning multi-arm manipulators reinforcement learning Soft Actor-Critic (SAC) Hindsight Experience Replay (HER) collision avoidance
author_facet	Evan Prianto MyeongSeop Kim Jae-Han Park Ji-Hun Bae Jung-Su Kim
author_sort	Evan Prianto
title	Path Planning for Multi-Arm Manipulators Using Deep Reinforcement Learning: Soft Actor–Critic with Hindsight Experience Replay
title_short	Path Planning for Multi-Arm Manipulators Using Deep Reinforcement Learning: Soft Actor–Critic with Hindsight Experience Replay
title_full	Path Planning for Multi-Arm Manipulators Using Deep Reinforcement Learning: Soft Actor–Critic with Hindsight Experience Replay
title_fullStr	Path Planning for Multi-Arm Manipulators Using Deep Reinforcement Learning: Soft Actor–Critic with Hindsight Experience Replay
title_full_unstemmed	Path Planning for Multi-Arm Manipulators Using Deep Reinforcement Learning: Soft Actor–Critic with Hindsight Experience Replay
title_sort	path planning for multi-arm manipulators using deep reinforcement learning: soft actor–critic with hindsight experience replay
publisher	MDPI AG
series	Sensors
issn	1424-8220
publishDate	2020-10-01
description	Since path planning for multi-arm manipulators is a complicated high-dimensional problem, effective and fast path generation is not easy for the arbitrarily given start and goal locations of the end effector. Especially, when it comes to deep reinforcement learning-based path planning, high-dimensionality makes it difficult for existing reinforcement learning-based methods to have efficient exploration which is crucial for successful training. The recently proposed soft actor–critic (SAC) is well known to have good exploration ability due to the use of the entropy term in the objective function. Motivated by this, in this paper, a SAC-based path planning algorithm is proposed. The hindsight experience replay (HER) is also employed for sample efficiency and configuration space augmentation is used in order to deal with complicated configuration space of the multi-arms. To show the effectiveness of the proposed algorithm, both simulation and experiment results are given. By comparing with existing results, it is demonstrated that the proposed method outperforms the existing results.
topic	path planning multi-arm manipulators reinforcement learning Soft Actor-Critic (SAC) Hindsight Experience Replay (HER) collision avoidance
url	https://www.mdpi.com/1424-8220/20/20/5911
work_keys_str_mv	AT evanprianto pathplanningformultiarmmanipulatorsusingdeepreinforcementlearningsoftactorcriticwithhindsightexperiencereplay AT myeongseopkim pathplanningformultiarmmanipulatorsusingdeepreinforcementlearningsoftactorcriticwithhindsightexperiencereplay AT jaehanpark pathplanningformultiarmmanipulatorsusingdeepreinforcementlearningsoftactorcriticwithhindsightexperiencereplay AT jihunbae pathplanningformultiarmmanipulatorsusingdeepreinforcementlearningsoftactorcriticwithhindsightexperiencereplay AT jungsukim pathplanningformultiarmmanipulatorsusingdeepreinforcementlearningsoftactorcriticwithhindsightexperiencereplay
_version_	1724548044840501248

Path Planning for Multi-Arm Manipulators Using Deep Reinforcement Learning: Soft Actor–Critic with Hindsight Experience Replay

Similar Items