Motion Planning of Robot Manipulators for a Smoother Path Using a Twin Delayed Deep Deterministic Policy Gradient with Hindsight Experience Replay

In order to enhance performance of robot systems in the manufacturing industry, it is essential to develop motion and task planning algorithms. Especially, it is important for the motion plan to be generated automatically in order to deal with various working environments. Although PRM (Probabilisti...

Full description

Bibliographic Details
Main Authors:	MyeongSeop Kim, Dong-Ki Han, Jae-Han Park, Jung-Su Kim
Format:	Article
Language:	English
Published:	MDPI AG 2020-01-01
Series:	Applied Sciences
Subjects:	motion planning probabilistic roadmap (prm) reinforcement learning policy gradient hindsight experience replay (her)
Online Access:	https://www.mdpi.com/2076-3417/10/2/575

id	doaj-aebab02e84c240f6916ed787647a386d
record_format	Article
spelling	doaj-aebab02e84c240f6916ed787647a386d2020-11-25T00:35:16ZengMDPI AGApplied Sciences2076-34172020-01-0110257510.3390/app10020575app10020575Motion Planning of Robot Manipulators for a Smoother Path Using a Twin Delayed Deep Deterministic Policy Gradient with Hindsight Experience ReplayMyeongSeop Kim0Dong-Ki Han1Jae-Han Park2Jung-Su Kim3Department of Electrical and Information Engineering, Seoul National University of Science and Technology, Seoul 01811, KoreaDepartment of Electrical and Information Engineering, Seoul National University of Science and Technology, Seoul 01811, KoreaRobotics R&D Group, Korea Institute of Industrial Technology (KITECH), Ansan 15588, KoreaDepartment of Electrical and Information Engineering, Seoul National University of Science and Technology, Seoul 01811, KoreaIn order to enhance performance of robot systems in the manufacturing industry, it is essential to develop motion and task planning algorithms. Especially, it is important for the motion plan to be generated automatically in order to deal with various working environments. Although PRM (Probabilistic Roadmap) provides feasible paths when the starting and goal positions of a robot manipulator are given, the path might not be smooth enough, which can lead to inefficient performance of the robot system. This paper proposes a motion planning algorithm for robot manipulators using a twin delayed deep deterministic policy gradient (TD3) which is a reinforcement learning algorithm tailored to MDP with continuous action. Besides, hindsight experience replay (HER) is employed in the TD3 to enhance sample efficiency. Since path planning for a robot manipulator is an MDP (Markov Decision Process) with sparse reward and HER can deal with such a problem, this paper proposes a motion planning algorithm using TD3 with HER. The proposed algorithm is applied to 2-DOF and 3-DOF manipulators and it is shown that the designed paths are smoother and shorter than those designed by PRM.https://www.mdpi.com/2076-3417/10/2/575motion planningprobabilistic roadmap (prm)reinforcement learningpolicy gradienthindsight experience replay (her)
collection	DOAJ
language	English
format	Article
sources	DOAJ
author	MyeongSeop Kim Dong-Ki Han Jae-Han Park Jung-Su Kim
spellingShingle	MyeongSeop Kim Dong-Ki Han Jae-Han Park Jung-Su Kim Motion Planning of Robot Manipulators for a Smoother Path Using a Twin Delayed Deep Deterministic Policy Gradient with Hindsight Experience Replay Applied Sciences motion planning probabilistic roadmap (prm) reinforcement learning policy gradient hindsight experience replay (her)
author_facet	MyeongSeop Kim Dong-Ki Han Jae-Han Park Jung-Su Kim
author_sort	MyeongSeop Kim
title	Motion Planning of Robot Manipulators for a Smoother Path Using a Twin Delayed Deep Deterministic Policy Gradient with Hindsight Experience Replay
title_short	Motion Planning of Robot Manipulators for a Smoother Path Using a Twin Delayed Deep Deterministic Policy Gradient with Hindsight Experience Replay
title_full	Motion Planning of Robot Manipulators for a Smoother Path Using a Twin Delayed Deep Deterministic Policy Gradient with Hindsight Experience Replay
title_fullStr	Motion Planning of Robot Manipulators for a Smoother Path Using a Twin Delayed Deep Deterministic Policy Gradient with Hindsight Experience Replay
title_full_unstemmed	Motion Planning of Robot Manipulators for a Smoother Path Using a Twin Delayed Deep Deterministic Policy Gradient with Hindsight Experience Replay
title_sort	motion planning of robot manipulators for a smoother path using a twin delayed deep deterministic policy gradient with hindsight experience replay
publisher	MDPI AG
series	Applied Sciences
issn	2076-3417
publishDate	2020-01-01
description	In order to enhance performance of robot systems in the manufacturing industry, it is essential to develop motion and task planning algorithms. Especially, it is important for the motion plan to be generated automatically in order to deal with various working environments. Although PRM (Probabilistic Roadmap) provides feasible paths when the starting and goal positions of a robot manipulator are given, the path might not be smooth enough, which can lead to inefficient performance of the robot system. This paper proposes a motion planning algorithm for robot manipulators using a twin delayed deep deterministic policy gradient (TD3) which is a reinforcement learning algorithm tailored to MDP with continuous action. Besides, hindsight experience replay (HER) is employed in the TD3 to enhance sample efficiency. Since path planning for a robot manipulator is an MDP (Markov Decision Process) with sparse reward and HER can deal with such a problem, this paper proposes a motion planning algorithm using TD3 with HER. The proposed algorithm is applied to 2-DOF and 3-DOF manipulators and it is shown that the designed paths are smoother and shorter than those designed by PRM.
topic	motion planning probabilistic roadmap (prm) reinforcement learning policy gradient hindsight experience replay (her)
url	https://www.mdpi.com/2076-3417/10/2/575
work_keys_str_mv	AT myeongseopkim motionplanningofrobotmanipulatorsforasmootherpathusingatwindelayeddeepdeterministicpolicygradientwithhindsightexperiencereplay AT dongkihan motionplanningofrobotmanipulatorsforasmootherpathusingatwindelayeddeepdeterministicpolicygradientwithhindsightexperiencereplay AT jaehanpark motionplanningofrobotmanipulatorsforasmootherpathusingatwindelayeddeepdeterministicpolicygradientwithhindsightexperiencereplay AT jungsukim motionplanningofrobotmanipulatorsforasmootherpathusingatwindelayeddeepdeterministicpolicygradientwithhindsightexperiencereplay
_version_	1725309389113393152

Motion Planning of Robot Manipulators for a Smoother Path Using a Twin Delayed Deep Deterministic Policy Gradient with Hindsight Experience Replay

Similar Items