Motion Planning of Robot Manipulators for a Smoother Path Using a Twin Delayed Deep Deterministic Policy Gradient with Hindsight Experience Replay
In order to enhance performance of robot systems in the manufacturing industry, it is essential to develop motion and task planning algorithms. Especially, it is important for the motion plan to be generated automatically in order to deal with various working environments. Although PRM (Probabilisti...
Main Authors: | , , , |
---|---|
Format: | Article |
Language: | English |
Published: |
MDPI AG
2020-01-01
|
Series: | Applied Sciences |
Subjects: | |
Online Access: | https://www.mdpi.com/2076-3417/10/2/575 |
id |
doaj-aebab02e84c240f6916ed787647a386d |
---|---|
record_format |
Article |
spelling |
doaj-aebab02e84c240f6916ed787647a386d2020-11-25T00:35:16ZengMDPI AGApplied Sciences2076-34172020-01-0110257510.3390/app10020575app10020575Motion Planning of Robot Manipulators for a Smoother Path Using a Twin Delayed Deep Deterministic Policy Gradient with Hindsight Experience ReplayMyeongSeop Kim0Dong-Ki Han1Jae-Han Park2Jung-Su Kim3Department of Electrical and Information Engineering, Seoul National University of Science and Technology, Seoul 01811, KoreaDepartment of Electrical and Information Engineering, Seoul National University of Science and Technology, Seoul 01811, KoreaRobotics R&D Group, Korea Institute of Industrial Technology (KITECH), Ansan 15588, KoreaDepartment of Electrical and Information Engineering, Seoul National University of Science and Technology, Seoul 01811, KoreaIn order to enhance performance of robot systems in the manufacturing industry, it is essential to develop motion and task planning algorithms. Especially, it is important for the motion plan to be generated automatically in order to deal with various working environments. Although PRM (Probabilistic Roadmap) provides feasible paths when the starting and goal positions of a robot manipulator are given, the path might not be smooth enough, which can lead to inefficient performance of the robot system. This paper proposes a motion planning algorithm for robot manipulators using a twin delayed deep deterministic policy gradient (TD3) which is a reinforcement learning algorithm tailored to MDP with continuous action. Besides, hindsight experience replay (HER) is employed in the TD3 to enhance sample efficiency. Since path planning for a robot manipulator is an MDP (Markov Decision Process) with sparse reward and HER can deal with such a problem, this paper proposes a motion planning algorithm using TD3 with HER. The proposed algorithm is applied to 2-DOF and 3-DOF manipulators and it is shown that the designed paths are smoother and shorter than those designed by PRM.https://www.mdpi.com/2076-3417/10/2/575motion planningprobabilistic roadmap (prm)reinforcement learningpolicy gradienthindsight experience replay (her) |
collection |
DOAJ |
language |
English |
format |
Article |
sources |
DOAJ |
author |
MyeongSeop Kim Dong-Ki Han Jae-Han Park Jung-Su Kim |
spellingShingle |
MyeongSeop Kim Dong-Ki Han Jae-Han Park Jung-Su Kim Motion Planning of Robot Manipulators for a Smoother Path Using a Twin Delayed Deep Deterministic Policy Gradient with Hindsight Experience Replay Applied Sciences motion planning probabilistic roadmap (prm) reinforcement learning policy gradient hindsight experience replay (her) |
author_facet |
MyeongSeop Kim Dong-Ki Han Jae-Han Park Jung-Su Kim |
author_sort |
MyeongSeop Kim |
title |
Motion Planning of Robot Manipulators for a Smoother Path Using a Twin Delayed Deep Deterministic Policy Gradient with Hindsight Experience Replay |
title_short |
Motion Planning of Robot Manipulators for a Smoother Path Using a Twin Delayed Deep Deterministic Policy Gradient with Hindsight Experience Replay |
title_full |
Motion Planning of Robot Manipulators for a Smoother Path Using a Twin Delayed Deep Deterministic Policy Gradient with Hindsight Experience Replay |
title_fullStr |
Motion Planning of Robot Manipulators for a Smoother Path Using a Twin Delayed Deep Deterministic Policy Gradient with Hindsight Experience Replay |
title_full_unstemmed |
Motion Planning of Robot Manipulators for a Smoother Path Using a Twin Delayed Deep Deterministic Policy Gradient with Hindsight Experience Replay |
title_sort |
motion planning of robot manipulators for a smoother path using a twin delayed deep deterministic policy gradient with hindsight experience replay |
publisher |
MDPI AG |
series |
Applied Sciences |
issn |
2076-3417 |
publishDate |
2020-01-01 |
description |
In order to enhance performance of robot systems in the manufacturing industry, it is essential to develop motion and task planning algorithms. Especially, it is important for the motion plan to be generated automatically in order to deal with various working environments. Although PRM (Probabilistic Roadmap) provides feasible paths when the starting and goal positions of a robot manipulator are given, the path might not be smooth enough, which can lead to inefficient performance of the robot system. This paper proposes a motion planning algorithm for robot manipulators using a twin delayed deep deterministic policy gradient (TD3) which is a reinforcement learning algorithm tailored to MDP with continuous action. Besides, hindsight experience replay (HER) is employed in the TD3 to enhance sample efficiency. Since path planning for a robot manipulator is an MDP (Markov Decision Process) with sparse reward and HER can deal with such a problem, this paper proposes a motion planning algorithm using TD3 with HER. The proposed algorithm is applied to 2-DOF and 3-DOF manipulators and it is shown that the designed paths are smoother and shorter than those designed by PRM. |
topic |
motion planning probabilistic roadmap (prm) reinforcement learning policy gradient hindsight experience replay (her) |
url |
https://www.mdpi.com/2076-3417/10/2/575 |
work_keys_str_mv |
AT myeongseopkim motionplanningofrobotmanipulatorsforasmootherpathusingatwindelayeddeepdeterministicpolicygradientwithhindsightexperiencereplay AT dongkihan motionplanningofrobotmanipulatorsforasmootherpathusingatwindelayeddeepdeterministicpolicygradientwithhindsightexperiencereplay AT jaehanpark motionplanningofrobotmanipulatorsforasmootherpathusingatwindelayeddeepdeterministicpolicygradientwithhindsightexperiencereplay AT jungsukim motionplanningofrobotmanipulatorsforasmootherpathusingatwindelayeddeepdeterministicpolicygradientwithhindsightexperiencereplay |
_version_ |
1725309389113393152 |