Reinforcement Q-Learning Control With Reward Shaping Function for Swing Phase Control in a Semi-active Prosthetic Knee

In this study, we investigated a control algorithm for a semi-active prosthetic knee based on reinforcement learning (RL). Model-free reinforcement Q-learning control with a reward shaping function was proposed as the voltage controller of a magnetorheological damper based on the prosthetic knee. Th...

Full description

Bibliographic Details
Main Authors: Yonatan Hutabarat, Kittipong Ekkachai, Mitsuhiro Hayashibe, Waree Kongprawechnon
Format: Article
Language:English
Published: Frontiers Media S.A. 2020-11-01
Series:Frontiers in Neurorobotics
Subjects:
Online Access:https://www.frontiersin.org/articles/10.3389/fnbot.2020.565702/full
id doaj-8744f26ffb404316bdf29f63fe414fb6
record_format Article
spelling doaj-8744f26ffb404316bdf29f63fe414fb62020-12-08T08:39:16ZengFrontiers Media S.A.Frontiers in Neurorobotics1662-52182020-11-011410.3389/fnbot.2020.565702565702Reinforcement Q-Learning Control With Reward Shaping Function for Swing Phase Control in a Semi-active Prosthetic KneeYonatan Hutabarat0Kittipong Ekkachai1Mitsuhiro Hayashibe2Mitsuhiro Hayashibe3Waree Kongprawechnon4Neuro-Robotics Laboratory, Graduate School of Biomedical Engineering, Tohoku University, Sendai, JapanSmart Machine and Mixed Reality (SMR) Laboratory, National Electronics and Computer Technology Center (NECTEC), Pathum Thani, ThailandNeuro-Robotics Laboratory, Graduate School of Biomedical Engineering, Tohoku University, Sendai, JapanDepartment of Robotics, Graduate School of Engineering, Tohoku University, Sendai, JapanSchool of Information Computer and Communication Technology (ICT), Sirindhorn International Institute of Technology (SIIT), Thammasat University, Pathum Thani, ThailandIn this study, we investigated a control algorithm for a semi-active prosthetic knee based on reinforcement learning (RL). Model-free reinforcement Q-learning control with a reward shaping function was proposed as the voltage controller of a magnetorheological damper based on the prosthetic knee. The reward function was designed as a function of the performance index that accounts for the trajectory of the subject-specific knee angle. We compared our proposed reward function to a conventional single reward function under the same random initialization of a Q-matrix. We trained this control algorithm to adapt to several walking speed datasets under one control policy and subsequently compared its performance with that of other control algorithms. The results showed that our proposed reward function performed better than the conventional single reward function in terms of the normalized root mean squared error and also showed a faster convergence trend. Furthermore, our control strategy converged within our desired performance index and could adapt to several walking speeds. Our proposed control structure has also an overall better performance compared to user-adaptive control, while some of its walking speeds performed better than the neural network predictive control from existing studies.https://www.frontiersin.org/articles/10.3389/fnbot.2020.565702/fullreinforcement learningreward shapingQ-learningsemi-active prosthetic kneemagnetorhelogical damper
collection DOAJ
language English
format Article
sources DOAJ
author Yonatan Hutabarat
Kittipong Ekkachai
Mitsuhiro Hayashibe
Mitsuhiro Hayashibe
Waree Kongprawechnon
spellingShingle Yonatan Hutabarat
Kittipong Ekkachai
Mitsuhiro Hayashibe
Mitsuhiro Hayashibe
Waree Kongprawechnon
Reinforcement Q-Learning Control With Reward Shaping Function for Swing Phase Control in a Semi-active Prosthetic Knee
Frontiers in Neurorobotics
reinforcement learning
reward shaping
Q-learning
semi-active prosthetic knee
magnetorhelogical damper
author_facet Yonatan Hutabarat
Kittipong Ekkachai
Mitsuhiro Hayashibe
Mitsuhiro Hayashibe
Waree Kongprawechnon
author_sort Yonatan Hutabarat
title Reinforcement Q-Learning Control With Reward Shaping Function for Swing Phase Control in a Semi-active Prosthetic Knee
title_short Reinforcement Q-Learning Control With Reward Shaping Function for Swing Phase Control in a Semi-active Prosthetic Knee
title_full Reinforcement Q-Learning Control With Reward Shaping Function for Swing Phase Control in a Semi-active Prosthetic Knee
title_fullStr Reinforcement Q-Learning Control With Reward Shaping Function for Swing Phase Control in a Semi-active Prosthetic Knee
title_full_unstemmed Reinforcement Q-Learning Control With Reward Shaping Function for Swing Phase Control in a Semi-active Prosthetic Knee
title_sort reinforcement q-learning control with reward shaping function for swing phase control in a semi-active prosthetic knee
publisher Frontiers Media S.A.
series Frontiers in Neurorobotics
issn 1662-5218
publishDate 2020-11-01
description In this study, we investigated a control algorithm for a semi-active prosthetic knee based on reinforcement learning (RL). Model-free reinforcement Q-learning control with a reward shaping function was proposed as the voltage controller of a magnetorheological damper based on the prosthetic knee. The reward function was designed as a function of the performance index that accounts for the trajectory of the subject-specific knee angle. We compared our proposed reward function to a conventional single reward function under the same random initialization of a Q-matrix. We trained this control algorithm to adapt to several walking speed datasets under one control policy and subsequently compared its performance with that of other control algorithms. The results showed that our proposed reward function performed better than the conventional single reward function in terms of the normalized root mean squared error and also showed a faster convergence trend. Furthermore, our control strategy converged within our desired performance index and could adapt to several walking speeds. Our proposed control structure has also an overall better performance compared to user-adaptive control, while some of its walking speeds performed better than the neural network predictive control from existing studies.
topic reinforcement learning
reward shaping
Q-learning
semi-active prosthetic knee
magnetorhelogical damper
url https://www.frontiersin.org/articles/10.3389/fnbot.2020.565702/full
work_keys_str_mv AT yonatanhutabarat reinforcementqlearningcontrolwithrewardshapingfunctionforswingphasecontrolinasemiactiveprostheticknee
AT kittipongekkachai reinforcementqlearningcontrolwithrewardshapingfunctionforswingphasecontrolinasemiactiveprostheticknee
AT mitsuhirohayashibe reinforcementqlearningcontrolwithrewardshapingfunctionforswingphasecontrolinasemiactiveprostheticknee
AT mitsuhirohayashibe reinforcementqlearningcontrolwithrewardshapingfunctionforswingphasecontrolinasemiactiveprostheticknee
AT wareekongprawechnon reinforcementqlearningcontrolwithrewardshapingfunctionforswingphasecontrolinasemiactiveprostheticknee
_version_ 1724390531448963072