Online Dual-Network-Based Adaptive Dynamic Programming for Solving Partially Unknown Multi-Player Non-Zero-Sum Games With Control Constraints

In this article, a novel online method for multi-player non-zero-sum (NZS) differential games of nonlinear partially unknown continuous time (CT) systems with control constraints is developed based on neural networks (NN). The issue of multi-player NZS games with saturated actuator is elaborately an...

Full description

Bibliographic Details
Main Authors:	Pengda Liu, Huaguang Zhang, Chong Liu, Hanguang Su
Format:	Article
Language:	English
Published:	IEEE 2020-01-01
Series:	IEEE Access
Subjects:	Adaptive critic designs adaptive dynamic programming control constraints multi-player non-zero-sum games
Online Access:	https://ieeexplore.ieee.org/document/9214826/

id	doaj-20b7b256f6694715ae265db5d8a3482c
record_format	Article
spelling	doaj-20b7b256f6694715ae265db5d8a3482c2021-03-30T03:38:33ZengIEEEIEEE Access2169-35362020-01-01818229518230610.1109/ACCESS.2020.30291719214826Online Dual-Network-Based Adaptive Dynamic Programming for Solving Partially Unknown Multi-Player Non-Zero-Sum Games With Control ConstraintsPengda Liu0https://orcid.org/0000-0003-0154-3755Huaguang Zhang1https://orcid.org/0000-0002-0647-4050Chong Liu2https://orcid.org/0000-0001-9842-6955Hanguang Su3https://orcid.org/0000-0003-1356-4158College of Information Science and Engineering, Northeastern University, Shenyang, ChinaCollege of Information Science and Engineering, Northeastern University, Shenyang, ChinaCollege of Information Science and Engineering, Northeastern University, Shenyang, ChinaCollege of Information Science and Engineering, Northeastern University, Shenyang, ChinaIn this article, a novel online method for multi-player non-zero-sum (NZS) differential games of nonlinear partially unknown continuous time (CT) systems with control constraints is developed based on neural networks (NN). The issue of multi-player NZS games with saturated actuator is elaborately analyzed and the unknown dynamics model is learned by applying identifier NN. Different from using the standard identifier-actor-critic framework of adaptive dynamic programming (ADP), the proposed method uses only identifier networks and critic networks for all the players to solve the coupled Hamilton-Jacobi (HJ) equations for multi-player NZS games, which could effectively simplify the algorithm and save computing resources. Moreover, a tuning law which utilizes the gradient descent method is designed for each critic network. Meanwhile, to remove the requirement for the initial stabilizing control, a novel stability term is designed to ensure the system stability during the training phase of the critic NN. By the means of Lyapunov approach, it is proven that the system states, the critic network weight estimation errors and the obtained control are all uniformly ultimately bounded (UUB). Finally, two numerical examples are simulated to illustrate the validity of the developed method for multi-player NZS games with control constraints.https://ieeexplore.ieee.org/document/9214826/Adaptive critic designsadaptive dynamic programmingcontrol constraintsmulti-playernon-zero-sum games
collection	DOAJ
language	English
format	Article
sources	DOAJ
author	Pengda Liu Huaguang Zhang Chong Liu Hanguang Su
spellingShingle	Pengda Liu Huaguang Zhang Chong Liu Hanguang Su Online Dual-Network-Based Adaptive Dynamic Programming for Solving Partially Unknown Multi-Player Non-Zero-Sum Games With Control Constraints IEEE Access Adaptive critic designs adaptive dynamic programming control constraints multi-player non-zero-sum games
author_facet	Pengda Liu Huaguang Zhang Chong Liu Hanguang Su
author_sort	Pengda Liu
title	Online Dual-Network-Based Adaptive Dynamic Programming for Solving Partially Unknown Multi-Player Non-Zero-Sum Games With Control Constraints
title_short	Online Dual-Network-Based Adaptive Dynamic Programming for Solving Partially Unknown Multi-Player Non-Zero-Sum Games With Control Constraints
title_full	Online Dual-Network-Based Adaptive Dynamic Programming for Solving Partially Unknown Multi-Player Non-Zero-Sum Games With Control Constraints
title_fullStr	Online Dual-Network-Based Adaptive Dynamic Programming for Solving Partially Unknown Multi-Player Non-Zero-Sum Games With Control Constraints
title_full_unstemmed	Online Dual-Network-Based Adaptive Dynamic Programming for Solving Partially Unknown Multi-Player Non-Zero-Sum Games With Control Constraints
title_sort	online dual-network-based adaptive dynamic programming for solving partially unknown multi-player non-zero-sum games with control constraints
publisher	IEEE
series	IEEE Access
issn	2169-3536
publishDate	2020-01-01
description	In this article, a novel online method for multi-player non-zero-sum (NZS) differential games of nonlinear partially unknown continuous time (CT) systems with control constraints is developed based on neural networks (NN). The issue of multi-player NZS games with saturated actuator is elaborately analyzed and the unknown dynamics model is learned by applying identifier NN. Different from using the standard identifier-actor-critic framework of adaptive dynamic programming (ADP), the proposed method uses only identifier networks and critic networks for all the players to solve the coupled Hamilton-Jacobi (HJ) equations for multi-player NZS games, which could effectively simplify the algorithm and save computing resources. Moreover, a tuning law which utilizes the gradient descent method is designed for each critic network. Meanwhile, to remove the requirement for the initial stabilizing control, a novel stability term is designed to ensure the system stability during the training phase of the critic NN. By the means of Lyapunov approach, it is proven that the system states, the critic network weight estimation errors and the obtained control are all uniformly ultimately bounded (UUB). Finally, two numerical examples are simulated to illustrate the validity of the developed method for multi-player NZS games with control constraints.
topic	Adaptive critic designs adaptive dynamic programming control constraints multi-player non-zero-sum games
url	https://ieeexplore.ieee.org/document/9214826/
work_keys_str_mv	AT pengdaliu onlinedualnetworkbasedadaptivedynamicprogrammingforsolvingpartiallyunknownmultiplayernonzerosumgameswithcontrolconstraints AT huaguangzhang onlinedualnetworkbasedadaptivedynamicprogrammingforsolvingpartiallyunknownmultiplayernonzerosumgameswithcontrolconstraints AT chongliu onlinedualnetworkbasedadaptivedynamicprogrammingforsolvingpartiallyunknownmultiplayernonzerosumgameswithcontrolconstraints AT hanguangsu onlinedualnetworkbasedadaptivedynamicprogrammingforsolvingpartiallyunknownmultiplayernonzerosumgameswithcontrolconstraints
_version_	1724183043951820800

Online Dual-Network-Based Adaptive Dynamic Programming for Solving Partially Unknown Multi-Player Non-Zero-Sum Games With Control Constraints

Similar Items