Intelligent Trajectory Design for Secure Full- Duplex MIMO-UAV Relaying Against Active Eavesdroppers: A Model-Free Reinforcement Learning Approach

Unmanned aerial vehicle (UAV) assisted wireless communication has recently been recognized as an inevitably promising component of future wireless networks. Particularly, UAVs can be utilized as relays to establish or improve network connectivity thanks to their flexible mobility and likely line-of-...

Full description

Bibliographic Details
Main Authors: Milad Tatar Mamaghani, Yi Hong
Format: Article
Language:English
Published: IEEE 2021-01-01
Series:IEEE Access
Subjects:
Online Access:https://ieeexplore.ieee.org/document/9311243/
Description
Summary:Unmanned aerial vehicle (UAV) assisted wireless communication has recently been recognized as an inevitably promising component of future wireless networks. Particularly, UAVs can be utilized as relays to establish or improve network connectivity thanks to their flexible mobility and likely line-of-sight channel conditions. However, this gives rise to more harmful security issues due to potential adversaries, particularly active eavesdroppers. To combat active eavesdroppers, we propose an artificial-noise beamforming based secure transmission scheme for a full-duplex UAV relaying scenario. In the considered scheme, we investigate a UAV-relay equipped with multiple antennas to securely serve multiple ground users in the presence of randomly located active eavesdroppers. We formulate a novel average system secrecy rate (ASSR) maximization problem under some quality of service (QoS) and mission time constraints. Since the ASSR optimization problem is too hard to solve by conventional optimization methods due to the unavailability of the environment's dynamics and complex model, we develop some model-free reinforcement learning-based algorithms, i.e., Q-learning, SARSA, Expected SARSA, Double Q-learning, and SARSA(λ), to efficiently solve the problem without substantial UAV-network data exchange. Using the proposed algorithms, we can maximize ASSR via finding an optimal UAV trajectory and proper resource allocation. Simulation results demonstrate that all the proposed learning-based algorithms can train the UAV-relay to learn the environment by iterative interactions, thus finding an optimal trajectory, intelligently. Particularly, we find that SARSA(λ) based proposed algorithm with λ = 0.1 outperforms the others in terms of the ASSR.
ISSN:2169-3536