Spectrum resource allocation for high-throughput satellite communications based on behavior cloning

In high-throughput multi-beam satellite systems, the dimensionality of the spectrum resource allocation problem increased drastically with the number of satellite beams and service users, which caused an exponential rise in the complexity of the solution. To address the challenge, a two-stage algori...

Full description

Bibliographic Details
Published in:Tongxin xuebao
Main Authors: QIN Hao, LI Shuangyi, ZHAO Di, MENG Haowei, SONG Bin
Format: Article
Language:Chinese
Published: Editorial Department of Journal on Communications 2024-05-01
Subjects:
Online Access:http://www.joconline.com.cn/thesisDetails#10.11959/j.issn.1000-436x.2024100
Description
Summary:In high-throughput multi-beam satellite systems, the dimensionality of the spectrum resource allocation problem increased drastically with the number of satellite beams and service users, which caused an exponential rise in the complexity of the solution. To address the challenge, a two-stage algorithm that combined behavior cloning (BC) with deep reinforcement learning (DRL) was proposed. In the first stage, the strategy network was pretrained using existing decision data from satellite operation through behavior cloning, which mimicked expert behavior to reduce blind exploration and accelerate algorithm convergence. In the second stage, the strategy network was further optimized using the proximal policy optimization (PPO), and a convolutional block attention module (CBAM) was employed to better extract the user traffic features, thereby enhancing overall algorithm performance. Simulation results demonstrate that the proposed algorithm outperforms the benchmark algorithms in terms of convergence speed and algorithm stability, and also delivers superior performance in system delay, average system satisfaction, and spectrum efficiency.
ISSN:1000-436X