首页 | 本学科首页   官方微博 | 高级检索  
     


Relay selection scheme based on deep reinforcement learning in wireless sensor networks
Affiliation:1. GEOMAR Helmholtz Centre for Ocean Research, Düsternbrooker Weg 20, 24105 Kiel, Germany;2. Alfred-Wegener-Institute, Helmholtz Centre for Polar and Marine Research, Bussestraße 24, 27570 Bremerhaven, Germany;1. Fundamental Aspects of Materials and Energy, Faculty of Applied Sciences, Delft University of Technology, Mekelweg 15, 2629 JB Delft, The Netherlands;2. Department of Materials Science and Engineering, Delft University of Technology, Mekelweg 2, 2628 CD Delft, The Netherlands;3. Novel Aerospace Materials Group, Faculty of Aerospace Engineering, Delft University of Technology, Kluyverweg 1, 2629 HS Delft, The Netherlands
Abstract:Cooperative communication technology has realized the enhancement in the wireless communication system’s spectrum utilization rate without resorting to any additional equipment; additionally, it ensures system reliability in transmission, increasingly becoming a research focus within the sphere of wireless sensor networks (WSNs). Since the selection of relay is crucial to cooperative communication technology, this paper proposes two different relay selection schemes subject to deep reinforcement learning (DRL), in response to the issues in WSNs with relay selection in cooperative communications, which can be summarized as the Deep-Q-Network Based Relay Selection Scheme (DQN-RSS), as well as the Proximal Policy Optimization Based Relay Selection Scheme (PPO-RSS); it further compared the commonly used Q-learning relay selection scheme (Q-RSS) with random relay selection scheme. First, the cooperative communication process in WSNs is modeled as a Markov decision process, and DRL algorithm is trained in accordance with the outage probability, as well as mutual information (MI). Under the condition of unknown instantaneous channel state information (CSI), the best relay is adaptively selected from multiple candidate relays. Thereafter, in view of the slow convergence speed of Q-RSS in high-dimensional state space, the DRL algorithm is used to accelerate the convergence. In particular, we employ DRL algorithm to deal with high-dimensional state space while speeding up learning. The experimental results reveal that under the same conditions, the random relay selection scheme always has the worst performance. And compared to Q-RSS, the two relay selection schemes designed in this paper greatly reduce the number of iterations and speed up the convergence speed, thereby reducing the computational complexity and overhead of the source node selecting the best relay strategy. In addition, the two relay selection schemes designed and raised in this paper are featured by lower-level outage probability with lower-level energy consumption and larger system capacity. In particular, PPO-RSS has higher reliability and practicability.
Keywords:Wireless sensor networks  Cooperative communications  Relay selection  Deep reinforcement learning
本文献已被 ScienceDirect 等数据库收录!
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号