首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
We have studied massive MIMO hybrid beamforming (HBF) for millimeter-wave (mmWave) communications, where the transceivers only have a few radio frequency chain (RFC) numbers compared to the number of antenna elements. We propose a hybrid beamforming design to improve the system’s spectral, hardware, and computational efficiencies, where finding the precoding and combining matrices are formulated as optimization problems with practical constraints. The series of analog phase shifters creates a unit modulus constraint, making this problem non-convex and subsequently incurring unaffordable computational complexity. Advanced deep reinforcement learning techniques effectively handle non-convex problems in many domains; therefore, we have transformed this non-convex hybrid beamforming optimization problem using a reinforcement learning framework. These frameworks are solved using advanced deep reinforcement learning techniques implemented with experience replay schemes to maximize the spectral and learning efficiencies in highly uncertain wireless environments. We developed a twin-delayed deep deterministic (TD3) policy gradient-based hybrid beamforming scheme to overcome Q-learning’s substantial overestimation. We assumed a complete channel state information (CSI) to design our beamformers and then challenged this assumption by proposing a deep reinforcement learning-based channel estimation method. We reduced hybrid beamforming complexity using soft target double deep Q-learning to exploit mmWave channel sparsity. This method allowed us to construct the analog precoder by selecting channel dominant paths. We have demonstrated that the proposed approaches improve the system’s spectral and learning efficiencies compared to prior studies. We have also demonstrated that deep reinforcement learning is a versatile technique that can unleash the power of massive MIMO hybrid beamforming in mmWave systems for next-generation wireless communication.  相似文献   

2.
Cyber–physical systems (CPS) have been widely employed as wireless control networks. There is a special type of CPS which is developed from the wireless networked control systems (WNCS). They usually include two communication links: Uplink transmission and downlink transmission. Those two links form a closed-loop. When such CPS are deployed for time-sensitive applications such as remote control, the uplink and downlink propagation delay are non-negligible. However, existing studies on CPS/WNCS usually ignore the propagation delay of the uplink and downlink channels. In order to achieve the best balance between uplink and downlink transmissions under such circumstances, we propose a heuristic framework to obtain the optimal scheduling strategy that can minimize the long-term average control cost. We model the optimization problem as a Markov decision process (MDP), and then give the sufficient conditions for the existence of the optimal scheduling strategy. We propose the semi-predictive framework to eliminate the impact of the coupling characteristic between the uplink and downlink data packets. Then we obtain the lookup table-based optimal offline strategy and the neural network-based suboptimal online strategy. Numerical simulation shows that the scheduling strategies obtained by this framework can bring significant performance improvements over the existing strategies.  相似文献   

3.
4.
A policy iteration algorithm of adaptive dynamic programming(ADP) is developed to solve the optimal tracking control for a class of discrete-time chaotic systems. By system transformations, the optimal tracking problem is transformed into an optimal regulation one. The policy iteration algorithm for discrete-time chaotic systems is first described. Then,the convergence and admissibility properties of the developed policy iteration algorithm are presented, which show that the transformed chaotic system can be stabilized under an arbitrary iterative control law and the iterative performance index function simultaneously converges to the optimum. By implementing the policy iteration algorithm via neural networks,the developed optimal tracking control scheme for chaotic systems is verified by a simulation.  相似文献   

5.
An intelligent solution method is proposed to achieve real-time optimal control for continuous-time nonlinear systems using a novel identifier-actor-optimizer(IAO)policy learning architecture.In this IAO-based policy learning approach,a dynamical identifier is developed to approximate the unknown part of system dynamics using deep neural networks(DNNs).Then,an indirect-method-based optimizer is proposed to generate high-quality optimal actions for system control considering both the constraints and performance index.Furthermore,a DNN-based actor is developed to approximate the obtained optimal actions and return good initial guesses to the optimizer.In this way,the traditional optimal control methods and state-of-the-art DNN techniques are combined in the IAO-based optimal policy learning method.Compared to the reinforcement learning algorithms with actor-critic architectures that suffer hard reward design and low computational efficiency,the IAO-based optimal policy learning algorithm enjoys fewer user-defined parameters,higher learning speeds,and steadier convergence properties in solving complex continuous-time optimal control problems(OCPs).Simulation results of three space flight control missions are given to substantiate the effectiveness of this IAO-based policy learning strategy and to illustrate the performance of the developed DNN-based optimal control method for continuous-time OCPs.  相似文献   

6.
We consider an intelligent reflecting surface (IRS)-assisted wireless powered communication network (WPCN) in which a multi antenna power beacon (PB) sends a dedicated energy signal to a wireless powered source. The source first harvests energy and then utilizing this harvested energy, it sends an information signal to destination where an external interference may also be present. For the considered system model, we formulated an analytical problem in which the objective is to maximize the throughput by jointly optimizing the energy harvesting (EH) time and IRS phase shift matrices. The optimization problem is high dimensional non-convex, thus a good quality solution can be obtained by invoking any state-of-the-art algorithm such as Genetic algorithm (GA). It is well-known that the performance of GA is generally remarkable, however it incurs a high computational complexity. To this end, we propose a deep unsupervised learning (DUL) based approach in which a neural network (NN) is trained very efficiently as time-consuming task of labeling a data set is not required. Numerical examples show that our proposed approach achieves a better performance–complexity trade-off as it is not only several times faster but also provides almost same or even higher throughput as compared to the GA.  相似文献   

7.
It is desirable to combine the expressive power of deep learning with Gaussian Process (GP) in one expressive Bayesian learning model. Deep kernel learning showed success as a deep network used for feature extraction. Then, a GP was used as the function model. Recently, it was suggested that, albeit training with marginal likelihood, the deterministic nature of a feature extractor might lead to overfitting, and replacement with a Bayesian network seemed to cure it. Here, we propose the conditional deep Gaussian process (DGP) in which the intermediate GPs in hierarchical composition are supported by the hyperdata and the exposed GP remains zero mean. Motivated by the inducing points in sparse GP, the hyperdata also play the role of function supports, but are hyperparameters rather than random variables. It follows our previous moment matching approach to approximate the marginal prior for conditional DGP with a GP carrying an effective kernel. Thus, as in empirical Bayes, the hyperdata are learned by optimizing the approximate marginal likelihood which implicitly depends on the hyperdata via the kernel. We show the equivalence with the deep kernel learning in the limit of dense hyperdata in latent space. However, the conditional DGP and the corresponding approximate inference enjoy the benefit of being more Bayesian than deep kernel learning. Preliminary extrapolation results demonstrate expressive power from the depth of hierarchy by exploiting the exact covariance and hyperdata learning, in comparison with GP kernel composition, DGP variational inference and deep kernel learning. We also address the non-Gaussian aspect of our model as well as way of upgrading to a full Bayes inference.  相似文献   

8.
In this paper, heterogeneous cellular networks (HCNs) with base stations (BSs) powered from both renewable energy sources and the grid power are considered. Based on a techno-economic analysis, we demonstrate that by controlling both transmit power and stored energy usage of BSs, energy costs can be effectively reduced. Specifically, we propose a two-stage BS operation scheme where an optimization and control subproblem is solved at each stage, respectively. For the first subproblem, transmit power of BSs is adjusted while quality of service (QoS) experienced by users is preserved. In the second subproblem, we consider the strategic scheduling of renewable energy used to power the BSs. That is, harvested energy may be reserved in the battery for future use to minimize the cost of on-grid power that varies in real-time. We propose: (1) an optimization approach built on a lattice model with a method to process outage rate constraint, and (2) a control algorithm based on nonlinear model predictive control (NMPC) theory to solve the two subproblems, respectively. Simulation results include a collection of case studies that demonstrate as to how operators may manage energy harvesting BSs to reduce their electricity costs.  相似文献   

9.
宋睿卓  魏庆来 《中国物理 B》2017,26(3):30505-030505
We develop an optimal tracking control method for chaotic system with unknown dynamics and disturbances. The method allows the optimal cost function and the corresponding tracking control to update synchronously. According to the tracking error and the reference dynamics, the augmented system is constructed. Then the optimal tracking control problem is defined. The policy iteration(PI) is introduced to solve the min-max optimization problem. The off-policy adaptive dynamic programming(ADP) algorithm is then proposed to find the solution of the tracking Hamilton–Jacobi–Isaacs(HJI) equation online only using measured data and without any knowledge about the system dynamics. Critic neural network(CNN), action neural network(ANN), and disturbance neural network(DNN) are used to approximate the cost function, control, and disturbance. The weights of these networks compose the augmented weight matrix, and the uniformly ultimately bounded(UUB) of which is proven. The convergence of the tracking error system is also proven. Two examples are given to show the effectiveness of the proposed synchronous solution method for the chaotic system tracking problem.  相似文献   

10.
In this paper, we investigate an intelligent reflecting surface (IRS)-assisted mobile edge computing (MEC) network under physical-layer security, where users can partially offload confidential and compute-intensive tasks to a computing access point (CAP) with the help of the IRS. We consider an eavesdropping environment, where an eavesdropper steals information from the communication. For the considered MEC network, we firstly design a secure data transmission rate to ensure physical-layer security. Moreover, we formulate the optimization target as minimizing the system cost linearized by the latency and energy consumption (ENCP). In further, we employ a deep deterministic policy gradient (DDPG) to optimize the system performance by allocating the offloading ratio and wireless bandwidth and computational capability to users. Finally, considering the impacts from different resources, based on DDPG, seeing our optimization strategy as one criterion, we designed other criteria with different resource allocation schemes. And some simulation results are given to demonstrate that our proposed criterion outperforms other criteria.  相似文献   

11.
董立静  柴森春  张百海 《中国物理 B》2014,23(1):10508-010508
We explore the tracking problem of a maneuvering target. Tracking agents with third-order kinematics can communicate with each other via wireless network. The communication network topology is arbitrary rather than switches among several fixed topologies. The information sharing and interaction among agents are position, velocity, and acceleration. Some sufficient conditions of tracking strategy have been proposed. Finally, a numerical example is employed to demonstrate the effectiveness of proposed tracking strategy.  相似文献   

12.
Computational efficiency is a direction worth considering in moving edge computing (MEC) systems. However, the computational efficiency of UAV-assisted MEC systems is rarely studied. In this paper, we maximize the computational efficiency of the MEC network by optimizing offloading decisions, UAV flight paths, and allocating users’ charging and offloading time reasonably. The method of deep reinforcement learning is used to optimize the resources of UAV-assisted MEC system in complex urban environment, and the user’s computation-intensive tasks are offloaded to the UAV-mounted MEC server, so that the overloaded tasks in the whole system can be alleviated. We study and design a framework algorithm that can quickly adapt to task offload decision making and resource allocation under changing wireless channel conditions in complex urban environments. The optimal offloading decisions from state space to action space is generated through deep reinforcement learning, and then the user’s own charging time and offloading time are rationally allocated to maximize the weighted sum computation rate. Finally, combined with the radio map to optimize the UAC trajectory to improve the overall weighted sum computation rate of the system. Simulation results show that the proposed DRL+TO framework algorithm can significantly improve the weighted sum computation rate of the whole MEC system and save time. It can be seen that the MEC system resource optimization scheme proposed in this paper is feasible and has better performance than other benchmark schemes.  相似文献   

13.
Timely status updates are critical in remote control systems such as autonomous driving and the industrial Internet of Things, where timeliness requirements are usually context dependent. Accordingly, the Urgency of Information (UoI) has been proposed beyond the well-known Age of Information (AoI) by further including context-aware weights which indicate whether the monitored process is in an emergency. However, the optimal updating and scheduling strategies in terms of UoI remain open. In this paper, we propose a UoI-optimal updating policy for timely status information with resource constraint. We first formulate the problem in a constrained Markov decision process and prove that the UoI-optimal policy has a threshold structure. When the context-aware weights are known, we propose a numerical method based on linear programming. When the weights are unknown, we further design a reinforcement learning (RL)-based scheduling policy. The simulation reveals that the threshold of the UoI-optimal policy increases as the resource constraint tightens. In addition, the UoI-optimal policy outperforms the AoI-optimal policy in terms of average squared estimation error, and the proposed RL-based updating policy achieves a near-optimal performance without the advanced knowledge of the system model.  相似文献   

14.
This article examines a multiuser intelligent reflecting surface (RIS) aided mobile edge computing (MEC) system, where multiple edge nodes (ENs) with powerful calculating resources at the network can help compute the calculating tasks from the users through wireless channels. We evaluate the system performance by using the performance metric of communication and computing delay. To enhance the system performance by reducing the network delay, we jointly optimize the unpacking design and wireless bandwidth allocation, whereas the task unpacking optimization is solved by using the deep deterministic policy gradient (DDPG) algorithm. As to the bandwidth allocation, we propose three analytical solutions, where criterion I performs an equal bandwidth allocation, criterion II performs the allocation based on the transmission data rate, while criterion III performs the allocation based on the transmission delay. We finally provide simulation results to show that the proposed optimization on the task unpacking and bandwidth allocation is effective in decreasing the network delay.  相似文献   

15.
Future communication networks must address the scarce spectrum to accommodate extensive growth of heterogeneous wireless devices. Efforts are underway to address spectrum coexistence, enhance spectrum awareness, and bolster authentication schemes. Wireless signal recognition is becoming increasingly more significant for spectrum monitoring, spectrum management, secure communications, among others. Consequently, comprehensive spectrum awareness on the edge has the potential to serve as a key enabler for the emerging beyond 5G (fifth generation) networks. State-of-the-art studies in this domain have (i) only focused on a single task – modulation or signal (protocol) classification – which in many cases is insufficient information for a system to act on, (ii) consider either radar or communication waveforms (homogeneous waveform category), and (iii) does not address edge deployment during neural network design phase. In this work, for the first time in the wireless communication domain, we exploit the potential of deep neural networks based multi-task learning (MTL) framework to simultaneously learn modulation and signal classification tasks while considering heterogeneous wireless signals such as radar and communication waveforms in the electromagnetic spectrum. The proposed MTL architecture benefits from the mutual relation between the two tasks in improving the classification accuracy as well as the learning efficiency with a lightweight neural network model. We additionally include experimental evaluations of the model with over-the-air collected samples and demonstrate first-hand insight on model compression along with deep learning pipeline for deployment on resource-constrained edge devices. We demonstrate significant computational, memory, and accuracy improvement of the proposed model over two reference architectures. In addition to modeling a lightweight MTL model suitable for resource-constrained embedded radio platforms, we provide a comprehensive heterogeneous wireless signals dataset for public use.  相似文献   

16.
Indoor location-aware service is booming in daily life and business activities, making the demand for precise indoor positioning systems thrive. The identification between line-of-sight (LOS) and non-line-of-sight (NLOS) is critical for wireless indoor time-of-arrival-based localization methods. Ultra-Wide-Band (UWB) is considered low cost among the many wireless positioning systems. It can resolve multi-path and have high penetration ability. This contribution addresses UWB NLOS/LOS identification problem in multiple environments. We propose a LOS/NLOS identification method using Convolutional Neural Network parallel with Gate Recurrent Unit, named Indoor NLOS/LOS identification Neural Network. The Convolutional Neural Network extracts spatial features of UWB channel impulse response data. While the Gate Recurrent Unit is an effective approach for designing deep recurrent neural networks which can extract temporal features. By integrating squeeze-and-extraction blocks into these architectures we can assign weights on channel-wise features. We simulated UWB channel impulse response signals in residential, office, and industrial scenarios based on the IEEE 802.15.4a channel model report. The presented network was tested in simulation scenarios and an open-source real-time measured dataset. Our method can solve NLOS identification problems for multiple indoor environments. Thus more versatile compare with networks only working in one scenario. Popular machine learning methods and deep learning methods are compared against our method. The test results show that the proposed network outperforms benchmark methods in simulation datasets and real-time measured datasets.  相似文献   

17.
袁建华  黄开  洪沪生  陈庆  李尚 《应用光学》2020,41(1):194-201
航天事业的发展以及新能源技术的开发,使得小型电动无人机在现代战争、科学研究等方面具有较高的应用价值。激光无线能量传输技术能有效解决小型电动无人机续航时间短的问题,极大提高了无人机的工作能效。以无人机激光供能系统结构原理为基础,针对小型电动无人机激光无线供能的特点,提出了一种最大功率点优化跟踪方法:即采用恒定电压法(CV法)和萤火虫算法(FA法)相结合的优化控制算法,在激光投射到无人机上的光伏电池板上后,通过对无人机激光无线充电过程中最大功率点的跟踪,提高激光利用率及充电稳定性。并且通过数值仿真,验证了所提算法的准确性和适用性。  相似文献   

18.
Battery energy storage technology is an important part of the industrial parks to ensure the stable power supply, and its rough charging and discharging mode is difficult to meet the application requirements of energy saving, emission reduction, cost reduction, and efficiency increase. As a classic method of deep reinforcement learning, the deep Q-network is widely used to solve the problem of user-side battery energy storage charging and discharging. In some scenarios, its performance has reached the level of human expert. However, the updating of storage priority in experience memory often lags behind updating of Q-network parameters. In response to the need for lean management of battery charging and discharging, this paper proposes an improved deep Q-network to update the priority of sequence samples and the training performance of deep neural network, which reduces the cost of charging and discharging action and energy consumption in the park. The proposed method considers factors such as real-time electricity price, battery status, and time. The energy consumption state, charging and discharging behavior, reward function, and neural network structure are designed to meet the flexible scheduling of charging and discharging strategies, and can finally realize the optimization of battery energy storage benefits. The proposed method can solve the problem of priority update lag, and improve the utilization efficiency and learning performance of the experience pool samples. The paper selects electricity price data from the United States and some regions of China for simulation experiments. Experimental results show that compared with the traditional algorithm, the proposed approach can achieve better performance in both electricity price systems, thereby greatly reducing the cost of battery energy storage and providing a stronger guarantee for the safe and stable operation of battery energy storage systems in industrial parks.  相似文献   

19.
We present a general machine learning based scheme to optimize experimental control.The method utilizes the neural network to learn the relation between the control parameters and the control goal, with which the optimal control parameters can be obtained.The main challenge of this approach is that the labeled data obtained from experiments are not abundant.The central idea of our scheme is to use the active learning to overcome this difficulty.As a demonstration example, we apply our method to control evaporative cooling experiments in cold atoms.We have first tested our method with simulated data and then applied our method to real experiments.It is demonstrated that our method can successfully reach the best performance within hundreds of experimental runs.Our method does not require knowledge of the experimental system as a prior and is universal for experimental control in different systems.  相似文献   

20.
Transient stability and steady-state (small signal) stability in power girds are reviewed. Transient stability concepts are illustrated with simple examples; in particular, we consider three methods for computing region of attraction: time-simulations, extended Lyapunov function, and sum of squares optimization method. We discuss steady state stability in power systems, and present an example of a feedback control via a communication network for the 10?Unit 39 Bus New England Test system.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号