首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 249 毫秒
1.
Computational efficiency is a direction worth considering in moving edge computing (MEC) systems. However, the computational efficiency of UAV-assisted MEC systems is rarely studied. In this paper, we maximize the computational efficiency of the MEC network by optimizing offloading decisions, UAV flight paths, and allocating users’ charging and offloading time reasonably. The method of deep reinforcement learning is used to optimize the resources of UAV-assisted MEC system in complex urban environment, and the user’s computation-intensive tasks are offloaded to the UAV-mounted MEC server, so that the overloaded tasks in the whole system can be alleviated. We study and design a framework algorithm that can quickly adapt to task offload decision making and resource allocation under changing wireless channel conditions in complex urban environments. The optimal offloading decisions from state space to action space is generated through deep reinforcement learning, and then the user’s own charging time and offloading time are rationally allocated to maximize the weighted sum computation rate. Finally, combined with the radio map to optimize the UAC trajectory to improve the overall weighted sum computation rate of the system. Simulation results show that the proposed DRL+TO framework algorithm can significantly improve the weighted sum computation rate of the whole MEC system and save time. It can be seen that the MEC system resource optimization scheme proposed in this paper is feasible and has better performance than other benchmark schemes.  相似文献   

2.
An intelligent solution method is proposed to achieve real-time optimal control for continuous-time nonlinear systems using a novel identifier-actor-optimizer(IAO)policy learning architecture.In this IAO-based policy learning approach,a dynamical identifier is developed to approximate the unknown part of system dynamics using deep neural networks(DNNs).Then,an indirect-method-based optimizer is proposed to generate high-quality optimal actions for system control considering both the constraints and performance index.Furthermore,a DNN-based actor is developed to approximate the obtained optimal actions and return good initial guesses to the optimizer.In this way,the traditional optimal control methods and state-of-the-art DNN techniques are combined in the IAO-based optimal policy learning method.Compared to the reinforcement learning algorithms with actor-critic architectures that suffer hard reward design and low computational efficiency,the IAO-based optimal policy learning algorithm enjoys fewer user-defined parameters,higher learning speeds,and steadier convergence properties in solving complex continuous-time optimal control problems(OCPs).Simulation results of three space flight control missions are given to substantiate the effectiveness of this IAO-based policy learning strategy and to illustrate the performance of the developed DNN-based optimal control method for continuous-time OCPs.  相似文献   

3.
Machine learning research has been able to solve problems in multiple domains. Machine learning represents an open area of research for solving optimisation problems. The optimisation problems can be solved using a metaheuristic algorithm, which can find a solution in a reasonable amount of time. However, the time required to find an appropriate metaheuristic algorithm, that would have the convenient configurations to solve a set of optimisation problems properly presents a problem. The proposal described in this article contemplates an approach that automatically creates metaheuristic algorithms given a set of optimisation problems. These metaheuristic algorithms are created by modifying their logical structure via the execution of an evolutionary process. This process employs an extension of the reinforcement learning approach that considers multi-agents in their environment, and a learning agent composed of an analysis process and a process of modification of the algorithms. The approach succeeded in creating a metaheuristic algorithm that managed to solve different continuous domain optimisation problems from the experiments performed. The implications of this work are immediate because they describe a basis for the generation of metaheuristic algorithms in an online-evolution.  相似文献   

4.
在局部成像检测过程中,由于复杂零件外形轮廓或放置状态的不同,使得零件与成像面坐标轴之间产生了一定的夹角,造成获取的对称点集中存在非对称点集或对称点不存在的问题,若采用传统Hough变换、拟合法检测装配同轴度存在较大误差。针对上述问题,提出了装配同轴度的局部成像检测算法,提取图像的上下边缘点集,结合Hough线性变换,统计两点集对投影到霍夫空间的参数空间点,并搜索其累积数量的最大值点,该点对应的对称轴即为最优对称轴。仿真结果表明,该方法可以高精度地提取最优对称轴,同轴度误差仅为0.002 7。因此,采用装配同轴度的局部成像检测方法是有效可行的。  相似文献   

5.
An Unmanned Aerial Vehicle (UAV) can greatly reduce manpower in the agricultural plant protection such as watering, sowing, and pesticide spraying. It is essential to develop a Decision-making Support System (DSS) for UAVs to help them choose the correct action in states according to the policy. In an unknown environment, the method of formulating rules for UAVs to help them choose actions is not applicable, and it is a feasible solution to obtain the optimal policy through reinforcement learning. However, experiments show that the existing reinforcement learning algorithms cannot get the optimal policy for a UAV in the agricultural plant protection environment. In this work we propose an improved Q-learning algorithm based on similar state matching, and we prove theoretically that there has a greater probability for UAV choosing the optimal action according to the policy learned by the algorithm we proposed than the classic Q-learning algorithm in the agricultural plant protection environment. This proposed algorithm is implemented and tested on datasets that are evenly distributed based on real UAV parameters and real farm information. The performance evaluation of the algorithm is discussed in detail. Experimental results show that the algorithm we proposed can efficiently learn the optimal policy for UAVs in the agricultural plant protection environment.  相似文献   

6.
Space exploration is a hot topic in the application field of mobile robots. Proposed solutions have included the frontier exploration algorithm, heuristic algorithms, and deep reinforcement learning. However, these methods cannot solve space exploration in time in a dynamic environment. This paper models the space exploration problem of mobile robots based on the decision-making process of the cognitive architecture of Soar, and three space exploration heuristic algorithms (HAs) are further proposed based on the model to improve the exploration speed of the robot. Experiments are carried out based on the Easter environment, and the results show that HAs have improved the exploration speed of the Easter robot at least 2.04 times of the original algorithm in Easter, verifying the effectiveness of the proposed robot space exploration strategy and the corresponding HAs.  相似文献   

7.
This article considers a backscatter-aided wireless powered mobile edge computing (BC-aided WPMEC) network, in which the tasks data of each Internet of Things (IoT) device can be computed locally or offloaded to the MEC server via backscatter communications, and design a resource allocation scheme regarding the weighted sum computation bits (WSCB) maximization of all the IoT devices. Towards this end, by optimizing the mobile edge computing (MEC) server’s transmit power, IoT devices’ power reflection coefficients, local computing frequencies and time, the time allocation between the energy harvesting and task offloading, as well as the binary offloading decision at each IoT device, we built a WSCB maximization problem, which belongs to a non-convex mixed integer programming problem. For solving this, the proof by contradiction and the objective function’s monotonicity are considered to determine the optimal local computing time of each IoT device and the optimal transmit power of the MEC server, and the time-sharing relaxation (TSR) is adopted to tackle the integer variables, which are used to simplify the original problem. Then, we decouple the simplified problem into two sub-problems by means of the block coordinate decent (BCD) technology, and each of the sub-problems is transformed to a convex one by introducing auxiliary variables. Based on this, we design a two-stage alternative (TSA) optimization algorithm to solve the formulated WSCB problem. Computer simulations validate that the TSA algorithm has a fast convergent rate and also demonstrate that the proposed scheme achieves a higher WSCB than the existing schemes.  相似文献   

8.
As a non-deterministic polynomial hard (NP-hard) problem, the shortest common supersequence (SCS) problem is normally solved by heuristic or metaheuristic algorithms. One type of metaheuristic algorithms that has relatively good performance for solving SCS problems is the chemical reaction optimization (CRO) algorithm. Several CRO-based proposals exist; however, they face such problems as unstable molecular population quality, uneven distribution, and local optimum (premature) solutions. To overcome these problems, we propose a new approach for the search mechanism of CRO-based algorithms. It combines the opposition-based learning (OBL) mechanism with the previously studied improved chemical reaction optimization (IMCRO) algorithm. This upgraded version is dubbed OBLIMCRO. In its initialization phase, the opposite population is constructed from a random population based on OBL; then, the initial population is generated by selecting molecules with the lowest potential energy from the random and opposite populations. In the iterative phase, reaction operators create new molecules, where the final population update is performed. Experiments show that the average running time of OBLIMCRO is more than 50% less than the average running time of CRO_SCS and its baseline algorithm, IMCRO, for the desoxyribonucleic acid (DNA) and protein datasets.  相似文献   

9.
郭贵松  林彬  杨夏  张小虎 《应用光学》2022,43(2):257-268
计算机视觉方法越来越多地应用于斑马鱼的群体行为研究;但是,由于斑马鱼游动过程形体变化大,遮挡多,准确与鲁棒地检测出斑马鱼仍然是一件非常具有挑战性的问题。为了解决该问题,提出一种基于斑马鱼图像特征的鱼群检测算法。首先通过分析目标特性,提出使用鱼头和鱼尾替代全鱼的检测方法,解决了传统整鱼检测在鱼群交叉遮挡时失效的难题;然后基于斑马鱼图像特征自动构建训练集,避免了深度学习手动标注的费时费力问题。通过对实际斑马鱼视频进行处理验证,与现有的算法相比,本文提出的方法在标注率、召回率(recall,R)与遮挡检测率(occlusion detection rate,ODR)等性能指标上有更好的实验效果。其中,在标注性能方面,本文提出的自动标注方法在总标注率上达到87.40%;在训练集效果方面,本文自动标注算法结合人工校正在标注时间上相比于人工标注方法减少93.11%,均值平均精度(mean average precision,mAP)达到79.80%;在目标检测方面,在目标遮挡率为42.72%的情况下,本文检测算法能够获得82.0%的召回率及58.02%的遮挡检测率。  相似文献   

10.
《中国物理 B》2021,30(10):100505-100505
Many problems in science, engineering and real life are related to the combinatorial optimization. However, many combinatorial optimization problems belong to a class of the NP-hard problems, and their globally optimal solutions are usually difficult to solve. Therefore, great attention has been attracted to the algorithms of searching the globally optimal solution or near-optimal solution for the combinatorial optimization problems. As a typical combinatorial optimization problem, the traveling salesman problem(TSP) often serves as a touchstone for novel approaches. It has been found that natural systems, particularly brain nervous systems, work at the critical region between order and disorder, namely,on the edge of chaos. In this work, an algorithm for the combinatorial optimization problems is proposed based on the neural networks on the edge of chaos(ECNN). The algorithm is then applied to TSPs of 10 cities, 21 cities, 48 cities and 70 cities. The results show that ECNN algorithm has strong ability to drive the networks away from local minimums.Compared with the transiently chaotic neural network(TCNN), the stochastic chaotic neural network(SCNN) algorithms and other optimization algorithms, much higher rates of globally optimal solutions and near-optimal solutions are obtained with ECNN algorithm. To conclude, our algorithm provides an effective way for solving the combinatorial optimization problems.  相似文献   

11.
In the field of reinforcement learning, we propose a Correct Proximal Policy Optimization (CPPO) algorithm based on the modified penalty factor β and relative entropy in order to solve the robustness and stationarity of traditional algorithms. Firstly, In the process of reinforcement learning, this paper establishes a strategy evaluation mechanism through the policy distribution function. Secondly, the state space function is quantified by introducing entropy, whereby the approximation policy is used to approximate the real policy distribution, and the kernel function estimation and calculation of relative entropy is used to fit the reward function based on complex problem. Finally, through the comparative analysis on the classic test cases, we demonstrated that our proposed algorithm is effective, has a faster convergence speed and better performance than the traditional PPO algorithm, and the measure of the relative entropy can show the differences. In addition, it can more efficiently use the information of complex environment to learn policies. At the same time, not only can our paper explain the rationality of the policy distribution theory, the proposed framework can also balance between iteration steps, computational complexity and convergence speed, and we also introduced an effective measure of performance using the relative entropy concept.  相似文献   

12.
To resolve coherent/incoherent, distributed/compact, and multipole aerodynamic-sound sources with phased-array pressure data, a new source-detection algorithm is developed based on L1 generalized inverse techniques. To extract each coherent signal, a cross spectral matrix is decomposed into eigenmodes. Subsequently, the complex source-amplitude distribution that recovers each eigenmode is solved using generalized inverse techniques with reference solutions which include multipoles as well as a monopole. Namely, the source distribution consisting of pre-defined source types is solved as an L1 norm problem using iteratively re-weighted least squares (IRLS). The capabilities of the proposed algorithm are demonstrated using various benchmark problems to compare the results with several existing beam-forming algorithms, and it is found that distributed sources as well as dipoles with arbitrary orientation can be identified regardless of coherency with another source. The resolution is comparable to existing deconvolution techniques, such as DAMAS or CLEAN, and the computational cost is only several times more than that of DAMAS2. The proposed algorithm is also examined using previous model-scale test data taken in an open-jet wind-tunnel for a study on jet-flap interaction, and some indication of dipole radiation is discerned near the flap edge.  相似文献   

13.
When an unmanned aerial vehicle (UAV) performs tasks such as power patrol inspection, water quality detection, field scientific observation, etc., due to the limitations of the computing capacity and battery power, it cannot complete the tasks efficiently. Therefore, an effective method is to deploy edge servers near the UAV. The UAV can offload some of the computationally intensive and real-time tasks to edge servers. In this paper, a mobile edge computing offloading strategy based on reinforcement learning is proposed. Firstly, the Stackelberg game model is introduced to model the UAV and edge nodes in the network, and the utility function is used to calculate the maximization of offloading revenue. Secondly, as the problem is a mixed-integer non-linear programming (MINLP) problem, we introduce the multi-agent deep deterministic policy gradient (MADDPG) to solve it. Finally, the effects of the number of UAVs and the summation of computing resources on the total revenue of the UAVs were simulated through simulation experiments. The experimental results show that compared with other algorithms, the algorithm proposed in this paper can more effectively improve the total benefit of UAVs.  相似文献   

14.
Recently, deep reinforcement learning (RL) algorithms have achieved significant progress in the multi-agent domain. However, training for increasingly complex tasks would be time-consuming and resource intensive. To alleviate this problem, efficient leveraging of historical experience is essential, which is under-explored in previous studies because most existing methods fail to achieve this goal in a continuously dynamic system owing to their complicated design. In this paper, we propose a method for knowledge reuse called “KnowRU”, which can be easily deployed in the majority of multi-agent reinforcement learning (MARL) algorithms without requiring complicated hand-coded design. We employ the knowledge distillation paradigm to transfer knowledge among agents to shorten the training phase for new tasks while improving the asymptotic performance of agents. To empirically demonstrate the robustness and effectiveness of KnowRU, we perform extensive experiments on state-of-the-art MARL algorithms in collaborative and competitive scenarios. The results show that KnowRU outperforms recently reported methods and not only successfully accelerates the training phase, but also improves the training performance, emphasizing the importance of the proposed knowledge reuse for MARL.  相似文献   

15.
Optimal sensor placement technique plays a key role in structural health monitoring of spatial lattice structures. This paper considers the problem of locating sensors on a spatial lattice structure with the aim of maximizing the data information so that structural dynamic behavior can be fully characterized. Based on the criterion of optimal sensor placement for modal test, an improved genetic algorithm is introduced to find the optimal placement of sensors. The modal strain energy (MSE) and the modal assurance criterion (MAC) have been taken as the fitness function, respectively, so that three placement designs were produced. The decimal two-dimension array coding method instead of binary coding method is proposed to code the solution. Forced mutation operator is introduced when the identical genes appear via the crossover procedure. A computational simulation of a 12-bay plain truss model has been implemented to demonstrate the feasibility of the three optimal algorithms above. The obtained optimal sensor placements using the improved genetic algorithm are compared with those gained by exiting genetic algorithm using the binary coding method. Further the comparison criterion based on the mean square error between the finite element method (FEM) mode shapes and the Guyan expansion mode shapes identified by data-driven stochastic subspace identification (SSI-DATA) method are employed to demonstrate the advantage of the different fitness function. The results showed that some innovations in genetic algorithm proposed in this paper can enlarge the genes storage and improve the convergence of the algorithm. More importantly, the three optimal sensor placement methods can all provide the reliable results and identify the vibration characteristics of the 12-bay plain truss model accurately.  相似文献   

16.
Poker has been considered a challenging problem in both artificial intelligence and game theory because poker is characterized by imperfect information and uncertainty, which are similar to many realistic problems like auctioning, pricing, cyber security, and operations. However, it is not clear that playing an equilibrium policy in multi-player games would be wise so far, and it is infeasible to theoretically validate whether a policy is optimal. Therefore, designing an effective optimal policy learning method has more realistic significance. This paper proposes an optimal policy learning method for multi-player poker games based on Actor-Critic reinforcement learning. Firstly, this paper builds the Actor network to make decisions with imperfect information and the Critic network to evaluate policies with perfect information. Secondly, this paper proposes a novel multi-player poker policy update method: asynchronous policy update algorithm (APU) and dual-network asynchronous policy update algorithm (Dual-APU) for multi-player multi-policy scenarios and multi-player sharing-policy scenarios, respectively. Finally, this paper takes the most popular six-player Texas hold ’em poker to validate the performance of the proposed optimal policy learning method. The experiments demonstrate the policies learned by the proposed methods perform well and gain steadily compared with the existing approaches. In sum, the policy learning methods of imperfect information games based on Actor-Critic reinforcement learning perform well on poker and can be transferred to other imperfect information games. Such training with perfect information and testing with imperfect information models show an effective and explainable approach to learning an approximately optimal policy.  相似文献   

17.
This paper presents an error analysis of numerical algorithms for solving the convective continuity equation using flux-corrected transport (FCT) techniques. The nature of numerical errors in Eulerian finite-difference solutions to the continuity equation is analyzed. The properties and intrinsic errors of an “optimal” algorithm are discussed and a flux-corrected form of such an algorithm is demonstrated for a restricted class of problems. This optimal FCT algorithm is applied to a model test problem and the error is monitored for comparison with more generally applicable algorithms. Several improved FCT algorithms are developed and judged against both standard flux-uncorrected transport algorithms and the optimal algorithm. These improved FCT algorithms are found to be four to eight times more accurate than standard non-FCT algorithms, nearly twice as accurate as the original SHASTA FCT algorithm, and approach the accuracy of the optimal algorithm.  相似文献   

18.
红外人脸图像的边缘轮廓特征对于红外人脸检测、识别等相关应用具有重要价值。针对红外人脸图像边缘轮廓提取时存在伪边缘的问题,提出了一种改进Canny算法的红外人脸图像边缘轮廓提取方法。首先通过对引导滤波算法引入“动态阈值约束因子”替换原始算法中的高斯滤波,解决了原始算法滤波处理不均匀和造成红外人脸图像弱边缘特征丢失的弊端;接着对原始算法的非极大值抑制进行了改进,在原始计算梯度方向的基础上又增加了4个梯度方向,使得非极大值抑制的插值较原始算法更加精细;最后改进OTSU(大津)算法,构造灰度-梯度映射函数确定最佳阈值,解决了原始算法人为经验确定阈值的局限性。实验结果表明:提出的改进Canny算法的红外人脸轮廓提取方法滤波后的图像,相较于原始Canny算法滤波处理,信噪比性能提升了34.40%,结构相似度性能提升了21.66%;最终的红外人脸边缘轮廓提取实验的优质系数值高于对比实验的其他方法,证明改进后的算法对于红外人脸图像边缘轮廓提取具有优越性。  相似文献   

19.
Intelligent reflecting surfaces (IRSs) are anticipated to provide reconfigurable propagation environment for next generation communication systems. In this paper, we investigate a downlink IRS-aided multi-carrier (MC) non-orthogonal multiple access (NOMA) system, where the IRS is deployed to especially assist the blocked users to establish communication with the base station (BS). To maximize the system sum rate under network quality-of-service (QoS), rate fairness and successive interference cancellation (SIC) constraints, we formulate a problem for joint optimization of IRS elements, sub-channel assignment and power allocation. The formulated problem is mixed non-convex. Therefore, a novel three stage algorithm is proposed for the optimization of IRS elements, sub-channel assignment and power allocation. First, the IRS elements are optimized using the bisection method based iterative algorithm. Then, the sub-channel assignment problem is solved using one-to-one stable matching algorithm. Finally, the power allocation problem is solved under the given sub-channel and optimal number of IRS elements using Lagrangian dual-decomposition method based on Lagrangian multipliers. Moreover, in an effort to demonstrate the low-complexity of the proposed resource allocation scheme, we provide the complexity analysis of the proposed algorithms. The simulated results illustrate the various factors that impact the optimal number of IRS elements and the superiority of the proposed resource allocation approach in terms of network sum rate and user fairness. Furthermore, we analyze the proposed approach against a new performance metric called computational efficiency (CE).  相似文献   

20.
Edge computing can deliver network services with low latency and real-time processing by providing cloud services at the network edge. Edge computing has a number of advantages such as low latency, locality, and network traffic distribution, but the associated resource management has become a significant challenge because of its inherent hierarchical, distributed, and heterogeneous nature. Various cloud-based network services such as crowd sensing, hierarchical deep learning systems, and cloud gaming each have their own traffic patterns and computing requirements. To provide a satisfactory user experience for these services, resource management that comprehensively considers service diversity, client usage patterns, and network performance indicators is required. In this study, an algorithm that simultaneously considers computing resources and network traffic load when deploying servers that provide edge services is proposed. The proposed algorithm generates candidate deployments based on factors that affect traffic load, such as the number of servers, server location, and client mapping according to service characteristics and usage. A final deployment plan is then established using a partial vector bin packing scheme that considers both the generated traffic and computing resources in the network. The proposed algorithm is evaluated using several simulations that consider actual network service and device characteristics.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号