首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 140 毫秒
1.
A domain decomposition method is developed for the numerical solution of nonlinear parabolic partial differential equations in any space dimension, based on the probabilistic representation of solutions as an average of suitable multiplicative functionals. Such a direct probabilistic representation requires generating a number of random trees, whose role is that of the realizations of stochastic processes used in the linear problems. First, only few values of the sought solution inside the space-time domain are computed (by a Monte Carlo method on the trees). An interpolation is then carried out, in order to approximate interfacial values of the solution inside the domain. Thus, a fully decoupled set of sub-problems is obtained. The algorithm is suited to massively parallel implementation, enjoying arbitrary scalability and fault tolerance properties. Pruning the trees is shown to increase appreciably the efficiency of the algorithm. Numerical examples conducted in 2D, including some for the KPP equation, are given.  相似文献   

2.
We perform a proof-of-concept implementation of the massively parallel algorithm [P. M. Lushnikov, Opt. Lett. 27, 939 (2002)] for simulation of dispersion-managed wavelength-division-multiplexed optical fiber systems. Linear scalability of the algorithm with the number of computer cores is demonstrated. Exact result on the accuracy of the implemented algorithm is found analytically and confirmed numerically as well as it is compared with the accuracy of the standard split-step algorithm.  相似文献   

3.
大规模并行处理的发展要求并行应用程序具有良好的可扩展性.以二维电磁等离子体粒子云并行程序为例,描述了近优可扩展性分析的应用.在已知小规模系统性能的基础上,通过近优可扩展性分析,可以得到更大规模的系统在多少台处理机上运行更为"合理"的信息.  相似文献   

4.
A new multi-block hybrid compact–WENO finite-difference method for the massively parallel computation of compressible flows is presented. In contrast to earlier methods, our approach breaks the global dependence of compact methods by using explicit finite-difference methods at block interfaces and is fully conservative. The resulting method is fifth- and sixth-order accurate for the convective and diffusive fluxes, respectively. The impact of the explicit interface treatment on the stability and accuracy of the multi-block method is quantified for the advection and diffusion equations. Numerical errors increase slightly as the number of blocks is increased. It is also found that the maximum allowable time steps increase with the number of blocks. The method demonstrates excellent scalability on up to 1264 processors.  相似文献   

5.
赵振国  李光荣  童杰  徐刚  周海京 《强激光与粒子束》2018,30(8):083001-1-083001-5
介绍了自主研发的强电磁脉冲多物理效应并行计算程序JEMS-CDS-System的情况,该程序采用时域有限元方法,基于JAUMIN并行自适应结构网格支撑框架研制,并行效能高,可扩展性强,且支持动态负载平衡。通过算例测试表明,该程序对于键合线的电-热-应力失效过程的最高温度与范式等效应力计算结果与COMSOL软件计算结果吻合较好;SiP功率放大模块的热-应力耦合天河2高性能计算平台并行计算结果表明,该程序在CPU1024核时,具有38.1%并行效率。  相似文献   

6.
徐小文  莫则尧 《计算物理》2007,24(4):387-394
对当今求解大型稀疏线性代数方程组最有效的迭代方法之--代数多重网格(AMG)算法的并行计算进行可扩展性能分析.给出一套并行计算可扩展性能分析方法,用于分析和指导并行迭代算法及实现技术的设计与优化并应用于并行AMG算法.分析表明,网格算子的平均模式大小和迭代过程的算法效率分别制约了AMG算法启动阶段和迭代求解阶段并行性能的发挥,成为该类算法急需解决的两个关键问题.  相似文献   

7.
为实现氧碘化学激光器(COIL)喷管流场大规模数值模拟,采用VICON程序中的基本方程与数值算法,应用JASMIN框架中的基本数据结构——网格片,以及多块结构网格拼接并行算法,发展了三维多块COIL并行模拟程序。数值实验结果展示了该并行模拟程序的正确性及可扩展性,并在2048个处理器核上模拟450万网格单元算例,加速比超过420。  相似文献   

8.
Lushnikov PM 《Optics letters》2002,27(11):939-941
An efficient numerical algorithm is presented for massively parallel simulations of dispersion-managed wavelength-division-multiplexed optical fiber systems. The algorithm is based on a weak nonlinearity approximation and independent parallel calculations of fast Fourier transforms on multiple central processor units (CPUs). The algorithm allows one to implement numerical simulations M/2 times faster than a direct numerical simulation by a split-step method, where M is a number of CPUs in a parallel network.  相似文献   

9.
A 3D parallel adaptive mesh refinement (AMR) scheme is described for solving the partial-differential equations governing ideal magnetohydrodynamic (MHD) flows. This new algorithm adopts a cell-centered upwind finite-volume discretization procedure and uses limited solution reconstruction, approximate Riemann solvers, and explicit multi-stage time stepping to solve the MHD equations in divergence form, providing a combination of high solution accuracy and computational robustness across a large range in the plasma β (β is the ratio of thermal and magnetic pressures). The data structure naturally lends itself to domain decomposition, thereby enabling efficient and scalable implementations on massively parallel supercomputers. Numerical results for MHD simulations of magnetospheric plasma flows are described to demonstrate the validity and capabilities of the approach for space weather applications  相似文献   

10.
A method for computing the numerical solution of Vlasov type equations on massively parallel computers is presented. In contrast with Particle In Cell methods which are known to be noisy, the method is based on a semi-Lagrangian algorithm that approaches the Vlasov equation on a grid of phase space. As this kind of method requires a huge computational effort, the simulations are carried out on parallel machines. To that purpose, we present a local cubic splines interpolation method based on a domain decomposition, e.g. devoted to a processor. Hermite boundary conditions between the domains, using ad hoc reconstruction of the derivatives, provide a good approximation of the global solution. The method is applied on various physical configurations which show the ability of the numerical scheme.  相似文献   

11.
Octopus is a general-purpose density-functional theory (DFT) code, with a particular emphasis on the time-dependent version of DFT (TDDFT). In this paper we present the ongoing efforts to achieve the parallelization of octopus. We focus on the real-time variant of TDDFT, where the time-dependent Kohn-Sham equations are directly propagated in time. This approach has great potential for execution in massively parallel systems such as modern supercomputers with thousands of processors and graphics processing units (GPUs). For harvesting the potential of conventional supercomputers, the main strategy is a multi-level parallelization scheme that combines the inherent scalability of real-time TDDFT with a real-space grid domain-partitioning approach. A scalable Poisson solver is critical for the efficiency of this scheme. For GPUs, we show how using blocks of Kohn-Sham states provides the required level of data parallelism and that this strategy is also applicable for code optimization on standard processors. Our results show that real-time TDDFT, as implemented in octopus, can be the method of choice for studying the excited states of large molecular systems in modern parallel architectures.  相似文献   

12.
荧光分子断层成像正向问题的并行计算   总被引:2,自引:0,他引:2  
邹玮  王加俊  冯大淦 《光学学报》2007,27(3):443-450
针对荧光分子断层成像中相应于激发光和发射光的两个正向方程必须串行求解的实际情况,提出了一种可同时对两个扩散方程进行求解的并行算法。其思想是通过引入乘子矩阵对耦合方程进行解耦来实现并行计算,并利用有限元方法进行了二维数值模拟,将算法求解所得结果与基于串行方法,以Ralf B.Schulz等提出的并行算法所得到的数值模拟结果进行了综合比较。实验表明,该算法一方面适合于任何大小的斯托克斯频移条件,具有更广泛的适应性;另一方面提高了荧光分子断层成像正向问题的求解速度和精度,从而有利于整个荧光分子断层成像的快速精确求解。  相似文献   

13.
基于"块-单元"数据结构的分子动力学并行计算   总被引:5,自引:0,他引:5  
开发了一种基于"块-单元"数据结构的可扩展并行算法,以实现大规模、非均匀分子动力学模拟.它采用空间填充曲线将三维区域分解转换为-维负载平衡问题,然后用基于实测的多层均权法求解,以保持处理机间负载均衡.在一个MPP并行机的500个CPU上,模拟包含2.1×108个粒子的三维金属微喷射模型,该算法获得了420倍的加速比.  相似文献   

14.
We address the failure in scalability of large-scale parallel simulations that are based on (semi-)implicit time-stepping and hence on the solution of linear systems on thousands of processors. We develop a general algorithmic framework based on domain decomposition that removes the scalability limitations and leads to optimal allocation of available computational resources. It is a non-intrusive approach as it does not require modification of existing codes. Specifically, we present here a two-stage domain decomposition method for the Navier–Stokes equations that combines features of discontinuous and continuous Galerkin formulations. At the first stage the domain is subdivided into overlapping patches and within each patch a C0 spectral element discretization (second stage) is employed. Solution within each patch is obtained separately by applying an efficient parallel solver. Proper inter-patch boundary conditions are developed to provide solution continuity, while a Multilevel Communicating Interface (MCI) is developed to provide efficient communication between the non-overlapping groups of processors of each patch. The overall strong scaling of the method depends on the number of patches and on the scalability of the standard solver within each patch. This dual path to scalability provides great flexibility in balancing accuracy with parallel efficiency. The accuracy of the method has been evaluated in solutions of steady and unsteady 3D flow problems including blood flow in the human intracranial arterial tree. Benchmarks on BlueGene/P, CRAY XT5 and Sun Constellation Linux Cluster have demonstrated good performance on up to 96,000 cores, solving up to 8.21B degrees of freedom in unsteady flow problem. The proposed method is general and can be potentially used with other discretization methods or in other applications.  相似文献   

15.
A probabilistic representation for initial value semilinear parabolic problems based on generalized random trees has been derived. Two different strategies have been proposed, both requiring generating suitable random trees combined with a Pade approximant for approximating accurately a given divergent series. Such series are obtained by summing the partial contribution to the solution coming from trees with arbitrary number of branches. The new representation greatly expands the class of problems amenable to be solved probabilistically, and was used successfully to develop a generalized probabilistic domain decomposition method. Such a method has been shown to be suited for massively parallel computers, enjoying full scalability and fault tolerance. Finally, a few numerical examples are given to illustrate the remarkable performance of the algorithm, comparing the results with those obtained with a classical method.  相似文献   

16.
A parallel implementation of the electromagnetic dual-primal finite element tearing and interconnecting algorithm (FETI-DPEM) is designed for general three-dimensional (3D) electromagnetic large-scale simulations. As a domain decomposition implementation of the finite element method, the FETI-DPEM algorithm provides fully decoupled subdomain problems and an excellent numerical scalability, and thus is well suited for parallel computation. The parallel implementation of the FETI-DPEM algorithm on a distributed-memory system using the message passing interface (MPI) is discussed in detail along with a few practical guidelines obtained from numerical experiments. Numerical examples are provided to demonstrate the efficiency of the parallel implementation.  相似文献   

17.
A block tridiagonal matrix is factored with minimal fill-in using a cyclic reduction algorithm that is easily parallelized. Storage of the factored blocks allows the application of the inverse to multiple right-hand sides which may not be known at factorization time. Scalability with the number of block rows is achieved with cyclic reduction, while scalability with the block size is achieved using multithreaded routines (OpenMP, GotoBLAS) for block matrix manipulation. This dual scalability is a noteworthy feature of this new solver, as well as its ability to efficiently handle arbitrary (non-powers-of-2) block row and processor numbers. Comparison with a state-of-the art parallel sparse solver is presented. It is expected that this new solver will allow many physical applications to optimally use the parallel resources on current supercomputers. Example usage of the solver in magneto-hydrodynamic (MHD), three-dimensional equilibrium solvers for high-temperature fusion plasmas is cited.  相似文献   

18.
陈再高  王建国  王玥  乔海亮  郭伟杰  张殿辉 《物理学报》2013,62(16):168402-168402
提出了基于粒子模拟和并行遗传算法的高功率微波源优化设计方法, 以全电磁粒子模拟软件UNIPIC模拟的高功率微波器件输出功率作为适应度函数, 采用浮点数编码的遗传算法对高功率微波源器件进行优化. 采用该算法, 对相对论返波管的布拉格反射器位置以及高度进行了浮点数编码,然后在巨型机上进行参数的全局优化, 获得了该返波管布拉格反射器的全局最优参数. 关键词: 并行遗传算法 相对论返波管 粒子模拟 高功率微波源  相似文献   

19.
Based on the Weierstrass elliptic function equation, a new Weierstrass semi-rational expansion method and its algorithm are presented. The main idea of the method changes the problem solving soliton equations into another one solving the corresponding set of nonlinear algebraic equations. With the aid of Maple, we choose the modified KdV equation, (2+1)-dimensional KP equation, and (3+1)-dimensional Jimbo-Miwa equation to illustrate our algorithm. As a consequence, many types of new doubly periodic solutions are obtained in terms of the Weierstrass elliptic function. Moreover the corresponding new Jacobi elliptic function solutions and solitary wave solutions are also presented as simple limits of doubly periodic solutions.  相似文献   

20.
王羽  欧阳洁  杨斌鑫 《物理学报》2010,59(10):6757-6763
采用Laplace数值反演的Stehfest算法研究了分数阶Oldroyd-B粘弹性流体在两平板间非定常的Poiseuille流动问题.首先,通过数值解与近似解析解的比较验证了Stehfest算法的有效性.其次,运用Stehfest算法对平板Poiseuille流动进行了研究,揭示了分数阶黏弹性平板流的速度过冲和应力过冲现象,指出这些现象对分数导数的阶数存在明显的依赖性.同时,数值结果表明,整数阶本构方程仅仅是分数阶本构方程的特例,分数阶本构方程较整数阶本构方程具有更广泛的适用性。  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号