首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
We present a mathematical framework for constructing and analyzing parallel algorithms for lattice kinetic Monte Carlo (KMC) simulations. The resulting algorithms have the capacity to simulate a wide range of spatio-temporal scales in spatially distributed, non-equilibrium physiochemical processes with complex chemistry and transport micro-mechanisms. Rather than focusing on constructing exactly the stochastic trajectories, our approach relies on approximating the evolution of observables, such as density, coverage, correlations and so on. More specifically, we develop a spatial domain decomposition of the Markov operator (generator) that describes the evolution of all observables according to the kinetic Monte Carlo algorithm. This domain decomposition corresponds to a decomposition of the Markov generator into a hierarchy of operators and can be tailored to specific hierarchical parallel architectures such as multi-core processors or clusters of Graphical Processing Units (GPUs). Based on this operator decomposition, we formulate parallel Fractional step kinetic Monte Carlo algorithms by employing the Trotter Theorem and its randomized variants; these schemes, (a) are partially asynchronous on each fractional step time-window, and (b) are characterized by their communication schedule between processors.The proposed mathematical framework allows us to rigorously justify the numerical and statistical consistency of the proposed algorithms, showing the convergence of our approximating schemes to the original serial KMC. The approach also provides a systematic evaluation of different processor communicating schedules. We carry out a detailed benchmarking of the parallel KMC schemes using available exact solutions, for example, in Ising-type systems and we demonstrate the capabilities of the method to simulate complex spatially distributed reactions at very large scales on GPUs. Finally, we discuss work load balancing between processors and propose a re-balancing scheme based on probabilistic mass transport methods.  相似文献   

2.
A new compact scheme is presented for computing wave propagation problems and Navier–Stokes equation. A combined compact difference scheme is developed for non-periodic problems (called NCCD henceforth) that simultaneously evaluates first and second derivatives, improving an existing combined compact difference (CCD) scheme. Following the methodologies in Sengupta et al. [T.K. Sengupta, S.K. Sircar, A. Dipankar, High accuracy schemes for DNS and acoustics, J. Sci. Comput. 26 (2) (2006) 151–193], stability and dispersion relation preservation (DRP) property analysis is performed here for general CCD schemes for the first time, emphasizing their utility in uni- and bi-directional wave propagation problems – that is relevant to acoustic wave propagation problems. We highlight: (a) specific points in parameter space those give rise to least phase and dispersion errors for non-periodic wave problems; (b) the solution error of CCD/NCCD schemes in solving Stommel Ocean model (an elliptic p.d.e.) and (c) the effectiveness of the NCCD scheme in solving Navier–Stokes equation for the benchmark lid-driven cavity problem at high Reynolds numbers, showing that the present method is capable of providing very accurate solution using far fewer points as compared to existing solutions in the literature.  相似文献   

3.
基于"块-单元"数据结构的分子动力学并行计算   总被引:5,自引:0,他引:5  
开发了一种基于"块-单元"数据结构的可扩展并行算法,以实现大规模、非均匀分子动力学模拟.它采用空间填充曲线将三维区域分解转换为-维负载平衡问题,然后用基于实测的多层均权法求解,以保持处理机间负载均衡.在一个MPP并行机的500个CPU上,模拟包含2.1×108个粒子的三维金属微喷射模型,该算法获得了420倍的加速比.  相似文献   

4.
流动数值模拟中一种并行自适应有限元算法   总被引:1,自引:0,他引:1  
周春华 《计算物理》2006,23(4):412-418
给出了一种流动数值模拟中的基于误差估算的并行网格自适应有限元算法.首先,以初网格上获得的当地事后误差估算值为权,应用递归谱对剖分方法划分初网格,使各子域上总体误差近似相等,以解决负载平衡问题.然后以误差值为判据对各子域内网格进行独立的自适应处理.最后应用基于粘接元的区域分裂法在非匹配的网格上求解N-S方程.区域分裂情形下N-S方程有限元解的误差估算则是广义Stokes问题误差估算方法的推广.为验证方法的可靠性,给出了不可压流经典算例的数值结果.  相似文献   

5.
In this paper, a lattice Boltzmann (LB) scheme for convection diffusion on irregular lattices is presented, which is free of any interpolation or coarse graining step. The scheme is derived using the axioma that the velocity moments of the equilibrium distribution equal those of the Maxwell–Boltzmann distribution. The axioma holds for both Bravais and irregular lattices, implying a single framework for LB schemes for all lattice types. By solving benchmark problems we have shown that the scheme is indeed consistent with convection diffusion. Furthermore, we have compared the performance of the LB schemes with that of finite difference and finite element schemes. The comparison shows that the LB scheme has a similar performance as the one-step second-order Lax–Wendroff scheme: it has little numerical diffusion, but has a slight dispersion error. By changing the relaxation parameter ω the dispersion error can be balanced by a small increase of the numerical diffusion.  相似文献   

6.
CHAP3D是北京应用物理与计算数学研究所自主研发的Lagrange通用弹塑性流体力学分析程序.文章介绍了在CHAP3D程序中使用的、针对多处理器集群的、基于静态双重区域分解的两种接触并行算法.第一种是分配单个完整接触面的接触并行算法,此算法将一对完整的接触面分配到一个处理器上,并建立计算域与接触域的通信关系.此接触并行算法的优点是简单,在具有接触面的处理器上可以直接使用串行的接触搜索算法和接触力耦合计算算法.另一种是主面剖分区域分解的接触并行算法,此算法将所有接触面的主面区域分解到所有处理器上.须建立计算域与接触域以及接触域内各处理器间的两种通信关系.该接触并行算法是一个负载平衡的并行算法,具有很好的并行效率和可扩展性.数值算例显示,这两种接触并行算法都能够很好地模拟多种不同类型的接触问题.   相似文献   

7.
研究各流域三维流动问题的Boltzmann模型方程计算方法,建立直接求解分子速度分布函数的气体运动论耦合迭代数值格式;基于变量依赖关系、数据通信与并行可扩展性分析,使用区域分解并行化方法,建立气体运动论数值算法并行方案,发展求解各流域三维绕流问题的气体运动论并行算法.拟定高低不同马赫数下来自不同流域的三维球体及返回舱绕流算例,进行高性能Fortran(HPF)大规模并行计算,将计算结果与有关实验数据、相关理论预测等进行比较分析,研究揭示不同流区复杂绕流现象及流动机理.研究表明,所发展的气体运动论并行算法具有很好的并行独立性,基本达到线性加速的并行效果,显示出良好的并行可扩展性.  相似文献   

8.
A single-parameter family of self-adjoint compact difference (SACD) schemes is developed for discretizing the Laplacian operator in self-adjoint form. Developed implicit scheme is formally second-order accurate (with respect to truncation error) with a free parameter, α which helps control the numerical properties in the spectral plane. The SACD scheme is analyzed in the spectral plane for its resolution properties for both periodic and non-periodic problems using the matrix spectral analysis [T.K. Sengupta, G. Ganeriwal, S. De, Analysis of central and upwind schemes, J. Comput. Phys. 192 (2) (2003) 677–694]. The major objective here is to identify the advantages of the new scheme over the traditional explicit second order CD2 scheme, in discretizing the Laplacian operator in self-adjoint form. This appears in Navier–Stokes equation and in other transport equations and boundary value problems (bvp) expressed in orthogonal co-ordinate systems, either in physical or in transformed plane. We also compare the developed method with the higher order compact schemes for non-uniform grids. To demonstrate the accuracy of SACD scheme we have tested it for: (i) bi-directional wave propagation problem, given by the second order wave equation and (ii) an elliptic bvp, as in the Stommel ocean model for the stream function. These examples help infer the properties of SACD scheme when solving different types of partial differential equations. Most importantly the effects of grid-stretching and choice of value of the free parameter (α) are investigated here. We also compare the present implicit compact method with explicit compact method known as the higher order compact (HOC) method.Also, the practical applications of the SACD scheme are explored by solving the Navier–Stokes equation for the cases of: (a) a flow inside a lid-driven cavity and (b) the receptivity and instability of an external adverse pressure gradient flow over a flat plate. In the former, unsteadiness of the flow is captured and in the latter, the receptivity of the flow is studied in causing flow instability by triggering Tollmien–Schlichting waves. The new scheme shows a marked improvement over the explicit scheme for low Reynolds number steady flow in lid driven cavity. Whereas for the flow in the same geometry at higher Reynolds numbers, efficacy of the scheme is established by showing the formation of a triangular vortex and secondary vortical structures. Presented scheme is perfectly capable of expressing the diffusion operator accurately as shown via the capturing of instability waves inside the shear layer.  相似文献   

9.
A class of high-order compact (HOC) exponential finite difference (FD) methods is proposed for solving one- and two-dimensional steady-state convection–diffusion problems. The newly proposed HOC exponential FD schemes have nonoscillation property and yield high accuracy approximation solution as well as are suitable for convection-dominated problems. The O(h4) compact exponential FD schemes developed for the one-dimensional (1D) problems produce diagonally dominant tri-diagonal system of equations which can be solved by applying the tridiagonal Thomas algorithm. For the two-dimensional (2D) problems, O(h4 + k4) compact exponential FD schemes are formulated on the nine-point 2D stencil and the line iterative approach with alternating direction implicit (ADI) procedure enables us to deal with diagonally dominant tridiagonal matrix equations which can be solved by application of the one-dimensional tridiagonal Thomas algorithm with a considerable saving in computing time. To validate the present HOC exponential FD methods, three linear and nonlinear problems, mostly with boundary or internal layers where sharp gradients may appear due to high Peclet or Reynolds numbers, are numerically solved. Comparisons are made between analytical solutions and numerical results for the currently proposed HOC exponential FD methods and some previously published HOC methods. The present HOC exponential FD methods produce excellent results for all test problems. It is shown that, besides including the excellent performances in computational accuracy, efficiency and stability, the present method has the advantage of better scale resolution. The method developed in this article is easy to implement and has been applied to obtain the numerical solutions of the lid driven cavity flow problem governed by the 2D incompressible Navier–Stokes equations using the stream function-vorticity formulation.  相似文献   

10.
A reactivity computation consists of computing the highest eigenvalue of a generalized eigenvalue problem, for which an inverse power algorithm is commonly used. Very fine modelizations are difficult to treat for our sequential solver, based on the simplified transport equations, in terms of memory consumption and computational time.A first implementation of a Lagrangian based domain decomposition method brings to a poor parallel efficiency because of an increase in the power iterations [1]. In order to obtain a high parallel efficiency, we improve the parallelization scheme by changing the location of the loop over the subdomains in the overall algorithm and by benefiting from the characteristics of the Raviart–Thomas finite element. The new parallel algorithm still allows us to locally adapt the numerical scheme (mesh, finite element order). However, it can be significantly optimized for the matching grid case. The good behavior of the new parallelization scheme is demonstrated for the matching grid case on several hundreds of nodes for computations based on a pin-by-pin discretization.  相似文献   

11.
We address the failure in scalability of large-scale parallel simulations that are based on (semi-)implicit time-stepping and hence on the solution of linear systems on thousands of processors. We develop a general algorithmic framework based on domain decomposition that removes the scalability limitations and leads to optimal allocation of available computational resources. It is a non-intrusive approach as it does not require modification of existing codes. Specifically, we present here a two-stage domain decomposition method for the Navier–Stokes equations that combines features of discontinuous and continuous Galerkin formulations. At the first stage the domain is subdivided into overlapping patches and within each patch a C0 spectral element discretization (second stage) is employed. Solution within each patch is obtained separately by applying an efficient parallel solver. Proper inter-patch boundary conditions are developed to provide solution continuity, while a Multilevel Communicating Interface (MCI) is developed to provide efficient communication between the non-overlapping groups of processors of each patch. The overall strong scaling of the method depends on the number of patches and on the scalability of the standard solver within each patch. This dual path to scalability provides great flexibility in balancing accuracy with parallel efficiency. The accuracy of the method has been evaluated in solutions of steady and unsteady 3D flow problems including blood flow in the human intracranial arterial tree. Benchmarks on BlueGene/P, CRAY XT5 and Sun Constellation Linux Cluster have demonstrated good performance on up to 96,000 cores, solving up to 8.21B degrees of freedom in unsteady flow problem. The proposed method is general and can be potentially used with other discretization methods or in other applications.  相似文献   

12.
A parallel approach to solve three-dimensional viscous incompressible fluid flow problems using discontinuous pressure finite elements and a Lagrange multiplier technique is presented. The strategy is based on non-overlapping domain decomposition methods, and Lagrange multipliers are used to enforce continuity at the boundaries between subdomains. The novelty of the work is the coupled approach for solving the velocity–pressure-Lagrange multiplier algebraic system of the discrete Navier–Stokes equations by a distributed memory parallel ILU (0) preconditioned Krylov method. A penalty function on the interface constraints equations is introduced to avoid the failure of the ILU factorization algorithm. To ensure portability of the code, a message based memory distributed model with MPI is employed. The method has been tested over different benchmark cases such as the lid-driven cavity and pipe flow with unstructured tetrahedral grids. It is found that the partition algorithm and the order of the physical variables are central to parallelization performance. A speed-up in the range of 5–13 is obtained with 16 processors. Finally, the algorithm is tested over an industrial case using up to 128 processors. In considering the literature, the obtained speed-ups on distributed and shared memory computers are found very competitive.  相似文献   

13.
We develop a parallel Jacobi–Davidson approach for finding a partial set of eigenpairs of large sparse polynomial eigenvalue problems with application in quantum dot simulation. A Jacobi–Davidson eigenvalue solver is implemented based on the Portable, Extensible Toolkit for Scientific Computation (PETSc). The eigensolver thus inherits PETSc’s efficient and various parallel operations, linear solvers, preconditioning schemes, and easy usages. The parallel eigenvalue solver is then used to solve higher degree polynomial eigenvalue problems arising in numerical simulations of three dimensional quantum dots governed by Schrödinger’s equations. We find that the parallel restricted additive Schwarz preconditioner in conjunction with a parallel Krylov subspace method (e.g. GMRES) can solve the correction equations, the most costly step in the Jacobi–Davidson algorithm, very efficiently in parallel. Besides, the overall performance is quite satisfactory. We have observed near perfect superlinear speedup by using up to 320 processors. The parallel eigensolver can find all target interior eigenpairs of a quintic polynomial eigenvalue problem with more than 32 million variables within 12 minutes by using 272 Intel 3.0 GHz processors.  相似文献   

14.
A new numerical scheme is proposed for solving Hamilton’s equations that possesses the properties of symplecticity. Just as in all symplectic schemes known to date, in this scheme the conservation laws of momentum and angular momentum are satisfied exactly. A property that distinguishes this scheme from known schemes is proved: in the new scheme, the energy conservation law is satisfied for a system of linear oscillators. The new numerical scheme is implicit and has the third order of accuracy with respect to the integration step. An algorithm is presented by which the accuracy of the scheme can be increased up to the fifth and higher orders. Exact and numerical solutions to the two-body problem, calculated by known schemes and by the scheme proposed here, are compared.  相似文献   

15.
Gas combustion, solid combustion as well as frontal polymerization are characterized by stiff fronts that propagate with nonlinear dynamics. The multiple-scale phenomena under consideration lead to very intense computations that require parallel computing in order to reduce the elapsed time of the computation. We develop a methodology to build on the MIMD architecture a parallel numerical method based on the property of the solution, i.e. a stiff quasi-planar two-dimensional combustion front. We illustrate our methodology using two models of the combustion process. The first is a thermo-diffusive model of a two-step chemical reaction exhibiting two transition layers. The second is a thermo-diffusive model of a one-step chemical reaction coupled with a hydrodynamical model using the stream function - vorticity formulation of the Navier - Stokes equations written in the Boussinesq approximation. This methodology makes use of efficient domain decomposition methods, combined with asymptotic analytical qualitative results to adapt the interface position, to solve the transition layer(s) of the solution accurately and operator splitting to take advantage of the quasi-planar property of the frontal process. Then, it provides three complementary levels of parallelism. A first level of parallelism based on the domain decomposition, thus a priori limited to the number of transition layers in the problem. A second based on an explicit parallelism in the orthogonal direction of the front propagation. A third based on the spread of equations on subnetworks of processors. The parallel implementation using the message passing library concept on the Paragon and iPSC860 MIMD computers are discussed. An efficient parallel algorithm to solve the space-periodic stream-function in the second model, based on Fourier modes decomposition combined with the first and second level of parallelism is provided. The direct numerical simulation provided by our numerical method allows us to explore the physical parameter space of the combustion process in order to understand the mechanism of instabilities. Some examples of hydrodynamical and thermal instabilities are given.  相似文献   

16.
基于几何区域分解的三维输运问题并行迭代算法   总被引:1,自引:1,他引:0  
对三维直角坐标下的输运隐式差分方程,研究了基于几何区域分解的并行迭代算法,给出了串、并行迭代误差估计.并对相关数值结果进行了分析、比较.  相似文献   

17.
A method for computing the numerical solution of Vlasov type equations on massively parallel computers is presented. In contrast with Particle In Cell methods which are known to be noisy, the method is based on a semi-Lagrangian algorithm that approaches the Vlasov equation on a grid of phase space. As this kind of method requires a huge computational effort, the simulations are carried out on parallel machines. To that purpose, we present a local cubic splines interpolation method based on a domain decomposition, e.g. devoted to a processor. Hermite boundary conditions between the domains, using ad hoc reconstruction of the derivatives, provide a good approximation of the global solution. The method is applied on various physical configurations which show the ability of the numerical scheme.  相似文献   

18.
李长峰  袁益让 《计算物理》2007,24(2):239-246
给出抛物方程一种有效的区域分裂差分格式,提高了计算效率.对一阶项采用二阶迎风差分格式,内边界点和各子区域分别采用显隐差分格式.在较弱的稳定性条件下,得到离散l2模误差估计结果.最后给出具体的数值算例,以验证方法的实用性.  相似文献   

19.
二维柱几何中子输运方程的并行区域分解方法   总被引:1,自引:1,他引:0  
分析不同的区域分解方法及优先级插入算法对二维柱几何下中子输运方程Sn间断有限元方程并行效率的影响,给出基于最小面体比的正方形区域分解方法及沿径向的优先级插入算法,并通过将正方形区域分解方法与径向优先级插入算法进行组合,形成新的算法.新算法更适应于二维柱几何下输运方程Sn间断有限元方法的并行计算.数值试验表明,在通信延迟较高的大型国产并行机上,新算法用数百个CPU还可以取得较好的并行效果,比已有方法具有更良好的可扩展性.  相似文献   

20.
Many complex fluid motions are driven by physical processes of instability, transition and turbulence dependent upon nonlinear mechanisms. Here, we solve the flow past cylinder(s) using single-block structured and overset grids by computing Navier–Stokes equation in two-dimensions. The suitability of a compact scheme in discretizing convection and diffusion terms are investigated first by looking at relevant numerical properties. Also, for the overset grid method, one of the methods is identified that shows the best results in minimizing interpolation error at sub-domain boundaries for an analytical test function. We provide extensive comparisons with experimental and other computational results for flow past a single cylinder, utilizing both single-block structured and Chimera or overset grids. Apart from showing instability of this flow calculated by these methods, we also compare the computed vorticity and velocity data using these two grids by employing the proper orthogonal decomposition (POD). We have analyzed and developed an overset grid method with compact scheme that does not need any filtering to control error. This has been ascertained by performing POD analysis. To show that the developed method is capable of handling complex geometries, we have computed flow past two cylinders in side-by-side arrangement. Results obtained capture the known flow characteristics for this arrangement well using relatively fewer number of grid points.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号