共查询到20条相似文献,搜索用时 140 毫秒
1.
2.
非定常Navier-Stokes方程基于完全重叠型区域分解的有限元并行算法 总被引:1,自引:0,他引:1
基于完全重叠型区域分解技巧,提出三种求解非定常Navier-Stokes方程的有限元并行算法.其基本思想是首先对空间施行完全重叠区域分解,然后各个处理器使用向后Euler格式独立并行求解关于时间t的常微分方程;对于非线性的对流项,分别采用半隐格式和全隐格式进行处理.算法中每个处理器所负责的子问题是一个全局问题,它定义在整个求解区域上,但绝大部分自由度来自其所负责的子区域,从而使得算法实现简单,通信需求少.数值算例验证了算法的有效性及其良好的并行性能. 相似文献
3.
4.
针对软体机器人用水凝胶含摩擦斜向接触问题,建立了接触力学数值模型,分析了水凝胶软材料斜向接触时的局部接触大变形及摩擦效应等非线性行为.基于超弹性材料本构,推导得到了水凝胶的更新自由能函数.给出接触计算策略,数值计算了刚性球状压头正向接触水凝胶和斜向接触水凝胶两个算例,讨论了经典Hertz接触理论的适用性以及不同摩擦系数对接触区应力分布和接触状态的影响规律.计算结果表明,水凝胶的材料非线性以及大变形引起的几何非线性使得经典Hertz接触理论不再适用;斜向接触时,摩擦系数的增大会导致水凝胶内部应力的重新分布,表现为最大应力位置由接触面下方转移到接触表面上,同时使水凝胶内部和表面出现两个主要高应力区.此外,研究还发现当摩擦系数较小时(μ<0.05),水凝胶正向接触事件中所有接触点均处于静摩擦到滑动摩擦的极限状态,而斜向接触事件中接触面的部分区域始终处于稳定静摩擦状态. 相似文献
5.
用区域分解算法结合蒙特卡罗法求坦克温度场和红外辐射出射度 总被引:3,自引:0,他引:3
本文将区域分解算法和蒙特卡罗法相结合,求坦克温度场和红外辐射出射度.采用蒙特卡罗法计算辐射传递系数,可以考虑界面的复杂辐射特性,如:镜反射、各向异性发射、各向异性反射等;可直接考虑界面的复杂几何特性,如:相互遮挡、太阳入射方向上的投影面积等.引入辐射传递系数,分离了计算的难点,使得在时间域(计算步骤)上能把整个问题分解为若干个子问题并行处理、在空间域(计算区域)上将坦克分解为若干个子区域,缩小计算规模,并可使用多个处理器并行计算;同时减轻了蒙特卡罗法编程的难度,缩短了计算时间. 相似文献
6.
三维电磁粒子模拟基于时域有限差分算法(FDTD)和PIC(particle-in-cell)方法.根据FDTD和PIC方法的特点,可以将整个模拟区域分割为多个子区域,每个计算进程模拟计算一个子区域,通过消息传递交换子区域的边界数据从而实现并行计算这一基本思路,完成了并行算法的设计,并分析了并行加速比的影响因素.在三维电磁粒子模拟软件CHIPIC3D上实现了该并行算法并验证了算法的正确性,最后应用CHIPIC3D并行版本对磁绝缘线振荡器和相对论速调管两种典型的高功率微波源器件进行了模拟,证明了该并行算法能取
关键词:
电磁粒子模拟
时域有限差分
并行计算
高功率微波源 相似文献
7.
8.
块三对角线性方程组不完全分解预条件的一种一维区域分解并行化方法 总被引:1,自引:0,他引:1
对块三对角线性方程组,不完全分解是最有效的预条件之一,但它本质上是一个串行计算过程,难以有效并行化.基于一维重叠区域分解,对局部不完全分解得到的上、下三角因子分别各自进行组合,构造一类全局的并行不完全分解型预条件.在具体实现时,给出两种具体途径,其中一种基于所有重叠部分对应分量的交换.之后,在仔细对其中的计算过程进行分析的基础上,给出一种只需要一条网格线上分量通信的实现算法,大大减少了通信量,且通信不随重叠度的增加而增加.这种并行化方法可以应用于块三对角线性方程组的任何不完全分解型预条件.实验结果表明,文中提出的并行化方法普遍优于加性Schwarz并行化方法. 相似文献
9.
10.
基于两重网格离散和区域分解技巧,提出三种求解非定常Navier-Stokes方程的有限元并行算法.算法的基本思想是在每一时间迭代步,在粗网格上采用Oseen迭代法求解非线性问题,在细网格上分别并行求解Oseen、Newton、Stokes线性问题以校正粗网格解.对于空间变量采用有限元离散,时间变量采用向后Euler格式离散.数值实验验证了算法的有效性. 相似文献
11.
Over the last few decades, the computational demands of massive particle-based simulations for both scientific and industrial purposes have been continuously increasing. Hence, considerable efforts are being made to develop parallel computing techniques on various platforms. In such simulations, particles freely move within a given space, and so on a distributed-memory system, load balancing, i.e., assigning an equal number of particles to each processor, is not guaranteed. However, shared-memory systems achieve better load balancing for particle models, but suffer from the intrinsic drawback of memory access competition, particularly during (1) paring of contact candidates from among neighboring particles and (2) force summation for each particle. Here, novel algorithms are proposed to overcome these two problems. For the first problem, the key is a pre-conditioning process during which particle labels are sorted by a cell label in the domain to which the particles belong. Then, a list of contact candidates is constructed by pairing the sorted particle labels. For the latter problem, a table comprising the list indexes of the contact candidate pairs is created and used to sum the contact forces acting on each particle for all contacts according to Newton’s third law. With just these methods, memory access competition is avoided without additional redundant procedures. The parallel efficiency and compatibility of these two algorithms were evaluated in discrete element method (DEM) simulations on four types of shared-memory parallel computers: a multicore multiprocessor computer, scalar supercomputer, vector supercomputer, and graphics processing unit. The computational efficiency of a DEM code was found to be drastically improved with our algorithms on all but the scalar supercomputer. Thus, the developed parallel algorithms are useful on shared-memory parallel computers with sufficient memory bandwidth. 相似文献
12.
13.
介绍了NEPTUNE软件采用的一些并行计算方法:采用“块-网格片”二层并行区域分解方法,使计算规模能够扩展到上千个处理器核。基于复杂几何特征采用自适应技术并行生成结构网格,在原有规则区域的基础上剔除无效网格,大幅降低了存储量和并行执行时间。在经典的Boris和SOR迭代方法基础上,采用红黑排序和几何约束,提出了非规则区域上的Poisson方程并行求解方法。采用这些方法后,当使用NEPTUNE软件模拟MILO器件时,可在1 024个处理器核上获得51.8%的并行效率。 相似文献
14.
Giorgos Arampatzis Markos A. Katsoulakis Petr Plecháč Michela Taufer Lifan Xu 《Journal of computational physics》2012,231(23):7795-7814
We present a mathematical framework for constructing and analyzing parallel algorithms for lattice kinetic Monte Carlo (KMC) simulations. The resulting algorithms have the capacity to simulate a wide range of spatio-temporal scales in spatially distributed, non-equilibrium physiochemical processes with complex chemistry and transport micro-mechanisms. Rather than focusing on constructing exactly the stochastic trajectories, our approach relies on approximating the evolution of observables, such as density, coverage, correlations and so on. More specifically, we develop a spatial domain decomposition of the Markov operator (generator) that describes the evolution of all observables according to the kinetic Monte Carlo algorithm. This domain decomposition corresponds to a decomposition of the Markov generator into a hierarchy of operators and can be tailored to specific hierarchical parallel architectures such as multi-core processors or clusters of Graphical Processing Units (GPUs). Based on this operator decomposition, we formulate parallel Fractional step kinetic Monte Carlo algorithms by employing the Trotter Theorem and its randomized variants; these schemes, (a) are partially asynchronous on each fractional step time-window, and (b) are characterized by their communication schedule between processors.The proposed mathematical framework allows us to rigorously justify the numerical and statistical consistency of the proposed algorithms, showing the convergence of our approximating schemes to the original serial KMC. The approach also provides a systematic evaluation of different processor communicating schedules. We carry out a detailed benchmarking of the parallel KMC schemes using available exact solutions, for example, in Ising-type systems and we demonstrate the capabilities of the method to simulate complex spatially distributed reactions at very large scales on GPUs. Finally, we discuss work load balancing between processors and propose a re-balancing scheme based on probabilistic mass transport methods. 相似文献
15.
三维多群中子扩散方程的精确、高效求解是核动力堆芯设计及燃料管理的基础。应用有限差分方法求解该方程具有简便、精确、成熟的优点;然而,该方法的计算量和存储量均较大,极大地限制了它的计算规模和应用范围。本文基于大规模并行计算,研究三维多群中子扩散方程有限差分方法:采用中心有限差分格式离散中子扩散方程;基于MPI并行编程模型,采用空间区域分解的方式实现大规模并行计算;采用多群多区域耦合PGMRES算法进行并行加速。在集群服务器上开发了ParaFiDi程序,并采用IAEA3D,PHWR等多个基准题对该程序进行验证。数值结果表明,ParaFiDi程序具有较高的计算精度和计算效率。 相似文献
16.
We present two parallel implementations of the bond fluctuation model on graphics processors that outperform by a factor of up to 50 times an equivalent implementation on single CPU processor. The first algorithm is a parallelized version of an accelerated MC method published earlier in [S. Nedelcu, J.-U. Sommer, Single chain dynamics in polymer networks: a Monte Carlo study, J. Chem. Phys. 130 (2009) 204902]. In this first algorithm we use the parallel domain decomposition technique to avoid monomer collisions. In contrast, in the second algorithm we associate each monomer with a parallel process, where all monomers in the system are attempted to move simultaneously. In both cases, only monomer moves that result in allowed bonds and preserve lattice occupancy are accepted. To validate the correctness of the GPU algorithms we simulated monodisperse polymer melts at monomer number density 0.5 and compared static and dynamical properties with standard CPU implementations. We found good agreement between the CPU and the GPU results, which demonstrates the equivalence of the serial and parallel implementations. The influence of higher monomer number density is discussed. 相似文献
17.
18.
19.
基于JASMIN框架的"联邦计算",将两个串行程序辐射流体RH2D与粒子输运Sn2D作为独立"邦元"耦合连接,形成的集成程序RHSn2D可以采用数千处理器并行模拟多物理耦合问题.集成程序RHSn2D中的邦元具有各自独立的网格划分与并行算法,同时借助框架技术,可以屏蔽邦元间的并行数据传递.算例表明,对于应用问题规模(90 720个网格单元,辐射流体100个Patch,粒子输运2 835个Patch,Sn方向48,16群),集成程序RHSn2D采用1 024个处理器可以达到36%的并行效率. 相似文献
20.
Saez-Landete J Salcedo-Sanz S Rosa-Zurera M Alonso J Bernabeu E 《Optics letters》2005,30(20):2724-2726
A new technique for the generation of optical reference signals with optimal properties is presented. In grating measurement systems a reference signal is needed to achieve an absolute measurement of the position. The optical signal is the autocorrelation of two codes with binary transmittance. For a long time, the design of this type of code has required great computational effort, which limits the size of the code to approximately 30 elements. Recently, the application of the dividing rectangles (DIRECT) algorithm has allowed the automatic design of codes up to 100 elements. Because of the binary nature of the problem and the parallel processing of the genetic algorithms, these algorithms are efficient tools for obtaining codes with particular autocorrelation properties. We design optimum zero reference codes with arbitrary length by means of a genetic algorithm enhanced with a restricted search operator. 相似文献