首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 140 毫秒
1.
开发一种显式非线性有限元分析中的并行接触算法.基于区域分割技术将桶排序全局搜索方法并行化,各处理器通过桶编号向量检测相互交叠及潜在的点一面接触对.根据数据通信的特点将接触对分为三类,对各类接触对分别设计通信策略.数值算例表明,并行算法具有较高的加速比、并行效率及良好的可扩展性.  相似文献   

2.
尚月强  何银年 《计算物理》2011,28(2):181-187
基于完全重叠型区域分解技巧,提出三种求解非定常Navier-Stokes方程的有限元并行算法.其基本思想是首先对空间施行完全重叠区域分解,然后各个处理器使用向后Euler格式独立并行求解关于时间t的常微分方程;对于非线性的对流项,分别采用半隐格式和全隐格式进行处理.算法中每个处理器所负责的子问题是一个全局问题,它定义在整个求解区域上,但绝大部分自由度来自其所负责的子区域,从而使得算法实现简单,通信需求少.数值算例验证了算法的有效性及其良好的并行性能.  相似文献   

3.
基于分块非结构化网格上的SIMPLE算法、区域分解算法及MPI并行编程方法,给出了一种并行计算流体力学实施方案。提出了"串行粗粒度,并行细粒度"的区域分解及网格划分方法;依据内边界网格界面上的通量守恒原则,推导出了子区域间内界面上的变量传递关系式。在魔方计算机上使用500处理器核完成了两类流动问题计算。研究表明,并行计算结果、串行计算结果与基准解吻合很好,通信时间、cache命中率对并行加速比有显著影响。  相似文献   

4.
陈康  沈煜年 《物理学报》2021,(12):162-172
针对软体机器人用水凝胶含摩擦斜向接触问题,建立了接触力学数值模型,分析了水凝胶软材料斜向接触时的局部接触大变形及摩擦效应等非线性行为.基于超弹性材料本构,推导得到了水凝胶的更新自由能函数.给出接触计算策略,数值计算了刚性球状压头正向接触水凝胶和斜向接触水凝胶两个算例,讨论了经典Hertz接触理论的适用性以及不同摩擦系数对接触区应力分布和接触状态的影响规律.计算结果表明,水凝胶的材料非线性以及大变形引起的几何非线性使得经典Hertz接触理论不再适用;斜向接触时,摩擦系数的增大会导致水凝胶内部应力的重新分布,表现为最大应力位置由接触面下方转移到接触表面上,同时使水凝胶内部和表面出现两个主要高应力区.此外,研究还发现当摩擦系数较小时(μ<0.05),水凝胶正向接触事件中所有接触点均处于静摩擦到滑动摩擦的极限状态,而斜向接触事件中接触面的部分区域始终处于稳定静摩擦状态.  相似文献   

5.
本文将区域分解算法和蒙特卡罗法相结合,求坦克温度场和红外辐射出射度.采用蒙特卡罗法计算辐射传递系数,可以考虑界面的复杂辐射特性,如:镜反射、各向异性发射、各向异性反射等;可直接考虑界面的复杂几何特性,如:相互遮挡、太阳入射方向上的投影面积等.引入辐射传递系数,分离了计算的难点,使得在时间域(计算步骤)上能把整个问题分解为若干个子问题并行处理、在空间域(计算区域)上将坦克分解为若干个子区域,缩小计算规模,并可使用多个处理器并行计算;同时减轻了蒙特卡罗法编程的难度,缩短了计算时间.  相似文献   

6.
三维电磁粒子模拟并行计算的研究   总被引:3,自引:0,他引:3       下载免费PDF全文
廖臣  刘大刚  刘盛纲 《物理学报》2009,58(10):6709-6718
三维电磁粒子模拟基于时域有限差分算法(FDTD)和PIC(particle-in-cell)方法.根据FDTD和PIC方法的特点,可以将整个模拟区域分割为多个子区域,每个计算进程模拟计算一个子区域,通过消息传递交换子区域的边界数据从而实现并行计算这一基本思路,完成了并行算法的设计,并分析了并行加速比的影响因素.在三维电磁粒子模拟软件CHIPIC3D上实现了该并行算法并验证了算法的正确性,最后应用CHIPIC3D并行版本对磁绝缘线振荡器和相对论速调管两种典型的高功率微波源器件进行了模拟,证明了该并行算法能取 关键词: 电磁粒子模拟 时域有限差分 并行计算 高功率微波源  相似文献   

7.
针对三维滑移界面上的分配参数滑移算法,提出一种三维滑移面光滑目的 ,仅根据单元面顶点的坐标和法向等局部信息,构造出曲面片边界G1连续(切平面连续),曲面片内C1连续的三维光滑曲面,此目的 为实际的滑移面提供一个准确的几何表达.在此光滑曲面上构建了具有C1连续的法向压力场和质量面密度场.然后在光滑滑移界面上通过Newton-Raphson迭代计算接触点,依据分配参数滑移算法施加接触约束和接触计算.数值算例显示光滑的接触滑移面可以减少滑移节点的振动,提高计算的收敛性.  相似文献   

8.
对块三对角线性方程组,不完全分解是最有效的预条件之一,但它本质上是一个串行计算过程,难以有效并行化.基于一维重叠区域分解,对局部不完全分解得到的上、下三角因子分别各自进行组合,构造一类全局的并行不完全分解型预条件.在具体实现时,给出两种具体途径,其中一种基于所有重叠部分对应分量的交换.之后,在仔细对其中的计算过程进行分析的基础上,给出一种只需要一条网格线上分量通信的实现算法,大大减少了通信量,且通信不随重叠度的增加而增加.这种并行化方法可以应用于块三对角线性方程组的任何不完全分解型预条件.实验结果表明,文中提出的并行化方法普遍优于加性Schwarz并行化方法.  相似文献   

9.
扩散方程区域分解的多步算法   总被引:1,自引:1,他引:0  
盛志明  崔霞  刘兴平 《计算物理》2011,28(6):825-830
利用分数步法进行内边界值的多步计算,改进二维扩散方程的区域分解算法,形成新的并行算法,放宽稳定性条件.其中采用分数步空间大步长离散格式计算内边界点值.算法精度与隐格式相当.与改进前相比,稳定性条件放宽了q倍(g为两个相邻时间步之间执行分数步内边界值计算的次数).利用离散极值原理,严格证明了算法的收敛性.在并行机上进行数...  相似文献   

10.
丁琪  尚月强 《计算物理》2020,37(1):10-18
基于两重网格离散和区域分解技巧,提出三种求解非定常Navier-Stokes方程的有限元并行算法.算法的基本思想是在每一时间迭代步,在粗网格上采用Oseen迭代法求解非线性问题,在细网格上分别并行求解Oseen、Newton、Stokes线性问题以校正粗网格解.对于空间变量采用有限元离散,时间变量采用向后Euler格式离散.数值实验验证了算法的有效性.  相似文献   

11.
Over the last few decades, the computational demands of massive particle-based simulations for both scientific and industrial purposes have been continuously increasing. Hence, considerable efforts are being made to develop parallel computing techniques on various platforms. In such simulations, particles freely move within a given space, and so on a distributed-memory system, load balancing, i.e., assigning an equal number of particles to each processor, is not guaranteed. However, shared-memory systems achieve better load balancing for particle models, but suffer from the intrinsic drawback of memory access competition, particularly during (1) paring of contact candidates from among neighboring particles and (2) force summation for each particle. Here, novel algorithms are proposed to overcome these two problems. For the first problem, the key is a pre-conditioning process during which particle labels are sorted by a cell label in the domain to which the particles belong. Then, a list of contact candidates is constructed by pairing the sorted particle labels. For the latter problem, a table comprising the list indexes of the contact candidate pairs is created and used to sum the contact forces acting on each particle for all contacts according to Newton’s third law. With just these methods, memory access competition is avoided without additional redundant procedures. The parallel efficiency and compatibility of these two algorithms were evaluated in discrete element method (DEM) simulations on four types of shared-memory parallel computers: a multicore multiprocessor computer, scalar supercomputer, vector supercomputer, and graphics processing unit. The computational efficiency of a DEM code was found to be drastically improved with our algorithms on all but the scalar supercomputer. Thus, the developed parallel algorithms are useful on shared-memory parallel computers with sufficient memory bandwidth.  相似文献   

12.
定常粒子输运蒙特卡罗并行计算是成功的,因为粒子游动是独立的,可以把模拟的粒子数等分到每个处理器去.然而,对非定常问题,由于每个时间步涉及散射源和几何网格的通讯,它严重的制约了并行规模,导致并行不可扩展.研究了两种算法,采用自适应分配处理器,提高了加速比和处理器的利用率;采用蒙特卡罗分层抽样大大降低了处理器之间散射源的通讯量,并行可扩展性显著改善,取得了理想的加速比.  相似文献   

13.
 介绍了NEPTUNE软件采用的一些并行计算方法:采用“块-网格片”二层并行区域分解方法,使计算规模能够扩展到上千个处理器核。基于复杂几何特征采用自适应技术并行生成结构网格,在原有规则区域的基础上剔除无效网格,大幅降低了存储量和并行执行时间。在经典的Boris和SOR迭代方法基础上,采用红黑排序和几何约束,提出了非规则区域上的Poisson方程并行求解方法。采用这些方法后,当使用NEPTUNE软件模拟MILO器件时,可在1 024个处理器核上获得51.8%的并行效率。  相似文献   

14.
We present a mathematical framework for constructing and analyzing parallel algorithms for lattice kinetic Monte Carlo (KMC) simulations. The resulting algorithms have the capacity to simulate a wide range of spatio-temporal scales in spatially distributed, non-equilibrium physiochemical processes with complex chemistry and transport micro-mechanisms. Rather than focusing on constructing exactly the stochastic trajectories, our approach relies on approximating the evolution of observables, such as density, coverage, correlations and so on. More specifically, we develop a spatial domain decomposition of the Markov operator (generator) that describes the evolution of all observables according to the kinetic Monte Carlo algorithm. This domain decomposition corresponds to a decomposition of the Markov generator into a hierarchy of operators and can be tailored to specific hierarchical parallel architectures such as multi-core processors or clusters of Graphical Processing Units (GPUs). Based on this operator decomposition, we formulate parallel Fractional step kinetic Monte Carlo algorithms by employing the Trotter Theorem and its randomized variants; these schemes, (a) are partially asynchronous on each fractional step time-window, and (b) are characterized by their communication schedule between processors.The proposed mathematical framework allows us to rigorously justify the numerical and statistical consistency of the proposed algorithms, showing the convergence of our approximating schemes to the original serial KMC. The approach also provides a systematic evaluation of different processor communicating schedules. We carry out a detailed benchmarking of the parallel KMC schemes using available exact solutions, for example, in Ising-type systems and we demonstrate the capabilities of the method to simulate complex spatially distributed reactions at very large scales on GPUs. Finally, we discuss work load balancing between processors and propose a re-balancing scheme based on probabilistic mass transport methods.  相似文献   

15.
三维多群中子扩散方程的精确、高效求解是核动力堆芯设计及燃料管理的基础。应用有限差分方法求解该方程具有简便、精确、成熟的优点;然而,该方法的计算量和存储量均较大,极大地限制了它的计算规模和应用范围。本文基于大规模并行计算,研究三维多群中子扩散方程有限差分方法:采用中心有限差分格式离散中子扩散方程;基于MPI并行编程模型,采用空间区域分解的方式实现大规模并行计算;采用多群多区域耦合PGMRES算法进行并行加速。在集群服务器上开发了ParaFiDi程序,并采用IAEA3D,PHWR等多个基准题对该程序进行验证。数值结果表明,ParaFiDi程序具有较高的计算精度和计算效率。  相似文献   

16.
We present two parallel implementations of the bond fluctuation model on graphics processors that outperform by a factor of up to 50 times an equivalent implementation on single CPU processor. The first algorithm is a parallelized version of an accelerated MC method published earlier in [S. Nedelcu, J.-U. Sommer, Single chain dynamics in polymer networks: a Monte Carlo study, J. Chem. Phys. 130 (2009) 204902]. In this first algorithm we use the parallel domain decomposition technique to avoid monomer collisions. In contrast, in the second algorithm we associate each monomer with a parallel process, where all monomers in the system are attempted to move simultaneously. In both cases, only monomer moves that result in allowed bonds and preserve lattice occupancy are accepted. To validate the correctness of the GPU algorithms we simulated monodisperse polymer melts at monomer number density 0.5 and compared static and dynamical properties with standard CPU implementations. We found good agreement between the CPU and the GPU results, which demonstrates the equivalence of the serial and parallel implementations. The influence of higher monomer number density is discussed.  相似文献   

17.
基于OpenMP标准分别设计了粒子模拟方法中电磁场计算、粒子运动求解、电荷密度和电流密度更新的并行计算实现算法。在多核计算机上对所设计并行算法进行了性能测试和分析,根据分析结果在3维并行粒子模拟软件CHIPIC3D上实现了基于OpenMP的并行计算功能,并应用其对一种扩展互作用振荡器进行了基于OpenMP的并行模拟和基于OpenMP/MPI混合模式的并行模拟。模拟结果表明并行算法正确并能取得较高的加速比。  相似文献   

18.
基于集群并行系统,实现了运用蒙特卡罗方法模拟100×100×100个原子Si衬底Al薄膜淀积过程的并行计算.采用了重叠的区域分解法和异步通信的有效并行计算策略,将区域的合理划分与薄膜淀积的空间填补的拓扑几何机理结合起来,着重减少通信耗费,提高算法的并行性能,大量地缩短了薄膜淀积模拟计算时间,从而为运用计算机方法模拟薄膜淀积、完成薄膜材料淀积的预测提供了更高效的手段.  相似文献   

19.
任健  魏军侠  曹小林 《计算物理》2012,29(2):205-212
基于JASMIN框架的"联邦计算",将两个串行程序辐射流体RH2D与粒子输运Sn2D作为独立"邦元"耦合连接,形成的集成程序RHSn2D可以采用数千处理器并行模拟多物理耦合问题.集成程序RHSn2D中的邦元具有各自独立的网格划分与并行算法,同时借助框架技术,可以屏蔽邦元间的并行数据传递.算例表明,对于应用问题规模(90 720个网格单元,辐射流体100个Patch,粒子输运2 835个Patch,Sn方向48,16群),集成程序RHSn2D采用1 024个处理器可以达到36%的并行效率.  相似文献   

20.
Optimal design of optical reference signals by use of a genetic algorithm   总被引:2,自引:0,他引:2  
A new technique for the generation of optical reference signals with optimal properties is presented. In grating measurement systems a reference signal is needed to achieve an absolute measurement of the position. The optical signal is the autocorrelation of two codes with binary transmittance. For a long time, the design of this type of code has required great computational effort, which limits the size of the code to approximately 30 elements. Recently, the application of the dividing rectangles (DIRECT) algorithm has allowed the automatic design of codes up to 100 elements. Because of the binary nature of the problem and the parallel processing of the genetic algorithms, these algorithms are efficient tools for obtaining codes with particular autocorrelation properties. We design optimum zero reference codes with arbitrary length by means of a genetic algorithm enhanced with a restricted search operator.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号