首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到18条相似文献,搜索用时 187 毫秒
1.
针对有限元分析的计算问题,在现有采用全局通信方案的简单并行算法基础上,对其所涉核心算法,采用稀疏数据结构与局部通信进行并行算法优化设计,有效减少了通信所涉及的处理器个数与通信量.同时,通过采用非阻塞通信,并将与通信无关计算进行分离与前置的方法,进行计算与通信重叠,以有效隐藏通信开销的影响.实验结果表明,优化所得算法相比现有算法具有明显改进,特别是对稀疏矩阵稠密向量乘与单元贡献装配,改进很大.同时,随着任务个数的增加,改进效果越明显.  相似文献   

2.
构建航天飞行器的结构有限元模型是准确模拟飞行仿真、完成飞行器在轨飞行阶段结构故障监测和诊断的基础。采用细长体飞行器简化梁模型,提出新的基于CUDA(Compute Unified Device Architecture)的有限元单元刚度矩阵生成和总刚度矩阵组装算法。依据梁单元矩阵的对称性,结合GPU硬件架构提出并行生成算法并进行改进。为有效减少装配时间,在装配过程中采用着色算法,提出了基于GPU(Graphics Processing Unit)共享内存的非零项组装策略,通过在不同计算平台下算例对比,验证了新算法的快速性。数值算例表明,本文算法的求解效率较高,针对一定计算规模内的模型可满足快速计算与诊断的实时性要求。  相似文献   

3.
构建航天飞行器的结构有限元模型是准确模拟飞行仿真、完成飞行器在轨飞行阶段结构故障监测和诊断的基础。采用细长体飞行器简化梁模型,提出新的基于CUDA(Compute Unified Device Architecture)的有限元单元刚度矩阵生成和总刚度矩阵组装算法。依据梁单元矩阵的对称性,结合GPU硬件架构提出并行生成算法并进行改进。为有效减少装配时间,在装配过程中采用着色算法,提出了基于GPU(Graphics Processing Unit)共享内存的非零项组装策略,通过在不同计算平台下算例对比,验证了新算法的快速性。数值算例表明,本文算法的求解效率较高,针对一定计算规模内的模型可满足快速计算与诊断的实时性要求。  相似文献   

4.
三维大规模有限差分网格生成技术是三维有限差分计算的基础,网格生成效率是三维有限差分网格生成的研究热点。传统的阶梯型有限差分网格生成方法主要有射线穿透法和切片法。本文在传统串行射线穿透法的基础上,提出了基于GPU (graphic processing unit)并行计算技术的并行阶梯型有限差分网格生成算法。并行算法应用基于分批次的数据传输策略,使得算法能够处理的数据规模不依赖于GPU内存大小,平衡了数据传输效率和网格生成规模之间的关系。为了减少数据传输量,本文提出的并行算法可以在GPU线程内部相互独立的生成射线起点坐标,进一步提高了并行算法的执行效率和并行化程度。通过数值试验的对比可以看出,并行算法的执行效率远远高于传统射线穿透法。最后,通过有限差分计算实例可以证实并行算法能够满足复杂模型大规模数值模拟的需求。  相似文献   

5.
大坝混凝土三维细观力学数值模型研究   总被引:5,自引:1,他引:4  
在细观结构层次上将大坝混凝土作为骨料、固化水泥砂浆及其粘结界面组成的复合材料,建立了大坝混凝土三维细观力学数值模型。该模型既能够反映混凝土及其细观各相材料在荷载作用下的损伤演化过程,又考虑了动载作用的应变率强化效应。给出了该数值模型求解方法,并编制出能够在普通PC机上运行的串行程序。加载过程既可按荷载控制又可按位移控制。同时,为了减少求解自由度应用了分尺度方法以使最小骨料和固化水泥砂浆混合后其力学性能与一种复合介质等效。通过混凝土湿筛和三级配试件的静、动(冲击)弯拉数值计算验证了本文计算方法和程序正确有效。另外,在串行程序的基础上,优化了刚度矩阵的存储方式,采用双门槛不完全Cholesky分解(ICT)预条件的共轭梯度法(CG),完成了能够在Sun Fire 6800服务器实现并行计算的并行程序改造,从而大大提高了计算效率。  相似文献   

6.
论文将四阶隐式高斯勒让德辛龙格库塔法应用于线性结构动力学方程,并对其进行了算法优化.针对n个自由度的动力学初值问题,先通过消元得到n阶线性代数方程组,利用其系数矩阵稀疏对称正定的性质,采用预处理共轭梯度法求解,其中预条件子由系数矩阵的不完全Cholesky分解得到.通过与中心差分法、Newmark-β法及Runge-Kutta法相比,论文方法在计算量未显著增加的前提下给出了更高的计算精度.  相似文献   

7.
针对无网格Galerkin法计算耗时的问题,采用逐节点对法来组装刚度矩阵、共轭梯度法求解基于CSR格式存储的稀疏线性方程组,提出了一种利用罚函数法施加本质边界条件的EFG法GPU加速并行算法,给出了刚度矩阵和惩罚刚度矩阵的统一格式,以及GPU加速并行算法的流程图。编写了基于CUDA构架平台的GPU程序,且在NVIDIA GeForce GTX 660显卡上通过数值算例对所提算法进行了性能测试与分析比较,探讨了影响加速比的因素。算例结果验证了所提算法的可行性,并在满足计算精度的前提下,其加速比最大可达17倍;同时线性方程组的求解对加速比起决定性影响。  相似文献   

8.
时间逆转成像技术具有定位准确和操作简单之特点,本文将其运用于混凝土结构损伤的检测.通过提取各换能器单元的发射信号和损伤散射信号构建超声波传播的传递矩阵然后对其进行奇异值分解,获得包含损伤信息的奇异向量;采用多重信号分类(multiple signal classification, MUSIC)算法,分别基于数值模拟数据和实验实测数据对混凝土结构内部损伤进行成像,实现了准确的损伤定位,并将成像结果与偏移成像法进行对比.此工作探索了将时间逆转成像技术应用于混凝土结构内部损伤实际工程检测可行性,为无损检测技术人员定性或定量分析混凝土结构的内部缺陷提供理论参考.  相似文献   

9.
本文根据动力子结构模态综合法,提出了相应的并行算法。该算法有效地将整个结构分成独立的多个子结构,然后由多个CPU同时独立求各子结构的分支模态和进行各子结构的分支模态变换。再串行组集界面刚度和质量阵,并求解缩减后的整体方程。最后返回各子结构求结点位移对于(`ω~2`)的模态(φ),这一步也在各CPU内独立地同时进行。该方法在西安交通大学的ELXSI-6400并行机上程序实现,表明能有效地节省计算时间,为一种大型结构动力分析方法。  相似文献   

10.
混凝土的一种标量损伤弹塑性本构模型   总被引:2,自引:2,他引:0  
荷载作用下材料性能的劣化是混凝土结构的微观损伤机理,其宏观表现为结构刚度的折减和承载力的降低。论文推导了基于不可逆热力学过程的弹塑性标量损伤本构,给出时间离散的屈服准则。采用基于向后Euler法的应力更新算法——两步图形返回的最近点投影法,推导了满足迭代结果收敛假设的塑性参数及算法刚度张量,给出了空间梁单元本构积分算法的Jacobi矩阵。将模型用于混凝土简支梁的承载力试验模拟,与计算数据对比表明了模型和算法的合理性和有效性。  相似文献   

11.
Preconditioning techniques based on incomplete Gaussian elimination for large, sparse, non-symmetric matrix systems are described. A certain level of fill-in may be specified in the incomplete factorizations. All methods considered may be applied to matrices with arbitrary sparsity patterns, for instance those associated with the general preprocessor algorithms or adaptive mesh techniques. The preconditioners have been combined with five conjugate gradient-like methods and tested on finite element discretized scalar convection-diffusion equations in 2D and 3D. It is found from numerical experiments that an amount of fill-in corresponding to about 50% of the number of original non-zero matrix entries is the optimal choice for this class of preconditioners. The preconditioners show almost no sensitivity to grid distortion. In problems with significantly variable coefficients or anisotropy the preconditioners stabilize the basic iterative schemes in addition to reducing the computational work substantially, mostly by more than 90%. The modified preconditioning technique, where fill-in is added on the main diagonal, performs in general better than the standard incomplete LU factorization, but is inferior to the latter in 3D problems and for matrix systems with complicated sparsity patterns.  相似文献   

12.
Deformable components in multibody systems are subject to kinematic constraints that represent mechanical joints and specified motion trajectories. These constraints can, in general, be described using a set of nonlinear algebraic equations that depend on the system generalized coordinates and time. When the kinematic constraints are augmented to the differential equations of motion of the system, it is desirable to have a formulation that leads to a minimum number of non-zero coefficients for the unknown accelerations and constraint forces in order to be able to exploit efficient sparse matrix algorithms. This paper describes procedures for the computer implementation of the absolute nodal coordinate formulation' for flexible multibody applications. In the absolute nodal coordinate formulation, no infinitesimal or finite rotations are used as nodal coordinates. The configuration of the finite element is defined using global displacement coordinates and slopes. By using this mixed set of coordinates, beam and plate elements can be treated as isoparametric elements. As a consequence, the dynamic formulation of these widely used elements using the absolute nodal coordinate formulation leads to a constant mass matrix. It is the objective of this study to develop computational procedures that exploit this feature. In one of these procedures, an optimum sparse matrix structure is obtained for the deformable bodies using the QR decomposition. Using the fact that the element mass matrix is constant, a QR decomposition of a modified constant connectivity Jacobian matrix is obtained for the deformable body. A constant velocity transformation is used to obtain an identity generalized inertia matrix associated with the second derivatives of the generalized coordinates, thereby minimizing the number of non-zero entries of the coefficient matrix that appears in the augmented Lagrangian formulation of the equations of motion of the flexible multibody systems. An alternate computational procedure based on Cholesky decomposition is also presented in this paper. This alternate procedure, which has the same computational advantages as the one based on the QR decomposition, leads to a square velocity transformation matrix. The computational procedures proposed in this investigation can be used for the treatment of large deformation problems in flexible multibody systems. They have also the advantages of the algorithms based on the floating frame of reference formulations since they allow for easy addition of general nonlinear constraint and force functions.  相似文献   

13.
Two commonly used preconditioners were evaluated for parallel solution of linear systems of equations with high condition numbers. The test cases were derived from topology optimisation applications in multiple disciplines, where the material distribution finite element methods were used. Because in this optimisation method, the equations rapidly become ill-conditioned due to disappearance of large number of elements from the design space as the optimisations progresses, it is shown that the choice for a suitable preconditioner becomes very crucial. In an earlier work the conjugate gradient (CG) method with a Block-Jacobi preconditioner was used, in which the number of CG iterations increased rapidly with the increasing number processors. Consequently, the parallel scalability of the method deteriorated fast due to the increasing loss of interprocessor information among the increased number of processors. By replacing the Block-Jacobi preconditioner with a sparse approximate inverse preconditioner, it is shown that the number of iterations to converge became independent of the number of processors. Therefore, the parallel scalability is improved.  相似文献   

14.
Yakoub  R. Y.  Shabana  A. A. 《Nonlinear dynamics》1999,20(3):267-282
In a previous publication, procedures that can be used with the absolute nodal coordinate formulation to solve the dynamic problems of flexible multibody systems were proposed. One of these procedures is based on the Cholesky decomposition. By utilizing the fact that the absolute nodal coordinate formulation leads to a constant mass matrix, a Cholesky decomposition is used to obtain a constant velocity transformation matrix. This velocity transformation is used to express the absolute nodal coordinates in terms of the generalized Cholesky coordinates. The inertia matrix associated with the Cholesky coordinates is the identity matrix, and therefore, an optimum sparse matrix structure can be obtained for the augmented multibody equations of motion. The implementation of a computer procedure based on the absolute nodal coordinate formulation and Cholesky coordinates is discussed in this paper. Numerical examples are presented in order to demonstrate the use of Cholesky coordinates in the simulation of the large deformations in flexible multibody applications.  相似文献   

15.
Variants of the bi-conjugate gradient (Bi-CG) method are used to resolve the problem of slow convergence in CFD when it is applied to complex flow field simulation using higher-order turbulence models. In this study the Navier-Stokes and Reynolds stress transport equations are discretized with an implicit, total variation diminishing (TVD), finite volume formulation. The preconditioning technique of incomplete lower-upper (ILU) factorization is incorporated into the conjugate gradient square (CGS), bi-conjugate gradient stable (Bi-CGSTAB) and transpose-free quasi-minimal residual (TFQMR) algorithms to accelerate convergence of the overall itertive methods. Computations have been carried out for separated flow fields over transonic bumps, supersonic bases and supersonic compression corners. By comparisons of the convergence rate with each other and with the conventional approximate factorization (AF) method it is shown that the Bi-CGSTAB method gives the most efficient convergence rate among these methods and can speed up the CPU time by a factor of 2·4–6·5 as compared with the AF method. Moreover, the AF method may yield somewhat different results from variants of the Bi-CG method owing to the factorization error which introduces a higher level of convergence criterion.  相似文献   

16.
The effects of reordering the unknowns on the convergence of incomplete factorization preconditioned Krylov subspace methods are investigated. Of particular interest is the resulting preconditioned iterative solver behavior when adaptive mesh refinement and coarsening (AMR/C) are utilized for serial or distributed parallel simulations. As representative schemes, we consider the familiar reverse Cuthill–McKee and quotient minimum degree algorithms applied with incomplete factorization preconditioners to CG and GMRES solvers. In the parallel distributed case, reordering is applied to local subdomains for block ILU preconditioning, and subdomains are repartitioned dynamically as mesh adaptation proceeds. Numerical studies for representative applications are conducted using the object‐oriented AMR/C software system libMesh linked to the PETSc solver library. Serial tests demonstrate that global unknown reordering and incomplete factorization preconditioning can reduce the number of iterations and improve serial CPU time in AMR/C computations. Parallel experiments indicate that local reordering for subdomain block preconditioning associated with dynamic repartitioning because of AMR/C leads to an overall reduction in processing time. Copyright © 2011 John Wiley & Sons, Ltd.  相似文献   

17.
Standard preconditioners such as incomplete LU decomposition perform well when used with conjugate gradient-like iterative solvers such as GMRES for the solution of elliptic problems. However, efficient computation of convection-dominated problems requires, in general, the use of preconditioners tuned to the particular class of fluid-flow problems at hand. This paper presents three such preconditioners. The first is applied to the finite element computation of inviscid (Euler equations) transonic and supersonic flows with shocks and uses incomplete LU decomposition applied to a matrix with extra artificial dissipation. The second preconditioner is applied to the finite difference computation of unsteady incompressible viscous flow; it uses incomplete LU decomposition applied to a matrix to which a pseudo-compressible term has been added. The third method and application are similar to the second, only the LU decomposition is replaced by Beam-warming approximate factorization. In all cases, the results are in very good agreement with other published results and the new algorithms are found to be competitive with others; it is anticipated that the efficiency and robustness of conjugate-gradient-like methods will render them the method of choice as the difficulty of the problems that they are applied to is increased.  相似文献   

18.
有限元分析快速解法   总被引:17,自引:0,他引:17  
陈璞  孙树立  袁明武 《力学学报》2002,34(2):216-222
基于结构分析有限元方程组的特征,提出了在刚度矩阵及其因子的超方程概念下的细胞稀疏索引存贮方案。与传统的稀疏索引存贮方案相比,它可以减少磁盘空间和内存的占用量约30%。同时,这一存贮方案也可以减少关于索引的操作.结合双向循环展开技术,发展了一种适合于多维有限元分析的快速稀疏直接静力求解方法。工程算例表明,所建议的方案在存贮量和速度方面显著地改进了直接求解法的效率。  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号