排序方式: 共有142条查询结果,搜索用时 562 毫秒
81.
H. W. Zhang J. Zhu X. Wang W. Zhang 《International Journal of Computational Fluid Dynamics》2019,33(10):393-406
ABSTRACTIn this paper, the OpenACC heterogeneous parallel programming model is successfully applied to modification and acceleration of the three-dimensional Tokamak magnetohydrodynamical code (CLT). Through combination of OpenACC and MPI technologies, CLT is further parallelised by using multiple-GPUs. Significant speedup ratios are achieved on NVIDIA TITAN Xp and TITAN V GPUs, respectively, with very few modifications of CLT. Furthermore, the validity of the double precision calculations on the above-mentioned two graphics cards has also been strictly verified with m/n?=?2/1 resistive tearing mode instability in Tokamak. 相似文献
82.
Quantum supercharger library: Hyper‐parallel integral derivatives algorithms for ab initio QM/MM dynamics 下载免费PDF全文
C. Alicia Renison Kyle D. Fernandes Kevin J. Naidoo 《Journal of computational chemistry》2015,36(18):1410-1419
This article describes an extension of the quantum supercharger library (QSL) to perform quantum mechanical (QM) gradient and optimization calculations as well as hybrid QM and molecular mechanical (QM/MM) molecular dynamics simulations. The integral derivatives are, after the two‐electron integrals, the most computationally expensive part of the aforementioned calculations/simulations. Algorithms are presented for accelerating the one‐ and two‐electron integral derivatives on a graphical processing unit (GPU). It is shown that a Hartree–Fock ab initio gradient calculation is up to 9.3X faster on a single GPU compared with a single central processing unit running an optimized serial version of GAMESS‐UK, which uses the efficient Schlegel method for ‐ and ‐orbitals. Benchmark QM and QM/MM molecular dynamics simulations are performed on cellobiose in vacuo and in a 39 Å water sphere (45 QM atoms and 24843 point charges, respectively) using the 6‐31G basis set. The QSL can perform 9.7 ps/day of ab initio QM dynamics and 6.4 ps/day of QM/MM dynamics on a single GPU in full double precision. © 2015 Wiley Periodicals, Inc. 相似文献
83.
《Journal of computational chemistry》2017,38(17):1552-1559
Kinetic energy density functionals (KEDFs) approximate the kinetic energy of a system of electrons directly from its electron density. They are used in electronic structure methods that lack direct access to orbitals, for example, orbital‐free density functional theory (OFDFT) and certain embedding schemes. In this contribution, we introduce libKEDF, an accelerated library of modern KEDF implementations that emphasizes nonlocal KEDFs. We discuss implementation details and assess the performance of the KEDF implementations for large numbers of atoms. We show that using libKEDF, a single computing node or (GPU) accelerator can provide easy computational access to mesoscale chemical and materials science phenomena using OFDFT algorithms. © 2017 Wiley Periodicals, Inc. 相似文献
84.
85.
Vladimir Prokofev 《国际流体数值方法杂志》2018,86(8):519-540
This paper describes a pressure correction method for single‐ and multilayer open flow models. The method does not require any complex procedures to solve the discretization of the Poisson equation and is distinguished by a high computational efficiency. The algorithm can easily be adapted to irregular meshes and parallelized. Parabolic interpolation of the pressure profile is used for the free surface. The discretization of the Poisson equation is written in a matrix form, allowing its usage also in the case of basic function expansion of the depth pressure profile. The paper presents the results of algorithm verification where experimental data sensitive to the numerical dissipation of the calculation model was used. Iteration convergence is high including problems with dry‐bed flooding. The complete described technique of pressure correction is implemented in OpenCL on the GPU. Computation time for a test problem solved using CPU and GPU is compared. 相似文献
86.
High‐speed compressible turbulent flows typically contain discontinuities and have been widely modeled using Weighted Essentially Non‐Oscillatory (WENO) schemes due to their high‐order accuracy and sharp shock capturing capability. However, such schemes may damp the small scales of turbulence and result in inaccurate solutions in the context of turbulence‐resolving simulations. In this connection, the recently developed Targeted Essentially Non‐Oscillatory (TENO) schemes, including adaptive variants, may offer significant improvements. The present study aims to quantify the potential of these new schemes for a fully turbulent supersonic flow. Specifically, DNS of a compressible turbulent channel flow with M = 1.5 and Reτ = 222 is conducted using OpenSBLI, a high‐order finite difference computational fluid dynamics framework. This flow configuration is chosen to decouple the effect of flow discontinuities and turbulence and focus on the capability of the aforementioned high‐order schemes to resolve turbulent structures. The effect of the spatial resolution in different directions and coarse grid implicit LES are also evaluated against the WALE LES model. The TENO schemes are found to exhibit significant performance improvements over the WENO schemes in terms of the accuracy of the statistics and the resolution of the three‐dimensional vortical structures. The sixth‐order adaptive TENO scheme is found to produce comparable results to those obtained with nondissipative fourth‐ and sixth‐order central schemes and reference data obtained with spectral methods. Although the most computationally expensive scheme, it is shown that this adaptive scheme can produce satisfactory results if used as an implicit LES model. 相似文献
87.
88.
Ji Xu Huabiao Qi Xiaojian Fang Liqiang Lu Wei Ge Xiaowei Wang Ming Xu Feiguo Chen Xianfeng He Jinghai Li 《Particuology》2011,9(4):446-450
Real-time simulation of industrial equipment is a huge challenge nowadays. The high performance and fine-grained parallel computing provided by graphics processing units (GPUs) bring us closer to our goals. In this article, an industrial-scale rotating drum is simulated using simplified discrete element method (DEM) without consideration of the tangential components of contact force and particle rotation. A single GPU is used first to simulate a small model system with about 8000 particles in real-time, and the simulation is then scaled up to industrial scale using more than 200 GPUs in a 1D domain-decomposition parallelization mode. The overall speed is about 1/11 of the real-time. Optimization of the communication part of the parallel GPU codes can speed up the simulation further, indicating that such real-time simulations have not only methodological but also industrial implications in the near future. 相似文献
89.
描述了HL-2A等离子体实时平衡重建的GPU并行化算法,主要包括G-S方程的并行化处理、三对角方程求解、网格边界磁通计算以及一系列矩阵相乘的并行加速.并行后,在129×129的网格下完成一次迭代计算需要约575μs. 相似文献
90.
The prediction of the penetration of three-dimensional (3D) shaped charge into steel plates is a challenging task. In this paper, the smoothed particle hydrodynamics (SPH) method is applied to simulate the jet formation generated by the shaped charge detonation and its damage to steel plates. The Jones–Wilkins–Lee (JWL) equation of state (EOS), Tillotson EOS, and elastic–perfectly plastic constitutive model were incorporated into SPH for the modeling of explosive detonation and dynamic behavior of metal material. The compute unified device architecture (CUDA) parallel programming interface has been employed in SPH to improve the computational efficiency of SPH. Firstly, the constitutive models and EOSs are validated by 3D TNT slab detonation and aluminum–aluminum (Al–Al) high velocity impact. Then the jet formation of the shaped charge detonation and its penetration into the steel plates are investigated using the graphics processing unit (GPU)-accelerated SPH methodology. The numerical results of these test cases are compared against the published experimental data or analytical result, which shows that the GPU-accelerated SPH methodology is capable of tackling the 3D shaped charge detonation and penetration involving millions of particles with high computational efficiency. 相似文献