期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

1.

Fast calculation of HELAS amplitudes using graphics processing unit (GPU)

K. Hagiwara J. Kanzaki N. Okamura D. Rainwater T. Stelzer 《The European Physical Journal C - Particles and Fields》2010,66(3-4):477-492

We use the graphics processing unit (GPU) for fast calculations of helicity amplitudes of physics processes. As our first attempt, we compute $u\overline{u}\rightarrow n\gamma$ (n=2 to 8) processes in pp collisions at $\sqrt{s}=14$ TeV by transferring the MadGraph generated HELAS amplitudes (FORTRAN) into newly developed HEGET (HELAS Evaluation with GPU Enhanced Technology) codes written in CUDA, a C-platform developed by NVIDIA for general purpose computing on the GPU. Compared with the usual CPU programs, we obtain a 40–150 times better performance on the GPU. 相似文献

2.

Calculation of HELAS amplitudes for QCD processes using graphics processing unit (GPU)

K. Hagiwara J. Kanzaki N. Okamura D. Rainwater T. Stelzer 《The European Physical Journal C - Particles and Fields》2010,70(1-2):513-524

We use a graphics processing unit (GPU) for fast calculations of helicity amplitudes of quark and gluon scattering processes in massless QCD. New HEGET (HELAS Evaluation with GPU Enhanced Technology) codes for gluon self-interactions are introduced, and a C++ program to convert the MadGraph generated FORTRAN codes into HEGET codes in CUDA (a C-platform for general purpose computing on GPU) is created. Because of the proliferation of the number of Feynman diagrams and the number of independent color amplitudes, the maximum number of final state jets we can evaluate on a GPU is limited to 4 for pure gluon processes (gg→4g), or 5 for processes with one or more quark lines such as $q\overline{q}\rightarrow 5g$ and qq→qq+3g. Compared with the usual CPU-based programs, we obtain 60–100 times better performance on the GPU, except for 5-jet production processes and the gg→4g processes for which the GPU gain over the CPU is about 20. 相似文献

3.

Fast evaluation of Helmholtz potential on graphics processing units (GPUs)

Shaojing Li Boris Livshitz Vitaliy Lomakin 《Journal of computational physics》2010,229(22):8463-8483

This paper presents a parallel algorithm implemented on graphics processing units (GPUs) for rapidly evaluating spatial convolutions between the Helmholtz potential and a large-scale source distribution. The algorithm implements a non-uniform grid interpolation method (NGIM), which uses amplitude and phase compensation and spatial interpolation from a sparse grid to compute the field outside a source domain. NGIM reduces the computational time cost of the direct field evaluation at N observers due to N co-located sources from O(N²) to O(N) in the static and low-frequency regimes, to O(N log N) in the high-frequency regime, and between these costs in the mixed-frequency regime. Memory requirements scale as O(N) in all frequency regimes. Several important differences between CPU and GPU implementations of the NGIM are required to result in optimal performance on respective platforms. In particular, in the CPU implementations all operations, where possible, are pre-computed and stored in memory in a preprocessing stage. This reduces the computational time but significantly increases the memory consumption. In the GPU implementations, where handling memory often is a critical bottle neck, several special memory handling techniques are used to accelerate the computations. A significant latency of the GPU global memory access is hidden by implementing coalesced reading, which requires arranging many array elements in contiguous parts of memory. Contrary to the CPU version, most of the steps in the GPU implementations are executed on-fly and only necessary arrays are kept in memory. This results in significantly reduced memory consumption, increased problem size N that can be handled, and reduced computational time on GPUs. The obtained GPU–CPU speed-up ratios are from 150 to 400 depending on the required accuracy and problem size. The presented method and its CPU and GPU implementations can find important applications in various fields of physics and engineering. 相似文献

4.

Fast extended focused imaging in digital holography using a graphics processing unit

Wang L Zhao J Di J Jiang H 《Optics letters》2011,36(9):1620-1622

We present a simple and effective method for reconstructing extended focused images in digital holography using a graphics processing unit (GPU). The Fresnel transform method is simplified by an algorithm named fast Fourier transform pruning with frequency shift. Then the pixel size consistency problem is solved by coordinate transformation and combining the subpixel resampling and the fast Fourier transform pruning with frequency shift. With the assistance of the GPU, we implemented an improved parallel version of this method, which obtained about a 300-500-fold speedup compared with central processing unit codes. 相似文献

5.

Parallelization of heterogeneous reactor calculations on a graphics processing unit

V. M. Malofeev V. A. Pal’shin 《Physics of Atomic Nuclei》2016,79(8):1246-1251

Parallelization is applied to the neutron calculations performed by the heterogeneous method on a graphics processing unit. The parallel algorithm of the modified TREC code is described. The efficiency of the parallel algorithm is evaluated. 相似文献

6.

Acceleration method for computer generated spherical hologram calculation of real objects using graphics processing unit (Invited Paper)

Gang Li Keehoon Hong Jiwoon Yeom Ni Chen Jae-Hyeung Park Nam Kim Byoungho Lee 《中国光学快报(英文版)》2014,12(6):60016-75

Graphics processing unit （GPU） based fast calculation method for computer generated spherical hologram （CGSH） of a real-existing object is proposed. Three-dimensional （3D） point cloud is constructed by capturing a real-existing object from multiple directions using a depth camera. The GPU based calculation is used in both hologram generation part and numerical reconstruction part of the CGSH. The improved calculation efficiency is verified by comparing the computation speed between central processing unit （CPU） based and GPU based imDlementation. 相似文献

7.

Redundancy computation analysis and implementation of phase diversity based on GPU

Quan Zhang Hua Bao Changhui Rao Zhenming Peng 《Optical Review》2015,22(5):741-752

相似文献

8.

Fast stray field computation on tensor grids

L. Exl W. Auzinger S. Bance M. Gusenbauer F. Reichel T. Schrefl 《Journal of computational physics》2012,231(7):2840-2850

A direct integration algorithm is described to compute the magnetostatic field and energy for given magnetization distributions on not necessarily uniform tensor grids. We use an analytically-based tensor approximation approach for function-related tensors, which reduces calculations to multilinear algebra operations. The algorithm scales with N^4/3 for N computational cells used and with N^2/3 (sublinear) when magnetization is given in canonical tensor format. In the final section we confirm our theoretical results concerning computing times and accuracy by means of numerical examples. 相似文献

9.

基于GPU的数字全息快速解包裹算法

《光学与光电技术》2015,(4)

相似文献

10.

On a method of computation of scattering amplitudes and on ionization of hydrogen atoms

J. N. Das A. K. Biswas 《Czechoslovak Journal of Physics》1988,38(10):1140-1145

The general expressions for the scattering amplitudes due to Das have been rederived following a new approach. It is possible in this approach to improve the results considerably. Ionization of hydrogen atoms by electrons and positrons has also been considered. 相似文献

11.

Simulation of fluid-structure interaction in a microchannel using the lattice Boltzmann method and size-dependent beam element on a graphics processing unit

下载免费PDF全文

Vahid Esfahanian Esmaeil Dehdashti Amir Mehdi Dehrouye-Semnani 《中国物理 B》2014,(8):389-395

Fluid-structure interaction （FSI） problems in microchannels play a prominent role in many engineering applications. The present study is an effort toward the simulation of flow in microchannel considering FSI. The bottom boundary of the microchannel is simulated by size-dependent beam elements for the finite element method （FEM） based on a modified cou- ple stress theory. The lattice Boltzmann method （LBM） using the D2Q13 LB model is coupled to the FEM in order to solve the fluid part of the FSI problem. Because of the fact that the LBM generally needs only nearest neighbor information, the algorithm is an ideal candidate for parallel computing. The simulations are carried out on graphics processing units （GPUs） using computed unified device architecture （CUDA）. In the present study, the governing equations are non-dimensionalized and the set of dimensionless groups is exhibited to show their effects on micro-beam displacement. The numerical results show that the displacements of the micro-beam predicted by the size-dependent beam element are smaller than those by the classical beam element. 相似文献

12.

Off-mass shell dual amplitudes (II)

John H. Schwarz C.C. Wu 《Nuclear Physics B》1974,72(3):397-412

A new method for continuing dual model amplitudes off the mass shell was recently proposed. This paper explores some of the properties of the resulting amplitudes. It is demonstrated that one-current amplitudes contain fixed poles in the J-plane at positions that are correlated with the asymptotic power behavior of form factors. The two-point function is explicitly calculated and shown to fall asymptotically as a power provided that a certain condition involving the dimension of space-time and a parameter (that is believed to be correlated with the leading Regge intercept) is satisfied. Certain formulas required for future investigation of more complicated amplitudes are also derived. 相似文献

13.

Off-shell closed string amplitudes: Towards a computation of the tachyon potential

《Nuclear Physics B》1995,442(3):494-532

We derive an explicit formula for the evaluation of the classical closed string action for any off-shell string field, and for the calculation of arbitrary off-shell amplitudes. The formulae require a parametrization, in terms of some moduli space coordinates, of the family of local coordinates needed to insert the off-shell states on Riemann surfaces. We discuss in detail the evaluation of the tachyon potential as a power series in the tachyon field. The expansion coefficients in this series are shown to be geometrical invariants of Strebel quadratic differentials whose variational properties imply that closed string polyhedra, among all possible choices of string vertices, yield a tachyon potential which is as small as possible order by order in the string coupling constant. Our discussion emphasizes the geometrical meaning of off-shell amplitudes. 相似文献

14.

Fast computer simulation of reconstructed image from rainbow hologram based on GPU

Jiao Shuming Hiroshi Yoshikawa 《Optical Review》2015,22(5):841-843

相似文献

15.

A fast calculation method of optical transfer function using GPU parallel computation

Quan Zhang Hua Bao Changhui Rao Zhenming Peng 《Optical Review》2015,22(6):903-910

相似文献

16.

Localisability and single-particle exchange amplitudes (II)

R.J. Rivers 《Nuclear Physics B》1973,52(1):155-168

We examine necessary causal constraints on the growth of coupling constants in a class of meromorphic amplitudes that can have good Regge-pole behaviour. Such necessary conditions are stronger than those proposed earlier for a simpler class of amplitudes. As an example we show that the Veneziano amplitude is acausal trajectory by trajectory. 相似文献

17.

Fast parallel Grad–Shafranov solver for real-time equilibrium reconstruction in EAST tokamak using graphic processing unit

下载免费PDF全文

黄耀肖炳甲罗正平《中国物理 B》2017,26(8):85204-085204

To achieve real-time control of tokamak plasmas, the equilibrium reconstruction has to be completed sufficiently quickly. For the case of an EAST tokamak experiment, real-time equilibrium reconstruction is generally required to provide results within 1ms. A graphic processing unit(GPU) parallel Grad–Shafranov(G-S) solver is developed in P-EFIT code,which is built with the CUDA? architecture to take advantage of massively parallel GPU cores and significantly accelerate the computation. Optimization and implementation of numerical algorithms for a block tri-diagonal linear system are presented. The solver can complete a calculation within 16 μs with 65×65 grid size and 27 μs with 129×129 grid size, and this solver supports that P-EFIT can fulfill the time feasibility for real-time plasma control with both grid sizes. 相似文献

18.

Fast computation of Voigt functions via Fourier transforms

Marcus H. Mendenhall 《Journal of Quantitative Spectroscopy & Radiative Transfer》2007,105(3):519-524

相似文献

19.

层析法计算三维物体全息图的并行加速研究

下载免费PDF全文

肖波郑华东刘柯健李飞高智方《应用光学》2019,40(4):620-626

随着计算空间光调制器的分辨率的尺寸逐渐变大，全息图三维动态显示的计算量也越来越大，使得对全息计算速度提出了新的要求。利用GPU并行计算处理的方式实现全息图的快速层析法计算，该方法利用GPU并行多线程和层析法中的图像二维傅里叶变换的优势对菲涅尔衍射变换算法加速计算；同时通过对GPU底层资源的调用和对CUDA中程序的流处理过程，有效减少中间的延时等待。通过对计算速度对比分析表明:与在CPU上运算相比，计算速度大幅提升，基于GPU并行计算的方法比基于CPU计算的方法速度快10倍左右。相似文献

20.

On the computation of one-loop amplitudes with external fermions in 4D heterotic superstrings

Andrea Pasquinucci Kaj Roland 《Nuclear Physics B》1995,440(3):441-494

We present the technical tools needed to compute any one-loop amplitude involving external spacetime fermions in a four-dimensional heterotic string model à la Kawai-Lewellen-Tye. As an example, we compute the one-loop three-point amplitude with one “photon” and two external massive fermions (“electrons”). As a check of our computation, we verify that the one-loop contribution to the Anomalous Magnetic Moment vanishes if the model has spacetime supersymmetry, as required by the supersymmetric sum rules. 相似文献