首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
We use the graphics processing unit (GPU) for fast calculations of helicity amplitudes of physics processes. As our first attempt, we compute $u\overline{u}\rightarrow n\gamma$ (n=2 to 8) processes in pp collisions at $\sqrt{s}=14$  TeV by transferring the MadGraph generated HELAS amplitudes (FORTRAN) into newly developed HEGET (HELAS Evaluation with GPU Enhanced Technology) codes written in CUDA, a C-platform developed by NVIDIA for general purpose computing on the GPU. Compared with the usual CPU programs, we obtain a 40–150 times better performance on the GPU.  相似文献   

2.
We use a graphics processing unit (GPU) for fast calculations of helicity amplitudes of quark and gluon scattering processes in massless QCD. New HEGET (HELAS Evaluation with GPU Enhanced Technology) codes for gluon self-interactions are introduced, and a C++ program to convert the MadGraph generated FORTRAN codes into HEGET codes in CUDA (a C-platform for general purpose computing on GPU) is created. Because of the proliferation of the number of Feynman diagrams and the number of independent color amplitudes, the maximum number of final state jets we can evaluate on a GPU is limited to 4 for pure gluon processes (gg→4g), or 5 for processes with one or more quark lines such as $q\overline{q}\rightarrow 5g$ and qqqq+3g. Compared with the usual CPU-based programs, we obtain 60–100 times better performance on the GPU, except for 5-jet production processes and the gg→4g processes for which the GPU gain over the CPU is about 20.  相似文献   

3.
This paper presents a parallel algorithm implemented on graphics processing units (GPUs) for rapidly evaluating spatial convolutions between the Helmholtz potential and a large-scale source distribution. The algorithm implements a non-uniform grid interpolation method (NGIM), which uses amplitude and phase compensation and spatial interpolation from a sparse grid to compute the field outside a source domain. NGIM reduces the computational time cost of the direct field evaluation at N observers due to N co-located sources from O(N2) to O(N) in the static and low-frequency regimes, to O(N log N) in the high-frequency regime, and between these costs in the mixed-frequency regime. Memory requirements scale as O(N) in all frequency regimes. Several important differences between CPU and GPU implementations of the NGIM are required to result in optimal performance on respective platforms. In particular, in the CPU implementations all operations, where possible, are pre-computed and stored in memory in a preprocessing stage. This reduces the computational time but significantly increases the memory consumption. In the GPU implementations, where handling memory often is a critical bottle neck, several special memory handling techniques are used to accelerate the computations. A significant latency of the GPU global memory access is hidden by implementing coalesced reading, which requires arranging many array elements in contiguous parts of memory. Contrary to the CPU version, most of the steps in the GPU implementations are executed on-fly and only necessary arrays are kept in memory. This results in significantly reduced memory consumption, increased problem size N that can be handled, and reduced computational time on GPUs. The obtained GPU–CPU speed-up ratios are from 150 to 400 depending on the required accuracy and problem size. The presented method and its CPU and GPU implementations can find important applications in various fields of physics and engineering.  相似文献   

4.
Wang L  Zhao J  Di J  Jiang H 《Optics letters》2011,36(9):1620-1622
We present a simple and effective method for reconstructing extended focused images in digital holography using a graphics processing unit (GPU). The Fresnel transform method is simplified by an algorithm named fast Fourier transform pruning with frequency shift. Then the pixel size consistency problem is solved by coordinate transformation and combining the subpixel resampling and the fast Fourier transform pruning with frequency shift. With the assistance of the GPU, we implemented an improved parallel version of this method, which obtained about a 300-500-fold speedup compared with central processing unit codes.  相似文献   

5.
Parallelization is applied to the neutron calculations performed by the heterogeneous method on a graphics processing unit. The parallel algorithm of the modified TREC code is described. The efficiency of the parallel algorithm is evaluated.  相似文献   

6.
Graphics processing unit (GPU) based fast calculation method for computer generated spherical hologram (CGSH) of a real-existing object is proposed. Three-dimensional (3D) point cloud is constructed by capturing a real-existing object from multiple directions using a depth camera. The GPU based calculation is used in both hologram generation part and numerical reconstruction part of the CGSH. The improved calculation efficiency is verified by comparing the computation speed between central processing unit (CPU) based and GPU based imDlementation.  相似文献   

7.
8.
A direct integration algorithm is described to compute the magnetostatic field and energy for given magnetization distributions on not necessarily uniform tensor grids. We use an analytically-based tensor approximation approach for function-related tensors, which reduces calculations to multilinear algebra operations. The algorithm scales with N4/3 for N computational cells used and with N2/3 (sublinear) when magnetization is given in canonical tensor format. In the final section we confirm our theoretical results concerning computing times and accuracy by means of numerical examples.  相似文献   

9.
10.
The general expressions for the scattering amplitudes due to Das have been rederived following a new approach. It is possible in this approach to improve the results considerably. Ionization of hydrogen atoms by electrons and positrons has also been considered.  相似文献   

11.
Fluid-structure interaction (FSI) problems in microchannels play a prominent role in many engineering applications. The present study is an effort toward the simulation of flow in microchannel considering FSI. The bottom boundary of the microchannel is simulated by size-dependent beam elements for the finite element method (FEM) based on a modified cou- ple stress theory. The lattice Boltzmann method (LBM) using the D2Q13 LB model is coupled to the FEM in order to solve the fluid part of the FSI problem. Because of the fact that the LBM generally needs only nearest neighbor information, the algorithm is an ideal candidate for parallel computing. The simulations are carried out on graphics processing units (GPUs) using computed unified device architecture (CUDA). In the present study, the governing equations are non-dimensionalized and the set of dimensionless groups is exhibited to show their effects on micro-beam displacement. The numerical results show that the displacements of the micro-beam predicted by the size-dependent beam element are smaller than those by the classical beam element.  相似文献   

12.
A new method for continuing dual model amplitudes off the mass shell was recently proposed. This paper explores some of the properties of the resulting amplitudes. It is demonstrated that one-current amplitudes contain fixed poles in the J-plane at positions that are correlated with the asymptotic power behavior of form factors. The two-point function is explicitly calculated and shown to fall asymptotically as a power provided that a certain condition involving the dimension of space-time and a parameter (that is believed to be correlated with the leading Regge intercept) is satisfied. Certain formulas required for future investigation of more complicated amplitudes are also derived.  相似文献   

13.
《Nuclear Physics B》1995,442(3):494-532
We derive an explicit formula for the evaluation of the classical closed string action for any off-shell string field, and for the calculation of arbitrary off-shell amplitudes. The formulae require a parametrization, in terms of some moduli space coordinates, of the family of local coordinates needed to insert the off-shell states on Riemann surfaces. We discuss in detail the evaluation of the tachyon potential as a power series in the tachyon field. The expansion coefficients in this series are shown to be geometrical invariants of Strebel quadratic differentials whose variational properties imply that closed string polyhedra, among all possible choices of string vertices, yield a tachyon potential which is as small as possible order by order in the string coupling constant. Our discussion emphasizes the geometrical meaning of off-shell amplitudes.  相似文献   

14.
15.
16.
We examine necessary causal constraints on the growth of coupling constants in a class of meromorphic amplitudes that can have good Regge-pole behaviour. Such necessary conditions are stronger than those proposed earlier for a simpler class of amplitudes. As an example we show that the Veneziano amplitude is acausal trajectory by trajectory.  相似文献   

17.
黄耀  肖炳甲  罗正平 《中国物理 B》2017,26(8):85204-085204
To achieve real-time control of tokamak plasmas, the equilibrium reconstruction has to be completed sufficiently quickly. For the case of an EAST tokamak experiment, real-time equilibrium reconstruction is generally required to provide results within 1ms. A graphic processing unit(GPU) parallel Grad–Shafranov(G-S) solver is developed in P-EFIT code,which is built with the CUDA? architecture to take advantage of massively parallel GPU cores and significantly accelerate the computation. Optimization and implementation of numerical algorithms for a block tri-diagonal linear system are presented. The solver can complete a calculation within 16 μs with 65×65 grid size and 27 μs with 129×129 grid size, and this solver supports that P-EFIT can fulfill the time feasibility for real-time plasma control with both grid sizes.  相似文献   

18.
19.
随着计算空间光调制器的分辨率的尺寸逐渐变大,全息图三维动态显示的计算量也越来越大,使得对全息计算速度提出了新的要求。利用GPU并行计算处理的方式实现全息图的快速层析法计算,该方法利用GPU并行多线程和层析法中的图像二维傅里叶变换的优势对菲涅尔衍射变换算法加速计算;同时通过对GPU底层资源的调用和对CUDA中程序的流处理过程,有效减少中间的延时等待。通过对计算速度对比分析表明:与在CPU上运算相比,计算速度大幅提升,基于GPU并行计算的方法比基于CPU计算的方法速度快10倍左右。  相似文献   

20.
We present the technical tools needed to compute any one-loop amplitude involving external spacetime fermions in a four-dimensional heterotic string model à la Kawai-Lewellen-Tye. As an example, we compute the one-loop three-point amplitude with one “photon” and two external massive fermions (“electrons”). As a check of our computation, we verify that the one-loop contribution to the Anomalous Magnetic Moment vanishes if the model has spacetime supersymmetry, as required by the supersymmetric sum rules.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号