首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 843 毫秒
1.
The present work assesses the impact - in terms of time to solution, throughput analysis, and hardware scalability - of transferring computationally intensive tasks, found in compressible reacting flow solvers, to the GPU. Attention is focused on outlining the workflow and data transfer penalties associated with “plugging in” a recently developed GPU-based chemistry library into (a) a purely CPU-based solver and (b) a GPU-based solver, where, except for the chemistry, all other variables are computed on the GPU. This comparison allows quantification of host-to-device (and vice versa) data transfer penalties on the overall solver speedup as a function of mesh and reaction mechanism size. To this end, a recently developed GPU-based chemistry library known as UMChemGPU is employed to treat the kinetics in the flow solver KARFS. UMChemGPU replaces conventional CPU-based Cantera routines using a matrix-based formulation. The impact of i) data transfer times, ii) chemistry acceleration, and iii) the hardware architecture is studied in detail in the context of GPU saturation limits. Hydrogen and dimethyl ether (DME) reaction mechanisms are used to assess the impact of the number of species/reactions on overall/chemistry-only speedup. It was found that offloading the source term computation to UMChemGPU results in up to 7X reduction in overall time to solution and four orders of magnitude faster source term computation compared to conventional CPU-based methods. Furthermore, the metrics for achieving maximum performance gain using GPU chemistry with an MPI + CUDA solver are explained using the Roofline model. Integrating the UMChemGPU with an MPI + OpenMP solver does not improve the overall performance due to the associated data copy time between the device (GPU) and host (CPU) memory spaces. The performance portability was demonstrated using three different GPU architectures, and the findings are expected to translate to a wide variety of high-performance codes in the combustion community.  相似文献   

2.
An efficient algorithm for time-domain solution of the acoustic wave equation for the purpose of room acoustics is presented. It is based on adaptive rectangular decomposition of the scene and uses analytical solutions within the partitions that rely on spatially invariant speed of sound. This technique is suitable for auralizations and sound field visualizations, even on coarse meshes approaching the Nyquist limit. It is demonstrated that by carefully mapping all components of the algorithm to match the parallel processing capabilities of graphics processors (GPUs), significant improvement in performance is gained compared to the corresponding CPU-based solver, while maintaining the numerical accuracy. Substantial performance gain over a high-order finite-difference time-domain method is observed. Using this technique, a 1 s long simulation can be performed on scenes of air volume 7500 m3 till 1650 Hz within 18 min compared to the corresponding CPU-based solver that takes around 5 h and a high-order finite-difference time-domain solver that could take up to three weeks on a desktop computer. To the best of the authors’ knowledge, this is the fastest time-domain solver for modeling the room acoustics of large, complex-shaped 3D scenes that generates accurate results for both auralization and visualization.  相似文献   

3.
Accurate radiative transfer models are the key tools for the understanding of radiative transfer processes in the atmosphere and ocean, and for the development of remote sensing algorithms. The widely used scalar approximation of radiative transfer can lead to errors in calculated top of atmosphere radiances. We show results with errors in the order of±8% for atmosphere ocean systems with case one waters. Variations in sea water salinity and temperature can lead to variations in the signal of similar magnitude. Therefore, we enhanced our scalar radiative transfer model MOMO, which is in use at Freie Universität Berlin, to treat these effects as accurately as possible. We describe our one-dimensional vector radiative transfer model for an atmosphere ocean system with a rough interface. We describe the matrix operator scheme and the bio-optical model for case one waters. We discuss some effects of neglecting polarization in radiative transfer calculations and effects of salinity changes for top of atmosphere radiances. Results are shown for the channels of the satellite instruments MERIS and OLCI from 412.5 nm to 900 nm.  相似文献   

4.
In this paper, we focus on graphical processing unit (GPU) and discuss how its architecture affects the choice of algorithm and implementation of fully-implicit petroleum reservoir simulation. In order to obtain satisfactory performance on new many-core architectures such as GPUs, the simulator developers must know a great deal on the specific hardware and spend a lot of time on fine tuning the code. Porting a large petroleum reservoir simulator to emerging hardware architectures is expensive and risky. We analyze major components of an in-house reservoir simulator and investigate how to port them to GPUs in a cost-effective way. Preliminary numerical experiments show that our GPU-based simulator is robust and effective. More importantly, these numerical results clearly identify the main bottlenecks to obtain ideal speedup on GPUs and possibly other many-core architectures.  相似文献   

5.
This paper studies the decoupling error associated with the atmospheric correction procedures in the ocean color remote sensing algorithms. The decoupling error is caused by the lack of proper consideration of multiple scattering between the atmospheric and ocean components. In other words, the atmosphere and ocean are not coupled properly. A vector radiative transfer model for the coupled atmosphere and ocean (CAO) system based on the successive order of scattering (SOS) method is used to study the error. The inherent optical properties (IOPs) of the ocean are provided by the most updated bio-optical models. Two wavelengths are used in the study, 412 and 555 nm. For a detector located just above the ocean interface, the decoupling errors range from 0.3% to 7% at 412 nm; and from 0.3% to 3 % at 555 nm for zenith viewing angles smaller than 70°. The decoupling errors are significantly larger for larger zenith viewing angles for this detector. For a detector at the top of the atmosphere (TOA), it is hard to separate the decoupling error from the error introduced by the diffuse transmittance. If we assume the upwelling radiance is uniform just below the ocean surface when estimating the diffuse transmittance, the decoupling errors are from ?4% to 8% for zenith viewing angles smaller than 70°; and negative decoupling errors show up at mainly large zenith viewing angles.  相似文献   

6.
The Joint Airborne IASI Validation Experiment (JAIVEx) was designed to investigate the absolute radiometric accuracy of the Infrared Atmospheric Sounding Interferometer (IASI) and test the radiative transfer algorithms on which applications using IASI radiances rely. Two comprehensively instrumented research aircraft participated in coordinated measurements co-aligned with overpasses on the IASI instrument, with airborne interferometers obtaining radiance observations alongside intensive measurements of the atmospheric state. The JAIVEx data set has been used to place an upper bound on the absolute radiometric accuracy of IASI radiances. Further, a set of clear air case studies have been used to test competing formulations of the CO2 line shape, water vapor spectroscopic line parameters and continuum. The current state-of-the art performance of line-by-line models is established with implications for optimal use of IASI radiances in numerical weather prediction.  相似文献   

7.
This paper reports the results of a time-resolved photoluminescence and energy transfer processes study in Ce3+ doped SrAlF5 single crystals. Several Ce3+ centers emitting near 4 eV due to 5d-4f transitions of Ce3+ ions substituting for Sr2+ in non-equivalent lattice sites were identified. The lifetime of these transitions is in the range of 25–35 ns under intra-center excitation in the energy region of 4–7 eV at T = 10 K. An effective energy transfer from lattice defects to dopant ions was revealed in the – 7–11 eV energy range. Both direct and indirect excitation channels are efficient at room temperature. Excitons bound to dopants are revealed at T = 10 K under excitation in the fundamental absorption region above 11 eV, as well as radiative decay of self-trapped excitons resulting in luminescence near 3 eV.  相似文献   

8.
We present the GPU calculation with the common unified device architecture (CUDA) for the Wolff single-cluster algorithm of the Ising model. Proposing an algorithm for a quasi-block synchronization, we realize the Wolff single-cluster Monte Carlo simulation with CUDA. We perform parallel computations for the newly added spins in the growing cluster. As a result, the GPU calculation speed for the two-dimensional Ising model at the critical temperature with the linear size L = 4096 is 5.60 times as fast as the calculation speed on a current CPU core. For the three-dimensional Ising model with the linear size L = 256, the GPU calculation speed is 7.90 times as fast as the CPU calculation speed. The idea of quasi-block synchronization can be used not only in the cluster algorithm but also in many fields where the synchronization of all threads is required.  相似文献   

9.
10.
The gas absorption process scheme in the broadband radiative transfer code “mstrn8”, which is used to calculate atmospheric radiative transfer efficiently in a general circulation model, is improved. Three major improvements are made. The first is an update of the database of line absorption parameters and the continuum absorption model. The second is a change to the definition of the selection rule for gas absorption used to choose which absorption bands to include. The last is an upgrade of the optimization method used to decrease the number of quadrature points used for numerical integration in the correlated k-distribution approach, thereby realizing higher computational efficiency without losing accuracy. The new radiation package termed “mstrnX” computes radiation fluxes and heating rates with errors less than 0.6 W/m2 and 0.3 K/day, respectively, through the troposphere and the lower stratosphere for any standard AFGL atmospheres. A serious cold bias problem of an atmospheric general circulation model using the ancestor code “mstrn8” is almost solved by the upgrade to “mstrnX”.  相似文献   

11.
This paper lays down the theoretical bases and the methods used in the Fast Optimal Retrievals on Layers for IASI (FORLI) software, which is developed and maintained at the “Université Libre de Bruxelles” (ULB) with the support of the “Laboratoire Atmosphères, Milieux, Observations Spatiales” (LATMOS) to process radiance spectra from the Infrared Atmospheric Sounding Interferometer (IASI) in the perspective of local to global chemistry applications. The forward radiative transfer model (RTM) and the retrieval approaches are formulated and numerical approximations are described. The aim of FORLI is near-real-time provision of global scale concentrations of trace gases from IASI, either integrated over the altitude range of the atmosphere (total columns) or vertically resolved. To this end, FORLI uses precalculated table of absorbances. At the time of writing three gas-specific versions of this algorithm have been set up: FORLI-CO, FORLI-O3 and FORLI-HNO3. The performances of each are reviewed and illustrations of results and early validations are provided, making the link to recent scientific publications. In this paper we stress the challenges raised by near-real-time processing of IASI, shortly describe the processing chain set up at ULB and draw perspectives for future developments and applications.  相似文献   

12.
The influence of chlorinated paraffin/titanium (C24H29Cl21/Ti) additives on burning and radiance performances of Magnesium/Teflon/Viton™ (MTV) foil-type was investigated via a high-speed camera, high-temperature differential thermobalance, far-infrared thermal imager and Fourier Transform Infrared (FTIR) remote-sensing spectrometer. We found that the burning temperature, radiance brightness, radiance area and radiance intensity after addition of C24H29Cl21/Ti are improved by 124–196 °C (8–13%), 300–475 W·m−2·sr−1 (12–19%), 943–1422 mm2 (67–101%) and 3.17–4.99 W·sr−1 (88–138%), respectively, and are maximized at the addition ratio of 10%. The substances formed by adding C24H29Cl21/Ti could improve the middle and far infrared radiation.  相似文献   

13.
A novel pendulum-type vibration isolation system is proposed consisting of three active cables with embedded piezoelectric actuators and a passive elastomer layer. The dynamic response of the isolation module in the vertical and horizontal directions is modeled using the Lagrangian approach. The validity of the dynamic model is confirmed by comparing the simulation results for the frequency response in the vertical and horizontal directions with the experimental results. An approximate model is proposed to take into account system uncertainties such as payload changes and hysteresis effects. A robust quantitative feedback theory (QFT)-based active controller is then designed to ensure that the active control can achieve a high level of disturbance rejection in the low-frequency range even under variable loading conditions. It is shown that the controller achieves average disturbance rejection of ?14 dB in the 2–60 Hz bandwidth range and ?35 dB at the resonance frequency. The experimental results confirm that the proposed system achieves a robust vibration isolation performance under the payload in the range of 40–60 kg.  相似文献   

14.
A detailed study of the fluorescence radiative dynamics and energy transfer processes between Er and Tm ions in the Er3+/Tm3+ doped fluoride glass is reported. The fluorescence properties of 2.7 μm emission, other infrared and visible emissions are investigated under different selective laser excitations. Three Judd–Ofelt intensity parameters, energy transfer microparameters and efficiency have been determined and discussed. It is found that present Er3+/Tm3+ doped fluoride glass possesses large calculated emission cross section (8.98×10–21 cm2) around 2.7 μm. The more suitable pumping scheme for laser applications at 2.7 μm laser is 980 nm excitation for Er3+/Tm3+ doped fluoride glass.  相似文献   

15.
This work reports the observation of emissions at 2.9 μm, 1.8 μm and 1.47 μm from Dy3+/Tm3+ codoped fluorophosphate glass upon excitation of a conventional 800 nm laser diode. Judd–Ofelt intensity parameters and radiative properties of Dy3+ ions in present glasses were calculated using the Judd–Ofelt theory. The mechanism and microparameters of energy transfer processes were investigated based on photoluminescence performance and lifetime measurements. The Dy3+/Tm3+ codoped fluorophosphate glass possessing advantageous spectroscopic characteristics as well as excellent thermal stability is a promising candidate for an efficient 2.9 μm laser.  相似文献   

16.
We report on the role of cross-relaxation in the decay of the 1D2 level of trivalent Pr in YPO4 in crystals with Pr concentrations of 0.1%, 1%, 2%, and 5%. We have found that the 1D2 level decay is purely radiative in the low-doped system. As the Pr concentration is increased, the 1D2 luminescence is quenched due to a cross-relaxation energy transfer between two Pr ions. The temporal behavior of the 1D2 luminescence following pulsed excitation has been monitored in each sample at temperatures between 30 K and 300 K, and all decay curves were fit to the Yokota–Tanimoto model. The decay times decrease as temperature increases, due to an increase in both the radiative rate and the energy transfer rate with temperature. There is little evidence of diffusion at any temperature, even in the more concentrated samples. We have also fit the decay curves using the LumiTrans computer simulation. A comparison of the fits to the decay curves of the two methods is presented.  相似文献   

17.
Mass transfer coefficient is an important parameter in the process of mass transfer. It can reflect the degree of enhancement of mass transfer process in liquid–solid reaction and in non-reactive systems like dissolution and leaching, and further verify the issues by experiments in the reaction process. In the present paper, a new computational model quantitatively solving ultrasonic enhancement on mass transfer coefficient in liquid–solid reaction is established, and the mass transfer coefficient on silicon surface with a transducer at frequencies of 40 kHz, 60 kHz, 80 kHz and 100 kHz has been numerically simulated. The simulation results indicate that mass transfer coefficient increases with the increasing of ultrasound power, and the maximum value of mass transfer coefficient is 1.467 × 10−4 m/s at 60 kHz and the minimum is 1.310 × 10−4 m/s at 80 kHz in the condition when ultrasound power is 50 W (the mass transfer coefficient is 2.384 × 10−5 m/s without ultrasound). The extrinsic factors such as temperature and transducer diameter and distance between reactor and ultrasound source also influence the mass transfer coefficient on silicon surface. Mass transfer coefficient increases with the increasing temperature, with the decreasing distance between silicon and central position, with the decreasing of transducer diameter, and with the decreasing of distance between reactor and ultrasound source at the same ultrasonic power and frequency. The simulation results indicate that the computational model can quantitatively solve the ultrasonic enhancement on mass transfer coefficient.  相似文献   

18.
The laser detection technology in uncertain and dynamic environments is of utmost importance in many fields. A model of transient radiative transfer of bidirectional path laser based on Monte Carlo method is developed to investigate the optimum wavelength of active detector at complex atmospheric conditions. The radiative parameters of atmosphere are calculated by HITRAN database and Mie theory at several typical atmospheric conditions including the standard atmosphere, urban aerosol, and radiation fog. Transmission characteristics for five spectral bands at the above atmospheric conditions are calculated by this model. The optimal transmission ability occurred in bands 0.2–0.5, 1.4–1.6, and 0.75–1.25 μm on the condition of standard atmosphere, urban aerosol, and radiation fog, respectively. All results provide effective reference and basic support for choosing the optimal spectral band for active detection.  相似文献   

19.
Radiative shock waves propagating in xenon at a low pressure have been produced using 60 joules of iodine laser (λ = 1.315 μm) at PALS center. The shocks have been probed by XUV imaging using a Zn X-raylaser (λ = 21 nm) generated with a 20-ns delay after the shock creating pulse. Auxiliary high-speed silicon diodes allowed performing space- and time-resolved measurement of plasma self-emission in the visible and XUV. The results show the generation of a shock wave propagating at 60 km/s preceded by a radiative precursor. This demonstrates the feasibility of radiative shock generation using high power infrared lasers and the use of XRL backlighting as a suitable diagnostic for shock imaging.  相似文献   

20.
紧束缚近似的含时密度泛函理论在多核和GPU系统下的高效加速实现,并应用于拥有成百上千原子体系的激发态电子结构计算.程序中采用了稀疏矩阵和OpenMP并行化来加速哈密顿矩阵的构建,而最为耗时的基态对角化部分通过双精度的GPU加速来实现.基态的GPU加速能够在保持计算精度的基础上达到8.73倍的加速比.激发态计算采用了基于Krylov子空间迭代算法,OpenMP并行化和GPU加速等方法对激发态计算的大规模TDDFT矩阵进行求解,从而得到本征值和本征矢,大大减少了迭代的次数和最终的求解时间.采用GPU对矩阵矢量相乘进行加速后的Krylov算法能够很快地达到收敛,使得相比于采用常规算法和CPU并行化的程序能够加速206倍.程序在一系列的小分子体系和大分子体系上的计算表明,相比基于第一性原理的CIS方法和含时密度泛函方法,程序能够花费很少的计算量取得合理而精确结果.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号