首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 9 毫秒
1.
Much of the current focus in high-performance computing is on multi-threading, multi-computing, and graphics processing unit (GPU) computing. However, vectorization and non-parallel optimization techniques, which can often be employed additionally, are less frequently discussed. In this paper, we present an analysis of several optimizations done on both central processing unit (CPU) and GPU implementations of a particular computationally intensive Metropolis Monte Carlo algorithm. Explicit vectorization on the CPU and the equivalent, explicit memory coalescing, on the GPU are found to be critical to achieving good performance of this algorithm in both environments. The fully-optimized CPU version achieves a 9× to 12× speedup over the original CPU version, in addition to speedup from multi-threading. This is 2× faster than the fully-optimized GPU version, indicating the importance of optimizing CPU implementations.  相似文献   

2.
Graphics processing units (GPUs) are recently being used to an increasing degree for general computational purposes. This development is motivated by their theoretical peak performance, which significantly exceeds that of broadly available CPUs. For practical purposes, however, it is far from clear how much of this theoretical performance can be realized in actual scientific applications. As is discussed here for the case of studying classical spin models of statistical mechanics by Monte Carlo simulations, only an explicit tailoring of the involved algorithms to the specific architecture under consideration allows to harvest the computational power of GPU systems. A number of examples, ranging from Metropolis simulations of ferromagnetic Ising models, over continuous Heisenberg and disordered spin-glass systems to parallel-tempering simulations are discussed. Significant speed-ups by factors of up to 1000 compared to serial CPU code as well as previous GPU implementations are observed.  相似文献   

3.
Yan Liu 《Optik》2011,122(7):647-649
The precision and computational efficiency of the shift operator finite-difference time-domain (SO-FDTD) method are compared with those of the piecewise linear current density recursive convolution (PLCDRC) method by simulating the electromagnetic wave at different frequency propagation in homogeneous unmagnetized plasma in this paper. The results show that the two methods almost have the same precision, but the CPU time of the SO-FDTD method is less than that of the PLCDRC method, this is because the SO-FDTD method eliminates many convolution operations of the PLCDRC method.  相似文献   

4.
<正>我们已经知道,天上的日月星辰并不是静止不动的,从它们的东升西落中所能得到的最直接、最直观的结论,就是所有天体都在一个以地球为中心  相似文献   

5.
The signal-to-noise ratio (SNR) performance and practicality issues of a four-element phased-array coil and an implantable coil system were compared for rat spinal cord magnetic resonance imaging (MRI) at 7 T. MRI scans of the rat spinal cord at T10 were acquired from eight rats over a 3 week period using both coil systems, with and without laminectomy. The results demonstrate that both the phased array and the implantable coil systems are feasible options for rat spinal cord imaging at 7 T, with both systems providing adequate SNR for 100-mum spatial resolution at reasonable imaging times. The implantable coils provided significantly higher SNR, as compared to the phased array (average SNR gain of 5.3x between the laminectomy groups and 2.5x between the nonlaminectomy groups). The implantable coil system should be used if maximal SNR is critical, whereas the phased array is a good choice for its ease of use and lesser invasiveness.  相似文献   

6.
Hole boring characteristics of laser beams are studied using two different laser wavelengths in preformed plasmas with overdense regions. We have shown that a whole beam self-focusing is created in plasma with a considerable density scale length using a 1 microm wavelength laser. The whole beam self-focusing of this type could be used for guiding the ultrahigh intense laser pulse to a highly compressed core for studying the feasibility of a fast ignitor. There is a clear difference in the hole-boring characteristics between two laser wavelengths at 1053 and 351 nm, both in the experiment and the simulation. Using the third-harmonic laser, a whole beam self-focusing is never created. The 351-nm laser beam broke up into filaments resulting in plasma jets observed in our interferogram.  相似文献   

7.
8.
9.
van Wijk MC  Thijssen JM 《Ultrasonics》2002,40(1-8):585-591
Assessment of the performance of medical ultrasound equipment is generally based on the image quality in fundamental mode. Recent development of the so-called tissue harmonic imaging (THI) mode induces the need for assessment of differences in the quality of imaging in THI vs. fundamental imaging mode. Quality features to be tested are sensitivity (penetration depth), spatial resolution, contrast resolution, lesion signal-to-noise ratio, and tissue-to-clutter ratio (TCR). These features are explained and examples are shown. The main conclusion from a comparison of the results for the two imaging modes might be that when using THI improvement of TCR, in particular in the near field, is obtained at the expense of a loss in axial resolution. Furthermore, lesion detection is not significantly improved.  相似文献   

10.
在EAST装置单道运动斯塔克效应(MSE)诊断系统数据处理中,采用CPU(中央处理器)+GPU(图形处理器)异构化模型,实现了数字谐波分析(DHA)算法的并行化加速计算。由CPU完成数据的加载及简单的数学计算,由GPU实现DHA算法的傅里叶正、逆变换及滤波等并行化计算,与串行算法相比,获得了2000倍以上的加速,可以满足MSE诊断实验期间及时数据处理的要求。  相似文献   

11.
A method for the synthesis of 4-arylquinolinolate ligand and their AlIII complexes based on Michael reaction of 2-methoxyaniline with 1-phenylpropenones was developed. The resulting 4-aryl-8-methoxyquinoline was demethylated and converted to corresponding AlIII complexes. Photophysical properties of two 4-aryl-Alq3 derivatives were then compared with properties of the parent Alq3 and a 5-phenyl-Alq3 congener. It appears that the 5-aryl derivatives show improved luminescence but also decreased physical stability. Electroluminescence of the prepared materials is presented and compared to Alq3 and a 5-phenyl-Alq3.  相似文献   

12.
13.
王俊平  郝跃 《物理学报》2009,58(6):4267-4273
在90 nm和65 nm技术节点,集成电路制造业的投资剧增而随机成品率却在下降低.为了提升随机成品率,带权关键面积的(WCA)计算和排序是关键.文中基于数学形态学提出了一种随机缺陷轮廓的WCA新模型,该模型不仅考虑了90 nm和65 nm工艺中缺陷在布线区域和空白区域的不同密度,而且也考虑了缺陷在粒径上的分布特性;同时还设计并实现了与新模型对应的WCA提取与排序算法,部分版图上的实验结果表明新WCA可以作为版图优化的代价函数,从而为随机缺陷的版图优化提供了精确依据. 关键词: 缺陷空间分布 缺陷粒径分布 关键面积 版图优化  相似文献   

14.
设计了一套以R134a为冷媒的微槽道两相流循环散热系统,用于冷却高发热密度的服务器CPU,实测综合传热系数1000~1200 W/(m^2·℃)。冷却水既可以由制冷机提供,也可以由蒸发冷却装置提供.搭建了实验测试平台,系统地测试和对比了该系统在不同CPU负荷和冷却水供水温度工况下的散热性能.测试结果表明,通过饱和温度为25~30℃的R134a两相流相变传热,可将散热热流密度为3 W/cm^2量级、总散热量在50~150 W量级的CPU本体温度稳定控制在50~60℃。根据实测数据,在不同气候条件下,该系统应用于大型数据中心全年理论能效比可以达到10以上,远高于常规机房空调。该系统具有换热能力强、体积小、能效高、冷源温度高、适用性广、节能潜力大等优点,具有可观的经济效益和社会效益。  相似文献   

15.
Joseph L. McCauley 《Physica A》2008,387(22):5518-5522
We analyze whether sliding window time averages applied to stationary increment processes converge to a limit in probability. The question centers on averages, correlations, and densities constructed via time averages of the increment x(t,T)=x(t+T)−x(t), e.g. x(t,T)=ln(p(t+T)/p(t)) in finance and economics, where p(t) is a price, and the assumption is that the increment is distributed independently of t. We apply Tchebychev’s Theorem to the construction of statistical ensembles, and then show that the convergence in probability condition is not satisfied when applied to time averages of functions of stationary increments. We further show that Tchebychev’s Theorem provides the basis for constructing approximate ensemble averages and densities from a single, historic time series where, as in FX markets, the series shows a definite ‘statistical periodicity’. The convergence condition is not satisfied strongly enough for densities and certain averages, but is well-satisfied by specific averages of direct interest. Rates of convergence cannot be established independently of specific models, however. Our analysis shows how to decide which empirical averages to avoid, and which ones to construct.  相似文献   

16.
羊国光 《光学学报》1993,13(7):77-584
本文成功地将遗传算法运用于高维衍射位相光学元件的优化设计.并与模拟退火算法进行了比较.结果表明,该算法不仅对于二元,而且对于多元位相光学元件的优化疫计均是十分有效的.而且,它特别适用于利用光电混合处理系统进行优化计算.  相似文献   

17.
Zairong Xi  Guangsheng Jin 《Physica A》2008,387(4):1056-1062
Brańczyk et al. pointed out that the quantum control scheme is superior to the classical control scheme for a simple quantum system using simulation [A.M. Brańczyk, P.E.M.F. Mendonca, A. Gilchrist, A.C. Doherty, S.D. Barlett, Quantum control theory of a single qubit, Physical Review A 75 (2007) 012329 or arXiv e-print quant-ph/0608037]. Here we rigorously prove the result. Furthermore we will show that any quantum operation does not universally “correct” the dephasing noise.  相似文献   

18.
The impact of the phase noise induced by self-phase modulation and intrachannel nonlinear effect for return-to-zero differential phase-shift keying (RZ-DPSK) in long haul 40 Gb/s transmission systems where dispersion is compensated by chirped fiber Bragg grating (CFBG) is analyzed and numerical evaluated, and it is compared with what is derived from the conventional DCF-based phase-modulated system. Our work also provides a clear physical picture of how the transmission performance is affected by CFBG, which is instructive for further research on CFBG compensated phase-modulated formats.  相似文献   

19.
We compare two different ways of quantum modification in a simple sequential game called Cat's Dilemma in the context of the debate on intransitive and transitive preferences. This kind of analysis can have essential meaning for research on artificial intelligence (some possibilities are discussed). Nature has both transitive and intransitive properties and perhaps quantum models will be more able to capture this dualism than the classical models. We also present an electoral interpretation of the game.  相似文献   

20.
In the neonatal brain, it is important to use a fast imaging technique to acquire all diffusion weighted images (DWI) for apparent diffusion coefficient (ADC) calculation. Taking into account the occurrence of typical echo planar imaging (EPI) artifacts, we have investigated whether single-shot (SSh) or multishot (MSh) DWI-EPI should be preferred. In 14 neonates, 17 adult patients and 5 adult volunteers, DWIs are obtained both with SSh and MSh EPI. The occurrence of artifacts and their influence on the ADC are explored and further quantified using simulations and phantom studies. Two radiologists scored overall image quality and diagnosability of all images. Single-shot and MSh DWI-EPI scored equally well in neonates with respect to overall image quality and diagnosability. In newborns, more motion artifacts in MSh can be noticed while N/2-ghost artifacts in SSh occur less frequently than in adults. Both N/2-ghost and motion artifacts result in significant ADC abnormalities. There is a serious risk that these artifacts will be mistaken for genuine diffusion abnormalities. N/2-ghost artifacts are hardly noticed in the neonatal brain, which might be due to smaller cerebrospinal fluid (CSF) velocity than in adults. Apparent diffusion coefficient values in MSh are unreliable if motion occurs. We conclude that for ADC calculations in neonates SSh DWI-EPI is more reliable than MSh.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号