首页 | 本学科首页   官方微博 | 高级检索  
文章检索
  按 检索   检索词:      
出版年份:   被引次数:   他引次数: 提示:输入*表示无穷大
  收费全文   660篇
  免费   135篇
  国内免费   38篇
化学   77篇
力学   144篇
综合类   6篇
数学   130篇
物理学   476篇
  2023年   10篇
  2022年   9篇
  2021年   31篇
  2020年   21篇
  2019年   20篇
  2018年   17篇
  2017年   26篇
  2016年   27篇
  2015年   35篇
  2014年   39篇
  2013年   46篇
  2012年   38篇
  2011年   39篇
  2010年   41篇
  2009年   43篇
  2008年   37篇
  2007年   32篇
  2006年   44篇
  2005年   35篇
  2004年   30篇
  2003年   36篇
  2002年   22篇
  2001年   21篇
  2000年   26篇
  1999年   18篇
  1998年   14篇
  1997年   8篇
  1996年   14篇
  1995年   12篇
  1994年   9篇
  1993年   6篇
  1992年   4篇
  1991年   5篇
  1990年   2篇
  1988年   5篇
  1987年   1篇
  1986年   4篇
  1985年   1篇
  1982年   2篇
  1976年   1篇
  1971年   1篇
  1957年   1篇
排序方式: 共有833条查询结果,搜索用时 15 毫秒
831.
The canonical polyadic (CP) decomposition of tensors is one of the most important tensor decompositions. While the well-known alternating least squares (ALS) algorithm is often considered the workhorse algorithm for computing the CP decomposition, it is known to suffer from slow convergence in many cases and various algorithms have been proposed to accelerate it. In this article, we propose a new accelerated ALS algorithm that accelerates ALS in a blockwise manner using a simple momentum-based extrapolation technique and a random perturbation technique. Specifically, our algorithm updates one factor matrix (i.e., block) at a time, as in ALS, with each update consisting of a minimization step that directly reduces the reconstruction error, an extrapolation step that moves the factor matrix along the previous update direction, and a random perturbation step for breaking convergence bottlenecks. Our extrapolation strategy takes a simpler form than the state-of-the-art extrapolation strategies and is easier to implement. Our algorithm has negligible computational overheads relative to ALS and is simple to apply. Empirically, our proposed algorithm shows strong performance as compared to the state-of-the-art acceleration techniques on both simulated and real tensors.  相似文献   
832.
In this paper, we apply the Anderson acceleration technique to the existing relaxation fixed-point iteration for solving the multilinear PageRank. In order to reduce computational cost, we further consider the periodical version of the Anderson acceleration. The convergence of the proposed algorithms is discussed. Numerical experiments on synthetic and real-world datasets are performed to demonstrate the advantages of the proposed algorithms over the relaxation fixed-point iteration and the extrapolated shifted fixed-point method. In particular, we give a strategy for choosing the quasi-optimal parameters of the associated algorithms when they are applied to solve the test problems with different sizes but the same structure.  相似文献   
833.
An efficient computing framework, namely PFlows, for fully resolved-direct numerical simulations of particle-laden flows was accelerated on NVIDIA General Processing Units (GPUs) and GPU-like accelerator (DCU) cards. The framework is featured as coupling the lattice Boltzmann method for fluid flow with the immersed boundary method for fluid-particle interaction, and the discrete element method for particle collision, using two fixed Eulerian meshes and one moved Lagrangian point mesh, respectively. All the parts are accelerated by a fine-grained parallelism technique using CUDA on GPUs, and further using HIP on DCU cards, i.e., the calculation on each fluid grid, each immersed boundary point, each particle motion, and each pair-particle collision is responsible by one computer thread, respectively. Coalesced memory accesses to LBM distribution functions with the data layout of Structure of Arrays are used to maximize utilization of hardware bandwidth. Parallel reduction with shared memory for data of immersed boundary points is adopted for the sake of reducing access to global memory when integrate particle hydrodynamic force. MPI computing is further used for computing on heterogeneous architectures with multiple CPUs-GPUs/DCUs. The communications between adjacent processors are hidden by overlapping with calculations. Two benchmark cases were conducted for code validation, including a pure fluid flow and a particle-laden flow. The performances on a single accelerator show that a GPU V100 can achieve 7.1–11.1 times speed up, while a single DCU can achieve 5.6–8.8 times speed up compared to a single Xeon CPU chip (32 cores). The performances on multi-accelerators show that parallel efficiency is 0.5–0.8 for weak scaling and 0.68–0.9 for strong scaling on up to 64 DCU cards even for the dense flow (φ = 20%). The peak performance reaches 179 giga lattice updates per second (GLUPS) on 256 DCU cards by using 1 billion grids and 1 million particles. At last, a large-scale simulation of a gas-solid flow with 1.6 billion grids and 1.6 million particles was conducted using only 32 DCU cards. This simulation shows that the present framework is prospective for simulations of large-scale particle-laden flows in the upcoming exascale computing era.  相似文献   
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号