期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

1.

Convergence and error estimate of cascade algorithms with infinitely supported masks in L p (? s )

JianBin Yang Song Li 《中国科学数学(英文版)》2012,55(3):577-592

The cascade algorithm plays an important role in computer graphics and wavelet analysis.In this paper,we first investigate the convergence of cascade algorithms associated with a polynomially decaying mask and a general dilation matrix in L p (R s) (1 p ∞) spaces,and then we give an error estimate of the cascade algorithms associated with truncated masks.It is proved that under some appropriate conditions if the cascade algorithm associated with a polynomially decaying mask converges in the L p-norm,then the cascade algorithms associated with the truncated masks also converge in the L p-norm.Moreover,the error between the two resulting limit functions is estimated in terms of the masks. 相似文献

2.

A parallel implementation of overlapping Schwarz method for the dual reciprocity method

Ke Chen Kamal Shanazari 《Applied mathematics and computation》2010,216(11):3355-3364

We present a parallel algorithm for the overlapping domain decomposition boundary integral equation method for two dimensional partial differential equations. In addition to the improvement of the ill-conditioning and the computational efficiency achieved by domain partitioning, using a parallel computer with p processors can offer up to p times efficiency. Assuming direct solution is used throughout, partitioning the domain into p subregions and employing a processor for each subproblem, overall, result in p² times efficiency over using a single domain and a single processor, taking into account that a sequential algorithm of the underlying method can improve the computational efficiency at least p times over using a single domain. Some numerical results showing the efficiency of the parallel technique will be presented. 相似文献

3.

Parallel Speed-Up of Monte Carlo Methods for Global Optimization

《Journal of Complexity》1994,10(1):64-95

We introduce the notion of expected hitting time to a goal as a measure of the convergence rate of a Monte Carlo optimization method. The techniques developed apply to simulated annealing, genetic algorithms, and other stochastic search schemes. The expected hitting time can itself be calculated from the more fundamental complementary hitting time distribution (CHTD) which completely characterizes a Monte Carlo method. The CHTD is asymptotically a geometric series, (1/s)/(1 − λ), characterized by two parameters, s, λ, related to the search process in a simple way. The main utility of the CHTD is in comparing Monte Carlo algorithms. In particular we show that independent, identical Monte Carlo algorithms run in parallel, IIP parallelism, and exhibit superlinear speedup. We give conditions under which this occurs and note that equally likely search is linearly sped up. Further we observe that a serial Monte Carlo search can have an infinite expected hitting time, but the same algorithm when parallelized can have a finite expected hitting time. One consequence of the observed superlinear speedup is an improved uniprocessor algorithm by the technique of in-code parallelism. 相似文献

4.

Parallel Galerkin domain decomposition procedures based on the streamline diffusion method for convection-diffusion problems 总被引：1，自引：0，他引：1

Keying Ma Tongjun Sun 《Journal of Computational and Applied Mathematics》2011,235(15):4464-4479

Based upon the streamline diffusion method, parallel Galerkin domain decomposition procedures for convection-diffusion problems are given. These procedures use implicit method in the sub-domains and simple explicit flux calculations on the inter-boundaries of sub-domains by integral mean method or extrapolation method to predict the inner-boundary conditions. Thus, the parallelism can be achieved by these procedures. The explicit nature of the flux calculations induces a time step limitation that is necessary to preserve stability. Artificial diffusion parameters δ are given. By analysis, optimal order error estimate is derived in a norm which is stronger than L²-norm for these procedures. This error estimate not only includes the optimal H¹-norm error estimate, but also includes the error estimate along the streamline direction ‖β⋅∇(u−U)‖, which cannot be achieved by standard finite element method. Experimental results are presented to confirm theoretical results. 相似文献

5.

Parallel iteration across the steps of high-order Runge-Kutta methods for nonstiff initial value problems

P. J. van der Houwen B. P. Sommeijer W. A. van der Veen 《Journal of Computational and Applied Mathematics》1995,60(3):309-329

For the parallel integration of nonstiff initial value problems (IVPs), three main approaches can be distinguished: approaches based on “parallelism across the problem”, on “parallelism across the method” and on “parallelism across the steps”. The first type of parallelism does not require special integration methods and can be exploited within any available IVP solver. The method-parallelism approach received much attention, particularly within the class of explicit Runge-Kutta methods originating from fixed point iteration of implicit Runge-Kutta methods of Gaussian type. The construction and implementation on a parallel machine of such methods is extremely simple. Since the computational work per processor is modest with respect to the number of data to be exchanged between the various processors, this type of parallelism is most suitable for shared memory systems. The required number of processors is roughly half the order of the generating Runge-Kutta method and the speed-up with respect to a good sequential IVP solver is about a factor 2. The third type of parallelism (step-parallelism) can be achieved in any IVP solver based on predictor-corrector iteration and requires the processors to communicate after each full iteration. If the iterations have sufficient computational volume, then the step-parallel approach may be suitable for implementation on distributed memory systems. Most step-parallel methods proposed so far employ a large number of processors, but lack the property of robustness, due to a poor convergence behaviour in the iteration process. Hence, the effective speed-up is rather poor. The dynamic step-parallel iteration process proposed in the present paper is less massively parallel, but turns out to be sufficiently robust to achieve speed-up factors up to 15. 相似文献

6.

Optimal parallel quicksort on EREW PRAM

Weixiong Zhang Nageswara S. V. Rao 《BIT Numerical Mathematics》1991,31(1):69-74

In this paper, we present parallel quicksort algorithms running inO((n/p+logp) logn) expected time andO((n/p+logp+log logn) logn) deterministic time respectively, and both withO(n) space by usingp processors on EREW PRAM. Whenp=O(n/logn), the cost is optimal, in terms of the product of time and number of processors. These algorithms can be used to obtain parallel algorithms for constructing balanced binary search trees without using sorting algorithms. One of our quicksort algorithms leads to a parallel quickhull algorithm on EREW PRAM.The work of this author was partially supported by a fellowship from the College of Science, Old Dominion University, Norfolk, VA 23529, USA. 相似文献

7.

A study of average-case speedup and scalability of parallel computations on static networks

《Mathematical and Computer Modelling》1999,29(9):83-94

A parallel system consists of a parallel algorithm and a parallel machine that supports the implementation of the algorithm. The scalability of a parallel system is a measure of its capability to increase speedup in proportion to the number of processors, or its capability to keep a constant efficiency as the number of processors increases. The present paper is devoted to the investigation of the average-case scalability of parallel algorithms executing on multicomputers with symmetric static networks, including the completely connected network, ring, hypercube, and torus. In particular, we characterize the communication overhead such that the expected efficiency can be kept at certain constant level, and that the number of tasks grows at the rate Θ(P log P). 相似文献

8.

A non-uniform basis order for the discontinuous Galerkin method of the acoustic and elastic wave equations 总被引：1，自引：0，他引：1

T. Lähivaara T. Huttunen 《Applied Numerical Mathematics》2011,61(4):473-486

Here, we solve the time-dependent acoustic and elastic wave equations using the discontinuous Galerkin method for spatial discretization and the low-storage Runge-Kutta and Crank-Nicolson methods for time integration. The aim of the present paper is to study how to choose the order of polynomial basis functions for each element in the computational mesh to obtain a predetermined relative error. In this work, the formula 2p+1≈κhk, which connects the polynomial basis order p, mesh parameter h, wave number k, and free parameter κ, is studied. The aim is to obtain a simple selection method for the order of the basis functions so that a relatively constant error level of the solution can be achieved. The method is examined using numerical experiments. The results of the experiments indicate that this method is a promising approach for approximating the degree of the basis functions for an arbitrarily sized element. However, in certain model problems we show the failure of the proposed selection scheme. In such a case, the method provides an initial basis for a more general p-adaptive discontinuous Galerkin method. 相似文献

9.

Non-uniform data distribution for communication-efficient parallel clustering

《Journal of computational science》2013,4(6):489-495

Global communication requirements and load imbalance of some parallel data mining algorithms are the major obstacles to exploit the computational power of large-scale systems. This work investigates how non-uniform data distributions can be exploited to remove the global communication requirement and to reduce the communication cost in parallel data mining algorithms and, in particular, in the k-means algorithm for cluster analysis. In the straightforward parallel formulation of the k-means algorithm, data and computation loads are uniformly distributed over the processing nodes. This approach has excellent load balancing characteristics that may suggest it could scale up to large and extreme-scale parallel computing systems. However, at each iteration step the algorithm requires a global reduction operation which hinders the scalability of the approach. This work studies a different parallel formulation of the algorithm where the requirement of global communication is removed, while maintaining the same deterministic nature of the centralised algorithm. The proposed approach exploits a non-uniform data distribution which can be either found in real-world distributed applications or can be induced by means of multi-dimensional binary search trees. The approach can also be extended to accommodate an approximation error which allows a further reduction of the communication costs. The effectiveness of the exact and approximate methods has been tested in a parallel computing system with 64 processors and in simulations with 1024 processing elements. 相似文献

10.

Parallel search paths for the simplex algorithm

Péter Tar Bálint Stágel István Maros 《Central European Journal of Operations Research》2017,25(4):967-984

It is well known that the simplex method is inherently a sequential algorithm with little scope for parallelization. Even so, during the last decades several attempts were made to parallelize it since it is one of the most important algorithms for solving linear optimization problems. Such parallelization ideas mostly rely on iteration parallelism and overlapping. Since the simplex method goes through a series of basic solutions until it finds an optimal solution, each of them must be available before performing the next basis change. This phenomenon imposes a limit on the performance of the parallelized version of the simplex method which uses overlapping iterations. Another approach can be considered if we think about alternative paths on the n-dimensional simplex polyhedron. As the simplex method goes through the edges of this polyhedron it is generally true that the speed of convergence of the algorithm is not smooth. It depends on the actual part of the surface. If a parallel version of the simplex algorithm simultaneously goes on different paths on this surface a highly reliable algorithm can be constructed. There is no known dominating strategy for pivot selection. Therefore, one can try different pivot selection methods in parallel in order to guide the algorithm on different pathways. This approach can be used effectively with periodic synchronization on shared memory multi-core computing environments to speed up the solution algorithm and get around numerically and/or algorithmically difficult situations throughout the computations. 相似文献

11.

Binary tree algebraic computation and parallel algorithms for simple graphs

《Journal of Algorithms in Cognition, Informatics and Logic》1988,9(1):92-113

In this paper we define the binary tree algebraic computation (BTAC) problem and develop an efficient parallel algorithm for solving this problem. A variety of graph problems (minimum covering set, minimum r-dominating set, maximum matching set, etc.) for trees and two terminal series parallel (TTSP) graphs can be converted to instances of the BTAC problem. Thus efficient parallel algorithms for these problems are obtained systematically by using the BTAC algorithm. The parallel computation model is an exclusive read exclusive write PRAM. The algorithms for tree problems run in O(log n) time with O(n) processors. The algorithms for TTSP graph problems run in O(log m) time with O(m) processors where n (m) is the number of vertices (edges) in the input graph. These algorithms are within an O(log n) factor of optimal. 相似文献

12.

An O(n²log n) parallel max-flow algorithm

Yossi Shiloach Uzi Vishkin 《Journal of Algorithms in Cognition, Informatics and Logic》1982,3(2):128-146

A synchronized parallel algorithm for finding maximum flow in a directed flow network is presented. Its depth is

O(n^{3} (log n) p)

, where p (p ≤ n) is the number of processors used. This problem seems to be more involved than most of the problems for which efficient parallel algorithms exist. The parallel algorithm induces a new rather simple sequential O(n³) algorithm. This algorithm is very much parallel oriented. It is quite difficult to conceive and analyze it, if one is restricted to the sequential point of view. 相似文献

13.

Inspection and duplication

Göran Palmers Lennart R⇘de 《Milan Journal of Mathematics》1991,61(1):213-217

In a cyclic industrial process a certain component fails during a cycle with probabilityp. In order to increase the reliability of the process, the component is duplicated in parallel with one or two redundant components of the same kind. The components fail independently of each other. To further increase the reliability, the system can be inspected aftern cycles, given that it has not failed during the previousn?1 cycles. Upon inspection failed components are repaired. In order to choose the inspection intervaln, the conditional probability that the system fails during cycle non, given that it has not failed during the previousn?1 cycles, is calculated. 相似文献

14.

ОптИМАльНыЕ кВАДРАт УРНыЕ ФОРМУлы Дль клА ссОВ ФУНкцИИ с ИНтЕгРИРУЕ МОИ ВL p r-ОИ пРОИжВОДНОИ

Н. Е. лУшпАИ 《Analysis Mathematica》1979,5(1):67-88

On the classW ^r L _p (1≦p≦∞;r=1, 2,…) of 1-periodic functions ?(x) having an absolutely continuous (r? l)st derivative such that $$\parallel f^{(r)} \parallel _{L_p } \leqq 1 (\parallel f^{(r)} \parallel _{L_\infty } = vrai \sup |f^{(r)} (x)|)$$ vrai sup ¦?(^r)(x)¦) an optimal quadrature formula of the form (0 ≦? ≦r?1, 0 ≦x ₀ < x₁ <…< x_m ≦ 1) is found in the cases ?=r?2 and ?=r? 3 (r=3, 5, …). An exact error bound is established for this formula. The statements proved forW ^r L _p allowed us also to obtain, under certain restrictions posed on the coefficientsp _kl, and the nodesx ₀ andx _m, optimal quadrature formulae for the classes $$W_0^r L_p = \{ f:f \in W^r L_p , f^{(i)} (0) = 0 (i = 0,1,...,r - 2)\} $$ and $$W_0^r L_p = \{ f:f \in \tilde W^r L_p , f^{(i)} (0) = f^{(i)} (1) = 0 (i = 0,1,...,r - 2)\} $$ for the same values ofp andr as above. 相似文献

15.

Fast algorithms for determining the linear complexities of sequences over GF(p m) with the period 3n

Hao Chen 《中国科学数学(英文版)》2006,49(5):715-720

In this paper, for the the primes p such that 3 is a divisor of p ? 1, we prove a result which reduces the computation of the linear complexity of a sequence over GF(p ^m) (any positive integer m) with the period 3n (n and p ^m ? 1 are coprime) to the computation of the linear complexities of three sequences with the period n. Combined with some known algorithms such as generalized Games-Chan algorithm, Berlekamp-Massey algorithm and Xiao-Wei-Lam-Imamura algorithm, we can determine the linear complexity of any sequence over GF(p ^m) with the period 3n (n and p ^m ? 1 are coprime) more efficiently. 相似文献

16.

Recent advances in gradient algorithms for optimal control problems 总被引：1，自引：0，他引：1

A. Miele 《Journal of Optimization Theory and Applications》1975,17(5-6):361-430

This paper summarizes recent advances in the area of gradient algorithms for optimal control problems, with particular emphasis on the work performed by the staff of the Aero-Astronautics Group of Rice University. The following basic problem is considered: minimize a functionalI which depends on the statex(t), the controlu(t), and the parameter π. Here,I is a scalar,x ann-vector,u anm-vector, and π ap-vector. At the initial point, the state is prescribed. At the final point, the statex and the parameter π are required to satisfyq scalar relations. Along the interval of integration, the state, the control, and the parameter are required to satisfyn scalar differential equations. First, the sequential gradient-restoration algorithm and the combined gradient-restoration algorithm are presented. The descent properties of these algorithms are studied, and schemes to determine the optimum stepsize are discussed. Both of the above algorithms require the solution of a linear, two-point boundary-value problem at each iteration. Hence, a discussion of integration techniques is given. Next, a family of gradient-restoration algorithms is introduced. Not only does this family include the previous two algorithms as particular cases, but it allows one to generate several additional algorithms, namely, those with alternate restoration and optional restoration. Then, two modifications of the sequential gradient-restoration algorithm are presented in an effort to accelerate terminal convergence. In the first modification, the quadratic constraint imposed on the variations of the control is modified by the inclusion of a positive-definite weighting matrix (the matrix of the second derivatives of the Hamiltonian with respect to the control). The second modification is a conjugate-gradient extension of the sequential gradient-restoration algorithm. Next, the addition of a nondifferential constraint, to be satisfied everywhere along the interval of integration, is considered. In theory, this seems to be only a minor modification of the basic problem. In practice, the change is considerable in that it enlarges dramatically the number and variety of problems of optimal control which can be treated by gradient-restoration algorithms. Indeed, by suitable transformations, almost every known problem of optimal control theory can be brought into this scheme. This statement applies, for instance, to the following situations: (i) problems with control equality constraints, (ii) problems with state equality constraints, (iii) problems with equality constraints on the time rate of change of the state, (iv) problems with control inequality constraints, (v) problems with state inequality constraints, and (vi) problems with inequality constraints on the time rate of change of the state. Finally, the simultaneous presence of nondifferential constraints and multiple subarcs is considered. The possibility that the analytical form of the functions under consideration might change from one subarc to another is taken into account. The resulting formulation is particularly relevant to those problems of optimal control involving bounds on the control or the state or the time derivative of the state. For these problems, one might be unwilling to accept the simplistic view of a continuous extremal arc. Indeed, one might want to take the more realistic view of an extremal arc composed of several subarcs, some internal to the boundary being considered and some lying on the boundary. The paper ends with a section dealing with transformation techniques. This section illustrates several analytical devices by means of which a great number of problems of optimal control can be reduced to one of the formulations presented here. In particular, the following topics are treated: (i) time normalization, (ii) free initial state, (iii) bounded control, and (iv) bounded state. 相似文献

17.

A parareal waveform relaxation algorithm for semi-linear parabolic partial differential equations

Jun Liu Yao-Lin Jiang 《Journal of Computational and Applied Mathematics》2012

We report a new parallel iterative algorithm for semi-linear parabolic partial differential equations (PDEs) by combining a kind of waveform relaxation (WR) techniques into the classical parareal algorithm. The parallelism can be simultaneously exploited by WR and parareal in different directions. We provide sharp error estimations for the new algorithm on bounded time domain and on unbounded time domain, respectively. The iterations of the parareal and the WR are balanced to optimize the performance of the algorithm. Furthermore, the speedup and the parallel efficiency of the new approach are analyzed. Numerical experiments are carried out to verify the effectiveness of the theoretic work. 相似文献

18.

Global error estimation and control in linearly-implicit parallel two-step peer W-methods

G.Yu. Kulikov R. Weiner 《Journal of Computational and Applied Mathematics》2011,236(6):1226-1239

The class of linearly-implicit parallel two-step peer W-methods has been designed recently for efficient numerical solutions of stiff ordinary differential equations. Those schemes allow for parallelism across the method, that is an important feature for implementation on modern computational devices. Most importantly, all stage values of those methods possess the same properties in terms of stability and accuracy of numerical integration. This property results in the fact that no order reduction occurs when they are applied to very stiff problems. In this paper, we develop parallel local and global error estimation schemes that allow the numerical solution to be computed for a user-supplied accuracy requirement in automatic mode. An algorithm of such global error control and other technical particulars are also discussed here. Numerical examples confirm efficiency of the presented error estimation and stepsize control algorithm on a number of test problems with known exact solutions, including nonstiff, stiff, very stiff and large-scale differential equations. A comparison with the well-known stiff solver RODAS is also shown. 相似文献

19.

<Emphasis Type="Italic">C</Emphasis><Superscript>0</Superscript>IPG adaptive algorithms for the biharmonic eigenvalue problem

Hao Li Yidu Yang 《Numerical Algorithms》2018,78(2):553-567

This paper focuses on C ⁰IPG adaptive algorithms for the biharmonic eigenvalue problem with the clamped boundary condition. We prove the reliability and efficiency of the a posteriori error indicator of the approximating eigenfunctions and analyze the reliability of the a posteriori error indicator of the approximating eigenvalues. We present two adaptive algorithms, and numerical experiments indicate that both algorithms are efficient. 相似文献

20.

Global error estimation and control in linearly-implicit parallel two-step peer W-methods

《Journal of Computational and Applied Mathematics》2012,236(6):1226-1239

The class of linearly-implicit parallel two-step peer W-methods has been designed recently for efficient numerical solutions of stiff ordinary differential equations. Those schemes allow for parallelism across the method, that is an important feature for implementation on modern computational devices. Most importantly, all stage values of those methods possess the same properties in terms of stability and accuracy of numerical integration. This property results in the fact that no order reduction occurs when they are applied to very stiff problems. In this paper, we develop parallel local and global error estimation schemes that allow the numerical solution to be computed for a user-supplied accuracy requirement in automatic mode. An algorithm of such global error control and other technical particulars are also discussed here. Numerical examples confirm efficiency of the presented error estimation and stepsize control algorithm on a number of test problems with known exact solutions, including nonstiff, stiff, very stiff and large-scale differential equations. A comparison with the well-known stiff solver RODAS is also shown. 相似文献