This paper describes the first algorithm to compute the greatest common divisor (GCD) of two n-bit integers using a modular representation for intermediate values U, V and also for the result. It is based on a reduction step, similar to one used in the accelerated algorithm [T. Jebelean, A generalization of the binary GCD algorithm, in: ISSAC '93: International Symposium on Symbolic and Algebraic Computation, Kiev, Ukraine, 1993, pp. 111–116; K. Weber, The accelerated integer GCD algorithm, ACM Trans. Math. Softw. 21 (1995) 111–122] when U and V are close to the same size, that replaces U by (U−bV)/p, where p is one of the prime moduli and b is the unique integer in the interval (−p/2,p/2) such that . When the algorithm is executed on a bit common CRCW PRAM with O(nlognlogloglogn) processors, it takes O(n) time in the worst case. A heuristic model of the average case yields O(n/logn) time on the same number of processors. 相似文献
Based on a modification of Moss' and Parikh's topological modal language [8], we study a generalization of a weakly expressive fragment of a certain propositional modal logic of time. We define a bimodal logic comprising operators for knowledge and nexttime. These operators are interpreted in binary computation structures. We present an axiomatization of the set T of theorems valid for this class of semantical domains and prove – as the main result of this paper – its completeness. Moreover, the question of decidability of T is treated. 相似文献
Integration of the subsurface flow equation by finite elements (FE) in space and finite differences (FD) in time requires the repeated solution to sparse symmetric positive definite systems of linear equations. Iterative techniques based on preconditioned conjugate gradients (PCG) are one of the most attractive tool to solve the problem on sequential computers. A present challenge is to make PCG attractive in a parallel computing environment as well. To this aim a key factor is the development of an efficient parallel preconditioner. FSAI (factorized sparse approximate inverse) and enlarged FSAI relying on the approximate inverse of the coefficient matrix appears to be a most promising parallel preconditioner. In the present paper PCG using FSAI, diagonal and pARMS (parallel algebraic recursive multilevel solvers) preconditioners is implemented on the IBM SP4/512 and CLX/768 supercomputers with up to 32 processors to solve underground flow problems of a large size. The results show that FSAI may allow for a parallel relative efficiency larger than 50% on the largest problems with p=32 processors. Moreover, FSAI turns out to be significantly less expensive and more robust than pARMS. Finally, it is shown that for p in the upper range may be much improved if PCG–FSAI is implemented on CLX. 相似文献
A model for parallel and distributed programs, the dynamic process graph (DPG), is investigated under graph-theoretic and complexity aspects. Such graphs embed constructors for parallel programs, synchronization mechanisms as well as conditional branches. They are capable of representing all possible executions of a parallel or distributed program in a very compact way. The size of this representation can be as small as logarithmic with respect to the size of any execution of the program.
In a preceding paper [A. Jakoby, et al., Scheduling dynamic graphs, in: Proc. 16th Symposium on Theoretical Aspects in Computer Science STACS'99, LNCS, vol. 1563, Springer, 1999, pp. 383–392] we have analysed the expressive power of the general model and various variants of it. We have considered the scheduling problem for DPGs given enough parallelism taking into account communication delays between processors when exchanging data. Given a DPG the question arises whether it can be executed (that means whether the corresponding parallel program has been specified correctly), and what is its minimum schedule length.
In this paper we study a subclass of dynamic process graphs called
-output DPGs, which are appropriate in many situations, and investigate their expressive power. In a previous paper we have shown that the problem to determine the minimum schedule length is still intractable for this subclass, namely this problem is
-complete as is the general case. Here we will investigate structural properties of the executions of such graphs. A natural graph-theoretic conjecture that executions must always split into components that are isomorphic to subgraphs turns out to be wrong. We are able to prove a weaker property. This implies a quadratic upper bound on the schedule length that may be necessary in the worst case, in contrast to the general case, where the optimal schedule length may be exponential with respect to the size of the representing DPG. Making this bound constructive, we obtain an approximation to a
-complete problem. Computing such a schedule and then executing the program can be done on a parallel machine in polynomial time in a highly distributive fashion. 相似文献
In the present paper, Daubechies' wavelets and the computation of their scaling coefficients are briefly reviewed. Then a new method of computation is proposed. This method is based on the work [7] concerning a new orthonormality condition and relations among scaling moments, respectively. For filter lengths up to 16, the arising system can be explicitly solved with algebraic methods like Gröbner bases. Its simple structure allows one to find quickly all possible solutions. 相似文献
The asymptotic correction technique of Paine, de Hoog and Anderssen can dramatically improve the accuracy of finite difference or finite element eigenvalues at negligible extra cost if closed form expressions are available for the errors in a simpler related problem. This paper gives closed form expressions for the errors in the eigenvalues of certain Sturm–Liouville problems obtained by various methods, thereby increasing the range of problems for which asymptotic correction can achieve maximum efficiency. It also investigates implementation of the method for more general problems. 相似文献
The parallel version of precondition techniques is developed for matrices arising from the Galerkin boundary element method for two-dimensional domains with Dirichlet boundary conditions. Results were obtained for implementations on a transputer network as well as on an nCUBE-2 parallel computer showing that iterative solution methods are very well suited for a MIMD computer. A comparison of numerical results for iterative and direct solution methods is presented and underlines the superiority of iterative methods for large systems. 相似文献
Many algorithms have been proposed to form manufacturing cells from component routings. However, many of these do not have the capability of solving large problems. We propose a procedure using similarity coefficients and a parallel genetic implementation of a TSP algorithm that is capable of solving large problems of up to 1000 parts and 1000 machines. In addition, we also compare our procedure with many existing procedures using nine well-known problems from the literature.
The results show that the proposed procedure compares well with the existing procedures and should be useful to practitioners and researchers. 相似文献