首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
The ability of the modern graphics processors to operate on large matrices in parallel can be exploited for solving constrained image deblurring problems in a short time. In particular, in this paper we propose the parallel implementation of two iterative regularization methods: the well known expectation maximization algorithm and a recent scaled gradient projection method. The main differences between the considered approaches and their impact on the parallel implementations are discussed. The effectiveness of the parallel schemes and the speedups over standard CPU implementations are evaluated on test problems arising from astronomical images.  相似文献   

2.
Multistage stochastic linear programs can represent a variety of practical decision problems. Solving a multistage stochastic program can be viewed as solving a large tree of linear programs. A common approach for solving these problems is the nested decomposition algorithm, which moves up down the tree by solving nodes and passing information among nodes. The natural independence of subtrees suggests that much of the computational effort of the nested decomposition algorithm can run in parallel across small numbers of fast processors. This paper explores the advantages of such parallel implementations over serial implementations and compares alternative sequencing protocols for parallel processors. Computational experience on a large test set of practical problems with up to 1.5 million constraints and almost 5 million variables suggests that parallel implementations may indeed work well, but they require careful attention to processor load balancing. Supported in part by the National Science Foundation under Grants DDM-9215921 and SES-9211937.  相似文献   

3.
The inherent structure of cellular automata is trivially parallelizable and can directly benefit from massively parallel machines in computationally intensive problems. This paper presents both block synchronous and block pipeline (with asynchronous message passing) parallel implementations of cellular automata on distributed memory (message-passing) architectures. A structural design problem is considered to study the performance of the various cellular automata implementations. The synchronous parallel implementation is a mixture of Jacobi and Gauss–Seidel style iteration, where it becomes more Jacobi like as the number of processors increases. Therefore, it exhibits divergence because of the mathematical characteristics of Jacobi iteration matrix for the structural problem as the number of processors increases. The proposed pipeline implementation preserves convergence by simulating a pure Gauss–Seidel style row-wise iteration. Numerical results for analysis and design of a cantilever plate made of composite material show that the pipeline update scheme is convergent and successfully generates optimal designs.  相似文献   

4.
In this paper we deal with the solution of the separable convex cost network flow problem. In particular, we propose a parallel asynchronous version of the -relaxation method and we prove theoretically its correctness.We present two implementations of the parallel method for a shared memory multiprocessor system, and we empirically analyze their numerical performance on different test problems. The preliminary numerical results show a good reduction of the execution time of the parallel algorithm with the respect to the sequential counterpart.  相似文献   

5.
The simplex method is frequently the most efficient method of solving linear programming (LP) problems. This paper reviews previous attempts to parallelise the simplex method in relation to efficient serial simplex techniques and the nature of practical LP problems. For the major challenge of solving general large sparse LP problems, there has been no parallelisation of the simplex method that offers significantly improved performance over a good serial implementation. However, there has been some success in developing parallel solvers for LPs that are dense or have particular structural properties. As an outcome of the review, this paper identifies scope for future work towards the goal of developing parallel implementations of the simplex method that are of practical value.  相似文献   

6.
《Optimization》2012,61(1-2):63-73
Serial and parallel implementations of the interior dual proximal point algorithm for the solution of large linear programs are described. A preconditioned conjugate gradient method is used to solve the linear system of equations that arises at each interior point interation. Numerical results for a set of multicommodity network flow problems are given. For larger problem preconditioned conjugate gradient method outperforms direct methods of solution. In fact it is impossible to handle very large problems by direct methods  相似文献   

7.
PARALLEL IMPLEMENTATIONS OF THE FAST SWEEPING METHOD   总被引:2,自引:0,他引:2  
The fast sweeping method is an efficient iterative method for hyperbolic problems. It combines Gauss-Seidel iterations with alternating sweeping orderings. In this paper several parallel implementations of the fast sweeping method are presented. These parallel algorithms are simple and efficient due to the causality of the underlying partial different equations. Numerical examples are used to verify our algorithms.  相似文献   

8.
A parallel algorithm is proposed for the solution of narrow banded non‐symmetric linear systems. The linear system is partitioned into blocks of rows with a small number of unknowns common to multiple blocks. Our technique yields a reduced system defined only on these common unknowns which can then be solved by a direct or iterative method. A projection based extension to this approach is also proposed for computing the reduced system implicitly, which gives rise to an inner–outer iteration method. In addition, the product of a vector with the reduced system matrix can be computed efficiently on a multiprocessor by concurrent projections onto subspaces of block rows. Scalable implementations of the algorithm can be devized for hierarchical parallel architectures by exploiting the two‐level parallelism inherent in the method. Our experiments indicate that the proposed algorithm is a robust and competitive alternative to existing methods, particularly for difficult problems with strong indefinite symmetric part. Copyright © 2001 John Wiley & Sons, Ltd.  相似文献   

9.
10.
We introduce a master–worker framework for parallel global optimization of computationally expensive functions using response surface models. In particular, we parallelize two radial basis function (RBF) methods for global optimization, namely, the RBF method by Gutmann [Gutmann, H.M., 2001a. A radial basis function method for global optimization. Journal of Global Optimization 19(3), 201–227] (Gutmann-RBF) and the RBF method by Regis and Shoemaker [Regis, R.G., Shoemaker, C.A., 2005. Constrained global optimization of expensive black box functions using radial basis functions, Journal of Global Optimization 31, 153–171] (CORS-RBF). We modify these algorithms so that they can generate multiple points for simultaneous evaluation in parallel. We compare the performance of the two parallel RBF methods with a parallel multistart derivative-based algorithm, a parallel multistart derivative-free trust-region algorithm, and a parallel evolutionary algorithm on eleven test problems and on a 6-dimensional groundwater bioremediation application. The results indicate that the two parallel RBF algorithms are generally better than the other three alternatives on most of the test problems. Moreover, the two parallel RBF algorithms have comparable performances on the test problems considered. Finally, we report good speedups for both parallel RBF algorithms when using a small number of processors.  相似文献   

11.
The application of the Lanczos algorithm in Newton-like methods for solving non-linear systems of equations arising in nonlinear structural finite element analysis is presented. It is shown that with appropriate preconditioners iterative methods can be developed which are robust and efficient even for ill conditioned problems. Though the real advantage of iterative solvers seems to exist on distributed memory machines, even on serial machines the performance can be improved compared with direct solvers while saving memory capacity. With a specific modification of the Lanczos algorithm in combination with arc-length procedures a further speed-up of the nonlinear analysis can be achieved. For parallel implementations domain decomposition methods are used. A parallel preconditioning strategy based on an incomplete factorisation method is presented. An example is taken and the quality and efficiency of two different domain decomposition methods are discussed for a large shell structure. This work was supported by the BMBF (Bundesministerium für Bildung und Forschung) of Germany.  相似文献   

12.
Traditionally, two variants of the L-shaped method based on Benders’ decomposition principle are used to solve two-stage stochastic programming problems: the aggregate and the disaggregate version. In this study we report our experiments with a special convex programming method applied to the aggregate master problem. The convex programming method is of the type that uses an oracle with on-demand accuracy. We use a special form which, when applied to two-stage stochastic programming problems, is shown to integrate the advantages of the traditional variants while avoiding their disadvantages. On a set of 105 test problems, we compare and analyze parallel implementations of regularized and unregularized versions of the algorithms. The results indicate that solution times are significantly shortened by applying the concept of on-demand accuracy.  相似文献   

13.
This paper describes serial and parallel implementations of two different search techniques applied to the traveling salesman problem. A novel approach has been taken to parallelize simulated annealing and the results are compared with the traditional annealing algorithm. This approach uses abbreviated cooling schedule and achieves a superlinear speedup. Also a new search technique, called tabu search, has been adapted to execute in a parallel computing environment. Comparison between simulated annealing and tabu search indicate that tabu search consistently outperforms simulated annealing with respect to computation time while giving comparable solutions. Examples include 25, 33, 42, 50, 57, 75 and 100 city problems.  相似文献   

14.
Two modifications of Laguerre's method are given. They define methods for simultaneously approximating all the zeros of a given polynomial. The asymptotic behavior of the methods is studied. The possibilities of both sequential and parallel implementations of the methods are considered.This research was supported by the National Science Foundation under grant number NSF-DCR-74-10042.  相似文献   

15.
In this paper, we propose efficient parallel implementations of the auction/sequential shortest path and the -relaxation algorithms for solving the linear minimum cost flow problem. In the parallel auction algorithm, several augmenting paths can be found simultaneously, each of them starting from a different node with positive surplus. Convergence results of an asynchronous version of the algorithm are also given. For the -relaxation method, there exist already parallel versions implemented on CM-5 and CM-2; our implementation is the first on a shared memory multiprocessor. We have obtained significant speedup values for the algorithms considered; it turns out that our implementations are effective and efficient.  相似文献   

16.
The purpose of this Note is to propose a time discretization of a partial differential evolution equation that allows for parallel implementations. The method, based on an Euler scheme, combines coarse resolutions and independent fine resolutions in time in the same spirit as standard spacial approximations. The resulting parallel implementation is done in the non standard time direction. Its main goal concerns real time problems, hence the proposed terminology of “parareal” algorithm.  相似文献   

17.
The extended backward differentiation formulas (EBDFs) and theirmodified form (MEBDF) were proposed by Cash in the 1980s forsolving initial value problems (IVPs) for stiff systems of ordinarydifferential equations (ODEs). In a recent performance evaluationof various IVP solvers, including a variable-step-variable-orderimplementation of the MEBDF method by Cash, it turned out thatthe MEBDF code often performs more efficiently than codes likeRADAU5, DASSL and VODE. This motivated us to look at possibleparallel implementations of the MEBDF method. Each MEBDF stepessentially consists of successively solving three non-linearsystems by means of modified Newton iteration using the sameJacobian matrix. In a direct implementation of the MEBDF methodon a parallel computer system, the only scope for (coarse grain)parallelism consists of a number of parallel vector updates.However, all forward–backward substitutions and all right-hand-sideevaluations have to be done in sequence. In this paper, ourstarting point is the original (unmodified) EBDF method. Asa consequence, two different Jacobian matrices are involvedin the modified Newton method, but on a parallel computer system,the effective Jacobian-evaluation and the LU decomposition costsare not increased. Furthermore, we consider the simultaneoussolution, rather than the successive solution, of the threenon-linear systems, so that in each iteration the forward–backwardsubstitutions and the right-hand-side evaluations can be doneconcurrently. A mutual comparison of the performance of theparallel EBDF approach and the MEBDF approach shows that wecan expect a speed-up factor of about 2 on three processors.  相似文献   

18.
Cooperative Parallel Tabu Search for Capacitated Network Design   总被引:1,自引:0,他引:1  
We present a cooperative parallel tabu search method for the fixed charge, capacitated, multicommodity network design problem. Several communication strategies are analyzed and compared. The resulting parallel procedure displays excellent performances in terms of solution quality and solution times. The experiments show that parallel implementations find better solutions than sequential ones. They also show that, when properly designed and implemented, cooperative search outperforms independent search strategies, at least on the class of problems of interest here.  相似文献   

19.
The parallel version of precondition techniques is developed for matrices arising from the Galerkin boundary element method for two-dimensional domains with Dirichlet boundary conditions. Results were obtained for implementations on a transputer network as well as on an nCUBE-2 parallel computer showing that iterative solution methods are very well suited for a MIMD computer. A comparison of numerical results for iterative and direct solution methods is presented and underlines the superiority of iterative methods for large systems.  相似文献   

20.
A parallel algorithm for constrained concave quadratic global minimization   总被引:2,自引:0,他引:2  
The global minimization of large-scale concave quadratic problems over a bounded polyhedral set using a parallel branch and bound approach is considered. The objective function consists of both a concave part (nonlinear variables) and a strictly linear part, which are coupled by the linear constraints. These large-scale problems are characterized by having the number of linear variables much greater than the number of nonlinear variables. A linear underestimating function to the concave part of the objective is easily constructed and minimized over the feasible domain to get both upper and lower bounds on the global minimum function value. At each minor iteration of the algorithm, the feasible domain is divided into subregions and linear underestimating problems over each subregion are solved in parallel. Branch and bound techniques can then be used to eliminate parts of the feasible domain from consideration and improve the upper and lower bounds. It is shown that the algorithm guarantees that a solution is obtained to within any specified tolerance in a finite number of steps. Computational results are presented for problems with 25 and 50 nonlinear variables and up to 400 linear variables. These results were obtained on a four processor CRAY2 using both sequential and parallel implementations of the algorithm. The average parallel solution time was approximately 15 seconds for problems with 400 linear variables and a relative tolerance of 0.001. For a relative tolerance of 0.1, the average computation time appears to increase only linearly with the number of linear variables.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号