首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
A processor is balanced in carrying out a computation if its computing time equals its I/O time. When the computation bandwidth of a processor is increased, like when multiple processors are incorporated to form an array, the critical question is to what degree the processor's memory must be enlarged in order to alleviate the I/O bottleneck to keep the computation balanced. In this paper, for the sorting problem, we present two balanced algorithms on linearly connected and mesh-connected processor arrays, respectively, and show that they reach the derived lower bounds of memory sizes. We also verify that the time complexities of the algorithms are optimal under their respective hardware constraints.  相似文献   

2.
We consider the problem of factoring a dense n×n matrix on a network consisting of P MIMD processors, with no shared memory, when the network is smaller than the number of elements in the matrix (P<n2). The specific example analyzed is a computational network that arises in computing the LU, QR, or Cholesky factorizations. We prove that if the nodes of the network are evenly distributed among processors and if computations are scheduled by a round-robin or a least-recently-executed scheduling algorithm, then optimal order of speedup is achieved. However, such speedup is not necessarily achieved for other scheduling algorithms or if the computation for the nodes is inappropriately split across processors, and we give examples of these phenomena. Lower bounds on execution time for the algorithm are established for two important node-assignment strategies.  相似文献   

3.
Computations with univariate polynomials, like the evaluation of product, quotient, remainder, greatest common divisor, etc, are closely related to linear algebra computations performed with structured matrices having the Toeplitz-like or the Hankel-like structures.

The discrete Fourier transform, and the FFT algorithms for its computation, constitute a powerful tool for the design and analysis of fast algorithms for solving problems involving polynomials and structured matrices.

We recall the main correlations between polynomial and matrix computations and present two recent results in this field: in particular, we show how Fourier methods can speed up the solution of a wide class of problems arising in queueing theory where certain Markov chains, defined by infinite block Toeplitz matrices in generalized Hessenberg form, must be solved. Moreover, we present a new method for the numerical factorization of polynomials based on a matrix generalization of Koenig's theorem. This method, that relies on the evaluation/interpolation technique at the Fourier points, reduces the problem of polynomial factorization to the computation of the LU decomposition of a banded Toeplitz matrix with its rows and columns suitably permuted. Numerical experiments that show the effectiveness of our algorithms are presented  相似文献   

4.
In this paper it is shown that Neville elimination is suited to exploit the rank structure of an order-r quasiseparable matrix ACn×n by providing a condensed decomposition of A as product of unit bidiagonal matrices, all together specified by O(nr) parameters, at the cost of O(nr3) flops. An application of this result for eigenvalue computation of totally positive rank-structured matrices is also presented.  相似文献   

5.
Balancing a matrix by a simple and accurate similarity transformation can improve the speed and accuracy of numerical methods for computing eigenvalues. We describe balancing strategies for a large and sparse Hamiltonian matrix H.It is first shown how to permute H to irreducible form while retaining its structure. This form can be used to decompose the Hamiltonian eigenproblem into smaller-sized problems. Next, we discuss the computation of a symplectic scaling matrix D so that the norm of D−1HD is reduced. The considered scaling algorithm is solely based on matrix-vector products and thus particularly suitable if the elements of H are not explicitly given. The merits of balancing for eigenvalue computations are illustrated by several practically relevant examples.  相似文献   

6.
The aim of this paper is to gain more insight into vector and matrix medians and to investigate algorithms to compute them. We prove relations between vector and matrix means and medians, particularly regarding the classical structure tensor. Moreover, we examine matrix medians corresponding to different unitarily invariant matrix norms for the case of symmetric 2×2 matrices, which frequently arise in image processing. Our findings are explained and illustrated by numerical examples. To solve the corresponding minimization problems, we propose several algorithms. Existing approaches include Weiszfeld’s algorithm for the computation of ?2 vector medians and semi-definite programming, in particular, second order cone programming, which has been used for matrix median computation. In this paper, we adapt Weiszfeld’s algorithm for our setting and show that also two splitting methods, namely the alternating direction method of multipliers and the parallel proximal algorithm, can be applied for generalized vector and matrix median computations. Besides, we compare the performance of these algorithms numerically and apply them within local median filters.  相似文献   

7.
Modern computers have several levels of memory hierarchy. To obtain good performance on these processors it is necessary to design algorithms that minimize I/O traffic to slower memories in the hierarchy. In this paper, we present I/O efficient algorithms to pebble r-pyramids and derive lower bounds on the number of I/O operations to do so. The r-pyramid graph models financial applications which are of practical interest and where minimizing memory traffic can have a significant impact on cost saving.  相似文献   

8.
An algorithm for finding the K best cuts in a network is presented. Using a branch technique introduced by Lawler [4] we reduce the problem to K computations of 2nd best cuts. The latter problem can be solved by an O(n4) algorithm yielding an overall complexity of O(K·n4) for the presented algorithm.  相似文献   

9.
This paper presents a fast and flexible heuristic for the multi item capacitated lotsizing problem. The lotsizing step requires only O(NT) computations which is at least an order of magnitude faster than other well-known heuristics. The method is flexible since it allows the user to specify features such as how to sort items, which criterion to use in combining lots and how to search the demand matrix. Extensive computational results show that the new method outperforms existing heuristics both in terms of average solution quality and required computation times.  相似文献   

10.
An algorithm is presented in this paper by which the rth root of real or complex matrices can be found without the computation of the eigenvalues and eigenvectors of the matrix. All required computations are in the real domain. The method is based on the Newton-Raphson algorithm and is capable of finding roots even when the matrix is defective. Computing the root of a matrix from eigenvalues and eigenvectors would be the preferred method if these data were available.  相似文献   

11.
Models of synchronized parallel computation in which all the processors have access to a common memory are considered. We focus on algorithms in models that allow simultaneous access to the same memory location, for both read and write instructions. Assume that such an algorithm uses p processors, d time units, and s memory space. We present a universal algorithm that implements this algorithm in models that forbid simultaneous access to the same memory location, using p processors, O(dlog2p) time units, and O (s + p) memory space his implementation algorithm is shown to compare favorably with its conventional naive counterpart, as the extra memory space it requires is independent of the implemented algorithm.  相似文献   

12.
An order O(2n) algorithm for computing all the principal minors of an arbitrary n × n complex matrix is motivated and presented, offering an improvement by a factor of n3 over direct computation. The algorithm uses recursive Schur complementation and submatrix extraction, storing the answer in a binary order. An implementation of the algorithm in MATLAB® is also given and practical considerations are discussed and treated accordingly.  相似文献   

13.
A fast solution algorithm is proposed for solving block banded block Toeplitz systems with non-banded Toeplitz blocks. The algorithm constructs the circulant transformation of a given Toeplitz system and then by means of the Sherman-Morrison-Woodbury formula transforms its inverse to an inverse of the original matrix. The block circulant matrix with Toeplitz blocks is converted to a block diagonal matrix with Toeplitz blocks, and the resulting Toeplitz systems are solved by means of a fast Toeplitz solver.The computational complexity in the case one uses fast Toeplitz solvers is equal to ξ(m,n,k)=O(mn3)+O(k3n3) flops, there are m block rows and m block columns in the matrix, n is the order of blocks, 2k+1 is the bandwidth. The validity of the approach is illustrated by numerical experiments.  相似文献   

14.
15.
We propose Fourier transform algorithms using QTT format for data-sparse approximate representation of one- and multi-dimensional vectors (m-tensors). Although the Fourier matrix itself does not have a low-rank QTT representation, it can be efficiently applied to a vector in the QTT format exploiting the multilevel structure of the Cooley-Tukey algorithm. The m-dimensional Fourier transform of an n×?×n vector with n=2 d has $\mathcal{O}(m d^{2} R^{3})$ complexity, where R is the maximum QTT-rank of input, output and all intermediate vectors in the procedure. For the vectors with moderate R and large n and m the proposed algorithm outperforms the $\mathcal{O}(n^{m} \log n)$ fast Fourier transform (FFT) algorithm and has asymptotically the same log-squared complexity as the superfast quantum Fourier transform (QFT) algorithm. By numerical experiments we demonstrate the examples of problems for which the use of QTT format relaxes the grid size constrains and allows the high-resolution computations of Fourier images and convolutions in higher dimensions without the ‘curse of dimensionality’. We compare the proposed method with Sparse Fourier transform algorithms and show that our approach is competitive for signals with small number of randomly distributed frequencies and signals with limited bandwidth.  相似文献   

16.
A review is given of some recent developments in the differential geometry of quantum computation for which the quantum evolution is described by the special unitary unimodular group, SU(2n). Using the Lie algebra su(2n), detailed derivations are given of a useful Riemannian geometry of SU(2n), including the connection and the geodesic equation for minimal complexity quantum computations.  相似文献   

17.
We investigate the complexity ofhalf-space range searching: givenn points ind-space, build a data structure that allows us to determine efficiently how many points lie in a query half-space. We establish a tradeoff between the storagem and the worst-case query timet in the Fredman/Yao arithmetic model of computation. We show thatt must be at least on the order of $$\frac{{(n/\log n)^{1 - (d - 1)/d(d + 1)} }}{{m^{1/d} }}$$ Although the bound is unlikely to be optimal, it falls reasonably close to the recent upper bound ofO(n/m 1/d ) established by Matou?ek. We also show that it is possible to devise a sequence ofn inserts and half-space range queries that require a total time ofn 2-O(1/d) . Our results imply the first nontrivial lower bounds for spherical range searching in any fixed dimension. For example, they show that, with linear storage, circular range queries in the plane require Ω(n 1/3) time (modulo a logarithmic factor).  相似文献   

18.
Bender-Canfield showed that a plethora of graph counting problems in orientable/non-orientable surfaces involve two constants tg and pg for the orientable and the non-orientable case, respectively. T.T.Q. Le and the authors recently discovered a hidden relation between the sequence tg and a formal power series solution u(z) of the Painlevé I equation which, among other things, allows to give exact asymptotic expansion of tg to all orders in 1/g for large g. The paper introduces a formal power series solution v(z) of a Riccati equation, gives a non-linear recursion for its coefficients and an exact asymptotic expansion to all orders in g for large g, using the theory of Borel transforms. In addition, we conjecture a precise relation between the sequence pg and v(z). Our conjecture is motivated by the enumerative aspects of a quartic matrix model for real symmetric matrices, and the analytic properties of its double scaling limit. In particular, the matrix model provides a computation of the number of rooted quadrangulations in the 2-dimensional projective plane. Our conjecture implies analyticity of the O(N)- and Sp(N)-types of free energy of an arbitrary closed 3-manifold in a neighborhood of zero. Finally, we give a matrix model calculation of the Stokes constants, pose several problems that can be answered by the Riemann-Hilbert approach, and provide ample numerical evidence for our results.  相似文献   

19.
Given a set S of n points in R3, we wish to decide whether S has a subset of size at least k with Euclidean diameter at most r. It is unknown whether this decision problem is NP-hard. The two closely related optimization problems, (i) finding a largest subset of diameter at most r, and (ii) finding a subset of the smallest diameter of size at least k, were recently considered by Afshani and Chan. For maximizing the size, they presented several polynomial-time algorithms with constant approximation factors, the best of which has a factor of . For maximizing the diameter, they presented a polynomial-time approximation scheme. In this paper, we present improved approximation algorithms for both optimization problems. For maximizing the size, we present two algorithms: the first one improves the approximation factor to 2.5 and the running time by an O(n) factor; the second one improves the approximation factor to 2 and the running time by an O(n2) factor. For minimizing the diameter, we improve the running time of the PTAS from O(nlogn+2O(1/ε3)n) to O(nlogn+2O(1/(ε1.5logε))n).  相似文献   

20.
This paper considers the problem of selecting a subset of N projects subject to multiple resource constraints. The objective is to maximize the net present value of the total return, where the return from each project depends on its completion time and the previously implemented projects. It is observed that the optimal order (sequence) of projects does not depend on the particular subset of selected projects and hence the overall problem reduces to a multiconstraint nonlinear integer (0–1) problem. This problem can be solved optimally in pseudo-polynomial time using a dynamic programming method but the method is impractical except when the number of constraints, K, is very small. We therefore propose two heuristic methods which require O(N3K) and O(N4K) computations, respectively, and evaluate them computationally on 640 problems with up to 100 projects and up to 8 constraints. The results show that the best heuristic is on the average within 0.08% of the optimum for the single-constraint case and within 2.61% of the optimum for the multiconstraint case.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号