Abstract: | We describe the design and implementation of a parallel QR decomposition algorithm for a large sparse matrix A . The algorithm is based on the multifrontal approach and makes use of Householder transformations. The tasks are distributed among processors according to an assembly tree which is built from the symbolic factorization of the matrix A T A . We first address uniprocessor issues and then discuss the multiprocessor implementation of the method. We consider the parallelization of both the factorization phase and the solve phase. We use relaxation of the sparsity structure of both the original matrix and the frontal matrices to improve the performance. We show that, in this case, the use of Level 3 BLAS can lead to very significant gains in performance. We use the eight processor Alliant˜FX/80 at CERFACS to illustrate our discussion. |