首页 | 本学科首页   官方微博 | 高级检索  
     检索      


A parameterized ordering for cache-, register- and pipeline-efficient Givens QR decomposition
Authors:Carrig  James J  Meyer  Gerard GL
Institution:(1) Sony Electronics, Inc., Santa Clara, CA 95054, USA;(2) Johns Hopkins University, Baltimore, MD 21218, USA
Abstract:A parameterized ordering of Givens rotations and guidelines for choosing parameter values is presented in the context of QR decomposition. Although a standard selection of parameter values retrieves an ordering that corresponds to a well-known algorithm, we show that non-standard values decrease the execution time. We implement the new ordering on an Intel Pentium Pro system, a single thin POWER2 processor of the IBM SP2, and a single R8000 processor of the SGI POWER Challenge XL. On each machine, we observe performance that is more than twice that of the original ordering. This revised version was published online in June 2006 with corrections to the Cover Date.
Keywords:Givens  QR algorithm  superscalar processors  65F05  65F25  65Y10
本文献已被 SpringerLink 等数据库收录!
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号