首页 | 本学科首页   官方微博 | 高级检索  
     检索      


Lanczos and the Riemannian SVD in information retrieval applications
Authors:Ricardo D Fierro  Eric P Jiang
Abstract:Variations of the latent semantic indexing (LSI) method in information retrieval (IR) require the computation of singular subspaces associated with the k dominant singular values of a large m × n sparse matrix A, where k?min(m,n). The Riemannian SVD was recently generalized to low‐rank matrices arising in IR and shown to be an effective approach for formulating an enhanced semantic model that captures the latent term‐document structure of the data. However, in terms of storage and computation requirements, its implementation can be much improved for large‐scale applications. We discuss an efficient and reliable algorithm, called SPK‐RSVD‐LSI, as an alternative approach for deriving the enhanced semantic model. The algorithm combines the generalized Riemannian SVD and the Lanczos method with full reorthogonalization and explicit restart strategies. We demonstrate that our approach performs as well as the original low‐rank Riemannian SVD method by comparing their retrieval performance on a well‐known benchmark document collection. Copyright 2004 John Wiley & Sons, Ltd.
Keywords:information retrieval  latent semantic indexing  Lanczos method  singular value decomposition  sparse
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号