首页 | 本学科首页   官方微博 | 高级检索  
     检索      


Sample size selection in optimization methods for machine learning
Authors:Richard H Byrd  Gillian M Chin  Jorge Nocedal  Yuchen Wu
Institution:1. Department of Computer Science, University of Colorado, Boulder, CO, USA
2. Department of Industrial Engineering and Management Sciences, Northwestern University, Evanston, IL, USA
3. Google Inc., Mountain View, CA, USA
Abstract:This paper presents a methodology for using varying sample sizes in batch-type optimization methods for large-scale machine learning problems. The first part of the paper deals with the delicate issue of dynamic sample selection in the evaluation of the function and gradient. We propose a criterion for increasing the sample size based on variance estimates obtained during the computation of a batch gradient. We establish an complexity bound on the total cost of a gradient method. The second part of the paper describes a practical Newton method that uses a smaller sample to compute Hessian vector-products than to evaluate the function and the gradient, and that also employs a dynamic sampling technique. The focus of the paper shifts in the third part of the paper to L 1-regularized problems designed to produce sparse solutions. We propose a Newton-like method that consists of two phases: a (minimalistic) gradient projection phase that identifies zero variables, and subspace phase that applies a subsampled Hessian Newton iteration in the free variables. Numerical tests on speech recognition problems illustrate the performance of the algorithms.
Keywords:
本文献已被 SpringerLink 等数据库收录!
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号