首页 | 本学科首页   官方微博 | 高级检索  
     检索      


Data compression and histograms
Authors:Bin Yu  T P Speed
Institution:(1) Department of Statistics, University of Wisconsin, 53706 Madison, WI, USA;(2) Department of Statisties, University of California, 94720 Berkeley, CA, USA
Abstract:Summary In this paper, the relationship between code length and the selection of the number of bins for a histogram density is considered for a sequence of iid observations on 0,1]. First, we use a shortest code length criterion to select the number of bins for a histogram. A uniform almost sure asymptotic expansion for the code length is given and it is used to prove the asymptotic optimality of the selection rule. In addition, the selection rule is consistent if the true density is uniform 0,1]. Secondly, we deal with the problem: what is the ldquobestrdquo achievable average code length with underlying density functionf? Minimax lower bounds are derived for the average code length over certain smooth classes of underlying densitiesf. For the smooth class with bounded first derivatives, the rate in the lower bound is shown to be achieved by a code based on a sequence of histograms whose number of bins is changed predictively. Moreover, this best code can be modified to ensure that the almost sure version of the code length has asymptotically the same behavior as its expected value, i.e., the average code length.Research supported in part by NSF grant DMS-8701426Research supported in part by NSF grant DMS-8802378
Keywords:60G05  94A99  94A17
本文献已被 SpringerLink 等数据库收录!
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号