首页 | 本学科首页   官方微博 | 高级检索  
     检索      


Regression Models for Multivariate Count Data
Authors:Yiwen Zhang  Hua Zhou  Jin Zhou  Wei Sun
Institution:1. Department of Statistics, North Carolina State University, Raleigh, North Carolina;2. Department of Biostatistics, University of California, Los Angeles, California;3. Division of Epidemiology and Biostatistics, University of Arizona, Tucson, Arizona;4. Program in Biostatistics and Biomathematics, Fred Hutchinson Cancer Research Center, Seattle, Washington
Abstract:Data with multivariate count responses frequently occur in modern applications. The commonly used multinomial-logit model is limiting due to its restrictive mean-variance structure. For instance, analyzing count data from the recent RNA-seq technology by the multinomial-logit model leads to serious errors in hypothesis testing. The ubiquity of overdispersion and complicated correlation structures among multivariate counts calls for more flexible regression models. In this article, we study some generalized linear models that incorporate various correlation structures among the counts. Current literature lacks a treatment of these models, partly because they do not belong to the natural exponential family. We study the estimation, testing, and variable selection for these models in a unifying framework. The regression models are compared on both synthetic and real RNA-seq data. Supplementary materials for this article are available online.
Keywords:Analysis of deviance  Categorical data analysis  Dirichlet-multinomial  Generalized Dirichlet-multinomial  Iteratively reweighted Poisson regression (IRPR)  Negative multinomial  Reduced rank GLM  Regularization
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号