共查询到20条相似文献,搜索用时 0 毫秒
1.
Raymond Chambers Hukum Chandra 《Journal of computational and graphical statistics》2013,22(2):452-470
Random effects models for hierarchically dependent data, for example, clustered data, are widely used. A popular bootstrap method for such data is the parametric bootstrap based on the same random effects model as that used in inference. However, it is hard to justify this type of bootstrap when this model is known to be an approximation. In this article, we describe a random effect block bootstrap approach for clustered data that is simple to implement, free of both the distribution and the dependence assumptions of the parametric bootstrap, and is consistent when the mixed model assumptions are valid. Results based on Monte Carlo simulation show that the proposed method seems robust to failure of the dependence assumptions of the assumed mixed model. An application to a realistic environmental dataset indicates that the method produces sensible results. Supplementary materials for the article, including the data used for the application, are available online. 相似文献
2.
Asymptotic Properties of a Class of Mixture Models for Failure Data: The Interior and Boundary Cases
H. T. V. Vu R. A. Maller X. Zhou 《Annals of the Institute of Statistical Mathematics》1998,50(4):627-653
We analyse an exponential family of distributions which generalises the exponential distribution for censored failure time data, analogous to the way in which the class of generalised linear models generalises the normal distribution. The parameter of the distribution depends on a linear combination of covariates via a possibly nonlinear link function, and we allow another level of heterogeneity: the data may contain "immune" individuals who are not subject to failure. Thus the data is modelled by a mixture of a distribution from the exponential family and a "mass at infinity" representing individuals who never fail. Our results include large sample distributions for parameter estimators and for hypothesis test statistics obtained by maximising the likelihood of a sample. The asymptotic distribution of the likelihood ratio test statistic for the hypothesis that there are no immunes present in the population is shown to be "non-standard"; it is a 50-50 mixture of a chi-squared distribution on 1 degree of freedom and a point mass at 0. Our analysis clearly shows how "negligibility" of individual covariate values and "sufficient followup" conditions are required for the asymptotic properties. 相似文献
3.
Jian-Jian Ren 《Annals of the Institute of Statistical Mathematics》2001,53(3):498-516
We propose a procedure to construct the empirical likelihood ratio confidence interval for the mean using a resampling method. This approach leads to the definition of a likelihood function for censored data, called weighted empirical likelihood function. With the second order expansion of the log likelihood ratio, a weighted empirical likelihood ratio confidence interval for the mean is proposed and shown by simulation studies to have comparable coverage accuracy to alternative methods, including the nonparametric bootstrap-t. The procedures proposed here apply in a unified way to different types of censored data, such as right censored data, doubly censored data and interval censored data, and computationally more efficient than the bootstrap-t method. An example of a set of doubly censored breast cancer data is presented with the application of our methods. 相似文献
4.
Jonathan Fintzi Xiang Cui Jon Wakefield 《Journal of computational and graphical statistics》2017,26(4):918-929
Stochastic epidemic models describe the dynamics of an epidemic as a disease spreads through a population. Typically, only a fraction of cases are observed at a set of discrete times. The absence of complete information about the time evolution of an epidemic gives rise to a complicated latent variable problem in which the state space size of the epidemic grows large as the population size increases. This makes analytically integrating over the missing data infeasible for populations of even moderate size. We present a data augmentation Markov chain Monte Carlo (MCMC) framework for Bayesian estimation of stochastic epidemic model parameters, in which measurements are augmented with subject-level disease histories. In our MCMC algorithm, we propose each new subject-level path, conditional on the data, using a time-inhomogenous continuous-time Markov process with rates determined by the infection histories of other individuals. The method is general, and may be applied to a broad class of epidemic models with only minimal modifications to the model dynamics and/or emission distribution. We present our algorithm in the context of multiple stochastic epidemic models in which the data are binomially sampled prevalence counts, and apply our method to data from an outbreak of influenza in a British boarding school. Supplementary material for this article is available online. 相似文献
5.
《Journal of computational and graphical statistics》2013,22(3):658-674
We introduce a class of spatiotemporal models for Gaussian areal data. These models assume a latent random field process that evolves through time with random field convolutions; the convolving fields follow proper Gaussian Markov random field (PGMRF) processes. At each time, the latent random field process is linearly related to observations through an observational equation with errors that also follow a PGMRF. The use of PGMRF errors brings modeling and computational advantages. With respect to modeling, it allows more flexible model structures such as different but interacting temporal trends for each region, as well as distinct temporal gradients for each region. Computationally, building upon the fact that PGMRF errors have proper density functions, we have developed an efficient Bayesian estimation procedure based on Markov chain Monte Carlo with an embedded forward information filter backward sampler (FIFBS) algorithm. We show that, when compared with the traditional one-at-a-time Gibbs sampler, our novel FIFBS-based algorithm explores the posterior distribution much more efficiently. Finally, we have developed a simulation-based conditional Bayes factor suitable for the comparison of nonnested spatiotemporal models. An analysis of the number of homicides in Rio de Janeiro State illustrates the power of the proposed spatiotemporal framework. Supplemental materials for this article are available online in the journal’s webpage. 相似文献
6.
保险损失数据的一个重要特点是尖峰厚尾性,即既有大量的小额损失,又有少量的高额损失,使得通常的损失分布模型很难拟合此类数据,从而出现了对各种损失分布模型进行改进的尝试.改进后的模型一方面要有较高的峰度,另一方面又要有较厚的尾部.最近几年文献中出现的改进模型主要是组合模型,即把一个具有非零众数的模型(如对数正态分布或威布尔分布)与一个厚尾分布模型(如帕累托分布或广义帕累托分布)进行组合.讨论了这些组合模型的性质和特点,并与偏t正态分布和偏t分布进行了比较分析,最后应用MCMC方法估计模型参数,并通过一个实际损失数据的拟合分析,表明偏t分布对尖峰厚尾损失数据的拟合要优于目前已经提出的各种组合模型. 相似文献
7.
Yiwen Zhang Hua Zhou Jin Zhou Wei Sun 《Journal of computational and graphical statistics》2017,26(1):1-13
Data with multivariate count responses frequently occur in modern applications. The commonly used multinomial-logit model is limiting due to its restrictive mean-variance structure. For instance, analyzing count data from the recent RNA-seq technology by the multinomial-logit model leads to serious errors in hypothesis testing. The ubiquity of overdispersion and complicated correlation structures among multivariate counts calls for more flexible regression models. In this article, we study some generalized linear models that incorporate various correlation structures among the counts. Current literature lacks a treatment of these models, partly because they do not belong to the natural exponential family. We study the estimation, testing, and variable selection for these models in a unifying framework. The regression models are compared on both synthetic and real RNA-seq data. Supplementary materials for this article are available online. 相似文献
8.
纵向数据是在实际应用中很常见的一种数据类型,在解决实际问题时建立纵向数据模型,进行统计分析很实用。本文研究一类重要的纵向数据下部分线性回归模型,所分析的纵向数据是随机观测而得到的,根据纵向数据的特性构造模型中未知参数分量和未知函数的估计量,进而研究了估计量的渐近性质,通过实例分析,证实了该方法的有效性和可操作性,有很好的使用价值。 相似文献
9.
Efficient Bayesian Inference for Multivariate Probit Models With Sparse Inverse Correlation Matrices
Aline Talhouk Arnaud Doucet Kevin Murphy 《Journal of computational and graphical statistics》2013,22(3):739-757
We propose a Bayesian approach for inference in the multivariate probit model, taking into account the association structure between binary observations. We model the association through the correlation matrix of the latent Gaussian variables. Conditional independence is imposed by setting some off-diagonal elements of the inverse correlation matrix to zero and this sparsity structure is modeled using a decomposable graphical model. We propose an efficient Markov chain Monte Carlo algorithm relying on a parameter expansion scheme to sample from the resulting posterior distribution. This algorithm updates the correlation matrix within a simple Gibbs sampling framework and allows us to infer the correlation structure from the data, generalizing methods used for inference in decomposable Gaussian graphical models to multivariate binary observations. We demonstrate the performance of this model and of the Markov chain Monte Carlo algorithm on simulated and real datasets. This article has online supplementary materials. 相似文献
10.
Cyrus R. Mehta Nitin Patel Pralay Senchaudhuri 《Journal of computational and graphical statistics》2013,22(1):21-40
Abstract We present an efficient algorithm for generating exact permutational distributions for linear rank statistics defined on stratified 2 × c contingency tables. The algorithm can compute exact p values and confidence intervals for a rich class of nonparametric problems. These include exact p values for stratified two-population Wilcoxon, Logrank, and Van der Waerden tests, exact p values for stratified tests of trend across several binomial populations, exact p values for stratified permutation tests with arbitrary scores, and exact confidence intervals for odds ratios embedded in stratified 2 × c tables. The algorithm uses network-based recursions to generate stratum-specific distributions and then combines them into an overall permutation distribution by convolution. Where only the tail area of a permutation distribution is desired, additional efficiency gains are achieved by backward induction and branch-and-bound processing of the network. The algorithm is especially efficient for highly imbalanced categorical data, a situation where the asymptotic theory is unreliable. The backward induction component of the algorithm can also be used to evaluate the conditional maximum likelihood, and its higher order derivatives, for the logistic regression model with grouped data. We illustrate the techniques with an analysis of two data sets: The leukemia data on survivors of the Hiroshima atomic bomb and data from an animal toxicology experiment provided by the U.S. Food and Drug Administration. 相似文献
11.
Bo-Cheng Wei Jian-Qing Shi Wing-Kam Fung Yue-Qing Hu 《Annals of the Institute of Statistical Mathematics》1998,50(2):277-294
A diagnostic model and several new diagnostic statistics are proposed for testing for varying dispersion in exponential family nonlinear models. A score statistic and an adjusted score statistic based on Cox and Reid (1987, J. Roy. Statist. Soc. Ser. B, 55, 467-471) are derived in normal, inverse Gaussian, and gamma nonlinear models. An adjusted likelihood ratio statistic is also given for normal and inverse Gaussian nonlinear models. The results of simulation studies are presented, which show that the adjusted tests keep their sizes better and are more powerful than the ordinary tests. 相似文献
12.
In this paper,we study the large-time behavior of periodic solutions for parabolic conservation laws.There is no smallness assumption on the initial data.We firstly get the local existence of the solution by the iterative scheme,then we get the exponential decay estimates for the solution by energy method and maximum principle,and obtain the global solution in the same time. 相似文献
13.
基于纵向数据研究非参数模型y=f(t)+ε,其中f(·)为未知平滑函数,ε为零均值随机误差项.利用截断幂函数基对f(·)进行基函数展开近似,并且结合惩罚样条的方法构造关于基函数系数的惩罚修正二次推断函数.然后利用割线法迭代得到基函数系数估计的数值解,从而得到未知平滑函数的估计.理论证明,应用此方法所得到的基函数系数估计具有相合性和渐近正态性.最后通过数值方法得到了较好的拟合结果. 相似文献
14.
适应于纵数据的随机效应模型中参数的局部影响诊断 总被引:3,自引:0,他引:3
本根据纵数据既包含个体又包含个体不同状态的特点,针对适应于纵数据的随机效应模型提出两种便于合理分析数据的扰动方案,并给出扰动对参数估计局部影响的各种计算公式和寻找影响点的方法。通过对Cambridge过滤嘴中提取尼古丁含量的实验室间数据进行分析表明我们的分析结果不但包含了以前许多学用不同的方法对这组数据所进行的所有有关影响点方面的分析结果,而且还获得了一些新的结果。 相似文献
15.
广义部分线性模型是广义线性模型和部分线性模型的推广,是一种应用广泛的半参数模型.本文讨论的是该模型在线性协变量和响应变量均存在非随机缺失数据情形下参数的Bayes估计和基于Bayes因子的模型选择问题,在分析过程中,采用了惩罚样条来估计模型中的非参数成分,并建立了Bayes层次模型;为了解决Gibbs抽样过程中因参数高度相关带来的混合性差以及因维数增加导致出现不稳定性的问题,引入了潜变量做为添加数据并应用了压缩Gibbs抽样方法,改进了收敛性;同时,为了避免计算多重积分,利用了M-H算法估计边缘密度函数后计算Bayes因子,为模型的选择比较提供了一种准则.最后,通过模拟和实例验证了所给方法的有效性. 相似文献
16.
Zdravko I. Botev Dirk P. Kroese 《Methodology and Computing in Applied Probability》2008,10(3):435-451
We propose a new method for density estimation of categorical data. The method implements a non-asymptotic data-driven bandwidth selection rule and provides model sparsity not present in the standard kernel density estimation method. Numerical experiments with a well-known ten-dimensional binary medical data set illustrate the effectiveness of the proposed approach for density estimation, discriminant analysis and classification. Supported by the Australian Research Council, under grant number DP0558957. 相似文献
17.
18.
在协变量和反映变量都缺失下,构造了线性模型中反映变量均值的经验似然置信区间,数据模拟表明调整的经验似然置信区间有较好的覆盖率和精度,进一步完善了缺失数据下对线性模型的研究. 相似文献
19.
刻画纵向数据协方差结构有三种可能因素 ,即序列相关 (特别是一阶自相关 )、随机效应和常规的随机误差 (Diggleetal,2 0 0 2 ) .本文研究非线性纵向数据模型的自相关性和随机效应存在性的单个和联合检验 ,得到了检验的score统计量 ,并利用血浆药物渗透数据 (Davidian&Gilinan ,1 995)说明检验方法的应用 . 相似文献