首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 140 毫秒
1.
研究了基于固定效应的纵向数据模分位点回归模型的参数估计及统计诊断问题.首先给出了参数估计的MM迭代算法,然后讨论了统计诊断中数据删除模型(CDM)和均值移模型(MSOM)的等价性问题,最后利用消炎镇痛药数据说明了方法的应用.  相似文献   

2.
统计诊断就是对统计推断方法解决问题的全过程进行诊断,而影响分析是统计诊断中十分重要的分支.本文针对半参数广义线性模型,证明了数据删除模型和均值漂移模型的等价性定理,给出了诸如广义Cook距离等诊断统计量并研究了异常点的Score检验统计量,最后通过实例验证了本文给出的诊断方法的有效性。  相似文献   

3.
基于改进的Cholesky分解,研究分析了纵向数据下半参数联合均值协方差模型的贝叶斯估计和贝叶斯统计诊断,其中非参数部分采用B样条逼近.主要通过应用Gibbs抽样和Metropolis-Hastings算法相结合的混合算法获得模型中未知参数的贝叶斯估计和贝叶斯数据删除影响诊断统计量.并利用诊断统计量的大小来识别数据的异常点.模拟研究和实例分析都表明提出的贝叶斯估计和诊断方法是可行有效的.  相似文献   

4.
统计诊断就是探查对统计推断(如估计或预测等)有较大影响的数据从而对全过程数据进行诊断.本文应用基于数据删除模型得到二维AR(1)模型的参数估计诊断公式,给出了Cook统计量的计算公式,进而推广到m维AR(p)模型的情形.  相似文献   

5.
为了更好地拟合偏态数据,充分提取偏态数据的信息,针对偏正态数据建立了众数回归模型,并基于Pena距离统计量对众数回归模型进行统计断研究,得到了众数回归模型的Pena距离表达式以及高杠杆异常点的诊断方法.利用EM算法与梯度下降法给出了众数回归模型参数的极大似然估计,根据数据删除模型计算似然距离、Cook距离和Pena距离统计量,绘制诊断统计图.通过Monte Carlo模拟试验和实例分析比较,说明文章提出的方法行之有效,并在一定条件下Pena距离对异常点或强影响点的诊断优于似然距离和Cook距离.  相似文献   

6.
半参数广义线性随机效应模型的影响分析   总被引:1,自引:0,他引:1       下载免费PDF全文
该文系统研究了半参数广义线性随机效应模型的统计诊断与影响分析方法, 证明了数据删除模型和均值漂移模型的等价性定理, 给出了广义Cook距离等诊断统计量及异常点的Score检验统计量并研究了该模型的局部影响分析,分别对加权扰动模型, 响应变量扰动模型得到了影响距阵的计算公式, 最后通过一个实例验证了文中给出诊断方法的有效性.  相似文献   

7.
经验似然方法已经被广泛用于许多模型的统计推断.基于经验似然对Logistic回归模型进行统计诊断.首先给出模型的估计方程,进而得到模型参数的极大经验似然估计;其次,基于经验似然研究了三种不同的影响曲率;最后通过实例分析,说明了统计诊断方法的有效性.  相似文献   

8.
本文研究偏正态数据下联合位置与尺度模型,考虑基于数据删除模型的参数估计和统计诊断,比较删除模型与未删除模型相应统计量之间的差异.首次提出基于联合位置与尺度模型的诊断统计量和局部影响分析.通过模拟研究和实例分析,给出不同的诊断统计量来判别异常点或强影响点,研究结果表明本文提出的理论和方法是有用和有效的.  相似文献   

9.
本文主要研究大数据集下利用杠杆值抽样后的异常点诊断问题。首先讨论了数据删除模型中参数估计的统计性质,构造了四种异常点诊断统计量;其次,根据均值漂移模型的漂移参数的假设检验问题,构造了三种检验统计量;最后,通过模拟和实证数据分析结果得出本文的结论—异常点诊断对于基于杠杆值的大数据集抽样估计起到重要的影响作用。  相似文献   

10.
利用统计诊断的一些思想,从Bayes预测理论的角度分析线性模型中的结构变化.考察两相回归模型其中诸yi是观察值,凡是回归变量的已知向量(p×1),Oj(j=1,2)是未知参数向县(p×1),是未知参数,诸εi是相互独立的.m是未知参数称为变点.我们主要对m感兴趣.实际上,在模型(1.1)中的统计推断之前,我们不知道哪个参数变化.本文结合统计诊断的一些思想和Dayes观点,利用基于条件预测奇异诊断(以下简记为CPDD)和Kullback-Leibler散度两种方法,来研究线性模型的结构变化.这些方法不限于任何条件,且能找出哪些参数变化;哪…  相似文献   

11.
基于EM算法和Laplace逼近, 本文给出了研究ZI (即含0较多的)纵向计数数据模型的影响分析方法. 为了识别含0较多的分组计数数据中的强影响点, 本文将ZI纵向数据模型中取值为0的数据赋予一定的权重; 而把随机效应看作缺失数据; 在此基础上引入EM算法, 从而应用完全数据对数似然函数的条件期望以及相应的$Q$距离函数进行影响分析; 并进一步应用Laplace逼近方法简化EM算法中的积分计算. 在此基础上, 基于数据删除模型和局部影响分析方法导出了适用于ZI纵向计数数据模型的诊断统计量. 本文也通过实际计数数据的例子验证了诊断统计量的有效性.  相似文献   

12.
本文提出了一个基于高斯混合模型的无监督分类算法. 考虑到利用EM算法求解高斯混合模型的参数参数估计问题容易陷入局部最优解, 我们引入逆Wishart分布来代替传统的Jeffery先验. 几个实验数据的结果表明, 采用该方法估计无监督分类的成分数, 无论是估计的正确率, 还是运算速度, 都有较大提高.  相似文献   

13.
The family of expectation--maximization (EM) algorithms provides a general approach to fitting flexible models for large and complex data. The expectation (E) step of EM-type algorithms is time-consuming in massive data applications because it requires multiple passes through the full data. We address this problem by proposing an asynchronous and distributed generalization of the EM called the distributed EM (DEM). Using DEM, existing EM-type algorithms are easily extended to massive data settings by exploiting the divide-and-conquer technique and widely available computing power, such as grid computing. The DEM algorithm reserves two groups of computing processes called workers and managers for performing the E step and the maximization step (M step), respectively. The samples are randomly partitioned into a large number of disjoint subsets and are stored on the worker processes. The E step of DEM algorithm is performed in parallel on all the workers, and every worker communicates its results to the managers at the end of local E step. The managers perform the M step after they have received results from a γ-fraction of the workers, where γ is a fixed constant in (0, 1]. The sequence of parameter estimates generated by the DEM algorithm retains the attractive properties of EM: convergence of the sequence of parameter estimates to a local mode and linear global rate of convergence. Across diverse simulations focused on linear mixed-effects models, the DEM algorithm is significantly faster than competing EM-type algorithms while having a similar accuracy. The DEM algorithm maintains its superior empirical performance on a movie ratings database consisting of 10 million ratings. Supplementary material for this article is available online.  相似文献   

14.
多元$t$分布数据的局部影响分析   总被引:4,自引:0,他引:4       下载免费PDF全文
对于多元$t$分布数据, 直接应用其概率密度进行影响分析是困难的\bd 本文通过引入服从Gamma分布的权重, 将其表示为特定多元正态分布的混合\bd 在此基础上, 进而将权重视为缺失数据, 引入EM算法; 从而利用基于完全数据似然函数的条件期望进行局部影响分析\bd 本文进一步系统研究了加权扰动模型下的局部影响分析, 得到了相应的诊断统计量; 并通过两个实例说明了这种方法的有效性.  相似文献   

15.
The multivariate probit model is very useful for analyzing correlated multivariate dichotomous data. Recently, this model has been generalized with a confirmatory factor analysis structure for accommodating more general covariance structure, and it is called the MPCFA model. The main purpose of this paper is to consider local influence analysis, which is a well-recognized important step of data analysis beyond the maximum likelihood estimation, of the MPCFA model. As the observed-data likelihood associated with the MPCFA model is intractable, the famous Cook's approach cannot be applied to achieve local influence measures. Hence, the local influence measures are developed via Zhu and Lee's [Local influence for incomplete data model, J. Roy. Statist. Soc. Ser. B 63 (2001) 111-126.] approach that is closely related to the EM algorithm. The diagnostic measures are derived from the conformal normal curvature of an appropriate function. The building blocks are computed via a sufficiently large random sample of the latent response strengths and latent variables that are generated by the Gibbs sampler. Some useful perturbation schemes are discussed. Results that are obtained from analyses of an artificial example and a real example are presented to illustrate the newly developed methodology.  相似文献   

16.
王继霞  汪春峰  苗雨 《数学杂志》2016,36(4):667-675
本文研究了一类有限混合Laplace分布回归模型的局部极大似然估计问题. 利用核回归方法和最大化局部加权似然函数的EM算法, 获得了参数函数的局部极大似然估计量, 并讨论了它们的渐近偏差, 渐近方差和渐近正态性. 推广了有限混合回归模型下局部非参数估计的结果.  相似文献   

17.
Changepoint models are widely used to model the heterogeneity of sequential data. We present a novel sequential Monte Carlo (SMC) online expectation–maximization (EM) algorithm for estimating the static parameters of such models. The SMC online EM algorithm has a cost per time which is linear in the number of particles and could be particularly important when the data is representable as a long sequence of observations, since it drastically reduces the computational requirements for implementation. We present an asymptotic analysis for the stability of the SMC estimates used in the online EM algorithm and demonstrate the performance of this scheme by using both simulated and real data originating from DNA analysis. The supplementary materials for the article are available online.  相似文献   

18.
In this paper, we carry out robust modeling and influence diagnostics in Birnbaum‐Saunders (BS) regression models. Specifically, we present some aspects related to BS and log‐BS distributions and their generalizations from the Student‐t distribution, and develop BS‐t regression models, including maximum likelihood estimation based on the EM algorithm and diagnostic tools. In addition, we apply the obtained results to real data from insurance, which shows the uses of the proposed model. Copyright © 2011 John Wiley & Sons, Ltd.  相似文献   

19.
Latent trait models such as item response theory (IRT) hypothesize a functional relationship between an unobservable, or latent, variable and an observable outcome variable. In educational measurement, a discrete item response is usually the observable outcome variable, and the latent variable is associated with an examinee’s trait level (e.g., skill, proficiency). The link between the two variables is called an item response function. This function, defined by a set of item parameters, models the probability of observing a given item response, conditional on a specific trait level. Typically in a measurement setting, neither the item parameters nor the trait levels are known, and so must be estimated from the pattern of observed item responses. Although a maximum likelihood approach can be taken in estimating these parameters, it usually cannot be employed directly. Instead, a method of marginal maximum likelihood (MML) is utilized, via the expectation-maximization (EM) algorithm. Alternating between an expectation (E) step and a maximization (M) step, the EM algorithm assures that the marginal log likelihood function will not decrease after each EM cycle, and will converge to a local maximum. Interestingly, the negative of this marginal log likelihood function is equal to the relative entropy, or Kullback-Leibler divergence, between the conditional distribution of the latent variables given the observable variables and the joint likelihood of the latent and observable variables. With an unconstrained optimization for the M-step proposed here, the EM algorithm as minimization of Kullback-Leibler divergence admits the convergence results due to Csiszár and Tusnády (Statistics & Decisions, 1:205–237, 1984), a consequence of the binomial likelihood common to latent trait models with dichotomous response variables. For this unconstrained optimization, the EM algorithm converges to a global maximum of the marginal log likelihood function, yielding an information bound that permits a fixed point of reference against which models may be tested. A likelihood ratio test between marginal log likelihood functions obtained through constrained and unconstrained M-steps is provided as a means for testing models against this bound. Empirical examples demonstrate the approach.  相似文献   

20.
We describe an extension of the hidden Markov model in which the manifest process conditionally follows a partition model. The assumption of local independence for the manifest random variable is thus relaxed to arbitrary dependence. The proposed class generalizes different existing models for discrete and continuous time series, and allows for the finest trading off between bias and variance. The models are fit through an EM algorithm, with the usual recursions for hidden Markov models extended at no additional computational cost.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号