首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到19条相似文献,搜索用时 140 毫秒
1.
数据缺失是众多影响数据质量的因素中最常见的一种.若缺失数据处理不当,将直接影响分析结果的可靠性,进而达不到分析的目的.本文针对随机缺失偏正态数据,研究了偏正态众数混合专家模型的参数估计.将众数回归插补与聚类相结合,提出分层众数回归插补方法.利用机器学习插补和统计学插补的方法,进一步比较研究三种机器学习插补方法:支持向量机插补、随机森林插补和神经网络插补,三种统计学插补方法:分层均值插补、众数回归插补和分层众数回归插补的缺失数据处理效果.通过Monte Carlo模拟和实例分析结果表明,分层众数回归插补的优良性.  相似文献   

2.
数据缺失在实际应用中普遍存在,数据缺失会降低研究效率,导致参数估计有偏.在协变量随机缺失(MAR)的假定下,本文基于众数回归和逆概率加权估计方法对线性模型进行参数估计.该方法结合参数Logistic回归和非参数Nadaraya-Watson估计两种倾向得分估计方法,分别构建IPWM-L估计量和IPWM-NW估计量.模拟研究和实例分析表明,众数回归模型比均值回归模型更具稳健性,逆概率加权众数(IPWM)估计方法在缺失数据下表现出了更好的拟合效果,与IPWM-L估计量相比, IPWM-NW估计量更稳健.  相似文献   

3.
本文研究缺失偏t正态数据下线性回归模型的参数估计问题,针对缺失偏t正态数据,为使样本分布更加接近真实分布,改善模型的回归系数、尺度参数、偏度参数和自由度参数的估计效果,提高参数估计的稳定性,提出一种适合缺失偏t正态数据下线性回归模型的修正随机回归插补方法.通过随机模拟和实例研究,同随机回归插补,多重随机回归插补方法比较,结果表明所提出的修正随机回归插补方法是有效可行的.  相似文献   

4.
《数理统计与管理》2015,(4):621-627
基于正态分布提出了缺失数据下联合均值与方差模型,在响应变量随机缺失下研究了该模型均值插补、回归插补和随机回归插补三种插补方法的参数估计,通过数据模拟和实例研究结果比较表明,随机回归插补方法是三种插补方法中最有用和有效的。  相似文献   

5.
针对响应变量随机缺失的变系数部分非线性模型,提出了一种稳健的基于众数回归的估计方法.采取逆概率加权方法,利用QR正交分解技术,分别得到了未知参数和变系数函数的众数回归估计量.在一定条件下,证明了估计量的渐近性质.通过数值模拟和实际数据分析,说明了所提估计方法的有效性.  相似文献   

6.
现有对回归模型的研究大多仅限于直接观测的解释变量,忽略数据的测量误差将增加模型参数的估计偏差.目前关于测量误差模型的研究主要集中在回归误差服从正态分布的假设,这种假设不适用于研究非对称的数据.对于偏斜数据,众数的代表性优于均值和中位数.本文基于测量误差数据介绍了偏正态众数回归模型,并通过EM算法估计了模型的参数.模拟研究的结果表明,协变量带测量误差下的众数回归比均值回归有更好的表现.通过实例分析进一步表明了所提出模型和方法的有效性.  相似文献   

7.
为了更好地拟合偏态数据,充分提取偏态数据的信息,针对偏正态数据建立了众数回归模型,并基于Pena距离统计量对众数回归模型进行统计断研究,得到了众数回归模型的Pena距离表达式以及高杠杆异常点的诊断方法.利用EM算法与梯度下降法给出了众数回归模型参数的极大似然估计,根据数据删除模型计算似然距离、Cook距离和Pena距离统计量,绘制诊断统计图.通过Monte Carlo模拟试验和实例分析比较,说明文章提出的方法行之有效,并在一定条件下Pena距离对异常点或强影响点的诊断优于似然距离和Cook距离.  相似文献   

8.
本文针对金融、经济、社会科学、环境科学、工程技术和生物医学等研究领域存在的不对称数据,提出偏正态数据下众数回归模型,基于牛顿-拉弗森迭代利用EM算法来估计未知参数。通过Monte Carlo模拟和BMI数据实例分析验证,表明本文所提出方法的有效性,对于偏正态数据众数回归模型的估计效果优于均值回归模型。  相似文献   

9.
基于空间自回归模型的缺失值插补方法   总被引:2,自引:0,他引:2  
本文研究来自于区域的截面数据中缺失值的插补问题,讨论了当数据中存在空间相关时,空间自回归模型的建立以及利用其对缺失值进行插补的方法,并根据实际数据,通过建立模型给出插补结果。  相似文献   

10.
缺失数据的插补调整   总被引:16,自引:2,他引:14  
插补是另一类对缺失数据进行调整 ,以减小估计偏差的方法。本文介绍的插补方法有 :演绎估计 ,均值插补 ,随机插补 ,回归插补和多重插补  相似文献   

11.
Joint location and scale models of the skew-normal distribution provide useful ex- tension for joint mean and variance models of the normal distribution when the data set under consideration involves asymmetric outcomes. This paper focuses on the maximum likelihood estimation of joint location and scale models of the skew-normal distribution. The proposed procedure can simultaneously estimate parameters in the location model and the scale model. Simulation studies and a real example are used to illustrate the proposed methodologies.  相似文献   

12.
In this paper, we carry out an in-depth theoretical investigation for inference with missing response and covariate data for general regression models. We assume that the missing data are missing at random (MAR) or missing completely at random (MCAR) throughout. Previous theoretical investigations in the literature have focused only on missing covariates or missing responses, but not both. Here, we consider theoretical properties of the estimates under three different estimation settings: complete case (CC) analysis, a complete response (CR) analysis that involves an analysis of those subjects with only completely observed responses, and the all case (AC) analysis, which is an analysis based on all of the cases. Under each scenario, we derive general expressions for the likelihood and devise estimation schemes based on the EM algorithm. We carry out a theoretical investigation of the three estimation methods in the normal linear model and analytically characterize the loss of information for each method, as well as derive and compare the asymptotic variances for each method assuming the missing data are MAR or MCAR. In addition, a theoretical investigation of bias for the CC method is also carried out. A simulation study and real dataset are given to illustrate the methodology.  相似文献   

13.
Modal regression based on nonparametric quantile estimator is given. Unlike the traditional mean and median regression, modal regression uses mode but not mean or median to represent the center of a conditional distribution, which helps the model to be more robust for outliers, asymmetric or heavy-taileddistribution. Most of solutions for modal regression are based on kernel estimation of density. This paper studies a new solution for modal regression by means of nonparametric quantile estimator. This method builds on the fact that the distribution function is the inverse of the quantile function, then the flexibility of nonparametric quantile estimator is utilized to improve the estimation of modal function. The simulations and application show that the new model outperforms the modal regression model via linear quantile function estimation.  相似文献   

14.
Interpolation is an important issue for a variety fields of statistics (e.g., missing data analysis). In time series analysis, the best interpolator for missing points problem has been investigated in several ways. In this paper, the asymptotics of a contrast function estimator defined by pseudo interpolation error for stationary process are investigated. We estimate parameters of the process by minimizing the pseudo interpolation error written in terms of a fitted parametric spectral density and the periodogram based on observed stretch. The estimator has the consistency and asymptotical normality. Although the criterion for the interpolation problem is known as the best in the sense of smallest mean square error for past and future extrapolation, it is shown that the estimator is asymptotically inefficient in general parameter estimation, which leads to an unexpected result.  相似文献   

15.
该文基于Bootstrap方法研究多个偏正态总体共同位置参数的区间估计和假设检验问题.首先,分别给出未知参数的矩估计和极大似然估计.其次,将徐礼文[1]对多个正态总体共同均值的探讨推广到多个偏正态总体,进而构造共同位置参数的Bootstrap置信区间和Bootstrap检验统计量.Monte Carlo模拟结果表明,无论是两个总体、三个总体还是五个总体,基于矩估计和惩罚极大似然估计的Bootstrap置信区间在覆盖概率意义下优于其他四种Bootstrap置信区间.最后,将上述方法应用于地区生产总值和生物利用度数据的案例分析,以验证该文所给方法的合理性和有效性.  相似文献   

16.
In this paper, we illustrate the use of the Conditional Tail Expectation (CTE) risk measure on a set of bivariate real data consisting of two types of auto insurance claim costs. Several continuous bivariate distributions (normal, lognormal, skew-normal with the alternative log-skew-normal) are fitted to the data. Besides, a bivariate nonparametric transformed kernel estimation is presented. CTE formulas are given for all these, and numerical results on the real data are discussed and compared.  相似文献   

17.
Non-random missing data poses serious problems in longitudinal studies. The binomial distribution parameter becomes to be unidentifiable without any other auxiliary information or assumption when it suffers from ignorable missing data. Existing methods are mostly based on the log-linear regression model. In this article, a model is proposed for longitudinal data with non-ignorable non-response. It is considered to use the pre-test baseline data to improve the identifiability of the post-test parameter. Furthermore, we derive the identified estimation (IE), the maximum likelihood estimation (MLE) and its associated variance for the post-test parameter. The simulation study based on the model of this paper shows that the proposed approach gives promising results.  相似文献   

18.
In this paper, we illustrate the use of the Conditional Tail Expectation (CTE) risk measure on a set of bivariate real data consisting of two types of auto insurance claim costs. Several continuous bivariate distributions (normal, lognormal, skew-normal with the alternative log-skew-normal) are fitted to the data. Besides, a bivariate nonparametric transformed kernel estimation is presented. CTE formulas are given for all these, and numerical results on the real data are discussed and compared.  相似文献   

19.
In this article, we develop efficient robust method for estimation of mean and covariance simultaneously for longitudinal data in regression model. Based on Cholesky decomposition for the covariance matrix and rewriting the regression model, we propose a weighted least square estimator, in which the weights are estimated under generalized empirical likelihood framework. The proposed estimator obtains high efficiency from the close connection to empirical likelihood method, and achieves robustness by bounding the weighted sum of squared residuals. Simulation study shows that, compared to existing robust estimation methods for longitudinal data, the proposed estimator has relatively high efficiency and comparable robustness. In the end, the proposed method is used to analyse a real data set.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号