首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到19条相似文献,搜索用时 93 毫秒
1.
抽样调查中缺失数据的插补方法   总被引:5,自引:0,他引:5  
在抽样调查等实际问题中,经常出现数据缺失.针对这类问题,通常的处理方法之一是对数据进行插补。本文综述了抽样调查中处理缺失数据常用的插补方法。重点讨论了单一插补的方差估计与多重插补的简化计算以及使用回答概率的单一插补等。最后讨论目前插补所面临的问题与其发展方向.  相似文献   

2.
针对现实生活中大量数据存在偏斜的情况,构建偏正态数据下的众数回归模型.又加之数据的缺失常有发生,采用插补方法处理缺失数据集,为比较插补效果,考虑对响应变量随机缺失情形进行统计推断研究.利用高斯牛顿迭代法给出众数回归模型参数的极大似然估计,比较该模型在均值插补,回归插补,众数插补三种插补条件下的插补效果.随机模拟和实例分...  相似文献   

3.
针对预测均值匹配中相近性刻画较为单一的问题,考虑多种相近性刻画方法,同时结合倾向得分可将多个协变量降维的特点,提出采用倾向得分匹配来对缺失数据进行插补的新方法:首先估计倾向得分,然后可选择最近邻、卡钳与半径、分层或区间等多种匹配方法进行匹配,最后利用匹配单元的目标变量来对数据缺失单元进行插补.进一步采用蒙特卡罗模拟和实际数据证实方法是有效的,且在均值插补、回归插补、随机插补、最近邻倾向得分匹配插补、卡钳与半径倾向得分匹配插补、分层或区间倾向得分匹配插补方法中分层或区间倾向得分匹配插补效果最好.  相似文献   

4.
本文主要讨论了响应数据缺失时基于无偏估计方程的分位数估计.本文提出了两种非参光滑技术的插补(imputation)方法,一种是整体非参核插补法,另一种是局部多重插补法.我们可以利用这两种方法构造渐近无偏估计方程.通过该缺失数据下的估计方程,我们可以利用常用的估计方法对未知分位数进行统计推断.本文证明了该方法下的分位数估计具有相合性和渐近正态性.  相似文献   

5.
本文研究缺失偏t正态数据下线性回归模型的参数估计问题,针对缺失偏t正态数据,为使样本分布更加接近真实分布,改善模型的回归系数、尺度参数、偏度参数和自由度参数的估计效果,提高参数估计的稳定性,提出一种适合缺失偏t正态数据下线性回归模型的修正随机回归插补方法.通过随机模拟和实例研究,同随机回归插补,多重随机回归插补方法比较,结果表明所提出的修正随机回归插补方法是有效可行的.  相似文献   

6.
在经济领域和工业产品质量改进试验中,对均值和散度同时建模十分必要;在数据采集过程中,时常会遇到数据缺失问题.文章基于上述两点,研究缺失数据下的双重广义线性模型的参数估计,采用最近距离插补和反距离加权插补对缺失数据进行处理,并应用最大扩展拟似然估计和最大伪似然估计两种估计方法对未知参数进行估计.随机模拟和实例结果表明,该模型和所应用的方法是有用和有效的.  相似文献   

7.
数据缺失是众多影响数据质量的因素中最常见的一种.若缺失数据处理不当,将直接影响分析结果的可靠性,进而达不到分析的目的.本文针对随机缺失偏正态数据,研究了偏正态众数混合专家模型的参数估计.将众数回归插补与聚类相结合,提出分层众数回归插补方法.利用机器学习插补和统计学插补的方法,进一步比较研究三种机器学习插补方法:支持向量机插补、随机森林插补和神经网络插补,三种统计学插补方法:分层均值插补、众数回归插补和分层众数回归插补的缺失数据处理效果.通过Monte Carlo模拟和实例分析结果表明,分层众数回归插补的优良性.  相似文献   

8.
主要考虑线性模型在自变量测量含误差以及因变量缺失情况下的估计问题.对于模型中的回归系数,我们基于最小二乘方法提出了两类估计,其中一类估计只由完整观测数据构成,而另外一类估计利用的则是利用简单插补方法构造的完整数据.证明了这两类估计是渐近正态性的.  相似文献   

9.
《数理统计与管理》2015,(4):621-627
基于正态分布提出了缺失数据下联合均值与方差模型,在响应变量随机缺失下研究了该模型均值插补、回归插补和随机回归插补三种插补方法的参数估计,通过数据模拟和实例研究结果比较表明,随机回归插补方法是三种插补方法中最有用和有效的。  相似文献   

10.
主要研究因变量存在缺失且协变量部分包含测量误差情形下,如何对变系数部分线性模型同时进行参数估计和变量选择.我们利用插补方法来处理缺失数据,并结合修正的profile最小二乘估计和SCAD惩罚对参数进行估计和变量选择.并且证明所得的估计具有渐近正态性和Oracle性质.通过数值模拟进一步研究所得估计的有限样本性质.  相似文献   

11.
New imputation methods for missing data using quantiles   总被引:1,自引:0,他引:1  
The problem of missing values commonly arises in data sets, and imputation is usually employed to compensate for non-response. We propose a novel imputation method based on quantiles, which can be implemented with or without the presence of auxiliary information. The proposed method is extended to unequal sampling designs and non-uniform response mechanisms. Iterative algorithms to compute the proposed imputation methods are presented. Monte Carlo simulations are conducted to assess the performance of the proposed imputation methods with respect to alternative imputation methods. Simulation results indicate that the proposed methods perform competitively in terms of relative bias and relative root mean square error.  相似文献   

12.
Summary  The main purpose of this paper is a comparison of several imputation methods within the simple additive modelty =f(x) + ε where the independent variableX is affected by missing completely at random. Besides the well-known complete case analysis, mean imputation plus random noise, single imputation and two kinds of nearest neighbor imputations are used. A short introduction to the model, the missing mechanism, the inference, the imputation methods and their implementation is followed by the main focus—the simulation experiment. The methods are compared within the experiment based on the sample mean squared error, estimated variances and estimated biases off(x) at the knots.  相似文献   

13.
Kernel function method has been successfully used for the estimation of a variety of function. By using the kernel function theory, an imputation method based on Epanechnikov kernel and its modification were proposed to solve the problem that missing data in compositional caused the failures of existing statistical methods and the k-nearest imputation didn't consider the different contributions of the k nearest samples when it used them to estimated the missing data. The experimental results illustrate that the modified imputation method based on Epanechnikov kernel get a more accurate estimation than k-nearest imputation for compositional data.  相似文献   

14.
不同差补方法的比较   总被引:6,自引:1,他引:5  
本文针对缺失数据提出几种差补方法 ,通过模拟实验 ,考察文些方法的适用性及优缺点。结果表明 ,控制变量的恰当引入有利于提高估算效果。从与真值的拟合角度看 ,均值差补法有优势 ;而从保持样本分布的角度看 ,含有随机过程的差补法效果显著。在使用差补后的“完整”数据集时始终保持客观谨慎的态度是非常重要的。  相似文献   

15.
复制数据是处理抽样调查中数据项目缺失的一种常用方法。在两种常见模型及复杂抽样设计下,本文对处理数据项目缺失的类均值复制和类加权均值复制方法进行了对比。  相似文献   

16.
Schenker N 《Survey methodology》1988,14(1):87-97, 93-104
"This paper discusses methods used to handle missing data in post-enumeration surveys for estimating census coverage error, as illustrated for the 1986 Test of Adjustment Related Operations (Diffendal 1988). The methods include imputation schemes based on hot-deck and logistic regression models as well as weighting adjustments. The sensivity of undercount estimates from the 1986 test to variations in the imputation models is also explored." The test was carried out in Central Los Angeles County, California.  相似文献   

17.
In many applications, some covariates could be missing for various reasons. Regression quantiles could be either biased or under-powered when ignoring the missing data. Multiple imputation and EM-based augment approach have been proposed to fully utilize the data with missing covariates for quantile regression. Both methods however are computationally expensive. We propose a fast imputation algorithm (FI) to handle the missing covariates in quantile regression, which is an extension of the fractional imputation in likelihood based regressions. FI and modified imputation algorithms (FIIPW and MIIPW) are compared to existing MI and IPW approaches in the simulation studies, and applied to part of of the National Collaborative Perinatal Project study.  相似文献   

18.
Missing data recurrently affect datasets in almost every field of quantitative research. The subject is vast and complex and has originated a literature rich in very different approaches to the problem. Within an exploratory framework, distance-based methods such as nearest-neighbour imputation (NNI), or procedures involving multivariate data analysis (MVDA) techniques seem to treat the problem properly. In NNI, the metric and the number of donors can be chosen at will. MVDA-based procedures expressly account for variable associations. The new approach proposed here, called Forward Imputation, ideally meets these features. It is designed as a sequential procedure that imputes missing data in a step-by-step process involving subsets of units according to their “completeness rate”. Two methods within this context are developed for the imputation of quantitative data. One applies NNI with the Mahalanobis distance, the other combines NNI and principal component analysis. Statistical properties of the two methods are discussed, and their performance is assessed, also in comparison with alternative imputation methods. To this purpose, a simulation study in the presence of different data patterns along with an application to real data are carried out, and practical hints for users are also provided.  相似文献   

19.
Mutual information can be used as a measure for the association of a genetic marker or a combination of markers with the phenotype. In this paper, we study the imputation of missing genotype data. We first utilize joint mutual information to compute the dependence between SNP sites, then construct a mathematical model in order to find the two SNP sites having maximal dependence with missing SNP sites, and further study the properties of this model. Finally, an extension method to haplotype-based imputation is proposed to impute the missing values in genotype data. To verify our method, extensive experiments have been performed, and numerical results show that our method is superior to haplotype-based imputation methods. At the same time, numerical results also prove joint mutual information can better measure the dependence between SNP sites. According to experimental results, we also conclude that the dependence between the adjacent SNP sites is not necessarily strongest.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号