共查询到19条相似文献,搜索用时 93 毫秒
1.
2.
针对现实生活中大量数据存在偏斜的情况,构建偏正态数据下的众数回归模型.又加之数据的缺失常有发生,采用插补方法处理缺失数据集,为比较插补效果,考虑对响应变量随机缺失情形进行统计推断研究.利用高斯牛顿迭代法给出众数回归模型参数的极大似然估计,比较该模型在均值插补,回归插补,众数插补三种插补条件下的插补效果.随机模拟和实例分... 相似文献
3.
针对预测均值匹配中相近性刻画较为单一的问题,考虑多种相近性刻画方法,同时结合倾向得分可将多个协变量降维的特点,提出采用倾向得分匹配来对缺失数据进行插补的新方法:首先估计倾向得分,然后可选择最近邻、卡钳与半径、分层或区间等多种匹配方法进行匹配,最后利用匹配单元的目标变量来对数据缺失单元进行插补.进一步采用蒙特卡罗模拟和实际数据证实方法是有效的,且在均值插补、回归插补、随机插补、最近邻倾向得分匹配插补、卡钳与半径倾向得分匹配插补、分层或区间倾向得分匹配插补方法中分层或区间倾向得分匹配插补效果最好. 相似文献
4.
5.
本文研究缺失偏t正态数据下线性回归模型的参数估计问题,针对缺失偏t正态数据,为使样本分布更加接近真实分布,改善模型的回归系数、尺度参数、偏度参数和自由度参数的估计效果,提高参数估计的稳定性,提出一种适合缺失偏t正态数据下线性回归模型的修正随机回归插补方法.通过随机模拟和实例研究,同随机回归插补,多重随机回归插补方法比较,结果表明所提出的修正随机回归插补方法是有效可行的. 相似文献
6.
7.
数据缺失是众多影响数据质量的因素中最常见的一种.若缺失数据处理不当,将直接影响分析结果的可靠性,进而达不到分析的目的.本文针对随机缺失偏正态数据,研究了偏正态众数混合专家模型的参数估计.将众数回归插补与聚类相结合,提出分层众数回归插补方法.利用机器学习插补和统计学插补的方法,进一步比较研究三种机器学习插补方法:支持向量机插补、随机森林插补和神经网络插补,三种统计学插补方法:分层均值插补、众数回归插补和分层众数回归插补的缺失数据处理效果.通过Monte Carlo模拟和实例分析结果表明,分层众数回归插补的优良性. 相似文献
8.
李静 《数学的实践与认识》2009,39(22)
主要考虑线性模型在自变量测量含误差以及因变量缺失情况下的估计问题.对于模型中的回归系数,我们基于最小二乘方法提出了两类估计,其中一类估计只由完整观测数据构成,而另外一类估计利用的则是利用简单插补方法构造的完整数据.证明了这两类估计是渐近正态性的. 相似文献
9.
10.
主要研究因变量存在缺失且协变量部分包含测量误差情形下,如何对变系数部分线性模型同时进行参数估计和变量选择.我们利用插补方法来处理缺失数据,并结合修正的profile最小二乘估计和SCAD惩罚对参数进行估计和变量选择.并且证明所得的估计具有渐近正态性和Oracle性质.通过数值模拟进一步研究所得估计的有限样本性质. 相似文献
11.
New imputation methods for missing data using quantiles 总被引:1,自引:0,他引:1
J.F. Muñoz 《Journal of Computational and Applied Mathematics》2009,232(2):305-317
The problem of missing values commonly arises in data sets, and imputation is usually employed to compensate for non-response. We propose a novel imputation method based on quantiles, which can be implemented with or without the presence of auxiliary information. The proposed method is extended to unequal sampling designs and non-uniform response mechanisms. Iterative algorithms to compute the proposed imputation methods are presented. Monte Carlo simulations are conducted to assess the performance of the proposed imputation methods with respect to alternative imputation methods. Simulation results indicate that the proposed methods perform competitively in terms of relative bias and relative root mean square error. 相似文献
12.
Thomas Nittner 《Computational Statistics》2004,19(2):261-282
Summary The main purpose of this paper is a comparison of several imputation methods within the simple additive modelty =f(x) + ε where the independent variableX is affected by missing completely at random. Besides the well-known complete case analysis, mean imputation plus random noise,
single imputation and two kinds of nearest neighbor imputations are used. A short introduction to the model, the missing mechanism,
the inference, the imputation methods and their implementation is followed by the main focus—the simulation experiment. The
methods are compared within the experiment based on the sample mean squared error, estimated variances and estimated biases
off(x) at the knots. 相似文献
13.
Kernel function method has been successfully used for the
estimation of a variety of function. By using the kernel function theory, an imputation
method based on Epanechnikov kernel and its modification were proposed to solve the
problem that missing data in compositional caused the failures of existing statistical
methods and the k-nearest imputation didn't consider the different contributions of
the k nearest samples when it used them to estimated the missing data. The experimental
results illustrate that the modified imputation method based on Epanechnikov kernel
get a more accurate estimation than k-nearest imputation for compositional data. 相似文献
14.
15.
复制数据是处理抽样调查中数据项目缺失的一种常用方法。在两种常见模型及复杂抽样设计下,本文对处理数据项目缺失的类均值复制和类加权均值复制方法进行了对比。 相似文献
16.
Schenker N 《Survey methodology》1988,14(1):87-97, 93-104
"This paper discusses methods used to handle missing data in post-enumeration surveys for estimating census coverage error, as illustrated for the 1986 Test of Adjustment Related Operations (Diffendal 1988). The methods include imputation schemes based on hot-deck and logistic regression models as well as weighting adjustments. The sensivity of undercount estimates from the 1986 test to variations in the imputation models is also explored." The test was carried out in Central Los Angeles County, California. 相似文献
17.
In many applications, some covariates could be missing for various reasons. Regression quantiles could be either biased or under-powered when ignoring the missing data. Multiple imputation and EM-based augment approach have been proposed to fully utilize the data with missing covariates for quantile regression. Both methods however are computationally expensive. We propose a fast imputation algorithm (FI) to handle the missing covariates in quantile regression, which is an extension of the fractional imputation in likelihood based regressions. FI and modified imputation algorithms (FIIPW and MIIPW) are compared to existing MI and IPW approaches in the simulation studies, and applied to part of of the National Collaborative Perinatal Project study. 相似文献
18.
Nadia Solaro Alessandro Barbiero Giancarlo Manzi Pier Alda Ferrari 《Advances in Data Analysis and Classification》2017,11(2):395-414
Missing data recurrently affect datasets in almost every field of quantitative research. The subject is vast and complex and has originated a literature rich in very different approaches to the problem. Within an exploratory framework, distance-based methods such as nearest-neighbour imputation (NNI), or procedures involving multivariate data analysis (MVDA) techniques seem to treat the problem properly. In NNI, the metric and the number of donors can be chosen at will. MVDA-based procedures expressly account for variable associations. The new approach proposed here, called Forward Imputation, ideally meets these features. It is designed as a sequential procedure that imputes missing data in a step-by-step process involving subsets of units according to their “completeness rate”. Two methods within this context are developed for the imputation of quantitative data. One applies NNI with the Mahalanobis distance, the other combines NNI and principal component analysis. Statistical properties of the two methods are discussed, and their performance is assessed, also in comparison with alternative imputation methods. To this purpose, a simulation study in the presence of different data patterns along with an application to real data are carried out, and practical hints for users are also provided. 相似文献
19.
Ying Wang Weiming Wan Rui-Sheng Wang Enmin Feng 《Journal of Computational and Applied Mathematics》2009
Mutual information can be used as a measure for the association of a genetic marker or a combination of markers with the phenotype. In this paper, we study the imputation of missing genotype data. We first utilize joint mutual information to compute the dependence between SNP sites, then construct a mathematical model in order to find the two SNP sites having maximal dependence with missing SNP sites, and further study the properties of this model. Finally, an extension method to haplotype-based imputation is proposed to impute the missing values in genotype data. To verify our method, extensive experiments have been performed, and numerical results show that our method is superior to haplotype-based imputation methods. At the same time, numerical results also prove joint mutual information can better measure the dependence between SNP sites. According to experimental results, we also conclude that the dependence between the adjacent SNP sites is not necessarily strongest. 相似文献