期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

1.

响应变量缺失时条件分位数的经验似然置信区间

曹添建凌能祥《应用数学》2012,25(2):318-326

本文利用经验似然的思想,分别构造在响应变量满足随机缺失(MAR)机制的条件下,不含附加信息和含附加信息时条件分位数的置信区间,并说明检验的渐近功效随信息量的增加而非降,推广了现有文献中的相应结果. 相似文献

2.

缺失数据下的两步抽样法估计

陆福忠《高校应用数学学报(A辑)》2009,24(1)

在数据缺失机制形式未知时,通过两步抽样得到了分布函数的相合估计量,证明了该估计量的渐近正态性.文中假设第二次抽样时的数据缺失机制与第一次抽样时的数据缺失机制函数形式类似,允许两者有一个一维未知参数的差别. 相似文献

3.

数据缺失情形线性回归模型中误差方差的经验似然估计

《数学的实践与认识》2020,(4)

在随机缺失(MAR)机制下利用经验似然方法构造了线性回归模型中误差方差的估计.并在一定条件下,证明了该估计的渐近正态性,由此得出当误差的分布不对称时,该估计的渐近方差比常用估计的渐近方差小. 相似文献

4.

因变量缺失下部分线性可加模型的估计和检验

下载免费PDF全文

魏传华郭双《应用数学》2016,29(4):797-808

本文研究部分线性可加模型在因变量存在缺失情形下的统计推断问题. 首先基于完整数据方法提出了参数分量的Profile 最小二乘估计并证明估计量的渐近正态性. 为了给出参数分量的区间估计,构造了渐近分布为卡方分布的经验似然统计量. 为了检验参数分量的线性约束条件, 构造了调整的广义似然比检验统计量, 当原假设成立时其渐近分布为卡方分布,从而将广义似然比检验推广到了缺失数据情形. 最后通过数值模拟验证所提方法的有效性. 相似文献

5.

借助优势比信息识别不可忽略缺失数据的模型参数 总被引：1，自引：0，他引：1

王学丽耿直《数理统计与管理》2005,24(3):56-63,75

由不可忽略缺失机制引起的缺失数据,常使得模型变得不可识别。对于那些不可识别的模型,可以通过添加协变量和借助其他来源的外部数据来达到识别的目的。本文探讨不可忽略缺失机制下,利用外部获得的优势比估计,来达到识别联合概率的方法。相似文献

6.

协变量缺失情形下的逆概率加权众数回归估计

林金官景钰涵韩忠成《应用数学》2023,(2):562-570

数据缺失在实际应用中普遍存在,数据缺失会降低研究效率,导致参数估计有偏.在协变量随机缺失(MAR)的假定下,本文基于众数回归和逆概率加权估计方法对线性模型进行参数估计.该方法结合参数Logistic回归和非参数Nadaraya-Watson估计两种倾向得分估计方法,分别构建IPWM-L估计量和IPWM-NW估计量.模拟研究和实例分析表明,众数回归模型比均值回归模型更具稳健性,逆概率加权众数(IPWM)估计方法在缺失数据下表现出了更好的拟合效果,与IPWM-L估计量相比, IPWM-NW估计量更稳健. 相似文献

7.

有缺失数据的正态母体参数的后验分布及其抽样算法 总被引：1，自引：0，他引：1

李开灿黄学维《应用数学学报》2009,32(2)

在缺失数据机制是可忽略的、先验分布是逆矩阵Γ分布的假设下,利用矩阵的cholesky分解和变量替换方法,本文导出了有单调缺失数据结构的正态分布参数的后验分布形式.进-步用后验分布的组成特点,构造了单调缺失数据结构的正态分布的协方差矩阵和均值后验分布的抽样算法. 相似文献

8.

数据非随机缺失时单变量的分布函数估计

陆福忠《应用概率统计》2011,27(2)

本文研究数据非随机缺失下的分布函数估计问题.在确定缺失数据是否属于某些指定区间的前提下,对一维随机变量y的分布函数F(y)作出了估计.此时,假定数据缺失机制形式已知,但包含某未知多维参数θ.本文证明了未知参数θ的估计量(θ)的相合性和渐近正态性,也证明了分布函数F(y)的估计量F(y)的相合性和渐近正态性. 相似文献

9.

具部分缺失数据的两个柏松总体的估计和检验

武大勇万建平《应用数学》2005,(Z1)

本文讨论部分缺失数据两柏松分布总体的参数估计和总体相同的似然比检验,证明了估计的强相合性和渐近正态性,给出了似然比检验的极限分布,并讨论了基于精确分布的检验问题. 相似文献

10.

长寿命产品在小子样缺失数据下的Bayes可靠性增长分析

程皖民冯静周经伦孙权《模糊系统与数学》2006,20(6):149-153

在长寿命产品的可靠性增长试验过程中,由于人员、观测设备或其他方面的原因,可能会造成某些试验数据丢失或未观测到的现象。对这类小子样变总体缺失数据情形,提出了Bayes可靠性增长分析方法。首先利用Box-Tiao技术构造先验分布,然后利用非齐次Poisson过程原理和缺失数据的产生机制,得到可靠性增长缺失数据的似然函数,再用Bayes统计推断方法得到产品各研制阶段结束时的可靠性水平,同时给出了缺失数据下增长模型的拟合优度检验方法。最后通过一个示例说明了该方法在工程上的应用。相似文献

11.

Theory and inference for regression models with missing responses and covariates

Qingxia Chen Ming-Hui Chen 《Journal of multivariate analysis》2008,99(6):1302-1331

In this paper, we carry out an in-depth theoretical investigation for inference with missing response and covariate data for general regression models. We assume that the missing data are missing at random (MAR) or missing completely at random (MCAR) throughout. Previous theoretical investigations in the literature have focused only on missing covariates or missing responses, but not both. Here, we consider theoretical properties of the estimates under three different estimation settings: complete case (CC) analysis, a complete response (CR) analysis that involves an analysis of those subjects with only completely observed responses, and the all case (AC) analysis, which is an analysis based on all of the cases. Under each scenario, we derive general expressions for the likelihood and devise estimation schemes based on the EM algorithm. We carry out a theoretical investigation of the three estimation methods in the normal linear model and analytically characterize the loss of information for each method, as well as derive and compare the asymptotic variances for each method assuming the missing data are MAR or MCAR. In addition, a theoretical investigation of bias for the CC method is also carried out. A simulation study and real dataset are given to illustrate the methodology. 相似文献

12.

Effects of missing data in credit risk scoring. A comparative analysis of methods to achieve robustness in the absence of sufficient data

R Florez-Lopez 《The Journal of the Operational Research Society》2010,61(3):486-501

The 2004 Basel II Accord has pointed out the benefits of credit risk management through internal models using internal data to estimate risk components: probability of default (PD), loss given default, exposure at default and maturity. Internal data are the primary data source for PD estimates; banks are permitted to use statistical default prediction models to estimate the borrowers’ PD, subject to some requirements concerning accuracy, completeness and appropriateness of data. However, in practice, internal records are usually incomplete or do not contain adequate history to estimate the PD. Current missing data are critical with regard to low default portfolios, characterised by inadequate default records, making it difficult to design statistically significant prediction models. Several methods might be used to deal with missing data such as list-wise deletion, application-specific list-wise deletion, substitution techniques or imputation models (simple and multiple variants). List-wise deletion is an easy-to-use method widely applied by social scientists, but it loses substantial data and reduces the diversity of information resulting in a bias in the model's parameters, results and inferences. The choice of the best method to solve the missing data problem largely depends on the nature of missing values (MCAR, MAR and MNAR processes) but there is a lack of empirical analysis about their effect on credit risk that limits the validity of resulting models. In this paper, we analyse the nature and effects of missing data in credit risk modelling (MCAR, MAR and NMAR processes) and take into account current scarce data set on consumer borrowers, which include different percents and distributions of missing data. The findings are used to analyse the performance of several methods for dealing with missing data such as likewise deletion, simple imputation methods, MLE models and advanced multiple imputation (MI) alternatives based on MarkovChain-MonteCarlo and re-sampling methods. Results are evaluated and discussed between models in terms of robustness, accuracy and complexity. In particular, MI models are found to provide very valuable solutions with regard to credit risk missing data. 相似文献

13.

Normal distribution based pseudo ML for missing data: With applications to mean and covariance structure analysis

Ke-Hai Yuan 《Journal of multivariate analysis》2009,100(9):1900-1918

When missing data are either missing completely at random (MCAR) or missing at random (MAR), the maximum likelihood (ML) estimation procedure preserves many of its properties. However, in any statistical modeling, the distribution specification for the likelihood function is at best only an approximation to the real world. In particular, since the normal-distribution-based ML is typically applied to data with heterogeneous marginal skewness and kurtosis, it is necessary to know whether such a practice still generates consistent parameter estimates. When the manifest variables are linear combinations of independent random components and missing data are MAR, this paper shows that the normal-distribution-based MLE is consistent regardless of the distribution of the sample. Examples also show that the consistency of the MLE is not guaranteed for all nonnormally distributed samples. When the population follows a confirmatory factor model, and data are missing due to the magnitude of the factors, the MLE may not be consistent even when data are normally distributed. When data are missing due to the magnitude of measurement errors/uniqueness, MLEs for many of the covariance parameters related to the missing variables are still consistent. This paper also identifies and discusses the factors that affect the asymptotic biases of the MLE when data are not missing at random. In addition, the paper also shows that, under certain data models and MAR mechanism, the MLE is asymptotically normally distributed and the asymptotic covariance matrix is consistently estimated by the commonly used sandwich-type covariance matrix. The results indicate that certain formulas and/or conclusions in the existing literature may not be entirely correct. 相似文献

14.

Supervised learning of multivariate skew normal mixture models with missing information

Tzy-Chy Lin Tsung-I Lin 《Computational Statistics》2010,25(2):183-201

We establish computationally flexible tools for the analysis of multivariate skew normal mixtures when missing values occur in data. To facilitate the computation and simplify the theoretical derivation, two auxiliary permutation matrices are incorporated into the model for the determination of observed and missing components of each observation and are manifestly effective in reducing the computational complexity. We present an analytically feasible EM algorithm for the supervised learning of parameters as well as missing observations. The proposed mixture analyzer, including the most commonly used Gaussian mixtures as a special case, allows practitioners to handle incomplete multivariate data sets in a wide range of considerations. The methodology is illustrated through a real data set with varying proportions of synthetic missing values generated by MCAR and MAR mechanisms and shown to perform well on classification tasks. 相似文献

15.

分数填补下两总体分位数差异的半经验似然推断

张正家秦永松《高校应用数学学报(A辑)》2009,24(4)

在完全随机缺失机制情形,利用分数填补法填补缺失值,然后用经验似然方法构造两总体分位数差异的半经验似然比统计量,证明其渐近服从加权X~2分布并构造了相应的半经验似然置信区间. 相似文献

16.

Dealing with missing data based on data envelopment analysis and halo effect

Yong Zha Ali Song Chuanyong Xu Honglin Yang 《Applied Mathematical Modelling》2013

This research attempts to solve the problem of dealing with missing data via the interface of Data Envelopment Analysis (DEA) and human behavior. Missing data is under continuing discussion in various research fields, especially those highly dependent on data. In practice and research, some necessary data may not be obtained in many cases, for example, procedural factors, lack of needed responses, etc. Thus the question of how to deal with missing data is raised. In this paper, modified DEA models are developed to estimate the appropriate value of missing data in its interval, based on DEA and Inter-dimensional Similarity Halo Effect. The estimated value of missing data is determined by the General Impression of original DEA efficiency. To evaluate the effectiveness of this method, the impact factor is proposed. In addition, the advantages of the proposed approach are illustrated in comparison with previous methods. 相似文献

17.

Auxiliary Variables in Multiple Imputation When Data Are Missing Not at Random

Sarah Mustillo Soyoung Kwon 《The Journal of mathematical sociology》2013,37(2):73-91

Most current implementations of multiple imputation (MI) assume that data are missing at random (MAR), but this assumption is generally untestable. We performed analyses to test the effects of auxiliary variables on MI when the data are missing not at random (MNAR) using simulated data and real data. In the analyses we varied (a) the correlation, (b) the level of missing data, (c) the pattern of missing data, and (d) sample size. Results showed that MI performed adequately without auxiliary variables but they also had a modest impact on bias in the real data and improved efficiency in both data sets. The results of this study suggest that, counter to the concern about the violation of the MAR assumption, MI appears to be quite robust to missing data that are MNAR in analytic situations such as the ones presented here. Further, results can be made even better via the use of auxiliary variables, particularly when efficiency is a primary concern. 相似文献

18.

Statistical inference for right-censored data with nonignorable missing censoring indicators

SUN ZhiHua XIE TianFa LIANG Hua 《中国科学数学(英文版)》2013,56(6):1263-1278

We consider the statistical inference for right-censored data when censoring indicators are missing but nonignorable, and propose an adjusted imputation product-limit estimator. The proposed estimator is shown to be consistent and converges to a Gaussian process. Furthermore, we develop an empirical processbased testing method to check the MAR (missing at random) mechanism, and establish asymptotic properties for the proposed test statistic. To determine the critical value of the test, a consistent model-based bootstrap method is suggested. We conduct simulation studies to evaluate the numerical performance of the proposed method and compare it with existing methods. We also analyze a real data set from a breast cancer study for an illustration. 相似文献

19.

带缺失数据的偏正态众数回归模型的参数估计

谭佳玲曾鑫吴刘仓《高校应用数学学报(A辑)》2022,(1):24-34

针对现实生活中大量数据存在偏斜的情况,构建偏正态数据下的众数回归模型.又加之数据的缺失常有发生,采用插补方法处理缺失数据集,为比较插补效果,考虑对响应变量随机缺失情形进行统计推断研究.利用高斯牛顿迭代法给出众数回归模型参数的极大似然估计,比较该模型在均值插补,回归插补,众数插补三种插补条件下的插补效果.随机模拟和实例分... 相似文献

20.

Binomial proportion estimation in longitudinal data with non-ignorable non-response

Xue-li WANG 《应用数学学报(英文版)》2013,29(3):623-630

Non-random missing data poses serious problems in longitudinal studies. The binomial distribution parameter becomes to be unidentifiable without any other auxiliary information or assumption when it suffers from ignorable missing data. Existing methods are mostly based on the log-linear regression model. In this article, a model is proposed for longitudinal data with non-ignorable non-response. It is considered to use the pre-test baseline data to improve the identifiability of the post-test parameter. Furthermore, we derive the identified estimation (IE), the maximum likelihood estimation (MLE) and its associated variance for the post-test parameter. The simulation study based on the model of this paper shows that the proposed approach gives promising results. 相似文献