期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

1.

Theory and inference for regression models with missing responses and covariates

Qingxia Chen Ming-Hui Chen 《Journal of multivariate analysis》2008,99(6):1302-1331

In this paper, we carry out an in-depth theoretical investigation for inference with missing response and covariate data for general regression models. We assume that the missing data are missing at random (MAR) or missing completely at random (MCAR) throughout. Previous theoretical investigations in the literature have focused only on missing covariates or missing responses, but not both. Here, we consider theoretical properties of the estimates under three different estimation settings: complete case (CC) analysis, a complete response (CR) analysis that involves an analysis of those subjects with only completely observed responses, and the all case (AC) analysis, which is an analysis based on all of the cases. Under each scenario, we derive general expressions for the likelihood and devise estimation schemes based on the EM algorithm. We carry out a theoretical investigation of the three estimation methods in the normal linear model and analytically characterize the loss of information for each method, as well as derive and compare the asymptotic variances for each method assuming the missing data are MAR or MCAR. In addition, a theoretical investigation of bias for the CC method is also carried out. A simulation study and real dataset are given to illustrate the methodology. 相似文献

2.

Semiparametric Maximum Likelihood for Missing Covariates in Parametric Regression

Zhiwei Zhang Howard E. Rockette 《Annals of the Institute of Statistical Mathematics》2006,58(4):687-706

We consider parameter estimation in parametric regression models with covariates missing at random. This problem admits a semiparametric maximum likelihood approach which requires no parametric specification of the selection mechanism or the covariate distribution. The semiparametric maximum likelihood estimator (MLE) has been found to be consistent. We show here, for some specific models, that the semiparametric MLE converges weakly to a zero-mean Gaussian process in a suitable space. The regression parameter estimate, in particular, achieves the semiparametric information bound, which can be consistently estimated by perturbing the profile log-likelihood. Furthermore, the profile likelihood ratio statistic is asymptotically chi-squared. The techniques used here extend to other models. 相似文献

3.

Estimators of the regression parameters of the zeta distribution

Louis G. Doray Michel Arsenault 《Insurance: Mathematics and Economics》2002,30(3):255

The zeta distribution with regression parameters has been rarely used in statistics because of the difficulty of estimating the parameters by traditional maximum likelihood. We propose an alternative method for estimating the parameters based on an iteratively reweighted least-squares algorithm. The quadratic distance estimator (QDE) obtained is consistent, asymptotically unbiased and normally distributed; the estimate can also serve as the initial value required by an algorithm to maximize the likelihood function. We illustrate the method with a numerical example from the insurance literature; we compare the values of the estimates obtained by the quadratic distance and maximum likelihood methods and their approximate variance–covariance matrix. Finally, we calculate the bias, variance and the asymptotic efficiency of the QDE compared to the maximum likelihood estimator (MLE) for some values of the parameters. 相似文献

4.

多周期Probit模型中MLE的存在性

刘金燕徐兴忠《应用数学》2004,(Z2)

本文我们讨论了多周期Probit模型中MLE的存在性问题,给出了当协方差阵已知时,参数的MLE存在的充要条件;当协方差阵未知但具有序列结构时,参数的MLE存在的一个必要条件和一个充分条件. 相似文献

5.

EMPIRICAL LIKELIHOOD APPROACH FOR LONGITUDINAL DATA WITH MISSING VALUES AND TIME-DEPENDENT COVARIATES

下载免费PDF全文

Yan Zhang Weiping Zhang Xiao Guo 《应用数学年刊》2016,32(2):200-220

Missing data and time-dependent covariates often arise simultaneously in longitudinal studies, and directly applying classical approaches may result in a loss of efficiency and biased estimates. To deal with this problem, we propose weighted corrected estimating equations under the missing at random mechanism, followed by developing a shrinkage empirical likelihood estimation approach for the parameters of interest when time-dependent covariates are present. Such procedure improves efficiency over generalized estimation equations approach with working independent assumption, via combining the independent estimating equations and the extracted additional information from the estimating equations that are excluded by the independence assumption. The contribution from the remaining estimating equations is weighted according to the likelihood of each equation being a consistent estimating equation and the information it carries. We show that the estimators are asymptotically normally distributed and the empirical likelihood ratio statistic and its profile counterpart follow central chi-square distributions asymptotically when evaluated at the true parameter. The practical performance of our approach is demonstrated through numerical simulations and data analysis. 相似文献

6.

Analysis of rounded data from dependent sequences 总被引：1，自引：0，他引：1

Baoxue Zhang Tianqing Liu Z. D. Bai 《Annals of the Institute of Statistical Mathematics》2010,62(6):1143-1173

Observations on continuous populations are often rounded when recorded due to the precision of the recording mechanism. However, classical statistical approaches have ignored the effect caused by the rounding errors. When the observations are independent and identically distributed, the exact maximum likelihood estimation (MLE) can be employed. However, if rounded data are from a dependent structure, the MLE of the parameters is difficult to calculate since the integral involved in the likelihood equation is intractable. This paper presents and examines a new approach to the parameter estimation, named as “short, overlapping series” (SOS), to deal with the α-mixing models in presence of rounding errors. We will establish the asymptotic properties of the SOS estimators when the innovations are normally distributed. Comparisons of this new approach with other existing techniques in the literature are also made by simulation with samples of moderate sizes. 相似文献

7.

Auxiliary Variables in Multiple Imputation When Data Are Missing Not at Random

Sarah Mustillo Soyoung Kwon 《The Journal of mathematical sociology》2013,37(2):73-91

Most current implementations of multiple imputation (MI) assume that data are missing at random (MAR), but this assumption is generally untestable. We performed analyses to test the effects of auxiliary variables on MI when the data are missing not at random (MNAR) using simulated data and real data. In the analyses we varied (a) the correlation, (b) the level of missing data, (c) the pattern of missing data, and (d) sample size. Results showed that MI performed adequately without auxiliary variables but they also had a modest impact on bias in the real data and improved efficiency in both data sets. The results of this study suggest that, counter to the concern about the violation of the MAR assumption, MI appears to be quite robust to missing data that are MNAR in analytic situations such as the ones presented here. Further, results can be made even better via the use of auxiliary variables, particularly when efficiency is a primary concern. 相似文献

8.

Outlier detection and robust covariance estimation using mathematical programming

Tri-Dzung Nguyen Roy E. Welsch 《Advances in Data Analysis and Classification》2010,4(4):301-334

The outlier detection problem and the robust covariance estimation problem are often interchangeable. Without outliers, the classical method of maximum likelihood estimation (MLE) can be used to estimate parameters of a known distribution from observational data. When outliers are present, they dominate the log likelihood function causing the MLE estimators to be pulled toward them. Many robust statistical methods have been developed to detect outliers and to produce estimators that are robust against deviation from model assumptions. However, the existing methods suffer either from computational complexity when problem size increases or from giving up desirable properties, such as affine equivariance. An alternative approach is to design a special mathematical programming model to find the optimal weights for all the observations, such that at the optimal solution, outliers are given smaller weights and can be detected. This method produces a covariance estimator that has the following properties: First, it is affine equivariant. Second, it is computationally efficient even for large problem sizes. Third, it easy to incorporate prior beliefs into the estimator by using semi-definite programming. The accuracy of this method is tested for different contamination models, including recently proposed ones. The method is not only faster than the Fast-MCD method for high dimensional data but also has reasonable accuracy for the tested cases. 相似文献

9.

��ʼ�Ȩ���ģ��Ӧ��λ��ľ��Ȼͳ��ƶ�

下载免费PDF全文

王历容秦永松罗志军《应用概率统计》2014,30(1):40-56

本文对两个样本数据不完全的线性模型展开讨论, 其中线性模型协变量的观测值不缺失, 响应变量的观测值随机缺失(MAR). 我们采用逆概率加权填补方法对响应变量的缺失值进行补足, 得到两个线性回归模型``完全'样本数据, 在``完全'样本数据的基础上构造了响应变量分位数差异的对数经验似然比统计量. 与以往研究结果不同的是本文在一定条件下证明了该统计量的极限分布为标准, 降低了由于权系数估计带来的误差, 进一步构造出了精度更高的分位数差异的经验似然置信区间. 相似文献

10.

Maximum likelihood estimation for general hidden semi-Markov processes with backward recurrence time dependence

S. Trevezas N. Limnios 《Journal of Mathematical Sciences》2009,163(3):262-274

This paper concerns the study of asymptotic properties of the maximum likelihood estimator (MLE) for the general hidden semi-Markov model (HSMM) with backward recurrence time dependence. By transforming the general HSMM into a general hidden Markov model, we prove that under some regularity conditions, the MLE is strongly consistent and asymptotically normal. We also provide useful expressions for asymptotic covariance matrices, involving the MLE of the conditional sojourn times and the embedded Markov chain of the hidden semi-Markov chain. Bibliography: 17 titles. 相似文献

11.

Statistical inference for right-censored data with nonignorable missing censoring indicators

SUN ZhiHua XIE TianFa LIANG Hua 《中国科学数学(英文版)》2013,56(6):1263-1278

We consider the statistical inference for right-censored data when censoring indicators are missing but nonignorable, and propose an adjusted imputation product-limit estimator. The proposed estimator is shown to be consistent and converges to a Gaussian process. Furthermore, we develop an empirical processbased testing method to check the MAR (missing at random) mechanism, and establish asymptotic properties for the proposed test statistic. To determine the critical value of the test, a consistent model-based bootstrap method is suggested. We conduct simulation studies to evaluate the numerical performance of the proposed method and compare it with existing methods. We also analyze a real data set from a breast cancer study for an illustration. 相似文献

12.

Asymptotic distributions in the testing and estimation of the missing-data multivariate normal linear patterned mean and correlation matrix

Ted H. Szatrowski 《Linear algebra and its applications》1985

Techniques used by Szatrowski (1979, 1983) to solve the testing and estimation problem for linear patterned covariance are used to obtain results for the linear patterned correlation problem in the presence of missing data. Iterative algorithms are given for finding the maximum-likelihood estimates (MLE). Asymptotic distributions of the MLE and likelihood-ratio statistics (LRS) are obtained using the delta method. 相似文献

13.

混合von Mises 模型的参数估计 总被引：1，自引：0，他引：1

陈家骅李鹏飞谭鲜明《系统科学与数学》2007,27(1):59-67

有限混合von Mises模型在天文学、生物学、地理和医药等许多领域都有重要的应用．可是,不论样本量有多大,此模型的似然函数都是无界的．因此,参数的最大似然估计(MLE)是不相合的．我们发现,与混合正态模型一样,上述困难可以通过引入关于分布浓度参数的一个惩罚函数或对参数空间添加适当的约束来克服．在此文中,我们从理论上证明了这两种方法是可行的,相应的参数估计是强相合的,且是渐近有效的．我们还通过计算机模拟来探讨这些新方法在有限样本情况下的统计性质,并与现有的矩估计作了比较．结果发现,惩罚极大似然估计在均方误差方面表现最佳．最后我们还分析了一组实际数据,以进一步介绍新的估计方法．相似文献

14.

Estimation of a Multivariate Normal Covariance Matrix with Staircase Pattern Data

Xiaoqian Sun Dongchu Sun 《Annals of the Institute of Statistical Mathematics》2007,59(2):211-233

In this paper, we study the problem of estimating a multivariate normal covariance matrix with staircase pattern data. Two kinds of parameterizations in terms of the covariance matrix are used. One is Cholesky decomposition and another is Bartlett decomposition. Based on Cholesky decomposition of the covariance matrix, the closed form of the maximum likelihood estimator (MLE) of the covariance matrix is given. Using Bayesian method, we prove that the best equivariant estimator of the covariance matrix with respect to the special group related to Cholesky decomposition uniquely exists under the Stein loss. Consequently, the MLE of the covariance matrix is inadmissible under the Stein loss. Our method can also be applied to other invariant loss functions like the entropy loss and the symmetric loss. In addition, based on Bartlett decomposition of the covariance matrix, the Jeffreys prior and the reference prior of the covariance matrix with staircase pattern data are also obtained. Our reference prior is different from Berger and Yang’s reference prior. Interestingly, the Jeffreys prior with staircase pattern data is the same as that with complete data. The posterior properties are also investigated. Some simulation results are given for illustration. 相似文献

15.

Nonparametric inference on the difference of location parameters of correlated variables from fragmentary samples

K. F. Cheng 《Annals of the Institute of Statistical Mathematics》1987,39(1):331-347

Summary In this paper, two types of robust estimators and approximate confidence intervals for the difference of location parameters of correlated random variables are proposed and investigated when some observations are missing. It is shown that the suggested estimators are consistent and asymptotically normally distributed. In addition, the proposed approximate confidence intervals are also shown to enjoy some nice asymptotic properties. 相似文献

16.

套重复测量模型的充分性和估计

艾摩尔《应用概率统计》2004,20(2):133-146

当每一个体有相同的子个体,并且每一子个体的处理水平是成对的时候,我们使用套重复测量模型.令Yi=(Yilll,…Yimrc)′是第i个个体的观测向量.假设Yi为相互独立的正态分布,均值为μi,协方差阵为∑＞0.假设可简化为所有测量值的方差为σ2;相同个体的不同子个体之间的成对测量值之间的关系如下(1)不同列不同行的观测值;(2)相同列不同行的测量值;(3)相同行不同列的测量值,它们的协方差分别为ρ2σ2,ρ3σ2,ρ4σ2.我们假设试验是给定的,用坐标自由(coordinate-free)的方法研究了套重复测量模型的完备充分统计量,最小方差无偏估计(MVUE)和极大似然估计(MLE). 相似文献

17.

MM Algorithms for Variance Components Models

Hua Zhou Liuyi Hu Jin Zhou Kenneth Lange 《Journal of computational and graphical statistics》2019,28(2):350-361

Variance components estimation and mixed model analysis are central themes in statistics with applications in numerous scientific disciplines. Despite the best efforts of generations of statisticians and numerical analysts, maximum likelihood estimation (MLE) and restricted MLE of variance component models remain numerically challenging. Building on the minorization–maximization (MM) principle, this article presents a novel iterative algorithm for variance components estimation. Our MM algorithm is trivial to implement and competitive on large data problems. The algorithm readily extends to more complicated problems such as linear mixed models, multivariate response models possibly with missing data, maximum a posteriori estimation, and penalized estimation. We establish the global convergence of the MM algorithm to a Karush–Kuhn–Tucker point and demonstrate, both numerically and theoretically, that it converges faster than the classical EM algorithm when the number of variance components is greater than two and all covariance matrices are positive definite. Supplementary materials for this article are available online. 相似文献

18.

Effects of missing data in credit risk scoring. A comparative analysis of methods to achieve robustness in the absence of sufficient data

R Florez-Lopez 《The Journal of the Operational Research Society》2010,61(3):486-501

The 2004 Basel II Accord has pointed out the benefits of credit risk management through internal models using internal data to estimate risk components: probability of default (PD), loss given default, exposure at default and maturity. Internal data are the primary data source for PD estimates; banks are permitted to use statistical default prediction models to estimate the borrowers’ PD, subject to some requirements concerning accuracy, completeness and appropriateness of data. However, in practice, internal records are usually incomplete or do not contain adequate history to estimate the PD. Current missing data are critical with regard to low default portfolios, characterised by inadequate default records, making it difficult to design statistically significant prediction models. Several methods might be used to deal with missing data such as list-wise deletion, application-specific list-wise deletion, substitution techniques or imputation models (simple and multiple variants). List-wise deletion is an easy-to-use method widely applied by social scientists, but it loses substantial data and reduces the diversity of information resulting in a bias in the model's parameters, results and inferences. The choice of the best method to solve the missing data problem largely depends on the nature of missing values (MCAR, MAR and MNAR processes) but there is a lack of empirical analysis about their effect on credit risk that limits the validity of resulting models. In this paper, we analyse the nature and effects of missing data in credit risk modelling (MCAR, MAR and NMAR processes) and take into account current scarce data set on consumer borrowers, which include different percents and distributions of missing data. The findings are used to analyse the performance of several methods for dealing with missing data such as likewise deletion, simple imputation methods, MLE models and advanced multiple imputation (MI) alternatives based on MarkovChain-MonteCarlo and re-sampling methods. Results are evaluated and discussed between models in terms of robustness, accuracy and complexity. In particular, MI models are found to provide very valuable solutions with regard to credit risk missing data. 相似文献

19.

Maximum-Likelihood Asymptotic Inference for Autoregressive Hilbertian Processes

M.?D.?Ruiz-Medina Email author R.?M.?Espejo 《Methodology and Computing in Applied Probability》2015,17(1):207-222

The autoregressive Hilbertian process framework has been introduced in Bosq (2000). This book provides the nonparametric estimation of the autocorrelation and covariance operators of the autoregressive Hilbertian processes. The asymptotic properties of these estimators are also provided. The maximum likelihood approach still remains unexplored. This paper obtains the asymptotic distribution of the maximum likelihood (ML) estimators of the auto-covariance operator of the Hilbert-valued innovation process, and of the autocorrelation operator of a Gaussian autoregressive Hilbertian process of order one. A real data example is analyzed in the financial context for illustration of the performance of the projection maximum likelihood estimation methodology in the context of missing data. 相似文献

20.

Kullback-Leibler Information Consistent Estimation for Censored Data

Akio Suzukawa Hideyuki Imai Yoshiharu Sato 《Annals of the Institute of Statistical Mathematics》2001,53(2):262-276

This paper is intended as an investigation of parametric estimation for the randomly right censored data. In parametric estimation, the Kullback-Leibler information is used as a measure of the divergence of a true distribution generating a data relative to a distribution in an assumed parametric model M. When the data is uncensored, maximum likelihood estimator (MLE) is a consistent estimator of minimizing the Kullback-Leibler information, even if the assumed model M does not contain the true distribution. We call this property minimum Kullback-Leibler information consistency (MKLI-consistency). However, the MLE obtained by maximizing the likelihood function based on the censored data is not MKLI-consistent. As an alternative to the MLE, Oakes (1986, Biometrics, 42, 177–182) proposed an estimator termed approximate maximum likelihood estimator (AMLE) due to its computational advantage and potential for robustness. We show MKLI-consistency and asymptotic normality of the AMLE under the misspecification of the parametric model. In a simulation study, we investigate mean square errors of these two estimators and an estimator which is obtained by treating a jackknife corrected Kaplan-Meier integral as the log-likelihood. On the basis of the simulation results and the asymptotic results, we discuss comparison among these estimators. We also derive information criteria for the MLE and the AMLE under censorship, and which can be used not only for selecting models but also for selecting estimation procedures. 相似文献