期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

1.

Normal/Independent Distributions and Their Applications in Robust Regression

Kenneth Lange Janet S. Sinsheimer 《Journal of computational and graphical statistics》2013,22(2):175-198

Abstract

Maximum likelihood estimation with nonnormal error distributions provides one method of robust regression. Certain families of normal/independent distributions are particularly attractive for adaptive, robust regression. This article reviews the properties of normal/independent distributions and presents several new results. A major virtue of these distributions is that they lend themselves to EM algorithms for maximum likelihood estimation. EM algorithms are discussed for least L_p regression and for adaptive, robust regression based on the t, slash, and contaminated normal families. Four concrete examples illustrate the performance of the different methods on real data. 相似文献

2.

Robust statistical modeling using the Birnbaum‐Saunders‐t distribution applied to insurance

Gilberto A. Paula Víctor Leiva Michelli Barros Shuangzhe Liu 《商业与工业应用随机模型》2012,28(1):16-34

In this paper, we carry out robust modeling and influence diagnostics in Birnbaum‐Saunders (BS) regression models. Specifically, we present some aspects related to BS and log‐BS distributions and their generalizations from the Student‐t distribution, and develop BS‐t regression models, including maximum likelihood estimation based on the EM algorithm and diagnostic tools. In addition, we apply the obtained results to real data from insurance, which shows the uses of the proposed model. Copyright © 2011 John Wiley & Sons, Ltd. 相似文献

3.

Robust and Accurate Inference via a Mixture of Gaussian and Student’s t Errors

Hyungsuk Tak Justin A. Ellis Sujit K. Ghosh 《Journal of computational and graphical statistics》2019,28(2):415-426

A Gaussian measurement error assumption, that is, an assumption that the data are observed up to Gaussian noise, can bias any parameter estimation in the presence of outliers. A heavy tailed error assumption based on Student’s t distribution helps reduce the bias. However, it may be less efficient in estimating parameters if the heavy tailed assumption is uniformly applied to all of the data when most of them are normally observed. We propose a mixture error assumption that selectively converts Gaussian errors into Student’s t errors according to latent outlier indicators, leveraging the best of the Gaussian and Student’s t errors; a parameter estimation can be not only robust but also accurate. Using simulated hospital profiling data and astronomical time series of brightness data, we demonstrate the potential for the proposed mixture error assumption to estimate parameters accurately in the presence of outliers. Supplemental materials for this article are available online. 相似文献

4.

Accurate confidence intervals in regression analyses of non-normal data

Robert J. Boik 《Annals of the Institute of Statistical Mathematics》2008,60(1):61-83

A linear model in which random errors are distributed independently and identically according to an arbitrary continuous distribution is assumed. Second- and third-order accurate confidence intervals for regression parameters are constructed from Charlier differential series expansions of approximately pivotal quantities around Student’s t distribution. Simulation verifies that small sample performance of the intervals surpasses that of conventional asymptotic intervals and equals or surpasses that of bootstrap percentile-t and bootstrap percentile-|t| intervals under mild to marked departure from normality. 相似文献

5.

偏正态数据下众数回归模型的统计诊断

曹幸运曾鑫吴刘仓《高校应用数学学报(A辑)》2021,36(1):9-20

为了更好地拟合偏态数据,充分提取偏态数据的信息,针对偏正态数据建立了众数回归模型,并基于Pena距离统计量对众数回归模型进行统计断研究,得到了众数回归模型的Pena距离表达式以及高杠杆异常点的诊断方法.利用EM算法与梯度下降法给出了众数回归模型参数的极大似然估计,根据数据删除模型计算似然距离、Cook距离和Pena距离统计量,绘制诊断统计图.通过Monte Carlo模拟试验和实例分析比较,说明文章提出的方法行之有效,并在一定条件下Pena距离对异常点或强影响点的诊断优于似然距离和Cook距离. 相似文献

6.

Bayesian Multivariate Distributional Regression With Skewed Responses and Skewed Random Effects

Patrick Michaelis Nadja Klein Thomas Kneib 《Journal of computational and graphical statistics》2018,27(3):602-611

The normal and the t distribution are classical tools for building random effects regression models where both can be used for the specification of either the conditional response distribution or the random effects distribution. However, the underlying assumption of symmetry can be questionable in many applications. We, therefore, propose regression models where the skew-normal and skew-t distribution are considered for both the response and the random effects specification and embed these models in the framework of distributional regression such that regression predictors can be specified for all distributional parameters. The distributional regression framework also allows us to consider multivariate versions of the skew-normal and the skew-t distribution. For Bayesian inference, we adapt iteratively weighted least-square proposals within Markov chain Monte Carlo simulations such that they can also facilitate the inclusion of nonnormal random effects specifications. Model choice is based on the Watanabe–Akaike information criterion, in particular, to differentiate between skew and nonskew distributional specifications in a number of simulation studies. Finally, to illustrate their practical applicability, the developed models are applied to a study on cholesterol levels originating from the Framingham Heart Study and a dataset from the Demographic and Health Surveys on undernutrition among children in Nigeria. Supplementary material for this article is available online. 相似文献

7.

A Flexible Model for Generalized Linear Regression with Measurement Error

Surupa Roy Tathagata Banerjee 《Annals of the Institute of Statistical Mathematics》2006,58(1):153-169

This paper focuses on the question of specification of measurement error distribution and the distribution of true predictors in generalized linear models when the predictors are subject to measurement errors. The standard measurement error model typically assumes that the measurement error distribution and the distribution of covariates unobservable in the main study are normal. To make the model flexible enough we, instead, assume that the measurement error distribution is multivariate t and the distribution of true covariates is a finite mixture of normal densities. Likelihood–based method is developed to estimate the regression parameters. However, direct maximization of the marginal likelihood is numerically difficult. Thus as an alternative to it we apply the EM algorithm. This makes the computation of likelihood estimates feasible. The performance of the proposed model is investigated by simulation study. 相似文献

8.

具有高斯过程误差的函数型线性模型的统计诊断

陈刚黄超林金官《应用数学与计算数学学报》2014,(1):117-126

提出了具有高斯过程误差的函数型回归模型的几种诊断方法.在此模型中,首先,在样条基的基础上,推导了回归系数函数的估计.随后,证明了数据删失模型和均值漂移模型的等价性.然后,研究了三种诊断方法,即残差分析、Cook距离和似然距离来诊断异常和强影响数据.最后,通过一个模拟例子和一个实例来阐述方法的有效性. 相似文献

9.

Functional relationships having many independent variables and errors with multivariate normal distribution

G. R. Dolby T. G. Freeman 《Journal of multivariate analysis》1975,5(4):466-479

This paper deals with maximum likelihood estimation of linear or nonlinear functional relationships assuming that replicated observations have been made on p variables at n points. The joint distribution of the pn errors is assumed to be multivariate normal. Existing results are extended in two ways: first, from known to unknown error covariance matrix; second, from the two variate to the multivariate case.For the linear relationship it is shown that the maximum likelihood point estimates are those obtained by the method of generalized least squares. The present method, however, has the advantage of supplying estimates of the asymptotic covariances of the structural parameter estimates. 相似文献

10.

Ols parameter estimation for a linear paired confluent model

A. G. Belov 《Computational Mathematics and Modeling》2009,20(4):383-396

We investigate OLS parameter estimation for a linear paired model in the case of a passive experiment with errors in both variables. The explicit form of the OLS estimates is obtained, their equivalence to maximum likelihood estimates is demonstrated in the presence of normal errors, and estimate consistency is proved. The OLS estimates are compared analytically and numerically with known parameter estimates of “direct,” “orthogonal,” and “diagonal” regression models. 相似文献

11.

部分线性混合效应模型中方差分量的稳健估计

下载免费PDF全文

秦国友朱仲义《应用概率统计》2007,23(2):207-214

部分线性混合效应模型中方差分量是我们感兴趣的参数, 文献中已经给出许多估计方法. 但是其中很多方法都可以归结为广义估计方程方法(GEE), 如: 最大似然估计(MLE), 约束最大似然估计(REMLE)等, 而GEE方法对异常点很敏感. 本文提出一组关于部分线性混合效应模型(PLMM)中均值和方差分量的稳健估计方程, 对均值和方差分量同时进行稳健估计; 并进行了随机模拟考察所提出稳健估计的有效性, 最后通过两个实例, 说明了所提方法的可行性. 相似文献

12.

Clusters, outliers, and regression: fixed point clusters

Christian Hennig 《Journal of multivariate analysis》2003,86(1):183-212

Fixed point clustering is a new stochastic approach to cluster analysis. The definition of a single fixed point cluster (FPC) is based on a simple parametric model, but there is no parametric assumption for the whole dataset as opposed to mixture modeling and other approaches. An FPC is defined as a data subset that is exactly the set of non-outliers with respect to its own parameter estimators. This paper concentrates upon the theoretical foundation of FPC analysis as a method for clusterwise linear regression, i.e., the single clusters are modeled as linear regressions with normal errors. In this setup, fixed point clustering is based on an iteratively reweighted estimation with zero weight for all outliers. FPCs are non-hierarchical, but they may overlap and include each other. A specification of the number of clusters is not needed. Consistency results are given for certain mixture models of interest in cluster analysis. Convergence of a fixed point algorithm is shown. Application to a real dataset shows that fixed point clustering can highlight some other interesting features of datasets compared to maximum likelihood methods in the presence of deviations from the usual assumptions of model based cluster analysis. 相似文献

13.

On the mean square error of maximum likelihood estimates of the distribution density of sufficient statistics of the multivariate normal distribution

R. A. Abusev 《Journal of Mathematical Sciences》1995,75(1):1378-1382

The extensive use of maximum likelihood estimates underscores the importance of the problem of statistical estimation of their errors. These estimates are of utmost importance in cases where the family of normal distributions and the families related to the normal distributions are considered [1, 2, 4]. The mean square errors of the maximum likelihood estimates of the normal density were investigated in the author's paper [3]. The mean square errors of statistical estimates of some families of densities related to the normal distributions were considered in the papers [4–6]. In the present paper, we obtain an asymptotic expansion of the mean square error of the maximum likelihood estimates of the densities of the joint distribution of sufficient statistics of the family of multivariate normal distributions. The results obtained allow us to construct the mean square errors of the maximum likelihood estimates for the chi-square density and Wishart's density. Translated fromStatisticheskie Metody Otsenivaniya i Proverki Gipotez, pp. 4–11, Perm. 1990. 相似文献

14.

A note on the simple structural regression model

R. B. Arellano-Valle H. Bolfarine 《Annals of the Institute of Statistical Mathematics》1996,48(1):111-125

In this paper we investigate some aspects like estimation and hypothesis testing in the simple structural regression model with measurement errors. Use is made of orthogonal parametrizations obtained in the literature. Emphasis is placed on some properties of the maximum likelihood estimators and also on the distribution of the likelihood ratio statistics. 相似文献

15.

生长曲线模型的分位数回归

下载免费PDF全文

张雨刘倩曾林蕊《应用概率统计》2014,30(3):296-302

生长曲线模型有着广泛的应用, 在经济学、生物学、医学等各个领域的研究都起着重要的作用. 已有文献关于生长曲线模型参数矩阵的估计基本上是使用最小二乘方法或极大似然方法. 使用最小二乘方法, 当误差项服从偏峰分布、厚尾分布、或者存在异常点时, 得出的估计不是有效的; 使用极大似然方法, 要求分布已知, 实际使用时很难满足这一点. 分位数回归能弥补如上这些缺陷, 所得估计具有很好的稳健性. 本文使用分位数回归方法给出生长曲线模型参数矩阵的估计, 及其渐近正态性. 相似文献

16.

Selection of smoothing parameters in<Emphasis Type="Italic">B</Emphasis>-spline nonparametric regression models using information criteria

Seiya Imoto Sadanori Konishi 《Annals of the Institute of Statistical Mathematics》2003,55(4):671-687

We consider the use ofB-spline nonparametric regression models estimated by the maximum penalized likelihood method for extracting information from data with complex nonlinear structure. Crucial points inB-spline smoothing are the choices of a smoothing parameter and the number of basis functions, for which several selectors have been proposed based on cross-validation and Akaike information criterion known as AIC. It might be however noticed that AIC is a criterion for evaluating models estimated by the maximum likelihood method, and it was derived under the assumption that the ture distribution belongs to the specified parametric model. In this paper we derive information criteria for evaluatingB-spline nonparametric regression models estimated by the maximum penalized likelihood method in the context of generalized linear models under model misspecification. We use Monte Carlo experiments and real data examples to examine the properties of our criteria including various selectors proposed previously. 相似文献

17.

Fast Implementation for Normal Mixed Effects Models With Censored Response

《Journal of computational and graphical statistics》2013,22(4):797-817

We propose an EM algorithm for computing the maximum likelihood and restricted maximum likelihood for linear and nonlinear mixed effects models with censored response. In contrast with previous developments, this algorithm uses closed-form expressions at the E-step, as opposed to Monte Carlo simulation. These expressions rely on formulas for the mean and variance of a truncated multinormal distribution, and can be computed using available software. This leads to an improvement in the speed of computation of up to an order of magnitude. A wide class of mixed effects models is considered, including the Laird–Ware model, and extensions to different structures for the variance components, heteroscedastic and autocorrelated errors, and multilevel models. We apply the methodology to two case studies from our own biostatistical practice, involving the analysis of longitudinal HIV viral load in two recent AIDS studies.

The proposed algorithm is implemented in the R package lmec. An appendix which includes further mathematical details, the R code, and datasets for examples and simulations are available as the online supplements. 相似文献

18.

ML Estimation of the MultivariatetDistribution and the EM Algorithm

Chuanhai Liu 《Journal of multivariate analysis》1997,63(2):296-312

Maximum likelihood estimation of the multivariatetdistribution, especially with unknown degrees of freedom, has been an interesting topic in the development of the EM algorithm. After a brief review of the EM algorithm and its application to finding the maximum likelihood estimates of the parameters of thetdistribution, this paper provides new versions of the ECME algorithm for maximum likelihood estimation of the multivariatetdistribution from data with possibly missing values. The results show that the new versions of the ECME algorithm converge faster than the previous procedures. Most important, the idea of this new implementation is quite general and useful for the development of the EM algorithm. Comparisons of different methods based on two datasets are presented. 相似文献

19.

Data analysis for accelerated life tests via Weibull-gamma frailty regression models

Xiao-Dong Zhou Yun-Juan Wang Lin Wu Rong-Xian Yue 《商业与工业应用随机模型》2023,39(4):567-583

In this article, we study data analysis methods for accelerated life test (ALT) with blocking. Unlike the previous assumption of normal distribution for random block effects, we advocate the use of Weibull regression model with gamma random effects for making statistical inference of ALT data. To estimate the unknown parameters in the proposed model, maximum likelihood estimation and Bayesian estimation methods are provided. We illustrate the proposed methods using real data examples and simulation examples. Numerical results suggest that distribution of random effects has minimal impact on the estimation of fixed effects in the Weibull regression models. Furthermore, to demonstrate the advantage of our proposed model, we also provide methods to compare ALT plans and thus identify the optimal ALT plans. 相似文献

20.

基于多目标规划法的模糊线性回归分析

黄华宋艳萍苗新艳肉孜阿吉《模糊系统与数学》2012,26(3):114-119

讨论输入、输出均为模糊数,回归系数为实数时的模糊线性回归分析。由于模糊最小二乘线性回归容易受异常值的影响,而最小一乘法能有效地降低回归模型的误差。为此,基于最小一乘法,建立多目标规划模型并将其转化为非线性规划问题进行求解,从而实现模糊线性回归模型的参数估计。最后,结合一个数值实例,验证和比较该方法的合理性和优越性。相似文献