首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到18条相似文献,搜索用时 119 毫秒
1.
线性回归模型的误差项不服从正态分布或存在多个离群点时,可以将残差秩次的某些函数作为权重引入估计模型来减少离群点的不良影响。本文从参数估计、稳健性质、回归诊断等方面对基于残差秩次的一类稳健回归方法进行介绍.通过模拟研究和实例分析表明,R和GR估计是一种估计效率较高的稳健回归方法,其中GR估计可同时避免X与Y空间离群点,而高失效点HBR估计可通过控制某个参数在稳健性与估计效率之间进行折衷.  相似文献   

2.
企业将资产运用于生产经营活动,并由此赚取更多的资产,即产生公司的收入.因此企业资产与收入之间必定存在一定的相关关系.在对上市公司总资产与营业收入进行一般线性拟合的基础之上,采用分位回归模型对上市公司的总资产与营业收入的关系进行深入剖析.结果表明,传统的线性模型只能揭示出总资产与营业收入呈正相关关系,而分位回归方法能更好地看出,高分位点营业收入的企业在提高一定总资产时,会更能促进营业收入的增长.由于收集到的数据中存在离群点,在第5节讨论了线性分位回归模型的统计诊断,类比于一般线性模型的R square得到不同分位点上的R square.通过删除离群点的处理,得出分位回归模型比一般线性模型更加稳健,数据在高分位点的拟合效果更好一些.  相似文献   

3.
对我国六大银行688条顾客数据,分别运用两种方法(分层回归模型和经典回归模型)在两种软件(MLwiN 2.10 Beta和SPSS 15.0),进行了数学建模.结果显示,经典回归模型进行参数估计的结果不会产生严重偏差;没有足够的证据证明经典回归模型会因为低估标准误从而使得不显著的变量变得显著.结论表明收集数据时,无论采用分层抽样还是随机抽样,建模者都可以先从建立简单模型着手,获得对数据的初步认知.  相似文献   

4.
为了更好地拟合偏态数据,充分提取偏态数据的信息,针对偏正态数据建立了众数回归模型,并基于Pena距离统计量对众数回归模型进行统计断研究,得到了众数回归模型的Pena距离表达式以及高杠杆异常点的诊断方法.利用EM算法与梯度下降法给出了众数回归模型参数的极大似然估计,根据数据删除模型计算似然距离、Cook距离和Pena距离统计量,绘制诊断统计图.通过Monte Carlo模拟试验和实例分析比较,说明文章提出的方法行之有效,并在一定条件下Pena距离对异常点或强影响点的诊断优于似然距离和Cook距离.  相似文献   

5.
针对确定输入、模糊输出的模糊线性回归分析模型,采用最小二乘法,讨论了模糊线性回归模型的数据删除模型的参数估计,将建立在确定性数据基础上的线性回归模型统计诊断量Cook距离推广到模糊线性回归分析模型中,构造了统计诊断量—模糊Cook距离,通过数值模拟和对实际例子的研究,识别出其中的强影响点,得出与其它方法相同的结论,表明本文构造的统计诊断量是有效的,且应用比其它方法更方便.  相似文献   

6.
Logistic回归模型的影响分析   总被引:2,自引:0,他引:2  
Logistic回归模型的影响分析是Logistic回归诊断研究中的重要内容。常用的分析方法都是轮换地删除数据点后的逐步判断,而这个判断的过程主要体现在模型的诊断图上。鉴于此,通过构造诊断统计量来有效地开发诊断图成为影响分析的核心内容,并由此能较为准确地探寻出模型的强影响点。本文通过对Logistic回归模型帽子矩阵的分解以及对轮换地删除数据点后的系数估计的相对变化量进行加权,得出Logistic回归模型诊断图使其能比传统的诊断图更准确地判断出模型的强影响点。  相似文献   

7.
基于最小截平方和估计的监测数据分析方法   总被引:1,自引:0,他引:1  
水工程安全监测数据中不可避免地存在离群点,而应用最为广泛的最小二乘法(least square,LS)不具备剔除离群点的能力,反而更易吸收离群点,使回归曲线严重偏离实际。针对LS在此方面的缺陷,本文在最小化残差平方和理论的基础上,提出采用最小截平方和估计(least trimmed squares,LTS)方法来构建水工程安全监控模型。根据实际工程的监测资料并对监测资料分析处理,剔除离群点得到最优数据群。通过求解最优数据群的回归系数,进而得到最接近实际数据的拟合曲线。相比于LS估计,LTS估计所得结果更具有合理性、稳健性,且能够显著提高数据的预测精度。因此,LTS估计在水工程安全监测等数据分析中具有良好的应用前景。  相似文献   

8.
统计诊断的主要任务就是通过诊断统计量检测已知观测数据在用既定模型拟合时的合理性,主要是找出数据当中的异常点或强影响点。本文主要研究Logostic回归模型的诊断统计量和诊断统计图。用牛顿迭代法给出Logistic回归模型的极大似然估计值,根据扰动模型得到传统的诊断统计量,结合残差、杠杆值和系数变化三者构造新的诊断统计量,绘制新的诊断统计图,通过模拟研究说明新的诊断统计量的有效性,最后用一个实际案例说明新的诊断方法的应用并进一步验证其优越性。  相似文献   

9.
时间序列自回归AR模型在建模过程中易受离群值的影响,导致计算结果与实际不相符.针对这一现象,将Hampel权函数运用于自相关函数中,从而构建出自回归AR模型的稳健估计算法,以克服离群值的影响.并对此方法进行了模拟和实证分析,模拟和实证分析均表明:当时序数据中不存在离群值时,传统估计方法与稳健估计方法得到的结果基本保持一致;当数据中存在离群值时,运用传统估计方法得到的结果出现较大变化,而运用稳健估计方法得到的结果基本不变.这说明相对于传统估计方法,稳健估计方法能有效抵抗离群值的影响,具有良好的抗干扰性和高抗差性.  相似文献   

10.
回归诊断中几种影响诊断量的比较   总被引:2,自引:0,他引:2  
在回归诊断的实际应用中,人们越来越认识到强影响点对回归模型的影响重要性。但由于影响评价的方法越来越多。使应用者不知采用何种方法为好。本文对一些较为常用的诊断量作了比较,并通过实例考虑它们的应用情况。  相似文献   

11.
《Optimization》2012,61(12):1467-1490
Large outliers break down linear and nonlinear regression models. Robust regression methods allow one to filter out the outliers when building a model. By replacing the traditional least squares criterion with the least trimmed squares (LTS) criterion, in which half of data is treated as potential outliers, one can fit accurate regression models to strongly contaminated data. High-breakdown methods have become very well established in linear regression, but have started being applied for non-linear regression only recently. In this work, we examine the problem of fitting artificial neural networks (ANNs) to contaminated data using LTS criterion. We introduce a penalized LTS criterion which prevents unnecessary removal of valid data. Training of ANNs leads to a challenging non-smooth global optimization problem. We compare the efficiency of several derivative-free optimization methods in solving it, and show that our approach identifies the outliers correctly when ANNs are used for nonlinear regression.  相似文献   

12.
In multiple linear regression model, we have presupposed assumptions (independence, normality, variance homogeneity and so on) on error term. When case weights are given because of variance heterogeneity, we can estimate efficiently regression parameter using weighted least squares estimator. Unfortunately, this estimator is sensitive to outliers like ordinary least squares estimator. Thus, in this paper, we proposed some statistics for detection of outliers in weighted least squares regression.  相似文献   

13.

Multiple linear regression model based on normally distributed and uncorrelated errors is a popular statistical tool with application in various fields. But these assumptions of normality and no serial correlation are hardly met in real life. Hence, this study considers the linear regression time series model for series with outliers and autocorrelated errors. These autocorrelated errors are represented by a covariance-stationary autoregressive process where the independent innovations are driven by shape mixture of skew-t normal distribution. The shape mixture of skew-t normal distribution is a flexible extension of the skew-t normal with an additional shape parameter that controls skewness and kurtosis. With this error model, stochastic modeling of multiple outliers is possible with an adaptive robust maximum likelihood estimation of all the parameters. An Expectation Conditional Maximization Either algorithm is developed to carryout the maximum likelihood estimation. We derive asymptotic standard errors of the estimators through an information-based approximation. The performance of the estimation procedure developed is evaluated through Monte Carlo simulations and real life data analysis.

  相似文献   

14.
线性回归诊断的若干问题   总被引:3,自引:0,他引:3  
本文对于线性回归诊断提出了几种新的模型和方法。我们首次研究了方差加权和均值漂移的混合模型,得到了相应的诊断统计量。本文还引入了罚函数方法,并以此为工具,讨论了若干有偏估计的影响度量,最后,本文提出了基于重心的诊断统计量,对于识别异常点有较好的效果。  相似文献   

15.
In this paper, we consider the robust regression problem associated with Huber loss in the framework of functional linear model and reproducing kernel Hilbert spaces. We propose an Ivanov regularized empirical risk minimization estimation procedure to approximate the slope function of the linear model in the presence of outliers or heavy-tailed noises. By appropriately tuning the scale parameter of the Huber loss, we establish explicit rates of convergence for our estimates in terms of excess prediction risk under mild assumptions. Our study in the paper justifies the efficiency of Huber regression for functional data from a theoretical viewpoint.  相似文献   

16.
Detection of multiple outliers or subset of influential points has been rarely considered in the linear measurement error models. In this paper a new influence statistic for one or a set of observations is generalized and characterized based on the corrected likelihood in the linear measurement error models. This influence statistic can be expressed in terms of the residuals and the leverages of linear measurement error regression. Unlike Cook’s statistic, this new measure of influence has asymptotically normal distribution and is able to detect a subset of high leverage outliers which is not identified by Cook’s statistic. As an illustrative example, simulation studies and a real data set are analysed.  相似文献   

17.
We consider the problem of deleting bad influential observations (outliers) in linear regression models. The problem is formulated as a Quadratic Mixed Integer Programming (QMIP) problem, where penalty costs for discarding outliers are used into the objective function. The optimum solution defines a robust regression estimator called penalized trimmed squares (PTS). Due to the high computational complexity of the resulting QMIP problem, the proposed robust procedure is computationally suitable for small sample data. The computational performance and the effectiveness of the new procedure are improved significantly by using the idea of ε-Insensitive loss function from support vectors machine regression. Small errors are ignored, and the mathematical formula gains the sparseness property. The good performance of the ε-Insensitive PTS (IPTS) estimator allows identification of multiple outliers avoiding masking or swamping effects. The computational effectiveness and successful outlier detection of the proposed method is demonstrated via simulated experiments. This research has been partially funded by the Greek Ministry of Education under the program Pythagoras II.  相似文献   

18.
讨论输入、输出均为模糊数,回归系数为实数时的模糊线性回归分析。由于模糊最小二乘线性回归容易受异常值的影响,而最小一乘法能有效地降低回归模型的误差。为此,基于最小一乘法,建立多目标规划模型并将其转化为非线性规划问题进行求解,从而实现模糊线性回归模型的参数估计。最后,结合一个数值实例,验证和比较该方法的合理性和优越性。  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号