首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到17条相似文献,搜索用时 78 毫秒
1.
PLS回归在消除多重共线性中的作用   总被引:12,自引:1,他引:11  
本文详细阐述了解释变量的多重共线性在回归建模与分析中的危害作用,并指出目前常用的几种消除多重线性影响的方法,以及它们的不足之处。本文结合实证研究指出:利用一种新的建模思路—PLS回归,可以更好地消除多重共线性对建模准确性与可靠性所带来的影响  相似文献   

2.
本文通过例子介绍多元线性回归中自变量共线性的诊断以及使用 SAS/SATA( 6.12 )软件中的 REG等过程的增强功能处理回归变量共线性的一些方法。包括筛选变量法 ,岭回归分析法 ,主成分回归法和偏最小二乘回归法  相似文献   

3.
本文通过例子介绍多元线性回归中自变量共线性的诊断以及使用SAS/SATA(6.12)软件中的REG等过程的增强功能处理回归变量共线性的一些方法,包括筛变量法,岭回归分析法,主成分回归法和稔蕞小二乘回归法。  相似文献   

4.
PLSR模型的回归效果分析   总被引:6,自引:1,他引:5  
本文简单地介绍了多元线性回归、主元回归、部分最小二乘回归模型 ,用实例对三种方法的回归性能进行比较 ,并指出在消除多重共线性、回归系数估计精度及预测精度等方面 ,部分最小二乘回归模型优于其它两种模型  相似文献   

5.
SAS6.11版岭回归分析程序设计及其实例分析   总被引:9,自引:0,他引:9  
应用岭回归分析可以解决自变量之间存在复共线性时的回归问题。本文给出了在SAS6.1 1及以上版本中实现岭回归分析的程序 ,用具体实例说明进行岭回归的方法  相似文献   

6.
本文提出三种创新性模型PLSADL,RADL和RPLSADL.这三种新模型是将考虑了数据时序性的时间序列ADL模型与考虑了多变量共线性问题的多元线性回归模型PLSR,RR,RPLS相结合.通过分析我国2000年到2012年季度GDP增长率与八项经济指标的关系,我们发现新模型PLSADL,RADL和RPLSADL在拟合效果和预测能力上都优于其它四个模型.这说明在ADL模型的建立过程中,如果能够考虑多变量共线性问题将会有效地提高模型的预测效果.  相似文献   

7.
无偏的岭回归迭代算法   总被引:1,自引:0,他引:1  
本文探讨线性模型的无偏的岭回归迭代算法,这种算法保持最小二乘法的性质,当存在较为严重的共线性时,它能给出较为精确的参数及其协差阵的估计值;当存在严格的共线性时,给出参数及其协差阵的无穷多解中的一个,这个解由初值决定。文章还给出了算法的收敛性及一些其它性质的证明。  相似文献   

8.
检测和解决多元回归分析中的多重共线性问题具有重要意义.本文采用岭回归(RR)和核主成分回归(KCPR)对同一数据进行回归分析,使用方差膨胀因子(VIF)和条件指数(CI)作为共线性诊断的量度,并对回归模型结果进行比较.经过实证分析,发现这两种回归方法都能很好地消除多重共线性,总的来说核主成分回归的对内拟合效果要优于岭回归.但是这两种方法的参数选择的不同对回归模型的好坏都有巨大影响,需要进一步分析判断.  相似文献   

9.
基于2008年经济普查的数据,从描述统计分析和回归分析两方面分别对微观数据和宏观汇总数据在统计分析上的差异进行了实证分析.在描述统计分析中发现,宏观汇总数据比微观数据更接近正态分布,但对数化处理后的数据并非如此;在回归分析中发现,基于微观数据和宏观汇总数据估计的生产函数,在消除异方差和多重共线性之前,无论是在生产函数的规模效应、生产要素的贡献率以及生产要素对产出的解释力度上均存在着差异,但是在消除异方差和多重共线性之后,在要素对产出的解释力度上仍存在很大差异.  相似文献   

10.
游华 《数理统计与管理》2003,22(Z1):322-325
在经济研究中,自变量之间以及因变量存在着较严重的多重共线性,本文采用偏最小二乘回归来建立多元线性回归模型,以消除多重共线性的影响,从而得到较满意的结果.  相似文献   

11.
非凸惩罚函数包括SCAD惩罚和MCP惩罚, 这类惩罚函数具有无偏性、连续性和稀疏性等特点,岭回归方法能够很好的克服共线性问题. 本文将非凸惩罚函数和岭回归方法的优势结合起来(简记为 NPR),研究了自变量间存在高相关性问题时NPR估计的Oracle性质. 这里主要研究了参数个数$p_n$ 随样本量$n$ 呈指数阶增长的情况. 同时, 通过模拟研究和实例分析进一步验证了NPR 方法的表现.  相似文献   

12.
关于线性回归模型的有偏估计   总被引:3,自引:0,他引:3  
有偏估计方法是近代回归分析的常用方法.本文研究了几种常用的有偏估计方法,澄清了这些方法的区别和联系.对有偏估计的一些关键点进行研究,给出了一种新的岭参数确定法和一种新的主成分概念,并讨论了这些方法的优良性.为了提高有偏估计的效率,提出了用比例因子规范模型的方法.最后,给出了说明本文方法的数值例子.  相似文献   

13.
Abstract

Akaike's information criterion (AIC), derived from asymptotics of the maximum likelihood estimator, is widely used in model selection. However, it has a finite-sample bias that produces overfitting in linear regression. To deal with this problem, Ishiguro, Sakamoto, and Kitagawa proposed a bootstrap-based extension to AIC which they called EIC. This article compares model-selection performance of AIC, EIC, a bootstrap-smoothed likelihood cross-validation (BCV) and its modification (632CV) in small-sample linear regression, logistic regression, and Cox regression. Simulation results show that EIC largely overcomes AIC's overfitting problem and that BCV may be better than EIC. Hence, the three methods based on bootstrapping the likelihood establish themselves as important alternatives to AIC in model selection with small samples.  相似文献   

14.
在大量的实际问题中,变量与变量间的相关关系都呈现出伪抛物线状的变化规律,尤其在以投入报酬递减率为事物本质的研究中更是如此.本文通过对伪抛物线函数性质的深入分析,建立了伪抛物线回归函数模型,并根据该模型的特点,运用数值计算和解析计算相结合的方法,建立了该模型的参数估计方法,同时给出了显著性检验方法.  相似文献   

15.
Fuzzy data given by expert knowledge can be regarded as a possibility distribution by which possibilistic linear systems are defined. Recently, it has become important to deal with fuzzy data in connection with expert knowledge. Three formulations of possibilistic linear regression analysis are proposed here to deal with fuzzy data. Since our formulations can be reduced to linear programming problems, the merit of our formulations is to be able to obtain easily fuzzy parameters in possibilistic linear models and to add other constraint conditions which might be obtained from expert knowledge of fuzzy parameters. This approach can be regarded as a fuzzy interval analysis in a fuzzy environment.  相似文献   

16.
We apply nonparametric regression to current status data, which often arises in survival analysis and reliability analysis. While no parametric assumption on the distributions has been imposed, most authors have employed parametric models like linear models to measure the covariate effects on failure times in regression analysis with current status data. We construct a nonparametric estimator of the regression function by modifying the maximum rank correlation (MRC) estimator. Our estimator can deal with the cases where the other estimators do not work. We present the asymptotic bias and the asymptotic distribution of the estimator by adapting a result on equicontinuity of degenerate U-processes to the setup of this paper.  相似文献   

17.
Much work has focused on developing exact tests for the analysis of discrete data using log linear or logistic regression models. A parametric model is tested for a dataset by conditioning on the value of a sufficient statistic and determining the probability of obtaining another dataset as extreme or more extreme relative to the general model, where extremeness is determined by the value of a test statistic such as the chi-square or the log-likelihood ratio. Exact determination of these probabilities can be infeasible for high dimensional problems, and asymptotic approximations to them are often inaccurate when there are small data entries and/or there are many nuisance parameters. In these cases Monte Carlo methods can be used to estimate exact probabilities by randomly generating datasets (tables) that match the sufficient statistic of the original table. However, naive Monte Carlo methods produce tables that are usually far from matching the sufficient statistic. The Markov chain Monte Carlo method used in this work (the regression/attraction approach) uses attraction to concentrate the distribution around the set of tables that match the sufficient statistic, and uses regression to take advantage of information in tables that “almost” match. It is also more general than others in that it does not require the sufficient statistic to be linear, and it can be adapted to problems involving continuous variables. The method is applied to several high dimensional settings including four-way tables with a model of no four-way interaction, and a table of continuous data based on beta distributions. It is powerful enough to deal with the difficult problem of four-way tables and flexible enough to handle continuous data with a nonlinear sufficient statistic.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号