首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 140 毫秒
1.
方差分解是向量自回归模型中研究各变量的冲击对所有内生变量预测误差贡献的方法。文章介绍了广义预测误差方差分解,同传统的正交预测误差方差分解相比,这种方法的特点是不受向量自回归模型中变量排序的影响。文章利用广义方差分解研究了沪市各个分类指数之间的关系,显示了系统冲击在各个行业指数之间传递的特点。  相似文献   

2.
对于含测量误差的重复测量数据,协变量与响应变量真值之间可能不存在完全匹配关系,即存在方程误差.且变量真值的测量误差方差可能与样本的某种特征有关,即存在异方差性.以此类数据为驱动,讨论了含方程误差的异方差重复测量误差模型的建模和估计问题,基于EM算法给出了模型参数的显式极大似然迭代估计.最后通过模拟计算和实例分析,讨论了模型和估计方法的有效性.  相似文献   

3.
在使用变量选择方法选出模型后,如何评价模型中变量系数的显著性是统计学重点关注的前沿问题之一.文章从适应性Lasso变量选择方法的选择结果出发,在考虑实践中误差分布多样性的前提下,基于选择事件构造了模型保留变量系数的条件检验统计量,并给出了该统计量的一致收敛性质的证明过程.模拟研究显示,在多种误差分布下所提方法均可进一步优化变量选择结果,有较强的实用价值.应用此方法对CEPS学生数据进行了实证分析,最终选取了学生认知能力等10个变量作为影响中学生成绩的主要因素,为相关研究提供了有益的参考.  相似文献   

4.
基于蚁群算法的模糊分类系统设计   总被引:1,自引:0,他引:1  
提出了一种基于最大-最小蚁群算法的模糊分类系统设计方法.该方法通过两个阶段来实现:特征变量选择和模型参数优化.首先采用蚁群算法对特征变量进行选择,得到一组具有较高分辩性能的特征变量,提高模型的解释性;在模型结构确定后,蚁群算法从训练样本中提取信息对模型的参数进行优化,在保证模型精确性的前提下,构造具有较少变量数目及规则数目的模糊模型,实现了精确性与解释性的折衷.最后将本方法运用到Iris和Wine数据样本分类问题中,并将结果与其它方法进行比较,仿真结果证明了该方法的有效性.  相似文献   

5.
空间变系数回归模型是空间线性回归模型的重要推广,在实际中有广泛的应用.然而,这个模型的变量选择问题还没有解决.本文通过一般的M型损失函数将均值回归、中位数回归、分位数回归和稳健均值回归纳入同一框架下,然后基于B样条近似,提出一个能够同时进行变量选择和函数系数估计的自适应组内(adaptive group)L_r(r≥1)范数惩罚的M型估计量.新方法有几个显著的特点:(1)对异常点和重尾分布稳健;(2)能够兼容异方差性,允许显著变量集合随所考虑的分位点不同而变化;(3)兼顾了估计量的有效性和稳健性.在较弱假设条件下,建立了变量选择的oracle性质.随机模拟和实例分析验证了所提方法在有限样本时的表现.  相似文献   

6.
赵培信  杨宜平 《应用数学》2015,28(1):165-171
利用一些辅助信息作为工具变量并结合光滑门限估计方程(SEE)方法,针对协变量含有测量误差广义线性模型提出一个工具变量类型的变量选择方法.该方法可以在估计模型中非零回归系数的同时,剔除模型中不显著的协变量,从而达到变量选择的目的.另外,该变量选择过程不需要求解任何凸优化问题,从而具有较强的适应性并且在实际应用比较容易计算.理论证明该变量选择方法是相合的,并且对非零回归系数的估计达到了最优的参数收敛速度.数值模拟结果表明所提出的变量选择方法可以有效地消除测量误差对估计精度的影响,并且具有较好的有限样本性质.  相似文献   

7.
建模经济学领域中的面板数据,异方差性在所难免.两阶段估计方法是一种较好的研究异方差性的手段,在进行样本分组时,如果仅选定一个自变量作为依据,会导致信息量不完整.本文提出了用变量选择的方法筛选出用于分组的几个变量,之后用κ均值方法进行聚类,进而实现对样本的类别划分,从而可以得到异方差估计.实证显示:在异方差估计精度和拟合值方面,本文提出的方法在有效性和可行性方面优势明显.  相似文献   

8.
罚模型聚类实现了在聚类过程中精简变量的目标,同时如何识别聚类的有效变量成了一个新的问题.在这个问题上,已有的研究有成对罚模型,模型处理了各类数据同方差的情况.考察了异方差情况下的变量选择问题,针对异方差数据提出了两种新的模型,并给出模型的解和算法.模拟数据分析结果表明,异方差数据上两个新模型都有更好的表现.  相似文献   

9.
本文将工具变量法由研究带变量误差的均值回归模型推广到研究带变量误差的线性分位数回归模型.所得到的估计量是一致的且在一般条件下具有渐近正态分布.这种方法可行且易于操作.模拟研究表明该估计量在有限样本下性质表现非常好.最后这种方法被应用到实际问题,研究工资与教育程度之间的关系.  相似文献   

10.
本征正交分解及Galerkin投影是解决复杂非线性系统模型降阶问题常用的方法.然而,该方法在构造降阶系统过程中只截取基函数的部分模态,这通常会使得降阶系统不准确.针对该问题,提出了对降阶系统误差进行快速校正的方法.首先应用Mori-Zwanzig格式对降阶系统的误差进行分析,理论上得到误差模型的形式和有效预测变量.再通过偏最小二乘方法构造预测变量和系统误差的多元回归模型,建立误差预测模型.将所构造的误差预测模型直接嵌入到原降阶系统,得到新的降阶系统在形式上等价于对原模型的右端采用Petrov-Galerkin投影.最后给出了新的降阶系统的误差估计.数值结果进一步说明了所提方法能有效地提高降阶系统的稳定性和准确性,且具有较高计算效率.  相似文献   

11.
In the Knowledge Discovery Process, classification algorithms are often used to help create models with training data that can be used to predict the classes of untested data instances. While there are several factors involved with classification algorithms that can influence classification results, such as the node splitting measures used in making decision trees, feature selection is often used as a pre-classification step when using large data sets to help eliminate irrelevant or redundant attributes in order to increase computational efficiency and possibly to increase classification accuracy. One important factor common to both feature selection as well as to classification using decision trees is attribute discretization, which is the process of dividing attribute values into a smaller number of discrete values. In this paper, we will present and explore a new hybrid approach, ChiBlur, which involves the use of concepts from both the blurring and χ2-based approaches to feature selection, as well as concepts from multi-objective optimization. We will compare this new algorithm with algorithms based on the blurring and χ2-based approaches.  相似文献   

12.
We analyze the approximation quality of the discrete-time decomposition approach, compared to simulation, and with respect to the expected value and the 95th-percentile of waiting time. For both performance measures, we use OLS regression models to compute point estimates, and quantile regression models to compute interval estimates of decomposition error. The ANOVA reveal major influencing factors on decomposition error while the regression models are demonstrated to provide accurate forecasts and precise confidence intervals for decomposition error.  相似文献   

13.
We propose a new binary classification and variable selection technique especially designed for high-dimensional predictors. Among many predictors, typically, only a small fraction of them have significant impact on prediction. In such a situation, more interpretable models with better prediction accuracy can be obtained by variable selection along with classification. By adding an ?1-type penalty to the loss function, common classification methods such as logistic regression or support vector machines (SVM) can perform variable selection. Existing penalized SVM methods all attempt to jointly solve all the parameters involved in the penalization problem altogether. When data dimension is very high, the joint optimization problem is very complex and involves a lot of memory allocation. In this article, we propose a new penalized forward search technique that can reduce high-dimensional optimization problems to one-dimensional optimization by iterating the selection steps. The new algorithm can be regarded as a forward selection version of the penalized SVM and its variants. The advantage of optimizing in one dimension is that the location of the optimum solution can be obtained with intelligent search by exploiting convexity and a piecewise linear or quadratic structure of the criterion function. In each step, the predictor that is most able to predict the outcome is chosen in the model. The search is then repeatedly used in an iterative fashion until convergence occurs. Comparison of our new classification rule with ?1-SVM and other common methods show very promising performance, in that the proposed method leads to much leaner models without compromising misclassification rates, particularly for high-dimensional predictors.  相似文献   

14.
Mathematical programming (MP) discriminant analysis models can be used to develop classification models for assigning observations of unknown class membership to one of a number of specified classes using values of a set of features associated with each observation. Since most MP discriminant analysis models generate linear discriminant functions, these MP models are generally used to develop linear classification models. Nonlinear classifiers may, however, have better classification performance than linear classifiers. In this paper, a mixed integer programming model is developed to generate nonlinear discriminant functions composed of monotone piecewise-linear marginal utility functions for each feature and the cut-off value for class membership. It is also shown that this model can be extended for feature selection. The performance of this new MP model for two-group discriminant analysis is compared with statistical discriminant analysis and other MP discriminant analysis models using a real problem and a number of simulated problem sets.  相似文献   

15.
The problem of deleting a row from a Q–R factorization (called downdating) using Gram–Schmidt orthogonalization is intimately connected to using classical iterative methods to solve a least squares problem with the orthogonal factor as the coefficient matrix. Past approaches to downdating have focused upon accurate computation of the residual of that least squares problem, then finding a unit vector in the direction of the residual that becomes a new column for the orthogonal factor. It is also important to compute the solution vector of the related least squares problem accurately, as that vector must be used in the downdating process to maintain good backward error in the new factorization. Using this observation, new algorithms are proposed. One of the new algorithms proposed is a modification of one due to Yoo and Park [BIT, 36:161–181, 1996]. That algorithm is shown to be a Gram–Schmidt procedure. Also presented are new results that bound the loss of orthogonality after downdating. An error analysis shows that the proposed algorithms’ behavior in floating point arithmetic is close to their behavior in exact arithmetic. Experiments show that the changes proposed in this paper can have a dramatic impact upon the accuracy of the downdated Q–R decomposition. AMS subject classification (2000) 65F20, 65F25  相似文献   

16.
In this paper, six univariate forecasting models for the container throughput volumes in Taiwan’s three major ports are presented. The six univariate models include the classical decomposition model, the trigonometric regression model, the regression model with seasonal dummy variables, the grey model, the hybrid grey model, and the SARIMA model. The purpose of this paper is to search for a model that can provide the most accurate prediction of container throughput. By applying monthly data to these models and comparing the prediction results based on mean absolute error, mean absolute percent error and root mean squared error, we find that in general the classical decomposition model appears to be the best model for forecasting container throughput with seasonal variations. The result of this study may be helpful for predicting the short-term variation in demand for the container throughput of other international ports.  相似文献   

17.
Classification models can be developed by statistical or mathematical programming discriminant analysis techniques. Variable selection extensions of these techniques allow the development of classification models with a limited number of variables. Although stepwise statistical variable selection methods are widely used, the performance of the resultant classification models may not be optimal because of the stepwise selection protocol and the nature of the group separation criterion. A mixed integer programming approach for selecting variables for maximum classification accuracy is developed in this paper and the performance of this approach, measured by the leave-one-out hit rate, is compared with the published results from a statistical approach in which all possible variable subsets were considered. Although this mixed integer programming approach can only be applied to problems with a relatively small number of observations, it may be of great value where classification decisions must be based on a limited number of observations.  相似文献   

18.
以辽东湾某生态监测区水质监测数据为例,以矩阵的奇异值分解和K means算法为分类工具,给出生态监测区水质监测数据的分类方法.方法具有以下特点:通过奇异值分解简化并加速了类比过程,通过动态设置类K避免了K means算法先设定类数的不足,还探讨了对少量新增监测数据的归类问题.方法对近海海水水质监测数据分类具有普适性.  相似文献   

19.
方差分量的广义谱分解估计   总被引:9,自引:1,他引:8  
对于随机效应部分为一般平衡多向分类的线性混合模型,将王松桂(2002)提出的一种称之为谱分解估计的参数估计新方法推广到随机效应设计阵为任意矩阵的含两个方差分量的线性混合模型,给出了方差分量的广义谱分解估计方法,并证明了所得估计的一些统计性质。另外,还就广义谱分解估计类中某些特殊估计和对应的方差分析估计进行了比较,得到了它们相等的充分必要条件。  相似文献   

20.
In many problems involving generalized linear models, the covariates are subject to measurement error. When the number of covariates p exceeds the sample size n, regularized methods like the lasso or Dantzig selector are required. Several recent papers have studied methods which correct for measurement error in the lasso or Dantzig selector for linear models in the p > n setting. We study a correction for generalized linear models, based on Rosenbaum and Tsybakov’s matrix uncertainty selector. By not requiring an estimate of the measurement error covariance matrix, this generalized matrix uncertainty selector has a great practical advantage in problems involving high-dimensional data. We further derive an alternative method based on the lasso, and develop efficient algorithms for both methods. In our simulation studies of logistic and Poisson regression with measurement error, the proposed methods outperform the standard lasso and Dantzig selector with respect to covariate selection, by reducing the number of false positives considerably. We also consider classification of patients on the basis of gene expression data with noisy measurements. Supplementary materials for this article are available online.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号