首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 125 毫秒
1.
单指标面板模型已广泛应用于各学科领域的研究中,其估计方法较为丰富,然而鲜有估计方法将个体内的相关性考虑在内.基于此,本文研究了一类个体内存在相关性的固定效应部分线性单指标面板模型,采用惩罚二次推断函数法和LSDV法相结合的方法对模型进行估计,证明了所得估计量的一致性和渐近正态性.Monte Carlo模拟结果显示其具有优良的有限样本表现,并将该估计技术应用于实际数据分析中.  相似文献   

2.
利用随机森林特征选择算法,对信用评估的可用指标集进行特征选择,在此基础上建立基于随机森林融合朴素贝叶斯的信用评估模型.选取UCI数据库中的German数据集进行实证研究,结果表明,通过随机森林进行特征选择的随机森林融合朴素贝叶斯模型具有更高的预测准确度.  相似文献   

3.
研究关于公司神经网络信用评估问题的现状,提出一套甄选方法准则。用于建立适合于我国企业的信用评分指标体系;然后依据该指标体系建立了基于BP回归神经网络的信用评估模型;采用V—fold交叉验证技术,利用样本公司实际指标数据对该模型的评分效果进行了实证研究。  相似文献   

4.
模糊判断矩阵的特征向量排序方法   总被引:1,自引:0,他引:1  
从相关性角度提出了互反判断矩阵排序的特征向量方法。利用两类一致性模糊判断矩阵与完全一致性互反判断矩阵的相互转换公式,给出了基于加性一致性指标与乘性一致性指标的模糊判断矩阵特征向量排序方法,最后利用这些方法进行了算例分析,结果表明这些新的排序方法是有效可行的。  相似文献   

5.
信用传染违约Aalen加性风险模型   总被引:1,自引:0,他引:1  
田军  周勇 《应用数学学报》2012,35(3):408-420
本文考虑了基于加性风险模型的信用风险违约预报模型,不但考虑了宏观因素和公司个体因素,并且通过引入行业因素来刻画公司间可能存在的不同于宏观因素的信用传染效应,由此克服了以往模型对违约相关性的低估.本文在参数加性风险模型下给出极大似然估计及渐近性,提出两种估计方法并比较二者表现,得到最优权估计更加有效.同时本文还考虑了半参数的风险模型,并基于鞅的估计方程得到其估计及渐近性,均得到不错的结果.  相似文献   

6.
数控车床可靠性分配模型是一个层次结构,可靠性分配的关键技术是确定结构底层指标关于顶层目标的重要度排序,其前提条件是单准则排序已知.AHP通过构造"两两比较"的"1~9"比例标度判断矩阵A_n为单准则排序提供了合理的数据条件;但是基于A_n一致性检验的特征根排序法因临界值的确定缺乏必要理论基础而受到质疑.改进AHP的FAHP因为没有摆脱"一致性检验"的干扰所以改进并不成功.为了建立与"一致性"无关的单准则排序方法定义具有可加性的评分标度概念,通过标度转换将比例标度判断矩阵A_n转化为评分标度判断矩阵B_n,利用评分标度的可加性在准则C下对n个比较对象排序.因为B_n不是正矩阵所以不存在"一致性概念",因此基于评分标度判断矩阵的排序与"一致性"无关.  相似文献   

7.
层次分析法(AHP)用两两比较的方法,通过构造比例标度判断矩阵为排序提供了一种合理的数据条件;对比例标度判断矩阵进行基于最大特征根的一致性检验,是为了把最大特征值对应的归一化特征向量作为排序向量;但终因理论上不能提供比例标度判断矩阵保序的充分条件,所以特征根排序法受到质疑,AHP需要改进.指出,改进AHP的正确途径是回归基于判断矩阵数据的排序方法研究.排序可行的条件是标度具有可加性;为此定义具有可加性的评分标度,建立标度转换公式将比例标度转换为评分标度;阐述为什么一致性检验无助于正确排序的原因并剖析为什么基于评分标度排序与一致性检验无关;由此建立基于评分标度判断矩阵中数据的评分排序方法.  相似文献   

8.
为了有效评估航空发动机的健康状况,提出一种逼近理想点的组合赋权法和未确知测度模型相结合的评估方法。首先,分别利用序关系分析法、指标相关性赋权法和指标难度赋权法得到各指标的主客观单一权重;利用逼近理想点的组合赋权法并考虑指标权重与重要性的一致性,求解指标组合权重,使赋权结果更具科学性,加大不同类别评价对象的区分度。其次,基于未确知测度模型,利用K-means算法将各指标分为两个对立等级,每个等级又分为两个互补小等级,以减少信息的重叠进一步区分不同类别的评价对象;根据指标属于不同等级的未确知测度、指标组合权重和评分准则,得到评价对象的健康评分。最后,通过航空发动机实例分析以及与其他方法的对比分析,验证此方法的有效性。  相似文献   

9.
选取中美两国2011年1月至2017年4月的公司债和国债月度交易数据,基于SV模型得到两国公司债的信用利差序列,进而对中美两国公司债的信用利差进行时间序列比较分析.实证发现,中国公司债信用利差序列表现出自回归和移动平均特征,而美国公司债信用利差序列则仅呈现自回归特征;在方差结构方面,中国公司债信用利差序列的残差不具有ARCH效应,而美国公司债信用利差序列的残差具有明显的ARCH效应.同时,对中美两国公司债信用利差建立VAR模型并进行脉冲响应分析,发现中美两国信用利差序列的相关性不强,对彼此的冲击的反应均较弱,为债券市场投资者构建跨国市场债券组合来分散信用风险提供决策支持.  相似文献   

10.
依次使用证据权重法(WOE)、逐步回归法、Probit模型的系数显著检验法对信用评价指标进行三轮筛选,构建了一套精简且区分违约状态能力强的信用评价指标三重组合筛选模型,并以某商业银行信贷数据库中的782个微型企业样本进行了应用分析.通过ROC曲线对构建的评价指标体系进行检验,得到AUC值为0.9472,表明了基于WOE-Probit逐步回归的信用指标组合筛选模型的合理性.通过与WOE-Logistic逐步回归作对比,得到基于WOE-Probit逐步回归的信用指标组合筛选模型的AUC值大于基于WOE-Logistic逐步回归的信用指标组合筛选模型的AUC值,表明方法的优越性.  相似文献   

11.
The features used may have an important effect on the performance of credit scoring models. The process of choosing the best set of features for credit scoring models is usually unsystematic and dominated by somewhat arbitrary trial. This paper presents an empirical study of four machine learning feature selection methods. These methods provide an automatic data mining technique for reducing the feature space. The study illustrates how four feature selection methods—‘ReliefF’, ‘Correlation-based’, ‘Consistency-based’ and ‘Wrapper’ algorithms help to improve three aspects of the performance of scoring models: model simplicity, model speed and model accuracy. The experiments are conducted on real data sets using four classification algorithms—‘model tree (M5)’, ‘neural network (multi-layer perceptron with back-propagation)’, ‘logistic regression’, and ‘k-nearest-neighbours’.  相似文献   

12.
In developing a classification model for assigning observations of unknown class to one of a number of specified classes using the values of a set of features associated with each observation, it is often desirable to base the classifier on a limited number of features. Mathematical programming discriminant analysis methods for developing classification models can be extended for feature selection. Classification accuracy can be used as the feature selection criterion by using a mixed integer programming (MIP) model in which a binary variable is associated with each training sample observation, but the binary variable requirements limit the size of problems to which this approach can be applied. Heuristic feature selection methods for problems with large numbers of observations are developed in this paper. These heuristic procedures, which are based on the MIP model for maximizing classification accuracy, are then applied to three credit scoring data sets.  相似文献   

13.
In software defect prediction with a regression model, too many metrics extracted from static code and aggregated (sum, avg, max, min) from methods into classes can be candidate features, and the classical feature selection methods, such as AIC, BIC, should be processed at a given model. As a result, the selected feature sets are significantly different for various models without a reasonable interpretation. Maximal information coefficient (MIC) presented by Reshef et al.\ucite{4} is a novel method to measure the degree of the interdependence between two continuous variables, and an available computing method is also given based on the observations. This paper firstly use the MIC between defect counts and each feature to select features, and then conduct the power transformation on the selected features, and finally build up the principal component Poisson and negative binomial regression model. All experiments are conducted on KC1 data set in NASA repository on the level of class. The block-regularized $m\times 2$ cross-validated sequential $t$-test is employed to test the difference of performance of two models. The performance measures of a model in this paper are FPA, AAE, ARE. The experimental results show that 1) the aggregated features, such as sum, avg, max, are selected by MIC except min, which are significantly different from AIC, BIC; 2) the power transformation to the features can improve the performance for majority of models; 3) after PCA and factorial analysis, two clear factors are obtained in the model. One corresponds to the aggregated features via avg and max, and the other corresponds to the aggregated features with sum. Therefore, the model owns a reasonable interpretation. Conclusively, the aggregated features with sum, avg, max are significantly effective for software defect prediction, and the regression model based on the selected features by MIC has some advantages.  相似文献   

14.
The number of Non-Performing Loans has increased in recent years, paralleling the current financial crisis, thus increasing the importance of credit scoring models. This study proposes a three stage hybrid Adaptive Neuro Fuzzy Inference System credit scoring model, which is based on statistical techniques and Neuro Fuzzy. The proposed model’s performance was compared with conventional and commonly utilized models. The credit scoring models are tested using a 10-fold cross-validation process with the credit card data of an international bank operating in Turkey. Results demonstrate that the proposed model consistently performs better than the Linear Discriminant Analysis, Logistic Regression Analysis, and Artificial Neural Network (ANN) approaches, in terms of average correct classification rate and estimated misclassification cost. As with ANN, the proposed model has learning ability; unlike ANN, the model does not stay in a black box. In the proposed model, the interpretation of independent variables may provide valuable information for bankers and consumers, especially in the explanation of why credit applications are rejected.  相似文献   

15.
Credit scoring is a method of modelling potential risk of credit applications. Traditionally, logistic regression and discriminant analysis are the most widely used approaches to create scoring models in the industry. However, these methods are associated with quite a few limitations, such as being instable with high-dimensional data and small sample size, intensive variable selection effort and incapability of efficiently handling non-linear features. Most importantly, based on these algorithms, it is difficult to automate the modelling process and when population changes occur, the static models usually fail to adapt and may need to be rebuilt from scratch. In the last few years, the kernel learning approach has been investigated to solve these problems. However, the existing applications of this type of methods (in particular the SVM) in credit scoring have all focused on the batch model and did not address the important problem of how to update the scoring model on-line. This paper presents a novel and practical adaptive scoring system based on an incremental kernel method. With this approach, the scoring model is adjusted according to an on-line update procedure that can always converge to the optimal solution without information loss or running into numerical difficulties. Non-linear features in the data are automatically included in the model through a kernel transformation. This approach does not require any variable reduction effort and is also robust for scoring data with a large number of attributes and highly unbalanced class distributions. Moreover, a new potential kernel function is introduced to further improve the predictive performance of the scoring model and a kernel attribute ranking technique is used that adds transparency in the final model. Experimental studies using real world data sets have demonstrated the effectiveness of the proposed method.  相似文献   

16.
One of the aims of credit scoring models is to predict the probability of repayment of any applicant and yet such models are usually parameterised using a sample of accepted applicants only. This may lead to biased estimates of the parameters. In this paper we examine two issues. First, we compare the classification accuracy of a model based only on accepted applicants, relative to one based on a sample of all applicants. We find only a minimal difference, given the cutoff scores for the old model used by the data supplier. Using a simulated model we examine the predictive performance of models estimated from bands of applicants, ranked by predicted creditworthiness. We find that the lower the risk band of the training sample, the less accurate the predictions for all applicants. We also find that the lower the risk band of the training sample, the greater the overestimate of the true performance of the model, when tested on a sample of applicants within the same risk band — as a financial institution would do. The overestimation may be very large. Second, we examine the predictive accuracy of a bivariate probit model with selection (BVP). This parameterises the accept–reject model allowing for (unknown) omitted variables to be correlated with those of the original good–bad model. The BVP model may improve accuracy if the loan officer has overridden a scoring rule. We find that a small improvement when using the BVP model is sometimes possible.  相似文献   

17.
18.
The stock exchanges in China give a stock special treatment in order to indicate its risk warning if the corresponding listed company cannot meet some requirements on financial performance. To correctly predict the special treatment of stocks is very important for the investors. The performance of the prediction models is mainly affected by the selection of explanatory variables and modelling methods. This paper makes a comparison between the multi-period hazard models and five widely used single-period static models by investigating a comprehensive category of variables including accounting variables, market variables, characteristic variables and macroeconomic variables. The empirical result shows that the performance of the models is sensitive to the choice of explanatory variables but the performance between the multi-period hazard models and the single-period static models has no significant difference.  相似文献   

19.
High-dimensional data have frequently been collected in many scientific areas including genomewide association study, biomedical imaging, tomography, tumor classifications, and finance. Analysis of highdimensional data poses many challenges for statisticians. Feature selection and variable selection are fundamental for high-dimensional data analysis. The sparsity principle, which assumes that only a small number of predictors contribute to the response, is frequently adopted and deemed useful in the analysis of high-dimensional data.Following this general principle, a large number of variable selection approaches via penalized least squares or likelihood have been developed in the recent literature to estimate a sparse model and select significant variables simultaneously. While the penalized variable selection methods have been successfully applied in many highdimensional analyses, modern applications in areas such as genomics and proteomics push the dimensionality of data to an even larger scale, where the dimension of data may grow exponentially with the sample size. This has been called ultrahigh-dimensional data in the literature. This work aims to present a selective overview of feature screening procedures for ultrahigh-dimensional data. We focus on insights into how to construct marginal utilities for feature screening on specific models and motivation for the need of model-free feature screening procedures.  相似文献   

20.
The feature selection problem aims to choose a subset of a given set of features that best represents the whole in a particular aspect, preserving the original semantics of the variables on the given samples and classes. In 2004, a new approach to perform feature selection was proposed. It was based on a NP-complete combinatorial optimisation problem called (\(\alpha ,\beta \))-k-feature set problem. Although effective for many practical cases, which made the approach an important feature selection tool, the only existing solution method, proposed on the original paper, was found not to work well for several instances. Our work aims to cover this gap found on the literature, quickly obtaining high quality solutions for the instances that existing approach can not solve. This work proposes a heuristic based on the greedy randomised adaptive search procedure and tabu search to address this problem; and benchmark instances to evaluate its performance. The computational results show that our method can obtain high quality solutions for both real and the proposed artificial instances and requires only a fraction of the computational resources required by the state of the art exact and heuristic approaches which use mixed integer programming models.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号