首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
数据包络分析在信用评估中的应用   总被引:1,自引:0,他引:1  
提出了一种基于拒绝案例集的数据包络分析模型和边界为分段线性分离超平面的分类方法,探讨了在只掌握单方面信息(信用差的单位)的情况下,如何对新单位进行信用评估的问题,并给出了评价决策单元信用状况的具体方法,应用实例表明所提出的模型和方法是可行的。  相似文献   

2.
The 2004 Basel II Accord has pointed out the benefits of credit risk management through internal models using internal data to estimate risk components: probability of default (PD), loss given default, exposure at default and maturity. Internal data are the primary data source for PD estimates; banks are permitted to use statistical default prediction models to estimate the borrowers’ PD, subject to some requirements concerning accuracy, completeness and appropriateness of data. However, in practice, internal records are usually incomplete or do not contain adequate history to estimate the PD. Current missing data are critical with regard to low default portfolios, characterised by inadequate default records, making it difficult to design statistically significant prediction models. Several methods might be used to deal with missing data such as list-wise deletion, application-specific list-wise deletion, substitution techniques or imputation models (simple and multiple variants). List-wise deletion is an easy-to-use method widely applied by social scientists, but it loses substantial data and reduces the diversity of information resulting in a bias in the model's parameters, results and inferences. The choice of the best method to solve the missing data problem largely depends on the nature of missing values (MCAR, MAR and MNAR processes) but there is a lack of empirical analysis about their effect on credit risk that limits the validity of resulting models. In this paper, we analyse the nature and effects of missing data in credit risk modelling (MCAR, MAR and NMAR processes) and take into account current scarce data set on consumer borrowers, which include different percents and distributions of missing data. The findings are used to analyse the performance of several methods for dealing with missing data such as likewise deletion, simple imputation methods, MLE models and advanced multiple imputation (MI) alternatives based on MarkovChain-MonteCarlo and re-sampling methods. Results are evaluated and discussed between models in terms of robustness, accuracy and complexity. In particular, MI models are found to provide very valuable solutions with regard to credit risk missing data.  相似文献   

3.
Most current implementations of multiple imputation (MI) assume that data are missing at random (MAR), but this assumption is generally untestable. We performed analyses to test the effects of auxiliary variables on MI when the data are missing not at random (MNAR) using simulated data and real data. In the analyses we varied (a) the correlation, (b) the level of missing data, (c) the pattern of missing data, and (d) sample size. Results showed that MI performed adequately without auxiliary variables but they also had a modest impact on bias in the real data and improved efficiency in both data sets. The results of this study suggest that, counter to the concern about the violation of the MAR assumption, MI appears to be quite robust to missing data that are MNAR in analytic situations such as the ones presented here. Further, results can be made even better via the use of auxiliary variables, particularly when efficiency is a primary concern.  相似文献   

4.
The number of Non-Performing Loans has increased in recent years, paralleling the current financial crisis, thus increasing the importance of credit scoring models. This study proposes a three stage hybrid Adaptive Neuro Fuzzy Inference System credit scoring model, which is based on statistical techniques and Neuro Fuzzy. The proposed model’s performance was compared with conventional and commonly utilized models. The credit scoring models are tested using a 10-fold cross-validation process with the credit card data of an international bank operating in Turkey. Results demonstrate that the proposed model consistently performs better than the Linear Discriminant Analysis, Logistic Regression Analysis, and Artificial Neural Network (ANN) approaches, in terms of average correct classification rate and estimated misclassification cost. As with ANN, the proposed model has learning ability; unlike ANN, the model does not stay in a black box. In the proposed model, the interpretation of independent variables may provide valuable information for bankers and consumers, especially in the explanation of why credit applications are rejected.  相似文献   

5.
If a credit scoring model is built using only applicants who have been previously accepted for credit such a non-random sample selection may produce bias in the estimated model parameters and accordingly the model's predictions of repayment performance may not be optimal. Previous empirical research suggests that omission of rejected applicants has a detrimental impact on model estimation and prediction. This paper explores the extent to which, given the previous cutoff score applied to decide on accepted applicants, the number of included variables influences the efficacy of a commonly used reject inference technique, reweighting. The analysis benefits from the availability of a rare sample, where virtually no applicant was denied credit. The general indication is that the efficacy of reject inference is little influenced by either model leanness or interaction between model leanness and the rejection rate that determined the sample. However, there remains some hint that very lean models may benefit from reject inference where modelling is conducted on data characterized by a very high rate of applicant rejection.  相似文献   

6.
Credit applicants are assigned to good or bad risk classes according to their record of defaulting. Each applicant is described by a high-dimensional input vector of situational characteristics and by an associated class label. A statistical model, which maps the inputs to the labels, can decide whether a new credit applicant should be accepted or rejected, by predicting the class label given the new inputs. Support vector machines (SVM) from statistical learning theory can build such models from the data, requiring extremely weak prior assumptions about the model structure. Furthermore, SVM divide a set of labelled credit applicants into subsets of ‘typical’ and ‘critical’ patterns. The correct class label of a typical pattern is usually very easy to predict, even with linear classification methods. Such patterns do not contain much information about the classification boundary. The critical patterns (the support vectors) contain the less trivial training examples. For instance, linear discriminant analysis with prior training subset selection via SVM also leads to improved generalization. Using non-linear SVM, more ‘surprising’ critical regions may be detected, but owing to the relative sparseness of the data, this potential seems to be limited in credit scoring practice.  相似文献   

7.
涉农企业信用评价动态指标隶属度向量判别研究   总被引:2,自引:0,他引:2  
对涉农企业信用评价中的动态指标的隶属度向量进行判别研究.首先借鉴X-12-A砒MA季节调整法的思想对信用数据进行剥离,构建一种过程连续性的动态信用指标;其次通过时间序列三指数平滑模型对动态信用数据的变化进行预测,得到动态信用指标隶属度向量;再次,结合熵权-AHP法确定的权重,确定动态信用指标的综合隶属度向量;最后实证检验了方法在企业信用评价中应用的有效性.  相似文献   

8.
This article seeks to gain insight into the influence of sample bias in a consumer credit scoring model. In earlier research, sample bias has been suggested to pose a sizeable threat to predictive performance and profitability due to its implications on either population drainage or biased estimates. Contrary to previous—mainly theoretical—research on sample bias, the unique features of the data set used in this study provide the opportunity to investigate the issue in an empirical setting. Based on the data of a mail-order company offering short-term consumer credit to their consumers, we show that (i) given a certain sample size, sample bias has a significant effect on consumer credit-scoring performance and profitability, (ii) its effect is composed of the inclusion of rejected orders in the scoring model, and—to a lesser extent—the inclusion of these orders into the variable-selection process, and (iii) the impact of the effect of sample bias on consumer credit-scoring performance and profitability is modest.  相似文献   

9.
在海量征信数据的背景下,为降低缺失数据插补的计算成本,提出收缩近邻插补方法.收缩近邻方法通过三阶段完成数据插补,第一阶段基于样本和变量的缺失比例计算入样概率,通过不等概抽样完成数据的收缩,第二阶段基于样本间距离,选取与缺失样本近邻的样本组成训练集,第三阶段建立随机森林模型进行迭代插补.利用Australian数据集和中国各银行数据集进行模拟研究,结果表明在确保一定插补精度的情况下,收缩近邻方法较大程度减少了计算量.  相似文献   

10.
目前多数研究利用美国旧金山市KMV公司于1997年建立的模型(KMV模型)计算企业年违约距离来评估具体企业的信用风险,但缺乏信贷行业的信用风险评估方法,也不能给出随时间变化的信用风险.首先提出基于数据的信贷行业随时间动态演化的信用风险评估模型,然后利用2016年18个行业的数据得到了中国信贷行业动态演化的信用风险,该信用风险随时间演化特征可分为波动上升、下降后波动、下降后稳定、稳定四种类型.进一步研究发现金融业、科学研究和技术服务业、信息传输软件和技术服务业这三个行业动态演化的信用风险平均值高且不稳定,住宿和餐饮业的信用风险很高但是比较平稳,其他行业的信用风险较低且较平稳.  相似文献   

11.
This study proposes and analyses a novel alternative to credit transition matrices (CTMs) developed by credit rating agencies - bank-sourced CTMs. It provides a unique insight into estimation of bank-sourced CTMs by assessing the extent to which the CTMs depend on the characteristics of the underlying credit risk datasets and the aggregation method and outlines that the choice of aggregation approach has a substantial effect on credit risk model results. Further, we show that bank-sourced CTMs are more dynamic than those of credit rating agencies, with higher off-diagonal transition rates and higher propensity to upgrade. Finally, we create a set of industry-specific CTMs, otherwise unobtainable due to the data sparsity faced by credit rating agencies, and highlight the implications of their differences, signalling the existence of industry-specific business cycles. The study uses a unique and large dataset of internal credit risk estimates from 24 global banks covering monthly observations on more than 26,000 large corporates and employs large-scale Monte Carlo simulations. This approach can be replicated by regulators (e.g., data collected by the European Central Bank in the AnaCredit project) and used by organisations aiming to improve their credit risk models.  相似文献   

12.
Quality credit is a new concept invented in China and to the best of our knowledge, there hasn’t been a widely-accepted quality credit indicator system and no quantitative method has been employed in quality credit evaluation up to now. To take the researches on quality credit a step further, this paper aims to establish a quality credit evaluation indicator system for air-conditioning enterprises in Chinese market and use TOPSIS (technique for order preference by similarity to ideal solution) method to evaluate quality credit of the enterprises. Based on the data of 8 air-conditioning enterprises, including 6 Chinese enterprises and 2 Japanese enterprises, three experiments with three different indicator systems are used to determine the final indicator system and verify the feasibility and effectiveness of TOPSIS. In Experiment one, an original indicator system is established to evaluate the quality credit of the 8 enterprises. In Experiment two and three, two reasonably adjusted indicator systems are used and the indicator system in Experiment three is the final one that we recommend. The analysis of experiments verifies that the proposed quality credit indicator system is reliable and TOPSIS is suitable for quality credit evaluation.  相似文献   

13.
刘超  李元睿  谢菁 《运筹与管理》2022,31(6):147-153
在信用风险识别领域,聚类算法常被用于区分不同风险等级的样本并识别风险特征。然而该领域中通常面临高维数据处理问题,导致传统聚类算法存在不适应此类问题的缺陷:易陷入局部最优、受冗余特征干扰、鲁棒性不强等。采用高维信用风险数据,研究上市公司信用风险,建立信用风险特征识别的三目标优化模型,设计基于分解的多目标子空间聚类算法进行求解。通过算法的横向对比实验,展示了所提出的算法在聚类精度和鲁棒性方面的优势,并根据聚类算法的权重分配结果,归纳总结上市公司信用风险评估过程中应重点关注的指标。  相似文献   

14.
在基于特征向量集的距离判别的基础上,提出新的判别分析方法,试图解决现有判别分析方法中计算量大及对复杂数据判别效果差的缺点.同时,将方法用于企业信用评价中,并与传统的判别方法及一些改进的判别方法作比较,实验结果表明,方法提高了企业信用评价的准确率.  相似文献   

15.
应用因子分析法对建设项目动态联盟背景下的伙伴信用进行评价。通过网站和问卷调查获取的数据提取出4个公共因子:盈利和成长能力、经营能力、创新能力、履约能力因子;并确定了4个公共因子和16个指标对伙伴信用的影响力度;相关性分析发现,伙伴信用与其绩效密切相关。最后对部分被调研企业中的17个企业的信用水平进行了综合评价,为其采取有效策略提高信用水平提供依据,也为盟主企业选择伙伴提供参考。  相似文献   

16.
The logistic regression framework has been for long time the most used statistical method when assessing customer credit risk. Recently, a more pragmatic approach has been adopted, where the first issue is credit risk prediction, instead of explanation. In this context, several classification techniques have been shown to perform well on credit scoring, such as support vector machines among others. While the investigation of better classifiers is an important research topic, the specific methodology chosen in real world applications has to deal with the challenges arising from the real world data collected in the industry. Such data are often highly unbalanced, part of the information can be missing and some common hypotheses, such as the i.i.d. one, can be violated. In this paper we present a case study based on a sample of IBM Italian customers, which presents all the challenges mentioned above. The main objective is to build and validate robust models, able to handle missing information, class unbalancedness and non-iid data points. We define a missing data imputation method and propose the use of an ensemble classification technique, subagging, particularly suitable for highly unbalanced data, such as credit scoring data. Both the imputation and subagging steps are embedded in a customized cross-validation loop, which handles dependencies between different credit requests. The methodology has been applied using several classifiers (kernel support vector machines, nearest neighbors, decision trees, Adaboost) and their subagged versions. The use of subagging improves the performance of the base classifier and we will show that subagging decision trees achieve better performance, still keeping the model simple and reasonably interpretable.  相似文献   

17.
模糊影响图评价算法在供应链金融信用风险评估中的应用   总被引:1,自引:0,他引:1  
传统的银行信贷模式风险评价专注于个体企业的财务数据.供应链金融新融资模式下的信用风险评价不同于传统的融资模式风险评价,它的评价范围更宽,不确定性因素更加复杂.在分析供应链金融模式的信用风险评价体系的基础上,结合模糊集和影响图理论建立了模糊影响图评价模型,对评估中难以量化的问题进行模糊处理,对变量之间的模糊影响关系进行分析,最后计算出信用风险概率分布.方法定性与定量相结合,为供应链金融新模式下的风险评估提供了一种新思路.  相似文献   

18.
本文通过银行的资产质量方面、资本充足率方面、管控效能层面、盈利状态层面、流动性层面与社会敏感度层面等构建商业银行信用风险评价体系。根据平滑扩充原理模拟生成大样本数据,对评级得分进行扩充,进而根据扩充后的大样本数据划分银行的信用风险等级。解决了由于样本少、无法对信用等级合理划分的难题。通过实证分析可以了解到,本文得出的银行评级信息和标准普尔提供的评价结论存在共同的序关系状态。因此,可根据本模型对大多数未经过国际权威机构评级的银行进行风险评级。  相似文献   

19.
当前上市公司信用风险数据所呈现出的高维度以及高相关性的特点严重影响了信用风险模型的准确性。为此本文结合已有算法以及信用风险模型的特点设计了一种新的基于非参数的变量选择方法。通过该方法对上市公司用风险相关变量进行分析筛选可以消除数据集中包含的噪声变量以及线性相关变量。本文同时还针对该方法设计了高变量维度下最优解求解算法。文章以Logistic模型为例对上市公司信用风险做了实证分析,研究结果表明与以往的变量选择方法相比该方法可以有效的降低数据维度,消除变量间的相关性,并同时提高模型的可靠性和预测精度。  相似文献   

20.
客户信用评估是银行等金融企业日常经营活动中的重要组成部分。一般违约样本在客户总体中只占少数,而能按时还款客户样本占多数,这就是客户信用评估中常见的类别不平衡问题。目前,用于客户信用评估的方法尚不能有效解决少数类样本稀缺带来的类别不平衡。本研究引入迁移学习技术整合系统内外部信息,以解决少数类样本稀缺带来的类别不平衡问题。为了提高对来自系统外部少数类样本信息的使用效率,构建了一种新的迁移学习模型:以基于集成技术的迁移装袋模型为基础,使用两阶段抽样和数据分组处理技术分别对其基模型生成和集成策略进行改进。运用重庆某商业银行信用卡客户数据进行的实证研究结果表明:与目前客户信用评估的常用方法相比,新模型能更好地处理绝对稀缺条件下类别不平衡对客户信用评估的影响,特别对占少数的违约客户有更好的预测精度。  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号