首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
Korean government has been funding the small and medium enterprises (SME) with superior technology based on scorecard. However high default rate of funded SMEs has been reported. In order to effectively manage such governmental fund, it is important to develop accurate scoring model for SMEs. In this paper, we provide a random effects logistic regression model to predict the default of funded SMEs based on both financial and non-financial factors. Advantage of such a random effects model lies in the ability of accommodating not only the individual characteristics of each SME but also the uncertainty that cannot be explained by such individual factors. It is expected that our study can contribute to effective management of government funds by proposing the prediction models for defaults of funded SMEs.  相似文献   

2.
Uniform boundedness of output variables is a standard assumption in most theoretical analysis of regression algorithms. This standard assumption has recently been weaken to a moment hypothesis in least square regression (LSR) setting. Although there has been a large literature on error analysis for LSR under the moment hypothesis, very little is known about the statistical properties of support vector machines regression with unbounded sampling. In this paper, we fill the gap in the literature. Without any restriction on the boundedness of the output sampling, we establish an ad hoc convergence analysis for support vector machines regression under very mild conditions.  相似文献   

3.
We reformulate the problem of determining support vectors directly as an application of Bayes' classifiers rather than as the dual program to a binary geometric separation problem. The primary purpose of the reformulation is to create a simpler exposition of the support vector machines technique. A secondary advantage is that it immediately and naturally applies to multi-class classification problems where the kernel function can be normalized as a density.  相似文献   

4.
Loss given default modelling has become crucially important for banks due to the requirement that they comply with the Basel Accords and to their internal computations of economic capital. In this paper, support vector regression (SVR) techniques are applied to predict loss given default of corporate bonds, where improvements are proposed to increase prediction accuracy by modifying the SVR algorithm to account for heterogeneity of bond seniorities. We compare the predictions from SVR techniques with thirteen other algorithms. Our paper has three important results. First, at an aggregated level, the proposed improved versions of support vector regression techniques outperform other methods significantly. Second, at a segmented level, by bond seniority, least square support vector regression demonstrates significantly better predictive abilities compared with the other statistical models. Third, standard transformations of loss given default do not improve prediction accuracy. Overall our empirical results show that support vector regression techniques are a promising technique for banks to use to predict loss given default.  相似文献   

5.
Support Vector Machines (SVMs) are now very popular as a powerful method in pattern classification problems. One of main features of SVMs is to produce a separating hyperplane which maximizes the margin in feature space induced by nonlinear mapping using kernel function. As a result, SVMs can treat not only linear separation but also nonlinear separation. While the soft margin method of SVMs considers only the distance between separating hyperplane and misclassified data, we propose in this paper multi-objective programming formulation considering surplus variables. A similar formulation was extensively researched in linear discriminant analysis mostly in 1980s by using Goal Programming(GP). This paper compares these conventional methods such as SVMs and GP with our proposed formulation through several examples.Received: September 2003, Revised: December 2003,  相似文献   

6.
Supervised classification is an important part of corporate data mining to support decision making in customer-centric planning tasks. The paper proposes a hierarchical reference model for support vector machine based classification within this discipline. The approach balances the conflicting goals of transparent yet accurate models and compares favourably to alternative classifiers in a large-scale empirical evaluation in real-world customer relationship management applications. Recent advances in support vector machine oriented research are incorporated to approach feature, instance and model selection in a unified framework.  相似文献   

7.
In recent years, support vector machines (SVMs) were successfully applied to a wide range of applications. However, since the classifier is described as a complex mathematical function, it is rather incomprehensible for humans. This opacity property prevents them from being used in many real-life applications where both accuracy and comprehensibility are required, such as medical diagnosis and credit risk evaluation. To overcome this limitation, rules can be extracted from the trained SVM that are interpretable by humans and keep as much of the accuracy of the SVM as possible. In this paper, we will provide an overview of the recently proposed rule extraction techniques for SVMs and introduce two others taken from the artificial neural networks domain, being Trepan and G-REX. The described techniques are compared using publicly available datasets, such as Ripley’s synthetic dataset and the multi-class iris dataset. We will also look at medical diagnosis and credit scoring where comprehensibility is a key requirement and even a regulatory recommendation. Our experiments show that the SVM rule extraction techniques lose only a small percentage in performance compared to SVMs and therefore rank at the top of comprehensible classification techniques.  相似文献   

8.
Discrete support vector machines (DSVM), originally proposed for binary classification problems, have been shown to outperform other competing approaches on well-known benchmark datasets. Here we address their extension to multicategory classification, by developing three different methods. Two of them are based respectively on one-against-all and round-robin classification schemes, in which a number of binary discrimination problems are solved by means of a variant of DSVM. The third method directly addresses the multicategory classification task, by building a decision tree in which an optimal split to separate classes is derived at each node by a new extended formulation of DSVM. Computational tests on publicly available datasets are then conducted to compare the three multicategory classifiers based on DSVM with other methods, indicating that the proposed techniques achieve significantly higher accuracies. This research was partially supported by PRIN grant 2004132117.  相似文献   

9.
Knowledge based proximal support vector machines   总被引:1,自引:0,他引:1  
We propose a proximal version of the knowledge based support vector machine formulation, termed as knowledge based proximal support vector machines (KBPSVMs) in the sequel, for binary data classification. The KBPSVM classifier incorporates prior knowledge in the form of multiple polyhedral sets, and determines two parallel planes that are kept as distant from each other as possible. The proposed algorithm is simple and fast as no quadratic programming solver needs to be employed. Effectively, only the solution of a structured system of linear equations is needed.  相似文献   

10.
Method  In this paper, we introduce a bi-level optimization formulation for the model and feature selection problems of support vector machines (SVMs). A bi-level optimization model is proposed to select the best model, where the standard convex quadratic optimization problem of the SVM training is cast as a subproblem. Feasibility  The optimal objective value of the quadratic problem of SVMs is minimized over a feasible range of the kernel parameters at the master level of the bi-level model. Since the optimal objective value of the subproblem is a continuous function of the kernel parameters, through implicity defined over a certain region, the solution of this bi-level problem always exists. The problem of feature selection can be handled in a similar manner. Experiments and results  Two approaches for solving the bi-level problem of model and feature selection are considered as well. Experimental results show that the bi-level formulation provides a plausible tool for model selection.  相似文献   

11.
Transductive learning involves the construction and application of prediction models to classify a fixed set of decision objects into discrete groups. It is a special case of classification analysis with important applications in web-mining, corporate planning and other areas. This paper proposes a novel transductive classifier that is based on the philosophy of discrete support vector machines. We formalize the task to estimate the class labels of decision objects as a mixed integer program. A memetic algorithm is developed to solve the mathematical program and to construct a transductive support vector machine classifier, respectively. Empirical experiments on synthetic and real-world data evidence the effectiveness of the new approach and demonstrate that it identifies high quality solutions in short time. Furthermore, the results suggest that the class predictions following from the memetic algorithm are significantly more accurate than the predictions of a CPLEX-based reference classifier. Comparisons to other transductive and inductive classifiers provide further support for our approach and suggest that it performs competitive with respect to several benchmarks.  相似文献   

12.
We propose using support vector machines (SVMs) to learn the efficient set in multiple objective discrete optimization (MODO). We conjecture that a surface generated by SVM could provide a good approximation of the efficient set. As one way of testing this idea, we embed the SVM-approximated efficient set information into a Genetic Algorithm (GA). This is accomplished by using a SVM-based fitness function that guides the GA search. We implement our SVM-guided GA on the multiple objective knapsack and assignment problems. We observe that using SVM improves the performance of the GA compared to a benchmark distance based fitness function and may provide competitive results.  相似文献   

13.
A convergent decomposition algorithm for support vector machines   总被引:1,自引:0,他引:1  
In this work we consider nonlinear minimization problems with a single linear equality constraint and box constraints. In particular we are interested in solving problems where the number of variables is so huge that traditional optimization methods cannot be directly applied. Many interesting real world problems lead to the solution of large scale constrained problems with this structure. For example, the special subclass of problems with convex quadratic objective function plays a fundamental role in the training of Support Vector Machine, which is a technique for machine learning problems. For this particular subclass of convex quadratic problem, some convergent decomposition methods, based on the solution of a sequence of smaller subproblems, have been proposed. In this paper we define a new globally convergent decomposition algorithm that differs from the previous methods in the rule for the choice of the subproblem variables and in the presence of a proximal point modification in the objective function of the subproblems. In particular, the new rule for sequentially selecting the subproblems appears to be suited to tackle large scale problems, while the introduction of the proximal point term allows us to ensure the global convergence of the algorithm for the general case of nonconvex objective function. Furthermore, we report some preliminary numerical results on support vector classification problems with up to 100 thousands variables.  相似文献   

14.
Forecasting the number of warranty claims is vitally important for manufacturers/warranty providers in preparing fiscal plans. In existing literature, a number of techniques such as log-linear Poisson models, Kalman filter, time series models, and artificial neural network models have been developed. Nevertheless, one might find two weaknesses existing in these approaches: (1) they do not consider the fact that warranty claims reported in the recent months might be more important in forecasting future warranty claims than those reported in the earlier months, and (2) they are developed based on repair rates (i.e., the total number of claims divided by the total number of products in service), which can cause information loss through such an arithmetic-mean operation.To overcome the above two weaknesses, this paper introduces two different approaches to forecasting warranty claims: the first is a weighted support vector regression (SVR) model and the second is a weighted SVR-based time series model. These two approaches can be applied to two scenarios: when only claim rate data are available and when original claim data are available. Two case studies are conducted to validate the two modelling approaches. On the basis of model evaluation over six months ahead forecasting, the results show that the proposed models exhibit superior performance compared to that of multilayer perceptrons, radial basis function networks and ordinary support vector regression models.  相似文献   

15.
The aim of much horserace modelling is to appraise the informational efficiency of betting markets. The prevailing approach involves forecasting the runners’ finish positions by means of discrete or continuous response regression models. However, theoretical considerations and empirical evidence suggest that the information contained within finish positions might be unreliable, especially among minor placings. To alleviate this problem, a classification-based modelling paradigm is proposed which relies only on data distinguishing winners and losers. To assess its effectiveness, an empirical experiment is conducted using data from a UK racetrack. The results demonstrate that the classification-based model compares favourably with state-of-the-art alternatives and confirm the reservations of relying on rank ordered finishing data. Simulations are conducted to further explore the origin of the model’s success by evaluating the marginal contribution of its constituent parts.  相似文献   

16.
The availability of abundant data posts a challenge to integrate static customer data and longitudinal behavioral data to improve performance in customer churn prediction. Usually, longitudinal behavioral data are transformed into static data before being included in a prediction model. In this study, a framework with ensemble techniques is presented for customer churn prediction directly using longitudinal behavioral data. A novel approach called the hierarchical multiple kernel support vector machine (H-MK-SVM) is formulated. A three phase training algorithm for the H-MK-SVM is developed, implemented and tested. The H-MK-SVM constructs a classification function by estimating the coefficients of both static and longitudinal behavioral variables in the training process without transformation of the longitudinal behavioral data. The training process of the H-MK-SVM is also a feature selection and time subsequence selection process because the sparse non-zero coefficients correspond to the variables selected. Computational experiments using three real-world databases were conducted. Computational results using multiple criteria measuring performance show that the H-MK-SVM directly using longitudinal behavioral data performs better than currently available classifiers.  相似文献   

17.
In this paper, a new optimization method has been proposed for accident prediction non-linear models. This has been achieved by eliminating the Hessian matrix from the equation of optimal pace length in the gradient vector method. One advantage is that it is independent of the starting point in optimization processes and it provides convergence at the highest top as well. This method has been tested on an accident prediction model and its preference over the gradient vector method has been proven.  相似文献   

18.
In this paper we propose some improvements to a recent decomposition technique for the large quadratic program arising in training support vector machines. As standard decomposition approaches, the technique we consider is based on the idea to optimize, at each iteration, a subset of the variables through the solution of a quadratic programming subproblem. The innovative features of this approach consist in using a very effective gradient projection method for the inner subproblems and a special rule for selecting the variables to be optimized at each step. These features allow to obtain promising performance by decomposing the problem into few large subproblems instead of many small subproblems as usually done by other decomposition schemes. We improve this technique by introducing a new inner solver and a simple strategy for reducing the computational cost of each iteration. We evaluate the effectiveness of these improvements by solving large-scale benchmark problems and by comparison with a widely used decomposition package.  相似文献   

19.
Support vector machines (SVMs), which are a kind of statistical learning methods, were applied in this research work to predict occupational accidents with success. In the first place, semi-parametric principal component analysis (SPPCA) was used in order to perform a dimensional reduction, but no satisfactory results were obtained. Next, a dimensional reduction was carried out using an innovative and intelligent computing regression algorithm known as multivariate adaptive regression splines (MARS) model with good results. The variables selected as important by the previous MARS model were taken as input variables for a SVM model. This SVM technique was able to classify, according to their working conditions, those workers that have suffered a work-related accident in the last 12 months and those that have not. SVM technique does not over-fit the experimental data and gives place to a better performance than back-propagation neural network models. Finally, the results and conclusions of this study are presented.  相似文献   

20.
EMD-SVM在南京市月平均气温预测中的应用   总被引:1,自引:0,他引:1  
南京市月平均气温具有非平稳性、噪声大、序列宽频等特征.为了提高温预测精度,本文提出一种经验模态分解(EMD)和支持向量机(SVM)回归相组合的预测模型(EMD-SVM).首先应用EMD分解算法把南京市月平均气温分解成不同尺度的基本模态分量(IMF),再运用支持向量机回归模型对每个IMF预测,最后将预测结果重构得到南京市月平均气温预测值.结果表明:EMD-SVM模型预测与单一支持向量机回归模型预测相比,平均预测精度提高0.59度,是一种有效的预测气温的模型.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号