首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 375 毫秒
1.
In this paper, we study the performance of various state-of-the-art classification algorithms applied to eight real-life credit scoring data sets. Some of the data sets originate from major Benelux and UK financial institutions. Different types of classifiers are evaluated and compared. Besides the well-known classification algorithms (eg logistic regression, discriminant analysis, k-nearest neighbour, neural networks and decision trees), this study also investigates the suitability and performance of some recently proposed, advanced kernel-based classification algorithms such as support vector machines and least-squares support vector machines (LS-SVMs). The performance is assessed using the classification accuracy and the area under the receiver operating characteristic curve. Statistically significant performance differences are identified using the appropriate test statistics. It is found that both the LS-SVM and neural network classifiers yield a very good performance, but also simple classifiers such as logistic regression and linear discriminant analysis perform very well for credit scoring.  相似文献   

2.
The model we present in this paper was developed to solve a meteorological equipment operations problem involving the purchase and operation of specialized aircraft for the gathering of meteorological data. This aircraft system (termed ‘System B’) will be used to supplement an existing system using weather balloons (termed ‘System A’). The problem involves decision making based upon bi-criterion objectives. These objectives usually occur in the presence of one or both of the following factors: (a) The outputs of the systems have more than one attribute and (b) the outputs may be measured under varying environmental situations. The specific type of problem treated here involves a two objective situation arising from the latter factor. It was found out that due to the nature of this problem it was difficult to implement a utility approach to generate a solution. Therefore, a cost-effectiveness approach is presented to demonstrate a technique for finding the trade-off function between objectives for varying levels of systems and operating characteristics. This approach is similar to that of Subramanian and Ravichandran [8] where a complex two-unit electronic system was studied.  相似文献   

3.
This study examined the differential effects of a meta-cognitive instruction, called IMPROVE, on third and sixth graders’ solution of word problems. In particular, the study focused on the solution of two kinds of word problems: with consistent and with inconsistent language. Participants were 194 Israeli students who studied in third (N = 110) and sixth (N = 84) grades. All students were administered pre- and post-tests constructed of 16 word problems with consistent and inconsistent language. About half of the students within each grade level were exposed to IMPROVE and the others studied under a ‘traditional’ teaching method. The findings indicate that at both grade levels the IMPROVE students significantly outperformed their counterparts in the control group, but third graders benefited from IMPROVE more than sixth graders. In addition, the study indicates that the gap in achievement between IMPROVE and control groups was larger on word problems with inconsistent language compared to word problems with consistent language. The theoretical and practical implications of the study are discussed.  相似文献   

4.
European Energy Performance of Buildings Directives DE promote energy efficiency in buildings. Under these Directives, the European Union States must apply minimum requirements regarding the energy performance of buildings and ensure the certification of their energy performance. The Directives set only the basic principles and requirements, leaving a significant amount of room for the Member States to establish their specific mechanisms, numeric requirements and ways to implement them, taking into account local conditions. With respect to the Spanish case, the search for buildings that are more energy efficient results in a conflict between users’ economic objectives and society's environmental objectives. In this paper, Compromise Programming is applied to help in the decision-making process. An appropriate distribution of types of dwellings, according to their energy performance and to the climatic zone considered in Spain, will be suggested. Results provide a compromise solution between both objectives.  相似文献   

5.
Soccer video summarization and classification is becoming a very important topic due to the world wide importance and popularity of soccer games which drives the need to automatically classify video scenes thus enabling better sport analysis, refereeing, training, advertisement, etc. Machine learning has been applied to the task of sports video classification. However, for some specific image and video problems (like sports video scenes classification), the learning task becomes convoluted and difficult due to the dynamic nature of the video sequence and the associated uncertainties relating to changes in light conditions, background, camera angle, occlusions and indistinguishable scene features, etc. The majority of previous techniques (such as SVM, neural network, decision tree, etc.) applied to sports video classifications did not provide a consummate solution, and such models could not be easily understood by human users; meanwhile, they increased the complexity and time of computation and the associated costs of the involved standalone machines. Hence, there is a need to develop a system which is able to address these drawbacks and handle the high levels of uncertainty in video scenes classification and undertake the heavy video processing securely and efficiently on a cloud computing based instance. Hence, in this paper we present a cloud computing based multi classifier systems which aggregates three classifiers based on neural networks and two fuzzy logic classifiers based on type-1 fuzzy logic and type-2 fuzzy logic classification systems which were optimized by a Big-Bang Big crunch optimization to maximize the system performance. We will present several real world experiments which shows the proposed classification system operating in real-time to produce high classification accuracies for soccer videos which outperforms the standalone classification systems based on neural networks, type-1 and type-2 fuzzy logic systems.  相似文献   

6.
Combining multiple classifiers, known as ensemble methods, can give substantial improvement in prediction performance of learning algorithms especially in the presence of non-informative features in the data sets. We propose an ensemble of subset of kNN classifiers, ESkNN, for classification task in two steps. Firstly, we choose classifiers based upon their individual performance using the out-of-sample accuracy. The selected classifiers are then combined sequentially starting from the best model and assessed for collective performance on a validation data set. We use bench mark data sets with their original and some added non-informative features for the evaluation of our method. The results are compared with usual kNN, bagged kNN, random kNN, multiple feature subset method, random forest and support vector machines. Our experimental comparisons on benchmark classification problems and simulated data sets reveal that the proposed ensemble gives better classification performance than the usual kNN and its ensembles, and performs comparable to random forest and support vector machines.  相似文献   

7.
Growth in operational complexity is a worldwide reality in the retail industry. One of the most tangible expressions of this phenomenon is the vast increase in the number of products offered. To cope with this problem, the industry has developed the ‘category management’ approach, in which groups of products with certain common characteristics are grouped together into ‘categories’, managed as if they were independent business units. In this paper, we propose a model to evaluate relative category performance in a retail store, considering they might have different business objectives. Our approach is based on Data Envelopment Analysis techniques and requires a careful definition of the resources that categories use to contribute to achieving their business objectives. We illustrate how to use our approach by applying it to the evaluation of several categories in a South American supermarket. The empirical results show that, even for very conservative assumptions, the model has a significant discriminatory power, identifying 25% of the sample as not operating efficiently. Although efficiency scores might exhibit a relatively large dispersion, the set of efficient units is robust to data variations.  相似文献   

8.
This paper investigates the performance of evolutionary algorithms in the optimization aspects of oblique decision tree construction and describes their performance with respect to classification accuracy, tree size, and Pareto-optimality of their solution sets. The performance of the evolutionary algorithms is analyzed and compared to the performance of exhaustive (traditional) decision tree classifiers on several benchmark datasets. The results show that the classification accuracy and tree sizes generated by the evolutionary algorithms are comparable with the results generated by traditional methods in all the sample datasets and in the large datasets, the multiobjective evolutionary algorithms generate better Pareto-optimal sets than the sets generated by the exhaustive methods. The results also show that a classifier, whether exhaustive or evolutionary, that generates the most accurate trees does not necessarily generate the shortest trees or the best Pareto-optimal sets.  相似文献   

9.
10.
Learning from examples is a frequently arising challenge, with a large number of algorithms proposed in the classification, data mining and machine learning literature. The evaluation of the quality of such algorithms is frequently carried out ex post, on an experimental basis: their performance is measured either by cross validation on benchmark data sets, or by clinical trials. Few of these approaches evaluate the learning process ex ante, on its own merits. In this paper, we discuss a property of rule-based classifiers which we call “justifiability”, and which focuses on the type of information extracted from the given training set in order to classify new observations. We investigate some interesting mathematical properties of justifiable classifiers. In particular, we establish the existence of justifiable classifiers, and we show that several well-known learning approaches, such as decision trees or nearest neighbor based methods, automatically provide justifiable classifiers. We also identify maximal subsets of observations which must be classified in the same way by every justifiable classifiers. Finally, we illustrate by a numerical example that using classifiers based on “most justifiable” rules does not seem to lead to overfitting, even though it involves an element of optimization.  相似文献   

11.
We are concerned here with a nonlinear multi-term fractional differential equation (FDE). The existence of a unique solution will be proved. Convergence analysis of Adomian decomposition method (ADM) applied to these type of equations is discussed. Convergence analysis is reliable enough to estimate the maximum absolute truncated error of Adomian’s series solution. Some numerical examples are given, their ADM solutions are compared with a numerical method solutions. This numerical method is introduced in Podlubny (Fractional Differential Equations, Chap. 8, Academic Press, San Diego, 1999).  相似文献   

12.
We introduce a new method to prove averaging lemmas, i.e., prove a regularizing effect on the average in velocity of a solution to a kinetic equation. The method does not require the use of Fourier transform and the whole procedure is performed in the ‘real space’; it leads to estimating an operator very similar to the so-called X-ray transform. We are then able to improve the known results when the integrability in space and velocity are different. To cite this article: P.-E. Jabin, L. Vega, C. R. Acad. Sci. Paris, Ser. I 337 (2003).  相似文献   

13.
In this paper, the classification power of the eigenvalues of six graph-associated matrices is investigated. Each matrix contains a certain type of geometric/ spatial information, which may be important for the classification process. The performances of the different feature types is evaluated on two data sets: first a benchmark data set for optical character recognition, where the extracted eigenvalues were utilized as feature vectors for multi-class classification using support vector machines. Classification results are presented for all six feature types, as well as for classifier combinations at decision level. For the decision level combination, probabilistic output support vector machines have been applied, with a performance up to 92.4 %. To investigate the power of the spectra for time dependent tasks, too, a second data set was investigated, consisting of human activities in video streams. To model the time dependency, hidden Markov models were utilized and the classification rate reached 98.3 %.  相似文献   

14.
The question of what kind of statistical education is provided by a given statistics course or set of courses is discussed. The needs for continuous study of course objectives and syllabuses, and experimentation with new teaching methods, are pointed out. A classification of statistics courses, whose main categories are ‘ course type ‘, ‘ methods of presentation ‘, ‘ objectives ‘ and ‘ syllabus ‘, is presented. Examples and suggestions for uses of the classification are given.  相似文献   

15.
Learning from imbalanced data, where the number of observations in one class is significantly larger than the ones in the other class, has gained considerable attention in the machine learning community. Assuming the difficulty in predicting each class is similar, most standard classifiers will tend to predict the majority class well. This study applies tornado data that are highly imbalanced, as they are rare events. The severe weather data used herein have thunderstorm circulations (mesocyclones) that produce tornadoes in approximately 6.7 % of the total number of observations. However, since tornadoes are high impact weather events, it is important to predict the minority class with high accuracy. In this study, we apply support vector machines (SVMs) and logistic regression with and without a midpoint threshold adjustment on the probabilistic outputs, random forest, and rotation forest for tornado prediction. Feature selection with SVM-recursive feature elimination was also performed to identify the most important features or variables for predicting tornadoes. The results showed that the threshold adjustment on SVMs provided better performance compared to other classifiers.  相似文献   

16.
Relationships between the concept of an exercise, of formulation of the text, of an exercise solution, and of the final result are analysed.

For mathematical exercises, deductive directed reasoning is of special importance. A classification of reasonings gives a basis for the division of the exercises. Among problem exercises, ‘open problems’ and solutions of exercises extended by an approach called in German, ‘Methode der erzeugenden Probleme’, are important from the point of view of didactics.

Empirical studies concerning the influence of the exercise's formulation on the efficiency of its solution, have been carried out for a set of about 16 thousand exercises.  相似文献   

17.
Supervised learning methods are powerful techniques to learn a function from a given set of labeled data, the so-called training data. In this paper the support vector machines approach is applied to an image classification task. Starting with the corresponding Tikhonov regularization problem, reformulated as a convex optimization problem, we introduce a conjugate dual problem to it and prove that, whenever strong duality holds, the function to be learned can be expressed via the dual optimal solutions. Corresponding dual problems are then derived for different loss functions. The theoretical results are applied by numerically solving a classification task using high dimensional real-world data in order to obtain optimal classifiers. The results demonstrate the excellent performance of support vector classification for this particular problem.  相似文献   

18.
Removing important nodes from complex networks is a great challenge in fighting against criminal organizations and preventing disease outbreaks. Six network performance metrics, including four new metrics, are applied to quantify networks’ diffusion speed, diffusion scale, homogeneity, and diameter. In order to efficiently identify nodes whose removal maximally destroys a network, i.e., minimizes network performance, ten structured heuristic node removal strategies are designed using different node centrality metrics including degree, betweenness, reciprocal closeness, complement-derived closeness, and eigenvector centrality. These strategies are applied to remove nodes from the September 11, 2001 hijackers’ network, and their performance are compared to that of a random strategy, which removes randomly selected nodes, and the locally optimal solution (LOS), which removes nodes to minimize network performance at each step. The computational complexity of the 11 strategies and LOS is also analyzed. Results show that the node removal strategies using degree and betweenness centralities are more efficient than other strategies.  相似文献   

19.
Multiple classifier systems combine several individual classifiers to deliver a final classification decision. In this paper the performance of several multiple classifier systems are evaluated in terms of their ability to correctly classify consumers as good or bad credit risks. Empirical results suggest that some multiple classifier systems deliver significantly better performance than the single best classifier, but many do not. Overall, bagging and boosting outperform other multi-classifier systems, and a new boosting algorithm, Error Trimmed Boosting, outperforms bagging and AdaBoost by a significant margin.  相似文献   

20.
Bin packing with fragmentable items is a variant of the classic bin packing problem where items may be cut into smaller fragments. The objective is to minimize the number of item fragments, or equivalently, to minimize the number of cuts, for a given number of bins. Models based on packing fragmentable items are useful for representing finite shared resources. In this article, we present improvements to approximation and metaheuristic algorithms to obtain an optimality-preserving optimization algorithm with polynomial complexity, worst-case performance guarantees and parametrizable running time. We also present a new family of fast lower bounds and prove their worst-case performance ratios. We evaluate the performance and quality of the algorithm and the best lower bound through a series of computational experiments on representative problem instances. For the studied problem sets, one consisting of 180 problems with up to 20 items and another consisting of 450 problems with up to 1024 items, the lower bound performs no worse than 5 / 6. For the first problem set, the algorithm found an optimal solution in 92 % of all 1800 runs. For the second problem set, the algorithm found an optimal solution in 99 % of all 4500 runs. No run lasted longer than 220 ms.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号