首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 10 毫秒
1.
Support vector machines (SVMs) have attracted much attention in theoretical and in applied statistics. The main topics of recent interest are consistency, learning rates and robustness. We address the open problem whether SVMs are qualitatively robust. Our results show that SVMs are qualitatively robust for any fixed regularization parameter λ. However, under extremely mild conditions on the SVM, it turns out that SVMs are not qualitatively robust any more for any null sequence λn, which are the classical sequences needed to obtain universal consistency. This lack of qualitative robustness is of a rather theoretical nature because we show that, in any case, SVMs fulfill a finite sample qualitative robustness property.For a fixed regularization parameter, SVMs can be represented by a functional on the set of all probability measures. Qualitative robustness is proven by showing that this functional is continuous with respect to the topology generated by weak convergence of probability measures. Combined with the existence and uniqueness of SVMs, our results show that SVMs are the solutions of a well-posed mathematical problem in Hadamard’s sense.  相似文献   

2.
The goal of classification (or pattern recognition) is to construct a classifier with small misclassification error. The notions of consistency and universal consistency are important to the construction of classification rules. A consistent rule guarantees us that taking more samples essentially suffices to roughly reconstruct the unknown distribution. Support vector machine (SVM) algorithm is one of the most important rules in two category classification. How to effectively extend the SVM for multicategory classification is still an on-going research issue. Different versions of multicategory support vector machines (MSVMs) have been proposed and used in practice. We study the one designed by Lee, Lin and Wahba with hinge loss functional. The consistency of MSVMs is established under a mild condition. As a corollary, the universal consistency holds true if the reproducing kernel Hilbert space is dense in C norm. In addition, an example is given to demonstrate the main results. Dedicated to Charlie Micchelli on the occasion of his 60th birthday Supported in part by NSF of China under Grants 10571010 and 10171007.  相似文献   

3.
A convergent decomposition algorithm for support vector machines   总被引:1,自引:0,他引:1  
In this work we consider nonlinear minimization problems with a single linear equality constraint and box constraints. In particular we are interested in solving problems where the number of variables is so huge that traditional optimization methods cannot be directly applied. Many interesting real world problems lead to the solution of large scale constrained problems with this structure. For example, the special subclass of problems with convex quadratic objective function plays a fundamental role in the training of Support Vector Machine, which is a technique for machine learning problems. For this particular subclass of convex quadratic problem, some convergent decomposition methods, based on the solution of a sequence of smaller subproblems, have been proposed. In this paper we define a new globally convergent decomposition algorithm that differs from the previous methods in the rule for the choice of the subproblem variables and in the presence of a proximal point modification in the objective function of the subproblems. In particular, the new rule for sequentially selecting the subproblems appears to be suited to tackle large scale problems, while the introduction of the proximal point term allows us to ensure the global convergence of the algorithm for the general case of nonconvex objective function. Furthermore, we report some preliminary numerical results on support vector classification problems with up to 100 thousands variables.  相似文献   

4.
5.
We investigate some closure properties of heavy-tailed distributions for random sums. We first consider the random sums with independent or some certain dependent long-tailed increments. We also consider the real-valued increments with dominatedly varying-tailed distributions under extended negative dependence and obtain two results on necessary and sufficient conditions or sufficient conditions for the closure on random sums.  相似文献   

6.
Support Vector Machines (SVMs) is known to be a powerful nonparametric classification technique even for high-dimensional data. Although predictive ability is important, obtaining an easy-to-interpret classifier is also crucial in many applications. Linear SVM provides a classifier based on a linear score. In the case of functional data, the coefficient function that defines such linear score usually has many irregular oscillations, making it difficult to interpret.  相似文献   

7.
8.
Support Vector Machine (SVM) is one of the most important class of machine learning models and algorithms, and has been successfully applied in various fields. Nonlinear optimization plays a crucial role in SVM methodology, both in defining the machine learning models and in designing convergent and efficient algorithms for large-scale training problems. In this paper we present the convex programming problems underlying SVM focusing on supervised binary classification. We analyze the most important and used optimization methods for SVM training problems, and we discuss how the properties of these problems can be incorporated in designing useful algorithms.  相似文献   

9.
Appendix: A primer on heavy-tailed distributions   总被引:4,自引:0,他引:4  
Sigman  Karl 《Queueing Systems》1999,33(1-3):261-275
  相似文献   

10.
The classical estimation method for extreme quantiles of heavy-tailed distributions was presented by Weissman (J. Amer. Statist. Assoc. 73 (1978) 812–815) and makes use of the Hill estimator (Ann. Statist. 3 (1975) 1163–1174) for the positive extreme value index. This index estimator can be interpreted as an estimator of the slope in the Pareto quantile plot in case one considers regression lines passing through a fixed anchor point. In this Note we propose a new extreme quantile estimator based on an unconstrained least squares estimator of the index, introduced by Kratz and Resnick (Comm. Statist. Stochastic Models 12 (1996) 699–724) and Schultze and Steinebach (Statist. Decisions 14 (1996) 353–372) and we study its asymptotic behavior. To cite this article: A. Fils, A. Guillou, C. R. Acad. Sci. Paris, Ser. I 338 (2004).  相似文献   

11.
In this paper, we deal with ranking problems arising from various data mining applications where the major task is to train a rank-prediction model to assign every instance a rank. We first discuss the merits and potential disadvantages of two existing popular approaches for ranking problems: the ‘Max-Wins’ voting process based on multi-class support vector machines (SVMs) and the model based on multi-criteria decision making. We then propose a confidence voting process for ranking problems based on SVMs, which can be viewed as a combination of the SVM approach and the multi-criteria decision making model. Promising numerical experiments based on the new model are reported. The research of the last author was supported by the grant #R.PG 0048923 of NESERC, the MITACS project “New Interior Point Methods and Software for Convex Conic-Linear Optimization and Their Application to Solve VLSI Circuit Layout Problems” and the Canada Researcher Chair Program.  相似文献   

12.
Kernelized support vector machines (SVMs) belong to the most widely used classification methods. However, in contrast to linear SVMs, the computation time required to train such a machine becomes a bottleneck when facing large data sets. In order to mitigate this shortcoming of kernel SVMs, many approximate training algorithms were developed. While most of these methods claim to be much faster than the state-of-the-art solver LIBSVM, a thorough comparative study is missing. We aim to fill this gap. We choose several well-known approximate SVM solvers and compare their performance on a number of large benchmark data sets. Our focus is to analyze the trade-off between prediction error and runtime for different learning and accuracy parameter settings. This includes simple subsampling of the data, the poor-man’s approach to handling large scale problems. We employ model-based multi-objective optimization, which allows us to tune the parameters of learning machine and solver over the full range of accuracy/runtime trade-offs. We analyze (differences between) solvers by studying and comparing the Pareto fronts formed by the two objectives classification error and training time. Unsurprisingly, given more runtime most solvers are able to find more accurate solutions, i.e., achieve a higher prediction accuracy. It turns out that LIBSVM with subsampling of the data is a strong baseline. Some solvers systematically outperform others, which allows us to give concrete recommendations of when to use which solver.  相似文献   

13.
Let L and S denote the classes of distributions with long tails and subexponential tails respectively. Let OS denote the class of distributions with O-subexponential tails, which means the distributions with the tails having the same order as the tails of their 2-fold convolutions. In this paper, we first construct a family of distributions without finite means in LOS?S. Next some distributions in LOS?S, which possess finite means or even finite higher moments, are also constructed. In connection with this, we prove that the class OS is closed under minimization of random variables. However, it is not closed under maximization of random variables.  相似文献   

14.
This paper presents the multiple instance classification problem that can be used for drug and molecular activity prediction, text categorization, image annotation, and object recognition. In order to model a more robust representation of outliers, hard margin loss formulations that minimize the number of misclassified instances are proposed. Although the problem is $\mathcal{NP}$ -hard, computational studies show that medium sized problems can be solved to optimality in reasonable time using integer programming and constraint programming formulations. A three-phase heuristic algorithm is proposed for larger problems. Furthermore, different loss functions such as hinge loss, ramp loss, and hard margin loss are empirically compared in the context of multiple instance classification. The proposed heuristic and robust support vector machines with hard margin loss demonstrate superior generalization performance compared to other approaches for multiple instance learning.  相似文献   

15.
We consider linear programming approaches for support vector machines (SVM). The linear programming problems are introduced as an approximation of the quadratic programming problems commonly used in SVM. When we consider the kernel based nonlinear discriminators, the approximation can be viewed as kernel principle component analysis which generates an important subspace from the feature space characterized the kernel function. We show that any data points nonlinearly, and implicitly, projected into the feature space by kernel functions can be approximately expressed as points lying a low dimensional Euclidean space explicitly, which enables us to develop linear programming formulations for nonlinear discriminators. We also introduce linear programming formulations for multicategory classification problems. We show that the same maximal margin principle exploited in SVM can be involved into the linear programming formulations. Moreover, considering the low dimensional feature subspace extraction, we can generate nonlinear multicategory discriminators by solving linear programming problems.Numerical experiments on real world datasets are presented. We show that the fairly low dimensional feature subspace can achieve a reasonable accuracy, and that the linear programming formulations calculate discriminators efficiently. We also discuss a sampling strategy which might be crucial for huge datasets.  相似文献   

16.
Supervised classification is an important part of corporate data mining to support decision making in customer-centric planning tasks. The paper proposes a hierarchical reference model for support vector machine based classification within this discipline. The approach balances the conflicting goals of transparent yet accurate models and compares favourably to alternative classifiers in a large-scale empirical evaluation in real-world customer relationship management applications. Recent advances in support vector machine oriented research are incorporated to approach feature, instance and model selection in a unified framework.  相似文献   

17.
This paper investigate a class of semi-supervised support vector machines ( $\text{ S }^3\mathrm{VMs}$ ) with arbitrary norm. A general framework for the $\text{ S }^3\mathrm{VMs}$ was first constructed based on a robust DC (Difference of Convex functions) program. With different DC decompositions, DC optimization formulations for the linear and nonlinear $\text{ S }^3\mathrm{VMs}$ are investigated. The resulting DC optimization algorithms (DCA) only require solving simple linear program or convex quadratic program at each iteration, and converge to a critical point after a finite number of iterations. The effectiveness of proposed algorithms are demonstrated on some UCI databases and licorice seed near-infrared spectroscopy data. Moreover, numerical results show that the proposed algorithms offer competitive performances to the existing $\text{ S }^3\mathrm{VM}$ methods.  相似文献   

18.
Feature selection consists of choosing a subset of available features that capture the relevant properties of the data. In supervised pattern classification, a good choice of features is fundamental for building compact and accurate classifiers. In this paper, we develop an efficient feature selection method using the zero-norm l 0 in the context of support vector machines (SVMs). Discontinuity at the origin for l 0 makes the solution of the corresponding optimization problem difficult to solve. To overcome this drawback, we use a robust DC (difference of convex functions) programming approach which is a general framework for non-convex continuous optimisation. We consider an appropriate continuous approximation to l 0 such that the resulting problem can be formulated as a DC program. Our DC algorithm (DCA) has a finite convergence and requires solving one linear program at each iteration. Computational experiments on standard datasets including challenging feature-selection problems of the NIPS 2003 feature selection challenge and gene selection for cancer classification show that the proposed method is promising: while it suppresses up to more than 99% of the features, it can provide a good classification. Moreover, the comparative results illustrate the superiority of the proposed approach over standard methods such as classical SVMs and feature selection concave.  相似文献   

19.
Method  In this paper, we introduce a bi-level optimization formulation for the model and feature selection problems of support vector machines (SVMs). A bi-level optimization model is proposed to select the best model, where the standard convex quadratic optimization problem of the SVM training is cast as a subproblem. Feasibility  The optimal objective value of the quadratic problem of SVMs is minimized over a feasible range of the kernel parameters at the master level of the bi-level model. Since the optimal objective value of the subproblem is a continuous function of the kernel parameters, through implicity defined over a certain region, the solution of this bi-level problem always exists. The problem of feature selection can be handled in a similar manner. Experiments and results  Two approaches for solving the bi-level problem of model and feature selection are considered as well. Experimental results show that the bi-level formulation provides a plausible tool for model selection.  相似文献   

20.
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号