首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
Grey wolf optimizer algorithm was recently presented as a new heuristic search algorithm with satisfactory results in real-valued and binary encoded optimization problems that are categorized in swarm intelligence optimization techniques. This algorithm is more effective than some conventional population-based algorithms, such as particle swarm optimization, differential evolution and gravitational search algorithm. Some grey wolf optimizer variants were developed by researchers to improve the performance of the basic grey wolf optimizer algorithm. Inspired by particle swarm optimization algorithm, this study investigates the performance of a new algorithm called Inspired grey wolf optimizer which extends the original grey wolf optimizer by adding two features, namely, a nonlinear adjustment strategy of the control parameter, and a modified position-updating equation based on the personal historical best position and the global best position. Experiments are performed on four classical high-dimensional benchmark functions, four test functions proposed in the IEEE Congress on Evolutionary Computation 2005 special session, three well-known engineering design problems, and one real-world problem. The results show that the proposed algorithm can find more accurate solutions and has higher convergence rate and less number of fitness function evaluations than the other compared techniques.  相似文献   

2.
Distance weighted discrimination (DWD) was originally proposed to handle the data piling issue in the support vector machine. In this article, we consider the sparse penalized DWD for high-dimensional classification. The state-of-the-art algorithm for solving the standard DWD is based on second-order cone programming, however such an algorithm does not work well for the sparse penalized DWD with high-dimensional data. To overcome the challenging computation difficulty, we develop a very efficient algorithm to compute the solution path of the sparse DWD at a given fine grid of regularization parameters. We implement the algorithm in a publicly available R package sdwd. We conduct extensive numerical experiments to demonstrate the computational efficiency and classification performance of our method.  相似文献   

3.
We propose a new binary classification and variable selection technique especially designed for high-dimensional predictors. Among many predictors, typically, only a small fraction of them have significant impact on prediction. In such a situation, more interpretable models with better prediction accuracy can be obtained by variable selection along with classification. By adding an ?1-type penalty to the loss function, common classification methods such as logistic regression or support vector machines (SVM) can perform variable selection. Existing penalized SVM methods all attempt to jointly solve all the parameters involved in the penalization problem altogether. When data dimension is very high, the joint optimization problem is very complex and involves a lot of memory allocation. In this article, we propose a new penalized forward search technique that can reduce high-dimensional optimization problems to one-dimensional optimization by iterating the selection steps. The new algorithm can be regarded as a forward selection version of the penalized SVM and its variants. The advantage of optimizing in one dimension is that the location of the optimum solution can be obtained with intelligent search by exploiting convexity and a piecewise linear or quadratic structure of the criterion function. In each step, the predictor that is most able to predict the outcome is chosen in the model. The search is then repeatedly used in an iterative fashion until convergence occurs. Comparison of our new classification rule with ?1-SVM and other common methods show very promising performance, in that the proposed method leads to much leaner models without compromising misclassification rates, particularly for high-dimensional predictors.  相似文献   

4.
In this paper, a stochastic gradient descent algorithm is proposed for the binary classification problems based on general convex loss functions. It has computational superiority over the existing algorithms when the sample size is large. Under some reasonable assumptions on the hypothesis space and the underlying distribution, the learning rate of the algorithm has been established, which is faster than that of closely related algorithms.  相似文献   

5.
一类连续函数模拟退火算法及其收敛性分析   总被引:11,自引:0,他引:11  
高维连续函数的全局优化问题普遍存在于计算生物学、计算化学等领域.针对这类问题和现有连续函数模拟退火算法的某些不足,本文给出了一类改进的模拟退火算法.采用一种简单的方法证明了算法的全局收敛性.数值结果表明,对于高维连续函数,该算法能够快速有效地收敛到全局最优点,比较了两种新解产生方法的试验结果。  相似文献   

6.
Near infrared (NIR) spectroscopy has been extensively used in classification problems because it is fast, reliable, cost-effective, and non-destructive. However, NIR data often have several hundred or thousand variables (wavelengths) that are highly correlated with each other. Thus, it is critical to select a few important features or wavelengths that better explain NIR data. Wavelets are popular as preprocessing tools for spectra data. Many applications perform feature selection directly, based on high-dimensional wavelet coefficients, and this can be computationally expensive. This paper proposes a two-stage scheme for the classification of NIR spectra data. In the first stage, the proposed multi-scale vertical energy thresholding procedure is used to reduce the dimension of the high-dimensional spectral data. In the second stage, a few important wavelet coefficients are selected using the proposed support vector machines gradient-recursive feature elimination. The proposed two-stage method has produced better classification performance, with higher computational efficiency, when tested on four NIR data sets.  相似文献   

7.
Classification on high-dimensional data with thousands to tens of thousands of dimensions is a challenging task due to the high dimensionality and the quality of the feature set. The problem can be addressed by using feature selection to choose only informative features or feature construction to create new high-level features. Genetic programming (GP) using a tree-based representation can be used for both feature construction and implicit feature selection. This work presents a comprehensive study to investigate the use of GP for feature construction and selection on high-dimensional classification problems. Different combinations of the constructed and/or selected features are tested and compared on seven high-dimensional gene expression problems, and different classification algorithms are used to evaluate their performance. The results show that the constructed and/or selected feature sets can significantly reduce the dimensionality and maintain or even increase the classification accuracy in most cases. The cases with overfitting occurred are analysed via the distribution of features. Further analysis is also performed to show why the constructed feature can achieve promising classification performance.  相似文献   

8.
Classical robust statistical methods dealing with noisy data are often based on modifications of convex loss functions. In recent years, nonconvex loss-based robust methods have been increasingly popular. A nonconvex loss can provide robust estimation for data contaminated with outliers. The significant challenge is that a nonconvex loss can be numerically difficult to optimize. This article proposes quadratic majorization algorithm for nonconvex (QManc) loss. The QManc can decompose a nonconvex loss into a sequence of simpler optimization problems. Subsequently, the QManc is applied to a powerful machine learning algorithm: quadratic majorization boosting algorithm (QMBA). We develop QMBA for robust classification (binary and multi-category) and regression. In high-dimensional cancer genetics data and simulations, the QMBA is comparable with convex loss-based boosting algorithms for clean data, and outperforms the latter for data contaminated with outliers. The QMBA is also superior to boosting when directly implemented to optimize nonconvex loss functions. Supplementary material for this article is available online.  相似文献   

9.
A clustering methodology based on biological visual models that imitates how humans visually cluster data by spatially associating patterns has been recently proposed. The method is based on Cellular Neural Networks and some resolution adjustments. The Cellular Neural Network rebuilds low-density areas while different resolutions find the best clustering option. The algorithm has demonstrated good performance compared to other clustering techniques. However, its main drawbacks correspond to its inability to operate with more than two-dimensional data sets and the computational time required for the resolution adjustment mechanism. This paper proposes a new version of this clustering methodology to solve such flaws. In the new approach, a pre-processing stage is incorporated featuring a Self-Organization Map that maps complex high-dimensional relations into a reduced lattice yet preserving the topological organization of the initial data set. This reduced representation is employed as the two-dimensional data set for further processing. In the new version, the resolution adjustment process is also accelerated through the use of an optimization method that combines the Hill-Climbing and the Random Search techniques. By incorporating such mechanisms rather than evaluating all possible resolutions, the optimization strategy finds the best resolution for a clustering problem by using a limited number of iterations. The proposed approach has been evaluated, considering several two-dimensional and high-dimensional datasets. Experimental evidence exhibits that the proposed algorithm performs the clustering task over complex problems delivering a 46% faster on average than the original method. The approach is also compared to other popular clustering techniques reported in the literature. Computational experiments demonstrate competitive results in comparison to other algorithms in terms of accuracy and robustness.  相似文献   

10.
We present two graph-based algorithms for multiclass segmentation of high-dimensional data, motivated by the binary diffuse interface model. One algorithm generalizes Ginzburg–Landau (GL) functional minimization on graphs to the Gibbs simplex. The other algorithm uses a reduction of GL minimization, based on the Merriman–Bence–Osher scheme for motion by mean curvature. These yield accurate and efficient algorithms for semi-supervised learning. Our algorithms outperform existing methods, including supervised learning approaches, on the benchmark datasets that we used. We refer to Garcia-Cardona (2014) for a more detailed illustration of the methods, as well as different experimental examples.  相似文献   

11.
We discuss receiver operating characteristic (ROC) curve and the area under the ROC curve (AUC) for binary classification problems in clinical fields. We propose a statistical method for combining multiple feature variables, based on a boosting algorithm for maximization of the AUC. In this iterative procedure, various simple classifiers that consist of the feature variables are combined flexibly into a single strong classifier. We consider a regularization to prevent overfitting to data in the algorithm using a penalty term for nonsmoothness. This regularization method not only improves the classification performance but also helps us to get a clearer understanding about how each feature variable is related to the binary outcome variable. We demonstrate the usefulness of score plots constructed componentwise by the boosting method. We describe two simulation studies and a real data analysis in order to illustrate the utility of our method.  相似文献   

12.
This paper investigates the feature subset selection problem for the binary classification problem using logistic regression model. We developed a modified discrete particle swarm optimization (PSO) algorithm for the feature subset selection problem. This approach embodies an adaptive feature selection procedure which dynamically accounts for the relevance and dependence of the features included the feature subset. We compare the proposed methodology with the tabu search and scatter search algorithms using publicly available datasets. The results show that the proposed discrete PSO algorithm is competitive in terms of both classification accuracy and computational performance.  相似文献   

13.
We develop exact algorithms for multi-objective integer programming (MIP) problems. The algorithms iteratively generate nondominated points and exclude the regions that are dominated by the previously-generated nondominated points. One algorithm generates new points by solving models with additional binary variables and constraints. The other algorithm employs a search procedure and solves a number of models to find the next point avoiding any additional binary variables. Both algorithms guarantee to find all nondominated points for any MIP problem. We test the performance of the algorithms on randomly-generated instances of the multi-objective knapsack, multi-objective shortest path and multi-objective spanning tree problems. The computational results show that the algorithms work well.  相似文献   

14.
Many margin-based binary classification techniques such as support vector machine (SVM) and ψ-learning deliver high performance. An earlier article proposed a new multicategory ψ-learning methodology that shows great promise in generalization ability. However,ψ-learning is computationally difficult because it requires handling a nonconvex minimization problem. In this article, we propose two computational tools for multicategory ψ-learning. The first one is based on d.c. algorithms and solved by sequential quadratic programming, while the second one uses the outer approximation method, which yields the global minimizer via sequential concave minimization. Numerical examples show the proposed algorithms perform well.  相似文献   

15.
MM Algorithms for Some Discrete Multivariate Distributions   总被引:1,自引:0,他引:1  
The MM (minorization–maximization) principle is a versatile tool for constructing optimization algorithms. Every EM algorithm is an MM algorithm but not vice versa. This article derives MM algorithms for maximum likelihood estimation with discrete multivariate distributions such as the Dirichlet-multinomial and Connor–Mosimann distributions, the Neerchal–Morel distribution, the negative-multinomial distribution, certain distributions on partitions, and zero-truncated and zero-inflated distributions. These MM algorithms increase the likelihood at each iteration and reliably converge to the maximum from well-chosen initial values. Because they involve no matrix inversion, the algorithms are especially pertinent to high-dimensional problems. To illustrate the performance of the MM algorithms, we compare them to Newton’s method on data used to classify handwritten digits.  相似文献   

16.
针对传统Kriging模型在多变量(高维)输入全局优化中因超参数过多而引发收敛速度慢,精度低,建模效率不高问题,提出了基于偏最小二乘变换技术和Kriging模型的有效全局优化方法.首先,构造偏最小二乘高斯核函数;其次,借助差分进化算法寻找满足期望改进准则最大化条件的新样本点;然后,将不同核函数和期望改进准则组合,构建四种有效全局优化算法并进行比较;最后,数值算例结果表明,基于偏最小二乘变换的Kriging全局优化方法在解决高维全局优化问题方面相比于标准的全局优化算法在收敛精度及收敛速度方面更具优势.  相似文献   

17.
现有一类分类算法通常采用经典欧氏测度描述样本间相似关系,然而欧氏测度不能较好地反映一些数据集样本的内在分布结构,从而影响这些方法对数据的描述能力.提出一种用于改善一类分类器描述性能的高维空间一类数据距离测度学习算法,与已有距离测度学习算法相比,该算法只需提供目标类数据,通过引入样本先验分布正则化项和L1范数惩罚的距离测度稀疏性约束,能有效解决高维空间小样本情况下的一类数据距离测度学习问题,并通过采用分块协调下降算法高效的解决距离测度学习的优化问题.学习的距离测度能容易的嵌入到一类分类器中,仿真实验结果表明采用学习的距离测度能有效改善一类分类器的描述性能,特别能够改善SVDD的描述能力,从而使得一类分类器具有更强的推广能力.  相似文献   

18.
本文提出一种新的聚类算法-基于模糊的投影寻踪算法,可以有效的处理医学中常常遇到的高维混合数据的模糊聚类问题.并将其应用在慢性肾衰的辩证分析问题中,为已有的慢性肾衰证候的分型标准提供科学支持.本文的研究方法为中医辩证的现代化研究开拓了新的思路,值得进一步深入探讨。  相似文献   

19.
《Journal of Complexity》1994,10(2):199-215
We consider two hybrid algorithms for finding an ϵ-approximation of a root of a convex real function that is twice differentiable and satisfies a certain growth condition on the intervial [0, R]. The first algorithm combines a binary search procedure with Newton′s method. The binary search produces an interval contained in the region of quadratic convergence of Newton′s method. The computational cost of the binary search, as well as the computational cost of Newton′s method, is of order O(log log(R/ϵ)). The second algorithm combines a binary search with the secant method in a similar fashion. This results in a lower overall computational cost when the cost of evaluating the derivative is more than .44042 of the cost of evaluating the function. Our results generalize same recent results of Ye.  相似文献   

20.
Markov Chain Monte Carlo (MCMC) algorithms play an important role in statistical inference problems dealing with intractable probability distributions. Recently, many MCMC algorithms such as Hamiltonian Monte Carlo (HMC) and Riemannian Manifold HMC have been proposed to provide distant proposals with high acceptance rate. These algorithms, however, tend to be computationally intensive which could limit their usefulness, especially for big data problems due to repetitive evaluations of functions and statistical quantities that depend on the data. This issue occurs in many statistic computing problems. In this paper, we propose a novel strategy that exploits smoothness (regularity) in parameter space to improve computational efficiency of MCMC algorithms. When evaluation of functions or statistical quantities are needed at a point in parameter space, interpolation from precomputed values or previous computed values is used. More specifically, we focus on HMC algorithms that use geometric information for faster exploration of probability distributions. Our proposed method is based on precomputing the required geometric information on a set of grids before running sampling algorithm and approximating the geometric information for the current location of the sampler using the precomputed information at nearby grids at each iteration of HMC. Sparse grid interpolation method is used for high dimensional problems. Tests on computational examples are shown to illustrate the advantages of our method.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号