首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
2.
This paper is devoted to some selected topics relating Combinatorial Optimization and Hierarchical Classification. It is oriented toward extensions of the standard classification schemes (the hierarchies): pyramids, quasi-hierarchies, circular clustering, rigid clustering and others. Bijection theorems between these models and dissimilarity models allow to state some clustering problems as optimization problems. Within the galaxy of optimization we have especially discussed the following: NP-completeness results and search for polynomial instances; problems solved in a polynomial time (e.g. subdominant theory); design, analysis and applications of algorithms. In contrast with the orientation to “new” clustering problems, the last part discusses some standard algorithmic approaches. This article appeared in 4OR 2, 179–219, 2004.  相似文献   

3.
Functional data analysis, as proposed by Ramsay (Psychometrika 47:379–396, 1982), has recently attracted many researchers. The most popular approach taken in recent studies of functional data has been the extension of statistical methods for the analysis of usual data to that of functional data (e.g., Ramsay and Silverman in Functional data Analysis Springer, Berlin Heidelberg New York, 1997, Applied functional data analysis: methods and case studies. Springer, Berlin Heidelberg New York, 2002; Mizuta in Proceedings of the tenth Japan and Korea Joint Conference of Statistics, pp 77–82, 2000; Shimokawa et al. in Japan J Appl Stat 29:27–39, 2000). In addition, several methods for clustering functional data have been proposed (Abraham et al. in Scand J Stat 30:581–595, 2003; Gareth and Catherine in J Am Stat Assoc 98:397–408, 2003; Tarpey and kinateder in J Classif 20:93–114, 2003; Rossi et al. in Proceedings of European Symposium on Artificial Neural Networks pp 305–312, 2004). Furthermore, Tokushige et al. (J Jpn Soc Comput Stat 15:319–326, 2002) defined several dissimilarities between functions for the case of functional data. In this paper, we extend existing crisp and fuzzy k-means clustering algorithms to the analysis of multivariate functional data. In particular, we consider the dissimilarity between functions as a function. Furthermore, cluster centers and memberships, which are defined as functions, are determined at the minimum of a certain target function by using a calculus-of-variations approach.  相似文献   

4.
5.
Dynamic clustering for interval data based on L 2 distance   总被引:2,自引:0,他引:2  
Summary  This paper introduces a partitioning clustering method for objects described by interval data. It follows the dynamic clustering approach and uses and L 2 distance. Particular emphasis is put on the standardization problem where we propose and investigate three standardization techniques for interval-type variables. Moreover, various tools for cluster interpretation are presented and illustrated by simulated and real-case data.  相似文献   

6.
Peer two-step W-methods are designed for integration of stiff initial value problems with parallelism across the method. The essential feature is that in each time step s ‘peer’ approximations are employed having similar properties. In fact, no primary solution variable is distinguished. Parallel implementation of these stages is easy since information from one previous time step is used only and the different linear systems may be solved simultaneously. This paper introduces a subclass having order s−1 where optimal damping for stiff problems is obtained by using different system parameters in different stages. Favourable properties of this subclass are uniform stability for realistic stepsize sequences and a superconvergence property which is proved using a polynomial collocation formulation. Numerical tests on a shared memory computer of a matrix-free implementation with Krylov methods are included. AMS subject classification (2000) 65L06, 65Y05.Received June 2004. Revised January 2005. Communicated by Timo Eirola.Helmut Podhaisky: The work of this author was supported by the German Academic Exchange Service, DAAD.  相似文献   

7.
Each clustering algorithm usually optimizes a qualification metric during its progress. The qualification metric in conventional clustering algorithms considers all the features equally important; in other words each feature participates in the clustering process equivalently. It is obvious that some features have more information than others in a dataset. So it is highly likely that some features should have lower importance degrees during a clustering or a classification algorithm; due to their lower information or their higher variances and etc. So it is always a desire for all artificial intelligence communities to enforce the weighting mechanism in any task that identically uses a number of features to make a decision. But there is always a certain problem of how the features can be participated in the clustering process (in any algorithm, but especially in clustering algorithm) in a weighted manner. Recently, this problem is dealt with by locally adaptive clustering (LAC). However, like its traditional competitors the LAC suffers from inefficiency in data with imbalanced clusters. This paper solves the problem by proposing a weighted locally adaptive clustering (WLAC) algorithm that is based on the LAC algorithm. However, WLAC algorithm suffers from sensitivity to its two parameters that should be tuned manually. The performance of WLAC algorithm is affected by well-tuning of its parameters. Paper proposes two solutions. The first is based on a simple clustering ensemble framework to examine the sensitivity of the WLAC algorithm to its manual well-tuning. The second is based on cluster selection method.  相似文献   

8.
We design a new label shortest path algorithm by applying the concept of a pseudo permanent label. This approach allows an algorithm to partition the set of nodes into two new sets: pseudo permanently labeled nodes and its complementary set. From this point of view, this new label method can be considered as a label setting method. Moreover, at least one node becomes permanently labeled when some nodes which belong to the set of pseudo permanently labeled nodes are scanned in each iteration of the algorithm. In the case of networks with non-negative length arcs it is easy to prove that this node has the minimum distance label among the non-pseudo permanently labeled nodes. On the other hand, it is not known during the computation which pseudo permanently labeled nodes are permanently labeled. Therefore, all distance labels are temporary and the algorithm becomes a label correcting method. Nevertheless, the proposed algorithm exhibits some nice features, such as: (1) the time bound for the running of the algorithm for a network with n nodes and m arcs is O(nm); (2) the number of node scan operations in the algorithm is less than the number of these operations in the previous label correcting algorithms as is observed in the computational experience; (3) the algorithm incorporates two new rules which allow easy detection of a negative cycle in the network; (4) the algorithm is quite simple and very easy to implement, and does not require sophisticated data structures; (5) the algorithm exhibits flexibility in the order in which the new pseudo permanently labeled nodes are scanned. The above features are possible through the application of the pseudo permanent label concept.  相似文献   

9.
In developing a classification model for assigning observations of unknown class to one of a number of specified classes using the values of a set of features associated with each observation, it is often desirable to base the classifier on a limited number of features. Mathematical programming discriminant analysis methods for developing classification models can be extended for feature selection. Classification accuracy can be used as the feature selection criterion by using a mixed integer programming (MIP) model in which a binary variable is associated with each training sample observation, but the binary variable requirements limit the size of problems to which this approach can be applied. Heuristic feature selection methods for problems with large numbers of observations are developed in this paper. These heuristic procedures, which are based on the MIP model for maximizing classification accuracy, are then applied to three credit scoring data sets.  相似文献   

10.
董永生 《中国科学:数学》2013,43(11):1059-1070
纹理是图像分析和识别中经常使用的关键特征, 而小波变换则是图像纹理表示和分类中的常用工具. 然而, 基于小波变换的纹理分类方法常常忽略了小波低频子带信息, 并且无法提取图像纹理的块状奇异信息. 本文提出小波子带系数的局部能量直方图建模方法、轮廓波特征的Poisson 混合模型建模方法和基于轮廓波子带系数聚类的特征提取方法, 并将其应用于图像纹理分类上. 基于局部能量直方图的纹理分类方法解决了小波低频子带的建模难题, 基于Poisson 混合模型的纹理分类方法则首次将Poisson 混合模型用于轮廓子带特征的建模, 而基于轮廓波域聚类的纹理分类方法是一种快速的分类方法. 实验结果显示, 本文所提出的三类方法都超过了当前典型的纹理分类方法.  相似文献   

11.
In this paper, we introduce a developed approach of neuro-fuzzy model free estimators to treat decision problems, where presence of highly multimodal distribution and spurious clusters in decision space causes problems in estimation. After critical analysis of Bart Kosko's DCL–AVQ + FAM neuro-fuzzy system, we developed our Straitjacket + F-wing system. Straitjacket neural system is an improved version of Kohonen's feature mapping according to convergence and termination conditions. F-wing fuzzy system bases on the feature map produced by Straitjacket and provides faster and more exact estimator interface than Bart Kosko's FAM, using fuzzy wing functions instead of fuzzy hyperpyramids. At the end of the paper we perform a test on artificial database comparing the efficiency of our approach with Kosko's system, discriminant analysis, hierarchic and K-mean clustering.  相似文献   

12.
This paper presents upper bounds for the Satellite Revenue Selection and Schedulingproblem (SRSS). A compact model of this generalized Prize Collecting Traveling Salesman Problem with Time Windows is defined and enriched with valid inequalities based on task interval reasoning. The non-concavity of the objective function to be maximized is also studied. Finally a Russian Dolls approach combines bounds on nested sub-problems. These first upper bounds for the SRSS problem are compared to best known solutions of the benchmark of the optimization challenge organized by the French OR society.Received: June 2003, Revised: January 2004, MSC classification: 90C90  相似文献   

13.
In De Clerck and Delanote (Des. Codes Cryptogr, 32: 103–110, 2004) it is shown that if a (0,α)-geometry with α ≥  3 is fully embedded in AG (n,q) then it is a linear representation. In De Feyter (J. Combin Theory Ser A, 109(1): 1–23, 2005; Discrete math, 292: 45–54, 2005) the (0,2)-geometries fully embedded in AG(3,q) are classified apart from two open cases. In this paper, we solve these two open cases. This classification for AG(3,q) is used in De Feyter (Adv Geom, 5: 279–292, 2005) to classify the (0,2)-geometries fully embedded in AG(n,q).   相似文献   

14.
Convex clustering, a convex relaxation of k-means clustering and hierarchical clustering, has drawn recent attentions since it nicely addresses the instability issue of traditional nonconvex clustering methods. Although its computational and statistical properties have been recently studied, the performance of convex clustering has not yet been investigated in the high-dimensional clustering scenario, where the data contains a large number of features and many of them carry no information about the clustering structure. In this article, we demonstrate that the performance of convex clustering could be distorted when the uninformative features are included in the clustering. To overcome it, we introduce a new clustering method, referred to as Sparse Convex Clustering, to simultaneously cluster observations and conduct feature selection. The key idea is to formulate convex clustering in a form of regularization, with an adaptive group-lasso penalty term on cluster centers. To optimally balance the trade-off between the cluster fitting and sparsity, a tuning criterion based on clustering stability is developed. Theoretically, we obtain a finite sample error bound for our estimator and further establish its variable selection consistency. The effectiveness of the proposed method is examined through a variety of numerical experiments and a real data application. Supplementary material for this article is available online.  相似文献   

15.
Feature selection consists of choosing a subset of available features that capture the relevant properties of the data. In supervised pattern classification, a good choice of features is fundamental for building compact and accurate classifiers. In this paper, we develop an efficient feature selection method using the zero-norm l 0 in the context of support vector machines (SVMs). Discontinuity at the origin for l 0 makes the solution of the corresponding optimization problem difficult to solve. To overcome this drawback, we use a robust DC (difference of convex functions) programming approach which is a general framework for non-convex continuous optimisation. We consider an appropriate continuous approximation to l 0 such that the resulting problem can be formulated as a DC program. Our DC algorithm (DCA) has a finite convergence and requires solving one linear program at each iteration. Computational experiments on standard datasets including challenging feature-selection problems of the NIPS 2003 feature selection challenge and gene selection for cancer classification show that the proposed method is promising: while it suppresses up to more than 99% of the features, it can provide a good classification. Moreover, the comparative results illustrate the superiority of the proposed approach over standard methods such as classical SVMs and feature selection concave.  相似文献   

16.
In the Frequency Assignment Problem with Polarization (FAPP), a given set of links must each be assigned a frequency and a polarization, while respecting given radio-electric compatibility constraints defined on pairs of links. In this paper, we propose a tabu search algorithm for the FAPP. A specialized neighborhood is proposed for the problem. Other key features of the algorithm are an adaptive technique to adjust the tabu tenure, an original diversification technique, and a pre-processing procedure based on arc-consistency techniques. The algorithm is tested on the 40 instances of the ROADEF Challenge 2001. It reaches the best known feasibility level for all instances and finds or improves on the best known solutions of the Challenge for a majority of the instances.Received: July 2003 / Revised version: September 2004MSC classification: 90B18, 68T20All correspondence to: Philippe Galinier  相似文献   

17.
This paper is devoted to some selected topics relating Combinatorial Optimization and Hierarchical Classification. It is oriented toward extensions of the standard classification schemes (the hierarchies): pyramids, quasi-hierarchies, circular clustering, rigid clustering and others. Bijection theorems between these models and dissimilarity models allow to state some clustering problems as optimization problems. Within the galaxy of optimization we have especially discussed the following: NP-completeness results and search for polynomial instances; problems solved in a polynomial time (e.g. subdominant theory); design, analysis and applications of algorithms. In contrast with the orientation to new clustering problems, the last part discusses some standard algorithmic approaches.Received: July 2004, Revised: September 2004, MSC classification: 62H30, 91C20, 05C65, 90C27, 68R01  相似文献   

18.
This paper is concerned with automated classification of Combinatorial Optimization Problem instances for instance-specific parameter tuning purpose. We propose the CluPaTra Framework, a generic approach to CLUster instances based on similar PAtterns according to search TRAjectories and apply it on parameter tuning. The key idea is to use the search trajectory as a generic feature for clustering problem instances. The advantage of using search trajectory is that it can be obtained from any local-search based algorithm with small additional computation time. We explore and compare two different search trajectory representations, two sequence alignment techniques (to calculate similarities) as well as two well-known clustering methods. We report experiment results on two classical problems: Travelling Salesman Problem and Quadratic Assignment Problem and industrial case study.  相似文献   

19.
Minimal codewords were introduced by Massey (Proceedings of the 6th Joint Swedish-Russian International Workshop on Information Theory, pp 276–279, 1993) for cryptographical purposes. They are used in particular secret sharing schemes, to model the access structures. We study minimal codewords of weight smaller than 3 · 2 mr in binary Reed–Muller codes RM(r, m) and translate our problem into a geometrical one, using a classification result of Kasami and Tokura (IEEE Trans Inf Theory 16:752–759, 1970) and Kasami et al. (Inf Control 30(4):380–395, 1976) on Boolean functions. In this geometrical setting, we calculate numbers of non-minimal codewords. So we obtain the number of minimal codewords in the cases where we have information about the weight distribution of the code RM(r, m). The presented results improve previous results obtained theoretically by Borissov et al. (Discrete Appl Math 128(1), 65–74, 2003), and computer aided results of Borissov and Manev (Serdica Math J 30(2-3), 303–324, 2004). This paper is in fact an extended abstract. Full proofs can be found on the arXiv.  相似文献   

20.
Classification on high-dimensional data with thousands to tens of thousands of dimensions is a challenging task due to the high dimensionality and the quality of the feature set. The problem can be addressed by using feature selection to choose only informative features or feature construction to create new high-level features. Genetic programming (GP) using a tree-based representation can be used for both feature construction and implicit feature selection. This work presents a comprehensive study to investigate the use of GP for feature construction and selection on high-dimensional classification problems. Different combinations of the constructed and/or selected features are tested and compared on seven high-dimensional gene expression problems, and different classification algorithms are used to evaluate their performance. The results show that the constructed and/or selected feature sets can significantly reduce the dimensionality and maintain or even increase the classification accuracy in most cases. The cases with overfitting occurred are analysed via the distribution of features. Further analysis is also performed to show why the constructed feature can achieve promising classification performance.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号