首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 562 毫秒
1.
Multi-label classification problems require each instance to be assigned a subset of a defined set of labels. This problem is equivalent to finding a multi-valued decision function that predicts a vector of binary classes. In this paper we study the decision boundaries of two widely used approaches for building multi-label classifiers, when Bayesian network-augmented naive Bayes classifiers are used as base models: Binary relevance method and chain classifiers. In particular extending previous single-label results to multi-label chain classifiers, we find polynomial expressions for the multi-valued decision functions associated with these methods. We prove upper boundings on the expressive power of both methods and we prove that chain classifiers provide a more expressive model than the binary relevance method.  相似文献   

2.
Discrete support vector machines (DSVM), originally proposed for binary classification problems, have been shown to outperform other competing approaches on well-known benchmark datasets. Here we address their extension to multicategory classification, by developing three different methods. Two of them are based respectively on one-against-all and round-robin classification schemes, in which a number of binary discrimination problems are solved by means of a variant of DSVM. The third method directly addresses the multicategory classification task, by building a decision tree in which an optimal split to separate classes is derived at each node by a new extended formulation of DSVM. Computational tests on publicly available datasets are then conducted to compare the three multicategory classifiers based on DSVM with other methods, indicating that the proposed techniques achieve significantly higher accuracies. This research was partially supported by PRIN grant 2004132117.  相似文献   

3.
The polyhedral homotopy method, which has been known as a powerful numerical method for computing all isolated zeros of a polynomial system, requires all mixed cells of the support of the system to construct a family of homotopy functions. The mixed cells are reformulated in terms of a linear inequality system with an additional combinatorial condition. An enumeration tree is constructed among a family of linear inequality systems induced from it such that every mixed cell corresponds to a unique feasible leaf node, and the depth-first search is applied to the enumeration tree for finding all the feasible leaf nodes. How to construct such an enumeration tree is crucial in computational efficiency. This paper proposes a dynamic construction of an enumeration tree, which branches each parent node into its child nodes so that the number of feasible child nodes is expected to be small; hence we can prune many subtrees which do not contain any mixed cell. Numerical results exhibit that the proposed dynamic construction of an enumeration tree works very efficiently for large scale polynomial systems; for example, it generated all mixed cells of the cyclic-15 problem for the first time in less than 16 hours.  相似文献   

4.
This paper proposes a novel ant colony optimisation (ACO) algorithm tailored for the hierarchical multi-label classification problem of protein function prediction. This problem is a very active research field, given the large increase in the number of uncharacterised proteins available for analysis and the importance of determining their functions in order to improve the current biological knowledge. Since it is known that a protein can perform more than one function and many protein functional-definition schemes are organised in a hierarchical structure, the classification problem in this case is an instance of a hierarchical multi-label problem. In this type of problem, each example may belong to multiple class labels and class labels are organised in a hierarchical structure—either a tree or a directed acyclic graph structure. It presents a more complex problem than conventional flat classification, given that the classification algorithm has to take into account hierarchical relationships between class labels and be able to predict multiple class labels for the same example. The proposed ACO algorithm discovers an ordered list of hierarchical multi-label classification rules. It is evaluated on sixteen challenging bioinformatics data sets involving hundreds or thousands of class labels to be predicted and compared against state-of-the-art decision tree induction algorithms for hierarchical multi-label classification.  相似文献   

5.
Artificial neural networks (ANN) have been widely used for both classification and prediction. This paper is focused on the prediction problem in which an unknown function is approximated. ANNs can be viewed as models of real systems, built by tuning parameters known as weights. In training the net, the problem is to find the weights that optimize its performance (i.e., to minimize the error over the training set). Although the most popular method for training these networks is back propagation, other optimization methods such as tabu search or scatter search have been successfully applied to solve this problem. In this paper we propose a path relinking implementation to solve the neural network training problem. Our method uses GRG, a gradient-based local NLP solver, as an improvement phase, while previous approaches used simpler local optimizers. The experimentation shows that the proposed procedure can compete with the best-known algorithms in terms of solution quality, consuming a reasonable computational effort.  相似文献   

6.
Following the model introduced by Aguech et al. (Probab Eng Inf Sci 21:133–141, 2007), the weighted depth of a node in a labelled rooted tree is the sum of all labels on the path connecting the node to the root. We analyse weighted depths of nodes with given labels, the last inserted node, nodes ordered as visited by the depth first search process, the weighted path length and the weighted Wiener index in a random binary search tree. We establish three regimes of nodes depending on whether the second-order behaviour of their weighted depths follows from fluctuations of the keys on the path, the depth of the nodes or both. Finally, we investigate a random distribution function on the unit interval arising as scaling limit for weighted depths of nodes with at most one child.  相似文献   

7.
This article introduces a classification tree algorithm that can simultaneously reduce tree size, improve class prediction, and enhance data visualization. We accomplish this by fitting a bivariate linear discriminant model to the data in each node. Standard algorithms can produce fairly large tree structures because they employ a very simple node model, wherein the entire partition associated with a node is assigned to one class. We reduce the size of our trees by letting the discriminant models share part of the data complexity. Being themselves classifiers, the discriminant models can also help to improve prediction accuracy. Finally, because the discriminant models use only two predictor variables at a time, their effects are easily visualized by means of two-dimensional plots. Our algorithm does not simply fit discriminant models to the terminal nodes of a pruned tree, as this does not reduce the size of the tree. Instead, discriminant modeling is carried out in all phases of tree growth and the misclassification costs of the node models are explicitly used to prune the tree. Our algorithm is also distinct from the “linear combination split” algorithms that partition the data space with arbitrarily oriented hyperplanes. We use axis-orthogonal splits to preserve the interpretability of the tree structures. An extensive empirical study with real datasets shows that, in general, our algorithm has better prediction power than many other tree or nontree algorithms.  相似文献   

8.
We present a global optimization algorithm, Branch-and-Sandwich, for optimistic bilevel programming problems that satisfy a regularity condition in the inner problem. The functions involved are assumed to be nonconvex and twice continuously differentiable. The proposed approach can be interpreted as the exploration of two solution spaces (corresponding to the inner and the outer problems) using a single branch-and-bound tree. A novel branching scheme is developed such that classical branch-and-bound is applied to both spaces without violating the hierarchy in the decisions and the requirement for (global) optimality in the inner problem. To achieve this, the well-known features of branch-and-bound algorithms are customized appropriately. For instance, two pairs of lower and upper bounds are computed: one for the outer optimal objective value and the other for the inner value function. The proposed bounding problems do not grow in size during the algorithm and are obtained from the corresponding problems at the parent node.  相似文献   

9.
Directed acyclic graphs (DAGs) constitute a qualitative representation for conditional independence (CI) properties of a probability distribution. It is known that every CI statement implied by the topology of a DAG is witnessed over it under a graph-theoretic criterion of d-separation. Alternatively, all such implied CI statements are derivable from the local independencies encoded by a DAG using the so-called semi-graphoid axioms. We consider Labeled Directed Acyclic Graphs (LDAGs) modeling graphically scenarios exhibiting context-specific independence (CSI). Such CSI statements are modeled by labeled edges, where labels encode contexts in which the edge vanishes. We study the problem of identifying all independence statements implied by the structure and the labels of an LDAG. We show that this problem is coNP-hard for LDAGs and formulate a sound extension of the semi-graphoid axioms for the derivation of such implied independencies. Finally we connect our study to certain qualitative versions of independence ubiquitous in database theory and teams semantics.  相似文献   

10.
Automatic image annotation is concerned with the task of assigning one or more semantic concepts to a given image. It is a typical multi-label classification problem. This paper presents a novel multi-label classification framework MLNRS based on neighborhood rough sets for automatic image annotation which considers the uncertainty of the mapping from visual feature space to semantic concepts space. Given a new instances, its neighbors in the training set are firstly identified. After that, based on the concept of upper and lower approximations of neighborhood rough sets, all possible labels of the given instance are found. Then, based on the statistical information gained from the label sets of the neighbors, maximum a posteriori (MAP) principle is utilized to determine the label set for the given instance. Experiments completed for three different image datasets show that MLNRS achieves more promising performance in comparison with to some well-known multi-label learning algorithms.  相似文献   

11.
A fundamental problem in computational biology is the phylogeny reconstruction for a set of specific organisms. One of the graph theoretical approaches is to construct a similarity graph on the set of organisms where adjacency indicates evolutionary closeness, and then to reconstruct a phylogeny by computing a tree interconnecting the organisms such that leaves in the tree are labeled by the organisms and every organism appears as a leaf in the tree. The similarity graph is simple and undirected. For any pair of adjacent organisms in the similarity graph, their distance in the output tree, which is measured by the number of edges on the path connecting them, must be less than some pre-specified bound. This is known as the problem of recognizing leaf powers and computing leaf roots. Graphs that are leaf powers are known to be chordal. It is shown in this paper that all strictly chordal graphs are leaf powers and a linear time algorithm is presented to compute a leaf root for any given strictly chordal graph. An intermediate root-and-power problem, the Steiner root problem, is also examined.  相似文献   

12.
Positive and negative hierarchies of nonlinear integrable lattice models are derived from a discrete spectral problem. The two lattice hierarchies are proved to have discrete zero curvature representations associated with a discrete spectral problem, which also shows that the positive and negative hierarchies correspond to positive and negative power expansions of Lax operators with respect to the spectral parameter, respectively. Moreover, the integrable lattice models in the positive hierarchy are of polynomial type, and the integrable lattice models in the negative hierarchy are of rational type. Further, we construct three integrable coupling systems of the positive hierarchy through enlarging Lax pair method.  相似文献   

13.
In this paper, we study the shortest path tour problem in which a shortest path from a given origin node to a given destination node must be found in a directed graph with non-negative arc lengths. Such path needs to cross a sequence of node subsets that are given in a fixed order. The subsets are disjoint and may be different-sized. A polynomial-time reduction of the problem to a classical shortest path problem over a modified digraph is described and two solution methods based on the above reduction and dynamic programming, respectively, are proposed and compared with the state-of-the-art solving procedure. The proposed methods are tested on existing datasets for this problem and on a large class of new benchmark instances. The computational experience shows that both the proposed methods exhibit a consistent improved performance in terms of computational time with respect to the existing solution method.  相似文献   

14.
With the wireless sensor networks (WSNs) becoming extremely widely used, mobile sensor networks (MSNs) have recently attracted more and more researchers’ attention. Existing routing tree maintenance methods used for query processing are based on static WSNs, most of which are not directly applicable to MSNs due to the unique characteristic of mobility. In particular, sensor nodes are always moving in real world, which seriously affects the stability of the routing tree. Therefore, in this paper, we propose a novel method, named routing tree maintenance based on trajectory prediction in mobile sensor networks (RTTP), to guarantee a long term stability of routing tree. At first, we establish a trajectory prediction model based on extreme learning machine (ELM), by which we can predict sensor node’s trajectory to choose an appropriate parent node for each non-effective node. Then, an Improved version of RTTP method (I-RTTP) that using probabilistic method to minimize the error and improve the accuracy is proposed, to improve the performance of RTTP. Therefore, the state of the routing tree in MSNs can be made more stable. Finally, extensive experimental results show that RTTP and I-RTTP can effectively improve the stability of routing tree and greatly reduce energy consumption of mobile sensor nodes.  相似文献   

15.
递归树的若干枚举特征   总被引:1,自引:0,他引:1  
递归树由Meir和Moon定义作非平面增长树的一种,且所有节点出度都是允许的.本文首先在n个节点的递归树集合和n-1个元素的排列之间建立一个新的──对应,这个对应能同时给出树叶子和排列中的路段之间的对应和树叶子数和排列中的路段数之间的密切关系.同时还研究递归树的各种枚举特征,诸如节点的分类枚举(内节点和叶子节点、偶节点和奇节点,具不同出度的节点)和通路长度枚举(接各种节点分类).  相似文献   

16.
We consider in this paper the efficient ways to generate multi-stage scenario trees. A general modified K-means clustering method is first presented to generate the scenario tree with a general structure. This method takes the time dependency of the simulated path into account. Based on the traditional and modified K-means analyses, the moment matching of multi-stage scenario trees is described as a linear programming (LP) problem. By simultaneously utilizing simulation, clustering, non-linear time series and moment matching skills, a sequential generation method and another new hybrid approach which can generate the whole multi-stage tree right off are proposed. The advantages of these new methods are: the vector autoregressive and multivariate generalized autoregressive conditional heteroscedasticity (VAR-MGARCH) model is adopted to properly reflect the inter-stage dependency and the time-varying volatilities of the data process, the LP-based moment matching technique ensures that the scenario tree generation problem can be solved more efficiently and the tree scale can be further controlled, and in the meanwhile, the statistical properties of the random data process are maintained properly. What is more important, our new LP methods can guarantee at least two branches are derived from each non-leaf node and thus overcome the drawback in relevant papers. We carry out a series of numerical experiments and apply the scenario tree generation methods to a portfolio management problem, which demonstrate the practicality, efficiency and advantages of our new approaches over other models or methods.  相似文献   

17.
Global Priority Estimation in Multiperson Decision Making   总被引:1,自引:0,他引:1  
The analytic hierarchy process generalized to synthesizing local preferences across a complex hierarchy structure into global priorities in multiperson decision making is considered in various approaches. They include classic eigenvector synthesis and multiplicative method, and several other suggested techniques, such as three-dimensional eigenvectors, and simultaneous linear and nonlinear estimations by whole hierarchy structure in synthetic optimizing procedures. A numerical example from marketing research with many criteria, subcriteria, alternatives, and respondents is presented. The quality of approximation compared by different approaches shows that the best results are produced by the nonlinear synthetic priority techniques. The techniques described have been successfully used in dozens of real-world projects and are very helpful for practical managerial decision making in complex hierarchies. Dr. Stan Lipovetsky is Senior Research Director, GfK Custom Research North America, Marketing Science, Research Center for Excellence. The author thanks Professor T. Rapcsak and two reviewers for insightful suggestions that improved the paper.  相似文献   

18.
This paper proposes input selection methods for fuzzy modeling, which are based on decision tree search approaches. The branching decision at each node of the tree is made based on the accuracy of the model available at the node. We propose two different approaches of decision tree search algorithms: bottom-up and top-down and four different measures for selecting the most appropriate set of inputs at every branching node (or decision node). Both decision tree approaches are tested using real-world application examples. These methods are applied to fuzzy modeling of two different classification problems and to fuzzy modeling of two dynamic processes. The models accuracy of the four different examples are compared in terms of several performance measures. Moreover, the advantages and drawbacks of using bottom-up or top-down approaches are discussed.  相似文献   

19.
In recent years, the Hamiltonian Monte Carlo (HMC) algorithm has been found to work more efficiently compared to other popular Markov chain Monte Carlo (MCMC) methods (such as random walk Metropolis–Hastings) in generating samples from a high-dimensional probability distribution. HMC has proven more efficient in terms of mixing rates and effective sample size than previous MCMC techniques, but still may not be sufficiently fast for particularly large problems. The use of GPUs promises to push HMC even further greatly increasing the utility of the algorithm. By expressing the computationally intensive portions of HMC (the evaluations of the probability kernel and its gradient) in terms of linear or element-wise operations, HMC can be made highly amenable to the use of graphics processing units (GPUs). A multinomial regression example demonstrates the promise of GPU-based HMC sampling. Using GPU-based memory objects to perform the entire HMC simulation, most of the latency penalties associated with transferring data from main to GPU memory can be avoided. Thus, the proposed computational framework may appear conceptually very simple, but has the potential to be applied to a wide class of hierarchical models relying on HMC sampling. Models whose posterior density and corresponding gradients can be reduced to linear or element-wise operations are amenable to significant speed ups through the use of GPUs. Analyses of datasets that were previously intractable for fully Bayesian approaches due to the prohibitively high computational cost are now feasible using the proposed framework.  相似文献   

20.
The problem of minimizing a quadratic form over the standard simplex is known as the standard quadratic optimization problem (SQO). It is NP-hard, and contains the maximum stable set problem in graphs as a special case. In this note, we show that the SQO problem may be reformulated as an (exponentially sized) linear program (LP). This reformulation also suggests a hierarchy of polynomial-time solvable LP’s whose optimal values converge finitely to the optimal value of the SQO problem. The hierarchies of LP relaxations from the literature do not share this finite convergence property for SQO, and we review the relevant counterexamples.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号