首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 187 毫秒
1.
Classification on high-dimensional data with thousands to tens of thousands of dimensions is a challenging task due to the high dimensionality and the quality of the feature set. The problem can be addressed by using feature selection to choose only informative features or feature construction to create new high-level features. Genetic programming (GP) using a tree-based representation can be used for both feature construction and implicit feature selection. This work presents a comprehensive study to investigate the use of GP for feature construction and selection on high-dimensional classification problems. Different combinations of the constructed and/or selected features are tested and compared on seven high-dimensional gene expression problems, and different classification algorithms are used to evaluate their performance. The results show that the constructed and/or selected feature sets can significantly reduce the dimensionality and maintain or even increase the classification accuracy in most cases. The cases with overfitting occurred are analysed via the distribution of features. Further analysis is also performed to show why the constructed feature can achieve promising classification performance.  相似文献   

2.
The features used may have an important effect on the performance of credit scoring models. The process of choosing the best set of features for credit scoring models is usually unsystematic and dominated by somewhat arbitrary trial. This paper presents an empirical study of four machine learning feature selection methods. These methods provide an automatic data mining technique for reducing the feature space. The study illustrates how four feature selection methods—‘ReliefF’, ‘Correlation-based’, ‘Consistency-based’ and ‘Wrapper’ algorithms help to improve three aspects of the performance of scoring models: model simplicity, model speed and model accuracy. The experiments are conducted on real data sets using four classification algorithms—‘model tree (M5)’, ‘neural network (multi-layer perceptron with back-propagation)’, ‘logistic regression’, and ‘k-nearest-neighbours’.  相似文献   

3.
In this paper, we study the performance of various state-of-the-art classification algorithms applied to eight real-life credit scoring data sets. Some of the data sets originate from major Benelux and UK financial institutions. Different types of classifiers are evaluated and compared. Besides the well-known classification algorithms (eg logistic regression, discriminant analysis, k-nearest neighbour, neural networks and decision trees), this study also investigates the suitability and performance of some recently proposed, advanced kernel-based classification algorithms such as support vector machines and least-squares support vector machines (LS-SVMs). The performance is assessed using the classification accuracy and the area under the receiver operating characteristic curve. Statistically significant performance differences are identified using the appropriate test statistics. It is found that both the LS-SVM and neural network classifiers yield a very good performance, but also simple classifiers such as logistic regression and linear discriminant analysis perform very well for credit scoring.  相似文献   

4.
The Bonferroni mean (BM) had been generalized for its capacity to capture the interrelationship between input arguments. In order to obtain much more information in the process of group decision making, especially in the cases that the relationships between the fused data are considered, this paper combines the power average operator with the intuitionistic fuzzy Bonferroni mean (IFBM) and develops the intuitionistic fuzzy power Bonferroni mean (IFPBM) and the weighted intuitionistic fuzzy power Bonferroni mean (WIFPBM). We investigate the desirable properties of these new extensions of BM and discuss their special cases. We give a comparison of the new extensions of BM with the corresponding existing IFBMs. Furthermore, the detailed steps of multiple attribute group decision making with the presented IFPBM or WIFPBM are given and numerical examples are illustrated to show the validity and feasibility of the new approaches.  相似文献   

5.
Combining multiple classifiers, known as ensemble methods, can give substantial improvement in prediction performance of learning algorithms especially in the presence of non-informative features in the data sets. We propose an ensemble of subset of kNN classifiers, ESkNN, for classification task in two steps. Firstly, we choose classifiers based upon their individual performance using the out-of-sample accuracy. The selected classifiers are then combined sequentially starting from the best model and assessed for collective performance on a validation data set. We use bench mark data sets with their original and some added non-informative features for the evaluation of our method. The results are compared with usual kNN, bagged kNN, random kNN, multiple feature subset method, random forest and support vector machines. Our experimental comparisons on benchmark classification problems and simulated data sets reveal that the proposed ensemble gives better classification performance than the usual kNN and its ensembles, and performs comparable to random forest and support vector machines.  相似文献   

6.
In this paper, we present two classification approaches based on Rough Sets (RS) that are able to learn decision rules from uncertain data. We assume that the uncertainty exists only in the decision attribute values of the Decision Table (DT) and is represented by the belief functions. The first technique, named Belief Rough Set Classifier (BRSC), is based only on the basic concepts of the Rough Sets (RS). The second, called Belief Rough Set Classifier, is more sophisticated. It is based on Generalization Distribution Table (BRSC-GDT), which is a hybridization of the Generalization Distribution Table and the Rough Sets (GDT-RS). The two classifiers aim at simplifying the Uncertain Decision Table (UDT) in order to generate significant decision rules for classification process. Furthermore, to improve the time complexity of the construction procedure of the two classifiers, we apply a heuristic method of attribute selection based on rough sets. To evaluate the performance of each classification approach, we carry experiments on a number of standard real-world databases by artificially introducing uncertainty in the decision attribute values. In addition, we test our classifiers on a naturally uncertain web usage database. We compare our belief rough set classifiers with traditional classification methods only for the certain case. Besides, we compare the results relative to the uncertain case with those given by another similar classifier, called the Belief Decision Tree (BDT), which also deals with uncertain decision attribute values.  相似文献   

7.
Data collected from a survey typically consist of attributes that are mostly if not completely binary-valued or binary-encoded. We present a method for handling such data where the underlying data analysis can be cast as a classification problem. We propose a hybrid method that combines neural network and decision tree methods. The network is trained to remove irrelevant data attributes and the decision tree is applied to extract comprehensible classification rules from the trained network. The conditions of the rules are in the form of a conjunction of M-of-N constructs. An M-of-N construct is a rule condition that is satisfied if (at least, exactly, at most) M of the N binary attributes in the construct are present. The effectiveness of the method is illustrated on data collected for a study of global car market segmentation. The results show that besides achieving high predictive accuracy, the method also allows meaningful interpretation of the relationships among the data variables.  相似文献   

8.
For a nonprincipal character χ modulo D, we prove a nontrivial estimate of the form Σnx Λ(n)χ(n ? l) \( \ll x\exp \{ - 0.6\sqrt {\ln D} \} \) for the sum of values of χ over a sequence of shifted primes in the case when xD1/2+ε, (l,D) = 1, and the modulus of the primitive character generated by χ is a cube-free number.  相似文献   

9.
Let χ = {χ n } n=0 be the Haar system normalized in L 2(0, 1) and M = {M s } s=1 be an arbitrary, increasing sequence of nonnegative integers. For any subsystem of χ of the form {φ k } = χS = {χ n } nS , where S = S(M) = {n k } k=1 = {nV[p]: pM}, V[0] = {1, 2} and V[p] = {2 p + 1, 2 p + 2, …, 2 p+1} for p = 1, 2, … a series of the form Σ i=1 a i φ i with a i ↘ 0 is constructed, that is universal with respect to partial series in all classes L r (0, 1), r ∈ (0, 1), in the sense of a.e. convergence and in the metric ofL r (0, 1). The constructed series is universal in the class of all measurable, finite functions on [0, 1] in the sense of a.e. convergence. It is proved that there exists a series by Haar system with decreasing coefficients, which has the following property: for any ? > 0 there exists a measurable function µ(x), x ∈ [0, 1], such that 0 ≤ µ(x) ≤ 1 and |{x ∈ [0, 1], µ(x) ≠ = 1}| < ?, and the series is universal in the weighted space L µ[0, 1] with respect to subseries, in the sense of convergence in the norm of L µ[0, 1].  相似文献   

10.
Let V be the complex vector space of homogeneous linear polynomials in the variables x1,..., x m . Suppose G is a subgroup of S m , and χ is an irreducible character of G. Let H d (G, χ) be the symmetry class of polynomials of degree d with respect to G and χ.
For any linear operator T acting on V, there is a (unique) induced operator K χ (T) ∈ End(H d (G, χ)) acting on symmetrized decomposable polynomials by
$${K_\chi }\left( T \right)\left( {{f_1} * {f_2} * \cdots * {f_d}} \right) = T{f_1} * T{f_2} * \cdots * T{f_d}.$$
In this paper, we show that the representation T ? K χ (T) of the general linear group GL(V) is equivalent to the direct sum of χ(1) copies of a representation (not necessarily irreducible) T ? B χ G (T).
  相似文献   

11.
Inspired by Arnold’s classification of local Poisson structures [1] in the plane using the hierarchy of singularities of smooth functions, we consider the problem of global classification of Poisson structures on surfaces. Among the wide class of Poisson structures, we consider the class of bm-Poisson structures which can be also visualized using differential forms with singularities as bm-symplectic structures. In this paper we extend the classification scheme in [24] for bm-symplectic surfaces to the equivariant setting. When the compact group is the group of deck-transformations of an orientable covering, this yields the classification of these objects for nonorientable surfaces. The paper also includes recipes to construct bm-symplectic structures on surfaces. The feasibility of such constructions depends on orientability and on the colorability of an associated graph. The desingularization technique in [10] is revisited for surfaces and the compatibility with this classification scheme is analyzed in detail.  相似文献   

12.
Filter back-projection (FBP) algorithms are available and extensively used methods for tomography. In this paper, we prove the convergence of FBP algorithms at any continuous point of image function, in L 2-norm and L 1-norm under the certain assumptions of image and window functions of FBP algorithms.  相似文献   

13.
We obtain an upper estimate N?χ(M) for the sum Q N of singular zero multiplicities of the Nth eigenfunction of the Laplace-Beltrami operator on the two-dimensional, compact, connected Riemann manifold M, where χ M is the Euler characteristic ofM. Stronger estimates, but equivalent asymptotically (N å ∞), are given for the cases of the sphere S 2 and the projective plane ?2. Asymptotically sharper estimates are shown for the case of a domain on the plane.  相似文献   

14.
The tabu search algorithms for the Crew Scheduling Problem (CSP) reported in this paper are part of a decision support system for crew scheduling management of the Lisbon Underground. The CPS is formulated as the minimum number of duties necessary to cover a pre-defined timetable under a set of contractual rules. An initial solution is constructed following a traditional run-cutting approach. Two alternative improvement algorithms are subsequently used to reduce the number of duties in the initial solution. Both algorithms are embedded in a tabu search framework: Tabu-crew takes advantage of a form of strategic oscillation for the neighbourhood search while the run-ejection algorithm considers compound moves based on a subgraph ejection chain method. Computational results are reported for a set of real problems.  相似文献   

15.
We study the inverse problem of the reconstruction of the coefficient ?(x, t) = ?0(x, t) + r(x) multiplying ut in a nonstationary parabolic equation. Here ?0(x, t) ≥ ?0 > 0 is a given function, and r(x) ≥ 0 is an unknown function of the class L(Ω). In addition to the initial and boundary conditions (the data of the direct problem), we pose the problem of nonlocal observation in the form ∫0Tu(x, t) (t) = χ(x) with a known measure (t) and a function χ(x). We separately consider the case (t) = ω(t)dt of integral observation with a smooth function ω(t). We obtain sufficient conditions for the existence and uniqueness of the solution of the inverse problem, which have the form of ready-to-verify inequalities. We suggest an iterative procedure for finding the solution and prove its convergence. Examples of particular inverse problems for which the assumptions of our theorems hold are presented.  相似文献   

16.
An axiomatization of the Choquet integral is proposed in the context of multiple criteria decision making without any commensurability assumption. The most essential axiom—named Commensurability Through Interaction—states that the importance of an attribute i takes only one or two values when a second attribute k varies. When the importance takes two values, the point of discontinuity is exactly the value on the attribute k that is commensurate to the fixed value on attribute i. If the weight of criterion i does not depend on criterion k, for any value of the other criteria than i and k, then criteria i and k are independent. Applying this construction to any pair ik of criteria, one obtains a partition of the set of criteria. In each block, the criteria interact one with another, and it is thus possible to construct vectors of values on the attributes that are commensurate. There is complete independence between the criteria of any two blocks in this partition. Hence one cannot ensure commensurability between two blocks in the partition. But this is not a problem since the Choquet integral is additive between subsets of criteria that are independent.  相似文献   

17.
Let L be a Lie group and let M be a compact manifold with dimension dim(L) + 1. Let Φ be a locally free action of L on M having class C r with r ≥ 2. Let R be the radical of L and let χ1, . . ., χ n be the characters of the adjoint action of {itR}. Finally, let Δ be the modular function of R. Under the assumption that none of the identities Δ×|χ i | = |χ j |α hold for any α ∈ [0, 1], one shows that Φ is the restriction to L of a locally free and transitive C r action of a larger Lie group. A second result is the existence of a unique Φ-invariant probability measure on {itM}; that measure is induced by a C r?1 nonsingular volume form. What makes that theorem all the more interesting is that certain of the Lie groups under consideration are not amenable.  相似文献   

18.
Based on Feng's theory of formal vector fields and formal flows, we study the convergence problem of the formal energies of symplectic methods for Hamiltonian systems and give the clear growth of the coefficients in the formal energies. With the help of B-series and Bernoulli functions, we prove that in the formal energy of the mid-point rule, the coefficient sequence of the merging products of an arbitrarily given rooted tree and the bushy trees of height 1(whose subtrees are vertices), approaches 0 as the number of branches goes to ∞; in the opposite direction, the coefficient sequence of the bushy trees of height m(m ≥ 2), whose subtrees are all tall trees, approaches ∞ at large speed as the number of branches goes to +∞. The conclusion extends successfully to the modified differential equations of other Runge-Kutta methods. This disproves a conjecture given by Tang et al.(2002), and implies:(1) in the inequality of estimate given by Benettin and Giorgilli(1994) for the terms of the modified formal vector fields, the high order of the upper bound is reached in numerous cases;(2) the formal energies/formal vector fields are nonconvergent in general case.  相似文献   

19.
That symbols in the modulation space M 1,1 generate pseudo-differential operators of the trace class was first stated by Feichtinger and proved by Gröchenig in [13]. In this paper, we show that the same is true if we replace M 1,1 by the more general α-modulation spaces, which include modulation spaces (α = 0) and Besov spaces (α = 1) as special cases. The result with α = 0 corresponds to that of Gröchenig, and the one with α = 1 is a new result which states the trace property of the operators with symbols in the Besov space. As an application, we discuss the trace property of the commutator [α (X, D), a], where; a(χ) is a Lipschitz function and σ(χ, ξ) belongs to an α-modulation space.  相似文献   

20.
In this note we study the general facility location problem with connectivity. We present an O(np 2)-time algorithm for the general facility location problem with connectivity on trees. Furthermore, we present an O(np)-time algorithm for the general facility location problem with connectivity on equivalent binary trees.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号