共查询到20条相似文献,搜索用时 15 毫秒
1.
This paper proposes a novel Informed Evolutionary algorithm (InEA) which implements the idea of learning with a generation. An association rule miner is used to identify the norm of a population. Subsequently, a knowledge based mutation operator is used to help guide the search of the evolutionary optimizer. The approach breaks away from the current practice of treating the optimization and analysis process as two independent processes. It shows how a rule mining module can be used to mine knowledge and hybridized into EA to improve the performance of the optimizer. The proposed memetic algorithm is examined via various benchmarks problems, and the simulation results show that InEA is competitive as compared to existing approaches in literature. 相似文献
2.
Fuzzy Cognitive Map (FCM) is a new kind of intelligent facility, which has many advantages such as intuitive representing
knowledge skills and strong inference mechanisms based on numeric matrix, etc. In practical application, a majority of data
is stored into the relational database in the form of Entity-Relationship schema. How to mine FCM directly from multi-relational
data resource has become a key problem in researching FCM and an important direction and area of data mining. However, traditional
approaches for obtaining FCM always rely on experience of domain experts or do not take into account the characteristics of
multi-relationship. Based on these, the paper proposes a new model of Two-layer Tree-type FCM (TTFCM) and a new mining methodology
based on gradient descent method. 相似文献
3.
A general approach to designing multiple classifiers represents them as a combination of several binary classifiers in order
to enable correction of classification errors and increase reliability. This method is explained, for example, in Witten and
Frank (Data Mining: Practical Machine Learning Tools and Techniques, 2005, Sect. 7.5). The aim of this paper is to investigate
representations of this sort based on Brandt semigroups. We give a formula for the maximum number of errors of binary classifiers,
which can be corrected by a multiple classifier of this type. Examples show that our formula does not carry over to larger
classes of semigroups. 相似文献
4.
Fionn Murtagh 《Proceedings of the Steklov Institute of Mathematics》2009,265(1):177-198
Data analysis and data mining are concerned with unsupervised pattern finding and structure determination in data sets. The data sets themselves are explicitly linked as a form of representation to an observational, or otherwise empirical, domain of interest. “Structure” has long been understood as symmetry which can take many forms with respect to any transformation, including point, translational, rotational, and many others. Symmetries directly point to invariants that pinpoint intrinsic properties of the data and of the background empirical domain of interest. As our data models change, so too do our perspectives on analyzing data. The structures in data surveyed here are based on hierarchy, represented as p-adic numbers or an ultrametric topology. 相似文献
5.
We present the design of more effective and efficient genetic algorithm based data mining techniques that use the concepts of feature selection. Explicit feature selection is traditionally done as a wrapper approach where every candidate feature subset is evaluated by executing the data mining algorithm on that subset. In this article we present a GA for doing both the tasks of mining and feature selection simultaneously by evolving a binary code along side the chromosome structure used for evolving the rules. We then present a wrapper approach to feature selection based on Hausdorff distance measure. Results from applying the above techniques to a real world data mining problem show that combining both the feature selection methods provides the best performance in terms of prediction accuracy and computational efficiency. 相似文献
6.
Ronaldo Dias Nancy L. Garcia Adriano Z. Zambom 《Computational Optimization and Applications》2010,45(3):521-541
The objective of this study is to find a smooth function joining two points A and B with minimum length constrained to avoid fixed subsets. A penalized nonparametric method of finding the best path is proposed. The method is generalized to the situation where stochastic measurement errors are present. In this case, the proposed estimator is consistent, in the sense that as the number of observations increases the stochastic trajectory converges to the deterministic one. Two applications are immediate, searching the optimal path for an autonomous vehicle while avoiding all fixed obstacles between two points and flight planning to avoid threat or turbulence zones. 相似文献
7.
In this survey paper, we present advances achieved during the last years in the development and use of OR, in particular, optimization methods in the new gene-environment and eco-finance networks, based on usually finite data series, with an emphasis on uncertainty in them and in the interactions of the model items. Indeed, our networks represent models in the form of time-continuous and time-discrete dynamics, whose unknown parameters we estimate under constraints on complexity and regularization by various kinds of optimization techniques, ranging from linear, mixed-integer, spline, semi-infinite and robust optimization to conic, e.g., semi-definite programming. We present different kinds of uncertainties and a new time-discretization technique, address aspects of data preprocessing and of stability, related aspects from game theory and financial mathematics, we work out structural frontiers and discuss chances for future research and OR application in our real world. 相似文献
8.
Non-probabilistic reliability based multidisciplinary design optimization has been widely acknowledged as an advanced methodology for complex system design when the data is insufficient. In this work, the uncertainty propagation analysis method in multidisciplinary system based on subinterval theory is firstly studied to obtain the uncertain responses. Then, based on the non-probabilistic set theory, the interval reliability based multidisciplinary design optimization model is established. Considering that the gradient information of interval reliability cannot be acquired in the whole design domain, which causes convergence difficulties and prohibitive computation, an interval reliability displacement based multidisciplinary design optimization method is proposed to address the issue. In the proposed method, the interval reliability displacement is introduced to measure the degree of interval reliability. By doing so, not only the connotation of the interval reliability is guaranteed, but more importantly, the partial gradient region for interval reliability is equivalently converted into full gradient region for reliability displacement. Consequently, the gradient information can be acquired under any circumstances and thus the convergence process is highly accelerated by utilizing the gradient optimization algorithms. 相似文献
9.
上市公司财务危机预警分析——基于数据挖掘的研究 总被引:3,自引:0,他引:3
本文以我国上市公司为研究对象,选取了1999-2001年被ST的公司和正常公司各73家作为训练样本,2002年被ST的公司和正常公司各43家作为检验样本,分析了财务危机出现前2年内各年两类公司15个财务指标。在进行数据挖掘中,我们运用了三种独立的方法,分别为判别分析、Logistic回归和神经网络,结果发现神经网络预测的效果要优于其它两种方法。最后,结合了这些方法的优点,建立了一种混合模型,研究表明预测的正确性要高于每种单独方法,从而提高了模型的预警效果。 相似文献
10.
《European Journal of Operational Research》2001,131(2):302-308
Mobility is one of the vital goods of modern societies. One way to alleviate congestion and to utilise the existing infrastructure more efficiently are Advanced Traveller Information Systems (ATIS). To provide the road user with optimal travel routes, we propose a procedure in two steps. First on-line simulations supplemented by real traffic data are performed to calculate actual travel times and traffic loads. Afterwards these data are processed in a route guidance system which allows the road user an optimisation with regard to individual preferences. To solve this multiple criteria optimisation problem fuzzy set theory is applied to the dynamic routing problem. 相似文献
11.
Jukka Corander Mats Gyllenberg Timo Koski 《Advances in Data Analysis and Classification》2009,3(1):3-24
Advantages of statistical model-based unsupervised classification over heuristic alternatives have been widely demonstrated in the scientific literature. However, the existing model-based approaches are often both conceptually and numerically instable for large and complex data sets. Here we consider a Bayesian model-based method for unsupervised classification of discrete valued vectors, that has certain advantages over standard solutions based on latent class models. Our theoretical formulation defines a posterior probability measure on the space of classification solutions corresponding to stochastic partitions of observed data. To efficiently explore the classification space we use a parallel search strategy based on non-reversible stochastic processes. A decision-theoretic approach is utilized to formalize the inferential process in the context of unsupervised classification. Both real and simulated data sets are used for the illustration of the discussed methods. 相似文献
12.
Tim Schröder Lars-Peter Lauven Jutta Geldermann 《European Journal of Operational Research》2018,264(3):1005-1019
Biorefineries can provide a product portfolio from renewable biomass similar to that of crude oil refineries. To operate biorefineries of any kind, however, the availability of biomass inputs is crucial and must be considered during planning. Here, we develop a planning approach that uses Geographic Information Systems (GIS) to account for spatially scattered biomass when optimizing a biorefinery’s location, capacity, and configuration. To deal with the challenges of a non-smooth objective function arising from the geographic data, higher dimensionality, and strict constraints, the planning problem is repeatedly decomposed by nesting an exact nonlinear program (NLP) inside an evolutionary strategy (ES) heuristic, which handles the spatial data from the GIS. We demonstrate the functionality of the algorithm and show how including spatial data improves the planning process by optimizing a synthesis gas biorefinery using this new planning approach. 相似文献
13.
This paper developed a multiobjective Big Data optimization approach based on a hybrid salp swarm algorithm and the differential evolution algorithm. The role of the differential evolution algorithm is to enhance the capability of the feature exploitation of the salp swarm algorithm because the operators of the differential evolution algorithm are used as local search operators. In general, the proposed method contains three stages. In the first stage, the population is generated, and the archive is initialized. The second stage updates the solutions using the hybrid salp swarm algorithm and the differential evolution algorithm, and the final stage determines the nondominated solutions and updates the archive. To assess the performance of the proposed approach, a series of experiments were performed. A set of single-objective and multiobjective problems from the 2015 Big Data optimization competition were tested; the dataset contained data with and without noise. The results of our experiments illustrated that the proposed approach outperformed other approaches, including the baseline nondominated sorting genetic algorithm, on all test problems. Moreover, for single-objective problems, the score value of the proposed method was better than that of the traditional multiobjective salp swarm algorithm. When compared with both algorithms, that is, the adaptive DE algorithm with external archive and the hybrid multiobjective firefly algorithm, its score was the largest. In contrast, for the multiobjective functions, the scores of the proposed algorithm were higher than that of the fireworks algorithm framework. 相似文献
14.
In the design of the cost function in the nonlinear finite control set model predictive control (FCS-MPC) system, the traditional method based on weighting factors demonstrates some limitations, such as the weighting factors adjusting and heavy predictive calculation due to the increased number of voltage vectors applied in controlling multilevel converters. This paper proposes a simplified FCS-MPC method based on common mode voltage satisfactory optimization, which could considerably reduce the predictive calculation by the optimized switch combination and simplify the cost function design. Moreover, satisfactory optimization is adopted to achieve the accuracy control of common-mode voltage amplitude without adjusting process of weighting factors. The simulation and experimental results verify the feasibility of this control strategy. 相似文献
15.
P. V. Gracheva 《Vestnik St. Petersburg University: Mathematics》2011,44(4):260-267
A solution of the large computational time problem arising in multidimensional data structuring is addressed by employing
algebraic properties of finite geometries. A vector parameterization of the Grassmannian Gr2(k, n) is proposed which makes it possible to minimize the amount of memory and reduce the number of operations required to solve
the problem. An algorithm based on this parameterization and the Gray codes is constructed; the algorithm is suitable for
parallel computation, which further reduces computation time. 相似文献
16.
P. V. Gracheva 《Vestnik St. Petersburg University: Mathematics》2011,44(2):140-146
An approach to reducing large computational time in the problem of multidimensional dichotomous data structuring based on algebraic properties of finite geometries is proposed. A vector parameterization of the Grassmannian Gr2(k, n) reducing memory expenditures and the number of operations required to solve this problem is introduced. A parallelization algorithm based on this parameterization and Gray coding which further reduces computational time is constructed. 相似文献
17.
Ali Sever 《Applied mathematics and computation》2011,217(24):9966-9970
Data mining is generally defined as the science of nontrivial extraction of implicit, previously unknown, and potentially useful information from datasets. There are many websites on the Internet that provide extensive information about products and allow users post comments on various products and rate the product on a scale of 1 to 5. During the past decade, the need for intelligent algorithms for calculating and organizing extremely large sets of data has grown exponentially. In this article we investigate the extent to which a product’s average user rating can be predicted, using a manageable subset of a data set. For this we use a linearization-algorithm based prediction model and sketch how an inverse problem can be formulated to yield a smooth local volatility function of user ratings. The MAPLE programs that implement the proposed algorithm show that the method is reasonably accurate for the reconstruction of volatility of user ratings, which is useful both in accurate user predictions as well as computing sensitivity. 相似文献
18.
图论、最优化理论显然在蛋白质结构的研究中大有用场. 首先, 调查/回顾了研究蛋白质结构的所有图论模型. 其后, 建立了一个图论模型: 让蛋白质的侧链来作为图的顶点, 应用图论的诸如团、 $k$-团、 社群、 枢纽、聚类等概念来建立图的边. 然后, 应用数学最优化的现代摩登数据挖掘算法/方法来分析水牛普里昂蛋白结构的大数据. 成功与令人耳目一新的数值结果将展示给朋友们. 相似文献
19.
20.
Irregularities are widespread in large databases and often lead to erroneous conclusions with respect to data mining and statistical
analysis. For example, considerable bias is often resulted from many parameter estimation procedures without properly handling
significant irregularities. Most data cleaning tools assume one known type of irregularity. This paper proposes a generic
Irregularity Enlightenment (IE) framework for dealing with the situation when multiple irregularities are hidden in large
volumes of data in general and cross sectional time series in particular. It develops an automatic data mining platform to
capture key irregularities and classify them based on their importance in a database. By decomposing time series data into
basic components, we propose to optimize a penalized least square loss function to aid the selection of key irregularities
in consecutive steps and cluster time series into different groups until an acceptable level of variation reduction is achieved.
Finally visualization tools are developed to help analysts interpret and understand the nature of data better and faster before
further data modeling and analysis. 相似文献