首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 62 毫秒
1.
Rough set theory is a new data mining approach to manage vagueness. It is capable to discover important facts hidden in the data. Literature indicate the current rough set based approaches can’t guarantee that classification of a decision table is credible and it is not able to generate robust decision rules when new attributes are incrementally added in. In this study, an incremental attribute oriented rule-extraction algorithm is proposed to solve this deficiency commonly observed in the literature related to decision rule induction. The proposed approach considers incremental attributes based on the alternative rule extraction algorithm (AREA), which was presented for discovering preference-based rules according to the reducts with the maximum of strength index (SI), specifically the case that the desired reducts are not necessarily unique since several reducts could include the same value of SI. Using the AREA, an alternative rule can be defined as the rule which holds identical preference to the original decision rule and may be more attractive to a decision-maker than the original one. Through implementing the proposed approach, it can be effectively operating with new attributes to be added in the database/information systems. It is not required to re-compute the updated data set similar to the first step at the initial stage. The proposed algorithm also excludes these repetitive rules during the solution search stage since most of the rule induction approaches generate the repetitive rules. The proposed approach is capable to efficiently and effectively generate the complete, robust and non-repetitive decision rules. The rules derived from the data set provide an indication of how to effectively study this problem in further investigations.  相似文献   

2.
Computing with words (CWW) relies on linguistic representation of knowledge that is processed by operating at the semantical level defined through fuzzy sets. Linguistic representation of knowledge is a major issue when fuzzy rule based models are acquired from data by some form of empirical learning. Indeed, these models are often requested to exhibit interpretability, which is normally evaluated in terms of structural features, such as rule complexity, properties on fuzzy sets and partitions. In this paper we propose a different approach for evaluating interpretability that is based on the notion of cointension. The interpretability of a fuzzy rule-based model is measured in terms of cointension degree between the explicit semantics, defined by the formal parameter settings of the model, and the implicit semantics conveyed to the reader by the linguistic representation of knowledge. Implicit semantics calls for a representation of user’s knowledge which is difficult to externalise. Nevertheless, we identify a set of properties - which we call “logical view” - that is expected to hold in the implicit semantics and is used in our approach to evaluate the cointension between explicit and implicit semantics. In practice, a new fuzzy rule base is obtained by minimising the fuzzy rule base through logical properties. Semantic comparison is made by evaluating the performances of the two rule bases, which are supposed to be similar when the two semantics are almost equivalent. If this is the case, we deduce that the logical view is applicable to the model, which can be tagged as interpretable from the cointension viewpoint. These ideas are then used to define a strategy for assessing interpretability of fuzzy rule-based classifiers (FRBCs). The strategy has been evaluated on a set of pre-existent FRBCs, acquired by different learning processes from a well-known benchmark dataset. Our analysis highlighted that some of them are not cointensive with user’s knowledge, hence their linguistic representation is not appropriate, even though they can be tagged as interpretable from a structural point of view.  相似文献   

3.
This paper discusses an extension of Answer Set Programming (ASP) called Hybrid Answer Set Programming (H-ASP) which allows the user to reason about dynamical systems that exhibit both discrete and continuous aspects. The unique feature of Hybrid ASP is that it allows the use of ASP type rules as controls for when to apply algorithms to advance the system to the next position. That is, if the prerequisites of a rule are satisfied and the constraints of the rule are not violated, then the algorithm associated with the rule is invoked.  相似文献   

4.
提出了一类特殊类型的数学规划模型并给出了一种新的分枝定界算法.这类数学模型尽管可以转化为0-1规划模型,但它相对于转化后的0-1规划模型:①决策意义明确,表达形式相对简单;②不需要引入参数M并在求解前确定其上界;③相对于求解转化后的0-1规划模型的分枝定界法,新分枝定界算法在最好情形下计算量最多为原算法的八分之一.作为本模型的一个应用,可以用来解决一些要么不实施要么有一定数量下限限制才可以实施的决策问题.  相似文献   

5.
Advanced Genetic Programming Based Machine Learning   总被引:1,自引:0,他引:1  
A Genetic Programming based approach for solving classification problems is presented in this paper. Classification is understood as the act of placing an object into a set of categories, based on the object’s properties; classification algorithms are designed to learn a function which maps a vector of object features into one of several classes. This is done by analyzing a set of input-output examples (“training samples”) of the function. Here we present a method based on the theory of Genetic Algorithms and Genetic Programming that interprets classification problems as optimization problems: Each presented instance of the classification problem is interpreted as an instance of an optimization problem, and a solution is found by a heuristic optimization algorithm. The major new aspects presented in this paper are advanced algorithmic concepts as well as suitable genetic operators for this problem class (mainly the creation of new hypotheses by merging already existing ones and their detailed evaluation). The experimental part of the paper documents the results produced using new hybrid variants of Genetic Algorithms as well as investigated parameter settings. Graphical analysis is done using a novel multiclass classifier analysis concept based on the theory of Receiver Operating Characteristic curves. The work described in this paper was done within the Translational Research Project L282 “GP-Based Techniques for the Design of Virtual Sensors” sponsored by the Austrian Science Fund (FWF).  相似文献   

6.
We propose a new fuzzy rough set approach which, differently from most known fuzzy set extensions of rough set theory, does not use any fuzzy logical connectives (t-norm, t-conorm, fuzzy implication). As there is no rationale for a particular choice of these connectives, avoiding this choice permits to reduce the part of arbitrary in the fuzzy rough approximation. Another advantage of the new approach is that it is based on the ordinal properties of fuzzy membership degrees only. The concepts of fuzzy lower and upper approximations are thus proposed, creating a base for induction of fuzzy decision rules having syntax and semantics of gradual rules. The proposed approach to rule induction is also interesting from the viewpoint of philosophy supporting data mining and knowledge discovery, because it is concordant with the method of concomitant variations by John Stuart Mill. The decision rules are induced from lower and upper approximations defined for positive and negative relationships between credibility degrees of multiple premises, on one hand, and conclusion, on the other hand.  相似文献   

7.
The classification system is very important for making decision and it has been attracted much attention of many researchers. Usually, the traditional classifiers are either domain specific or produce unsatisfactory results over classification problems with larger size and imbalanced data. Hence, genetic algorithms (GA) are recently being combined with traditional classifiers to find useful knowledge for making decision. Although, the main concerns of such GA-based system are the coverage of less search space and increase of computational cost with the growth of population. In this paper, a rule-based knowledge discovery model, combining C4.5 (a Decision Tree based rule inductive algorithm) and a new parallel genetic algorithm based on the idea of massive parallelism, is introduced. The prime goal of the model is to produce a compact set of informative rules from any kind of classification problem. More specifically, the proposed model receives a base method C4.5 to generate rules which are then refined by our proposed parallel GA. The strength of the developed system has been compared with pure C4.5 as well as the hybrid system (C4.5 + sequential genetic algorithm) on six real world benchmark data sets collected from UCI (University of California at Irvine) machine learning repository. Experiments on data sets validate the effectiveness of the new model. The presented results especially indicate that the model is powerful for volumetric data set.  相似文献   

8.
We are considering the problem of multi-criteria classification. In this problem, a set of “if … then …” decision rules is used as a preference model to classify objects evaluated by a set of criteria and regular attributes. Given a sample of classification examples, called learning data set, the rules are induced from dominance-based rough approximations of preference-ordered decision classes, according to the Variable Consistency Dominance-based Rough Set Approach (VC-DRSA). The main question to be answered in this paper is how to classify an object using decision rules in situation where it is covered by (i) no rule, (ii) exactly one rule, (iii) several rules. The proposed classification scheme can be applied to both, learning data set (to restore the classification known from examples) and testing data set (to predict classification of new objects). A hypothetical example from the area of telecommunications is used for illustration of the proposed classification method and for a comparison with some previous proposals.  相似文献   

9.
A Dual-Objective Evolutionary Algorithm for Rules Extraction in Data Mining   总被引:1,自引:0,他引:1  
This paper presents a dual-objective evolutionary algorithm (DOEA) for extracting multiple decision rule lists in data mining, which aims at satisfying the classification criteria of high accuracy and ease of user comprehension. Unlike existing approaches, the algorithm incorporates the concept of Pareto dominance to evolve a set of non-dominated decision rule lists each having different classification accuracy and number of rules over a specified range. The classification results of DOEA are analyzed and compared with existing rule-based and non-rule based classifiers based upon 8 test problems obtained from UCI Machine Learning Repository. It is shown that the DOEA produces comprehensible rules with competitive classification accuracy as compared to many methods in literature. Results obtained from box plots and t-tests further examine its invariance to random partition of datasets. An erratum to this article is available at .  相似文献   

10.
Although Answer Set Programming (ASP) is a powerful framework for declarative problem solving, it cannot in an intuitive way handle situations in which some rules are uncertain, or in which it is more important to satisfy some constraints than others. Possibilistic ASP (PASP) is a natural extension of ASP in which certainty weights are associated with each rule. In this paper we contrast two different views on interpreting the weights attached to rules. Under the first view, weights reflect the certainty with which we can conclude the head of a rule when its body is satisfied. Under the second view, weights reflect the certainty that a given rule restricts the considered epistemic states of an agent in a valid way, i.e. it is the certainty that the rule itself is correct. The first view gives rise to a set of weighted answer sets, whereas the second view gives rise to a weighted set of classical answer sets.  相似文献   

11.
Improving the scalability of rule-based evolutionary learning   总被引:2,自引:0,他引:2  
Evolutionary learning techniques are comparable in accuracy with other learning methods such as Bayesian Learning, SVM, etc. These techniques often produce more interpretable knowledge than, e.g. SVM; however, efficiency is a significant drawback. This paper presents a new representation motivated by our observations that Bioinformatics and Systems Biology often give rise to very large-scale datasets that are noisy, ambiguous and usually described by a large number of attributes. The crucial observation is that, in the most successful rules obtained for such datasets, only a few key attributes (from the large number of available ones) are expressed in a rule, hence automatically discovering these few key attributes and only keeping track of them contributes to a substantial speed up by avoiding useless match operations with irrelevant attributes. Thus, in effective terms this procedure is performing a fine-grained feature selection at a rule-wise level, as the key attributes may be different for each learned rule. The representation we propose has been tested within the BioHEL machine learning system, and the experiments performed show that not only the representation has competent learning performance, but that it also manages to reduce considerably the system run-time. That is, the proposed representation is up to 2–3 times faster than state-of-the-art evolutionary learning representations designed specifically for efficiency purposes.  相似文献   

12.
Reasoning expressions are those which express the reaasoning procedure by means ofonly deduction rules and the initial formulas(axioms or assumptions)without the helpof any intermediate results.They express the procedure systematically,completely andconcisely.The deduction rules are mappings from formulas(premises)to formula(conclusion).The elementary rules are certain propositional connectives(but notnecessarily truth functions)while the higher rules are certain quantifiers.Besides,thedetachment rule is an inverse of the connective implication,and is itself the kernel ofdeduction method;while another inverse of implication(i.e.the suggestion rule)is thekernel of induction method.  相似文献   

13.
14.
对于含自由变量的LP问题,为了得到比单纯形法[1]更有效的算法,通过研究在单纯形法迭代过程中,将自由变量化为非负变量再实施运算的规律,提出一种能节省存贮空间和提高运算速度的改进单纯形法。数值实验表明新算法是有效的。  相似文献   

15.
The objective of this study was to distinguish within a population of patients with and without breast cancer. The study was based on the University of Wisconsin's dataset of 569 patients, of whom 212 were subsequently found to have breast cancer. A subset-conjunctive model, which is related to Logical Analysis of Data, is described to distinguish between the two groups of patients based on the results of a non-invasive procedure called Fine Needle Aspiration, which is often used by physicians before deciding on the need for a biopsy. We formulate the problem of inferring subset-conjunctive rules as a 0-1 integer program, show that it is NP-Hard, and prove that it admits no polynomial-time constant-ratio approximation algorithm. We examine the performance of a randomized algorithm, and of randomization using LP rounding. In both cases, the expected performance ratio is arbitrarily bad. We use a deterministic greedy algorithm to identify a Pareto-efficient set of subset-conjunctive rules; describe how the rules change with a re-weighting of the type-I and type-II errors; how the best rule changes with the subset size; and how much of a tradeoff is required between the two types of error as one selects a more stringent or more lax classification rule. An important aspect of the analysis is that we find a sequence of closely related efficient rules, which can be readily used in a clinical setting because they are simple and have the same structure as the rules currently used in clinical diagnosis.  相似文献   

16.
In cumulative and disjunctive constraint-based scheduling, the resource constraint is enforced by several filtering rules. Among these rules, we have (extended) edge-finding and not-first/not-last rules. The not-first/not-last rule detects tasks that cannot run first/last relatively to a set of tasks and prunes their time bounds. In this paper, it is presented a sound O (n 2 log n) algorithm for the cumulative not-first/not-last rule where n is the number of tasks. This algorithm reaches the same fix point as previous not-first/not-last algorithms, although it may take additional iterations to do so. The worst case complexity of this new algorithm for the maximal adjustment is the same as our previous complete O (n 2|H| log n) not-first/not-last algorithm [7] where |H| is the maximum between the number of distinct earliest completion and latest start times of tasks. But, experimental results on benchmarks from the Project Scheduling Problem Library (PSPLib) and the Baptiste and Le Pape data set (BL) suggest that the new not-first/not-last algorithm has a substantially reduced runtime. Furthermore, the results demonstrate that in practice the new algorithm rarely requires more propagations than previous not-first/not-last algorithms.  相似文献   

17.
We introduce a mathematical programming approach to building rule lists, which are a type of interpretable, nonlinear, and logical machine learning classifier involving IF-THEN rules. Unlike traditional decision tree algorithms like CART and C5.0, this method does not use greedy splitting and pruning. Instead, it aims to fully optimize a combination of accuracy and sparsity, obeying user-defined constraints. This method is useful for producing non-black-box predictive models, and has the benefit of a clear user-defined tradeoff between training accuracy and sparsity. The flexible framework of mathematical programming allows users to create customized models with a provable guarantee of optimality. The software reviewed as part of this submission was given the DOI (Digital Object Identifier)  https://doi.org/10.5281/zenodo.1344142.  相似文献   

18.
Interpretability is one of the key concepts in many of the applications using the fuzzy rule-based approach. It is well known that there are many different criteria around this concept, the complexity being one of them. In this paper, we focus our efforts in reducing the complexity of the fuzzy rule sets. One of the most interesting approaches for learning fuzzy rules is the iterative rule learning approach. It is mainly characterized by obtaining rules covering few examples in final stages, being in most cases useless to represent the knowledge. This behavior is due to the specificity of the extracted rules, which eventually creates more complex set of rules. Thus, we propose a modified version of the iterative rule learning algorithm in order to extract simple rules relaxing this natural trend. The main idea is to change the rule extraction process to be able to obtain more general rules, using pruned searching spaces together with a knowledge simplification scheme able to replace learned rules. The experimental results prove that this purpose is achieved. The new proposal reduces the complexity at both, the rule and rule base levels, maintaining the accuracy regarding to previous versions of the algorithm.  相似文献   

19.
In this paper a fuzzy neural network based on a fuzzy relational “IF-THEN” reasoning scheme is designed. To define the structure of the model different t-norms and t-conorms are proposed. The fuzzification and the defuzzification phases are then added to the model so that we can consider the model like a controller. A learning algorithm to tune the parameters that is based on a back-propagation algorithm and a recursive pseudoinverse matrix technique is introduced. Different experiments on synthetic and benchmark data are made. Several results using the UCI repository of Machine learning database are showed for classification and approximation tasks. The model is also compared with some other methods known in literature.  相似文献   

20.
研究错误逻辑的知识表达模型,以错误逻辑理论结合生态文明"五位一体"所构建的生态文明建设指标体系,进行基于对象识别的知识表达.指标体系内的各元素分别被定义为错误逻辑模型中的事物、特征、函数和规则.建模时,首先进行事物分解,第二步进行特定事物下对应的特性及规则分解,最后根据判别规则G对错误函数f形式的影响,对各项指标所适用的错误函数类型进行分类.对象的生成可以为用矩阵这样的数据结构对逻辑知识进行系统化组织做前期准备.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号