首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
Monotonic Variable Consistency Rough Set Approaches   总被引:2,自引:0,他引:2  
We consider probabilistic rough set approaches based on different versions of the definition of rough approximation of a set. In these versions, consistency measures are used to control assignment of objects to lower and upper approximations. Inspired by some basic properties of rough sets, we find it reasonable to require from these measures several properties of monotonicity. We consider three types of monotonicity properties: monotonicity with respect to the set of attributes, monotonicity with respect to the set of objects, and monotonicity with respect to the dominance relation. We show that consistency measures used so far in the definition of rough approximation lack some of these monotonicity properties. This observation led us to propose new measures within two kinds of rough set approaches: Variable Consistency Indiscernibility-based Rough Set Approaches (VC-IRSA) and Variable Consistency Dominance-based Rough Set Approaches (VC-DRSA). We investigate properties of these approaches and compare them to previously proposed Variable Precision Rough Set (VPRS) model, Rough Bayesian (RB) model, and previous versions of VC-DRSA.  相似文献   

2.
Clustering is one of the most widely used approaches in data mining with real life applications in virtually any domain. The huge interest in clustering has led to a possibly three-digit number of algorithms with the k-means family probably the most widely used group of methods. Besides classic bivalent approaches, clustering algorithms belonging to the domain of soft computing have been proposed and successfully applied in the past four decades. Bezdek’s fuzzy c-means is a prominent example for such soft computing cluster algorithms with many effective real life applications. More recently, Lingras and West enriched this area by introducing rough k-means. In this article we compare k-means to fuzzy c-means and rough k-means as important representatives of soft clustering. On the basis of this comparison, we then survey important extensions and derivatives of these algorithms; our particular interest here is on hybrid clustering, merging fuzzy and rough concepts. We also give some examples where k-means, rough k-means, and fuzzy c-means have been used in studies.  相似文献   

3.
基于Shadowed Sets理论研究了粗糙集连续属性离散化问题,提出一种新的基于Shadowed Sets 理论的候选断点集提取算法.该算法根据实例在单属性上的分布,对数据样本进行分类,采用Shadowed Sets计算出各类的上下近似,最终提取出候选断点集.使用多组UCI数据对此算法的性能进行检验,同时还与其它候选断点集提取算法做了对比实验.实验结果表明,此算法能有效地减少数据集候选断点的数目,提高离散化算法运行速度和识别率.  相似文献   

4.
Clustering algorithms divide up a dataset into a set of classes/clusters, where similar data objects are assigned to the same cluster. When the boundary between clusters is ill defined, which yields situations where the same data object belongs to more than one class, the notion of fuzzy clustering becomes relevant. In this course, each datum belongs to a given class with some membership grade, between 0 and 1. The most prominent fuzzy clustering algorithm is the fuzzy c-means introduced by Bezdek (Pattern recognition with fuzzy objective function algorithms, 1981), a fuzzification of the k-means or ISODATA algorithm. On the other hand, several research issues have been raised regarding both the objective function to be minimized and the optimization constraints, which help to identify proper cluster shape (Jain et al., ACM Computing Survey 31(3):264–323, 1999). This paper addresses the issue of clustering by evaluating the distance of fuzzy sets in a feature space. Especially, the fuzzy clustering optimization problem is reformulated when the distance is rather given in terms of divergence distance, which builds a bridge to the notion of probabilistic distance. This leads to a modified fuzzy clustering, which implicitly involves the variance–covariance of input terms. The solution of the underlying optimization problem in terms of optimal solution is determined while the existence and uniqueness of the solution are demonstrated. The performances of the algorithm are assessed through two numerical applications. The former involves clustering of Gaussian membership functions and the latter tackles the well-known Iris dataset. Comparisons with standard fuzzy c-means (FCM) are evaluated and discussed.  相似文献   

5.
Rough set theory, a mathematical tool to deal with inexact or uncertain knowledge in information systems, has originally described the indiscernibility of elements by equivalence relations. Covering rough sets are a natural extension of classical rough sets by relaxing the partitions arising from equivalence relations to coverings. Recently, some topological concepts such as neighborhood have been applied to covering rough sets. In this paper, we further investigate the covering rough sets based on neighborhoods by approximation operations. We show that the upper approximation based on neighborhoods can be defined equivalently without using neighborhoods. To analyze the coverings themselves, we introduce unary and composition operations on coverings. A notion of homomorphism is provided to relate two covering approximation spaces. We also examine the properties of approximations preserved by the operations and homomorphisms, respectively.  相似文献   

6.
粗集、模糊集均是处理不确定信息的数据分析工具,是数据挖掘的重要方法.由Zadeh首先提出的模糊扩张原理是模糊集理论的最基本的原理之一,粗集是通过上、下近似算子来发挥作用的.本文讨论扩张原理与粗集上近似之间的关系,证明了扩张原理可以表示成粗集上近似的形式,因此,扩张原理成了粗集与模糊集之间的桥梁.此外,借助粗集上、下近似算子的公理系统解决了扩张原理的反问题.  相似文献   

7.
Rough sets are efficient for data pre-processing during data mining. However, some important problems such as attribute reduction in rough sets are NP-hard and the algorithms required to solve them are mostly greedy ones. The transversal matroid is an important part of matroid theory, which provides well-established platforms for greedy algorithms. In this study, we investigate transversal matroids using the rough set approach. First, we construct a covering induced by a family of subsets and we propose the approximation operators and upper approximation number based on this covering. We present a sufficient condition under which a subset is a partial transversal, and also a necessary condition. Furthermore, we characterize the transversal matroid with the covering-based approximation operator and construct some types of circuits. Second, we explore the relationships between closure operators in transversal matroids and upper approximation operators based on the covering induced by a family of subsets. Finally, we study two types of axiomatic characterizations of the covering approximation operators based on the set theory and matroid theory, respectively. These results provide more methods for investigating the combination of transversal matroids with rough sets.  相似文献   

8.
覆盖广义粗糙集的模糊性   总被引:5,自引:0,他引:5  
在研究覆盖广义粗糙集的基础上,利用两个距离函数Hamming和Euclidean距离函数,结合模糊集的最近寻常集,引入了覆盖广义粗糙集模糊度的概念,给出了一种模糊度计算方法,并证明了该模糊度的一些重要性质。这些结果在覆盖广义粗糙集的理论研究和应用都发挥着一定作用。  相似文献   

9.
This study proposes a novel Forward Search and Backward Trace (FSBT) technique based on Rough Set Theory to improve data analysis and extend the scope of observations made from sample data to solve personal investment portfolio problems. Rough Set Theory mathematically classifies data into class sets. The class set with the most objects may generate one decision rule. The rules generated from RST are rough and fragmented, that are very difficult to interpret the information. An empirical case is used to generate more than 85 rules by the RST method in comparison with FSBT method which only generated 14 rules. This result can show our proposed method is better than traditional RST method based on class sets that contain the most objects. Much of human knowledge is described in natural language. It is a very important thing to convert information from computer databases into normal human language. Sample data taken from features with the same backgrounds are used to compile different portfolios that investment companies and investment advisors can employ to satisfy the investor’ needs. The method not only can provide decision-making rules, but also can offer alternative strategies for better data analysis. We believe that the FSBT technique can be fully applied in research on investment marketing.  相似文献   

10.
The partitioning clustering is a technique to classify n objects into k disjoint clusters, and has been developed for years and widely used in many applications. In this paper, a new overlapping cluster algorithm is defined. It differs from traditional clustering algorithms in three respects. First, the new clustering is overlapping, because clusters are allowed to overlap with one another. Second, the clustering is non-exhaustive, because an object is permitted to belong to no cluster. Third, the goals considered in this research are the maximization of the average number of objects contained in a cluster and the maximization of the distances among cluster centers, while the goals in previous research are the maximization of the similarities of objects in the same clusters and the minimization of the similarities of objects in different clusters. Furthermore, the new clustering is also different from the traditional fuzzy clustering, because the object–cluster relationship in the new clustering is represented by a crisp value rather than that represented by using a fuzzy membership degree. Accordingly, a new overlapping partitioning cluster (OPC) algorithm is proposed to provide overlapping and non-exhaustive clustering of objects. Finally, several simulation and real world data sets are used to evaluate the effectiveness and the efficiency of the OPC algorithm, and the outcomes indicate that the algorithm can generate satisfactory clustering results.  相似文献   

11.
Rough set theory is an important tool for approximate reasoning about data. Axiomatic systems of rough sets are significant for using rough set theory in logical reasoning systems. In this paper, outer product method are used in rough set study for the first time. By this approach, we propose a unified lower approximation axiomatic system for Pawlak’s rough sets and fuzzy rough sets. As the dual of axiomatic systems for lower approximation, a unified upper approximation axiomatic characterization of rough sets and fuzzy rough sets without any restriction on the cardinality of universe is also given. These rough set axiomatic systems will help to understand the structural feature of various approximate operators.  相似文献   

12.
Diverse reduct subspaces based co-training for partially labeled data   总被引:1,自引:0,他引:1  
Rough set theory is an effective supervised learning model for labeled data. However, it is often the case that practical problems involve both labeled and unlabeled data, which is outside the realm of traditional rough set theory. In this paper, the problem of attribute reduction for partially labeled data is first studied. With a new definition of discernibility matrix, a Markov blanket based heuristic algorithm is put forward to compute the optimal reduct of partially labeled data. A novel rough co-training model is then proposed, which could capitalize on the unlabeled data to improve the performance of rough classifier learned only from few labeled data. The model employs two diverse reducts of partially labeled data to train its base classifiers on the labeled data, and then makes the base classifiers learn from each other on the unlabeled data iteratively. The classifiers constructed in different reduct subspaces could benefit from their diversity on the unlabeled data and significantly improve the performance of the rough co-training model. Finally, the rough co-training model is theoretically analyzed, and the upper bound on its performance improvement is given. The experimental results show that the proposed model outperforms other representative models in terms of accuracy and even compares favorably with rough classifier trained on all training data labeled.  相似文献   

13.
The concepts of the lower approximation integral,the upper approximation integral and rough integrals are given on the basis of function rough sets.Based on these concepts,the relation of the lower approximation integrals,the relation of the upper approximation integrals,the relation of rough integrals,and the double median theorem of rough integrals are discussed.Rough integrals have finite contraction characteristic and finite extension characteristic.  相似文献   

14.
粗糙线性空间   总被引:1,自引:0,他引:1  
将粗糙集思想引入到线性空间。提出线性空间中子集的上近似和下近似的概念,给出上、下粗线性子空间与上、下粗不变子空间的概念,讨论粗糙集线性空间的有关性质。  相似文献   

15.
Fuzzy c-means clustering algorithm (FCM) can provide a non-parametric and unsupervised approach to the cluster analysis of data. Several efforts of fuzzy clustering have been undertaken by Bezdek and other researchers. Earlier studies in this field have reported problems due to the setting of optimum initial condition, cluster validity measure, and high computational load. More recently, the fuzzy clustering has benefited of a synergistic approach with Genetic Algorithms (GA) that play the role of an useful optimization technique that helps to better tolerate some classical drawbacks, such as sensitivity to initialization, noise and outliers, and susceptibility to local minima. We propose a genetic-level clustering methodology able to cluster objects represented by R p spaces. The unsupervised cluster algorithm, called SFCM (Spatial Fuzzy c-Means), is based on a fuzzy clustering c-means method that searches the best fuzzy partition of the universe assuming that the evaluation of each object with respect to some features is unknown, but knowing that it belongs to circular regions of R 2 space. Next we present a Java implementation of the algorithm, which provides a complete and efficient visual interaction for the setting of the parameters involved into the system. To demonstrate the applications of SFCM, we discuss a case study where it is shown the generality of our model by treating a simple 3-way data fuzzy clustering as example of a multicriteria optimization problem.  相似文献   

16.
An new initialization method for fuzzy c-means algorithm   总被引:1,自引:0,他引:1  
In this paper an initialization method for fuzzy c-means (FCM) algorithm is proposed in order to solve the two problems of clustering performance affected by initial cluster centers and lower computation speed for FCM. Grid and density are needed to extract approximate clustering center from sample space. Then, an initialization method for fuzzy c-means algorithm is proposed by using amount of approximate clustering centers to initialize classification number, and using approximate clustering centers to initialize initial clustering centers. Experiment shows that this method can improve clustering result and shorten clustering time validly.  相似文献   

17.
Cluster analysis is an unsupervised learning technique for partitioning objects into several clusters. Assuming that noisy objects are included, we propose a soft clustering method which assigns objects that are significantly different from noise into one of the specified number of clusters by controlling decision errors through multiple testing. The parameters of the Gaussian mixture model are estimated from the EM algorithm. Using the estimated probability density function, we formulated a multiple hypothesis testing for the clustering problem, and the positive false discovery rate (pFDR) is calculated as our decision error. The proposed procedure classifies objects into significant data or noise simultaneously according to the specified target pFDR level. When applied to real and artificial data sets, it was able to control the target pFDR reasonably well, offering a satisfactory clustering performance.  相似文献   

18.
粗糙集理论是由Pawlak提出的一种表示与处理数据表中信息的形式化工具.作为粗糙集概念的推广,一种基于完备剩余格的L-模糊粗糙集已由Radzikowska与Kerre提出,在本文中,我们第一次借助于L-模糊Galois联络对L-模糊粗糙集进行了公理化刻画.由于L-模糊粗糙集及L-模糊Galois联络均为相应经典情形的推广,故本文的结论对于经典粗糙集来说也是成立的,这就意味着通过Galois联络可将经典粗糙集乃至L-模糊粗糙集的公理化统一起来.  相似文献   

19.
基于粗糙集的模糊决策算法   总被引:8,自引:0,他引:8  
给出一种从连续决策表中提取模糊决策规则的规则提取算法。首先,转化连续属性值为模糊值;然后,给出两个不同对象的模糊属性值关于相应连续属性的相似度;其次,给出了λ相似关系与λ相似类的定义。根据λ相似关系,给出粗糙-模糊空间中的下近似与上近似概念;最后,结合模糊集与粗糙集理论的思想,给出一种从连续值域决策表获取决策规则的算法,并通过实例说明该算法的有效性。  相似文献   

20.
In this paper, we investigate the problem of determining the number of clusters in the k-modes based categorical data clustering process. We propose a new categorical data clustering algorithm with automatic selection of k. The new algorithm extends the k-modes clustering algorithm by introducing a penalty term to the objective function to make more clusters compete for objects. In the new objective function, we employ a regularization parameter to control the number of clusters in a clustering process. Instead of finding k directly, we choose a suitable value of regularization parameter such that the corresponding clustering result is the most stable one among all the generated clustering results. Experimental results on synthetic data sets and the real data sets are used to demonstrate the effectiveness of the proposed algorithm.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号