首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 62 毫秒
1.
微阵列技术允许同时录制成百万的基因表达水平。但由于经费和工艺的限制,目前研究者获得的表达数据集往往包含少量的样本,而基因表达的测量值却有上万条。很多传统的统计方法无法分析这样的数据,本文结合数据挖掘中统计学习理论的相关知识,详细介绍了一种有监督分析方法———支持向量机(SVMs)在微阵列表达数据分析中的应用。  相似文献   

2.
一种遗传模糊神经网络数据挖掘算法   总被引:2,自引:0,他引:2  
数据挖掘是近年来信息处理领域出现的新的研究方向。本文探讨了扩展型TS模糊神经网络和遗传算法在数据挖掘中的应用,并提出了一种把模糊神经网络与遗传算法相结合的数据挖掘方法。在该方法中由遗传算法自适应地构造和优化TS模型,TS模型完成预测,这种预测是建立在遗传算法的聚类结果之上的。二者的结合,提高数据挖掘的应用效果。文章最后给出该方法的应用实例。  相似文献   

3.
微生物组学大数据在生态环境、人类健康和疾病研究方面都起到了重要作用。通过数学、统计等数据挖掘方法,从高维复杂数据中提取有用信息,是微生物组学大数据建模和分析的关键问题。本文分析了微生物组学大数据的特点,对当前数据分析和计算研究中存在的热点和难点进行了探讨分析,并综述了当前微生物组学大数据模式挖掘、网络重建与分析的研究概况。  相似文献   

4.
微生物组学大数据在生态环境、人类健康和疾病研究方面都起到了重要作用。通过数学、统计等数据挖掘方法,从高维复杂数据中提取有用信息,是微生物组学大数据建模和分析的关键问题。本文分析了微生物组学大数据的特点,对当前数据分析和计算研究中存在的热点和难点进行了探讨分析,并综述了当前微生物组学大数据模式挖掘、网络重建与分析的研究概况。  相似文献   

5.
应用统计类数据挖掘技术对房地产业上市公司财务进行了分析。即搜集十五个房地产业上市公司的主要财务指标进行因子分析,用提取主成份的方法缩减变数,归纳出影响公司财务状况的四个主要因素。然后,对十五家房地产上市公司进行聚类分析,划分各公司的经营等级。最后,结合因子分析与聚类分析的结果对各公司的经营状况进行了综合评价,并以此指导投资者和经营管理者做出正确的决策。  相似文献   

6.
数据挖掘是近年来国际上智能信息处理和决策支持分析领域的最前沿的研究方向之一.本文综合介绍了数据挖掘的主要概念和新技术,并展示了其丰富的应用领域.  相似文献   

7.
针对原有千车故障数统计方法上的不足,本文从改进统计方法着手,提出一种新的统计方法即重新定义千车故障数,然后利用数据挖掘中的聚类分析方法将具有相同特征的批次综合起来考虑,建立通用的运筹模型.针对缺失数据、近期预测这两个问题,本文对通用模型进行调整,“学习”出同类数据间的不同权值,然后利用加权数据,并通过拟合曲线来求出预测值.由于远期预测中数据的严重缺乏,则是从纯粹统计学的角度出发,计算得到预测值.预测模型通用性强,适用面较广.本文应用了SAS和MATLAB两种软件来求解上述模型,预测结果准确率较高,并且符合实际情况.  相似文献   

8.
可拓数据挖掘研究进展   总被引:3,自引:1,他引:2  
可拓学研究用形式化模型解决矛盾问题的理论与方法,可拓数据挖掘是可拓学和数据挖掘结合的产物,它探讨利用可拓学方法和数据挖掘技术,去挖掘数据库中与可拓变换有关的知识,包括可拓分类知识、传导知识等可拓知识.随着经济全球化的推进,环境的多变促使了信息和知识的更新周期缩短,创新和解决矛盾问题越来越成为各行各业的重要工作.因此,如何挖掘可拓知识就成为数据挖掘研究的重要任务.研究表明,可拓数据挖掘将具有广阔的应用前景.将介绍可拓数据挖掘的集合论基础、基本知识和目前研究的主要内容,并提出今后需要进一步探讨的问题及其发展前景.  相似文献   

9.
证券分析师为股票市场提供上市公司的信息,是股票市场上的重要角色.随着中国股市的发展,各类证券投资咨询机构发布的投资研究报告也越来越多,它们对投资者特别是机构投资者发挥着越来越大的影响.通过建立该问题的数学和统计模型,评估了证券分析师投资建议的实际效果,并通过数据挖掘方法进一步筛选出了各个行业的明星分析师.对金融证券分析师投资评级数据的深入分析和挖掘,有助于投资者更加合理有效的使用这些信息.  相似文献   

10.
对通过卫星云图来识别移动云块、计算风矢的问题进行了探讨.采用以点法式平面方程为核心的空间曲面方程组,实现了视场坐标变换;结合相邻帧差、中值滤波、阈值截取三种方法对卫星云图进行预处理,获得了云块分布;建立了滑动搜索图像匹配模型,并从数据挖掘以及图像特征的角度出发,设计了基于数据和基于统计参数这两种自适应算法,对匹配窗口大小、搜索范围进行优化;最后,为模型的继续改进提出了建议.  相似文献   

11.
基于物元可拓性的潜信息挖掘   总被引:3,自引:1,他引:2  
潜信息挖掘是数据挖掘的核心内容 .本文应用可拓论 ,提出了基于物元可拓性的潜信息挖掘方法 ,探讨了潜信息挖掘的发散性方法 ,相关性方法和蕴含性方法 ,这些方法与现有的数据挖掘方法相兼容 ,相互补充 ,相得益彰 .  相似文献   

12.
CPI指数变换对产品销售影响的可拓数据挖掘   总被引:2,自引:0,他引:2  
目前对数据挖掘的研究主要集中在对静态数据的挖掘,而在实际工作中,经常要处理的矛盾问题,需要通过可拓变换和可拓变换的运算来解决,这就需要用到变换的知识,需要运用动态数据挖掘或可拓数据挖掘来解决问题.运用可拓逻辑和可拓数据挖掘的理论知识,根据国家消费者物价指数的变换对产品销售数据的影响来研究可拓数据挖掘中传导知识的挖掘,为企业的决策者在目前的市场环境下提出更加合理的销售策略提供依据.  相似文献   

13.
With the rapid growth of databases in many modern enterprises data mining has become an increasingly important approach for data analysis. The operations research community has contributed significantly to this field, especially through the formulation and solution of numerous data mining problems as optimization problems, and several operations research applications can also be addressed using data mining methods. This paper provides a survey of the intersection of operations research and data mining. The primary goals of the paper are to illustrate the range of interactions between the two fields, present some detailed examples of important research work, and provide comprehensive references to other important work in the area. The paper thus looks at both the different optimization methods that can be used for data mining, as well as the data mining process itself and how operations research methods can be used in almost every step of this process. Promising directions for future research are also identified throughout the paper. Finally, the paper looks at some applications related to the area of management of electronic services, namely customer relationship management and personalization.  相似文献   

14.
Data mining involves extracting interesting patterns from data and can be found at the heart of operational research (OR), as its aim is to create and enhance decision support systems. Even in the early days, some data mining approaches relied on traditional OR methods such as linear programming and forecasting, and modern data mining methods are based on a wide variety of OR methods including linear and quadratic optimization, genetic algorithms and concepts based on artificial ant colonies. The use of data mining has rapidly become widespread, with applications in domains ranging from credit risk, marketing, and fraud detection to counter-terrorism. In all of these, data mining is increasingly playing a key role in decision making. Nonetheless, many challenges still need to be tackled, ranging from data quality issues to the problem of how to include domain experts' knowledge, or how to monitor model performance. In this paper, we outline a series of upcoming trends and challenges for data mining and its role within OR.  相似文献   

15.
We explore use of data mining for lead time estimation in make-to-order manufacturing. The regression tree approach is chosen as the specific data mining method. Training and test data are generated from variations of a job shop simulation model. Starting with a large set of job and shop attributes, a reasonably small subset is selected based on their contribution to estimation performance. Data mining with the selected attributes is compared with linear regression and three other lead time estimation methods from the literature. Empirical results indicate that our data mining approach coupled with the attribute selection scheme outperforms these methods.  相似文献   

16.
Data mining aims to find patterns in organizational databases. However, most techniques in mining do not consider knowledge of the quality of the database. In this work, we show how to incorporate into classification mining recent advances in the data quality field that view a database as the product of an imprecise manufacturing process where the flaws/defects are captured in quality matrices. We develop a general purpose method of incorporating data quality matrices into the data mining classification task. Our work differs from existing data preparation techniques since while other approaches detect and fix errors to ensure consistency with the entire data set our work makes use of the apriori knowledge of how the data is produced/manufactured.  相似文献   

17.
Data preprocessing is an important and critical step in the data mining process and it has a huge impact on the success of a data mining project. In this paper, we present an algorithm DB-HReduction, which discretizes or eliminates numeric attributes and generalizes or eliminates symbolic attributes very efficiently and effectively. This algorithm greatly decreases the number of attributes and tuples of the data set and improves the accuracy and decreases the running time of the data mining algorithms in the later stage.  相似文献   

18.
Outlier mining is an important aspect in data mining and the outlier mining based on Cook distance is most commonly used. But we know that when the data have multicoUinearity, the traditional Cook method is no longer effective. Considering the excellence of the principal component estimation, we use it to substitute the least squares estimation, and then give the Cook distance measurement based on principal component estimation, which can be used in outlier mining. At the same time, we have done some research on related theories and application problems.  相似文献   

19.
Operations research and data mining already have a long-established common history. Indeed, with the growing size of databases and the amount of data available, data mining has become crucial in modern science and industry. Data mining problems raise interesting challenges for several research domains, and in particular for operations research, as very large search spaces of solutions need to be explored. Hence, many operations research methods have been proposed to deal with such challenging problems. But the relationships between these two domains are not limited to these natural applications of operations research approaches. The counterpart is also important to consider, since data mining approaches have also been applied to improve operations research techniques. The aim of this article is to highlight the interplay between these two research disciplines. A particular emphasis will be placed on the emerging theme of applying multi-objective approaches in this context.  相似文献   

20.
We present the design of more effective and efficient genetic algorithm based data mining techniques that use the concepts of feature selection. Explicit feature selection is traditionally done as a wrapper approach where every candidate feature subset is evaluated by executing the data mining algorithm on that subset. In this article we present a GA for doing both the tasks of mining and feature selection simultaneously by evolving a binary code along side the chromosome structure used for evolving the rules. We then present a wrapper approach to feature selection based on Hausdorff distance measure. Results from applying the above techniques to a real world data mining problem show that combining both the feature selection methods provides the best performance in terms of prediction accuracy and computational efficiency.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号