首页 | 本学科首页   官方微博 | 高级检索  
     

双向聚类方法综述
引用本文:方匡南,陈远星,张庆昭,马双鸽. 双向聚类方法综述[J]. 数理统计与管理, 2020, 39(1): 22-34
作者姓名:方匡南  陈远星  张庆昭  马双鸽
作者单位:厦门大学经济学院统计系,厦门福建361005;新能源与储能运行控制国家重点实验室(中国电力科学研究院有限公司),北京100192;厦门大学经济学院统计系,厦门福建,361005;厦门大学经济学院统计系,厦门福建361005;耶鲁大学生物统计系,纽黑文06510,美国
基金项目:中央高校基本科研业务费专项资金资助(20720181003,20720171095);新能源与储能运行控制国家重点实验室开放基金资助(NYB51201801579)
摘    要:传统的聚类方法由于无法提取样本和变量间的局部对应关系,并且当数据具有高维性和稀疏性时表现不佳,因此学者们提出了双向聚类,基于样本和变量间的局部关系,同时对样本和变量进行聚类,形成一系列子矩阵的聚类结果。近年来,双向聚类发展迅速,在基因分析、文本聚类、推荐系统等领域应用广泛。首先,对双向聚类方法进行梳理与归纳,重点阐述稀疏双向聚类、谱双向聚类和信息双向聚类三类方法,分析它们之间的区别和联系,并且介绍这三类方法在多源数据的整合分析、多层聚类、半监督学习以及集成学习上的发展现状和趋势;其次,重点介绍双向聚类在基因分析、文本聚类、推荐系统等领域的应用研究情况;最后,结合大数据时代的数据特征和双向聚类存在的问题,展望双向聚类未来的研究方向。

关 键 词:稀疏双向聚类  子矩阵  谱双向聚类  信息双向聚类

Review of Biclustering
FANG Kuang-nan,CHEN Yuan-xing,ZHANG Qing-zhao,MA Shuang-ge. Review of Biclustering[J]. Application of Statistics and Management, 2020, 39(1): 22-34
Authors:FANG Kuang-nan  CHEN Yuan-xing  ZHANG Qing-zhao  MA Shuang-ge
Affiliation:(Department of Statistics,School of Economics,Xiamen University,Xiamen 361005,China;Department of Biostatistics.Yale University,New Haven 06510,USA;National Key Laboratory on Power Grid Environment Protection(China Electric Power Reseach Institute),Beijing 100192,China)
Abstract:The duality between samples and variables cannot be exploited by traditional clustering methods,which often performs badly when dealing with sparse and high-dimensional data.The scholars then proposed biclustering,which can cluster both samples and variables simultaneously to get submatrices,based on the duality between samples and variables.In recent years,biclustering methods have developed rapidly and been widely used in many areas,such as microarray analysis,text clustering,recommendation system and so on.This paper at first reviews biclustering methods and focuses on three classical biclustering methods,including Sparse Biclustering,Spectral Biclustering and Information Theoretic Coclustering.In detail,we conclude the differences and relationships between these three methods,and introduce the development status and trends of integrative analysis in multi-source datasets,multi-level clustering,semi-supervised learning,supervised learning and ensemble learning.Secondly,we focus on the application research of biclustering in the fields of microarray analysis,text clustering and recommendation system.Finally,combining the data characteristics of the big data era with the existing problems in biclustering,the future research direction of biclustering is discussed.
Keywords:sparse biclustering  submatrix  spectral biclustering  information theoretic co-clustering
本文献已被 维普 万方数据 等数据库收录!
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号