首页 | 本学科首页   官方微博 | 高级检索  
     检索      

基于梯度惩罚生成对抗网络的过采样算法
引用本文:陶家亮,魏国亮,宋燕,窦军,穆伟蒙.基于梯度惩罚生成对抗网络的过采样算法[J].上海理工大学学报,2023,45(3):235-243.
作者姓名:陶家亮  魏国亮  宋燕  窦军  穆伟蒙
作者单位:上海理工大学 理学院,上海 200093;上海理工大学 管理学院,上海 200093;上海理工大学 光电信息与计算机工程学院,上海 200093
基金项目:国家自然科学基金资助项目(61873169);上海市“科技创新行动计划”国内科技合作项目(20015801100)
摘    要:在不平衡数据分类问题中,为了更注重学习原始样本的概率密度分布,提出基于梯度惩罚生成对抗网络的过采样算法(OGPG)。该算法首先引入生成对抗网络(GAN),有效地学习原始数据的概率分布;其次,采用梯度惩罚对判别器输入项的梯度二范数进行约束,降低了GAN易出现的过拟合和梯度消失,合理地生成新样本。实验部分,在14个公开数据集上运用k近邻和决策树分类器对比其他过采样算法,在评价指标上均有显著提升,并利用Wilcoxon符号秩检验验证了该算法与对比算法在统计学上的差异。结果表明该算法具有良好的有效性和通用性。

关 键 词:不平衡数据  过采样算法  概率密度分布  生成对抗网络  梯度惩罚
收稿时间:2022/3/7 0:00:00

Oversampling algorithm based on gradient penalty generative adversarial network
TAO Jialiang,WEI Guoliang,SONG Yan,DOU Jun,MU Weimeng.Oversampling algorithm based on gradient penalty generative adversarial network[J].Journal of University of Shanghai For Science and Technology,2023,45(3):235-243.
Authors:TAO Jialiang  WEI Guoliang  SONG Yan  DOU Jun  MU Weimeng
Institution:College of Science, University of Shanghai for Science and Technology, Shanghai 200093, China;Business School, University of Shanghai for Science and Technology, Shanghai 200093, China;School of Optical-Electrical and Computer Engineering, University of Shanghai for Science and Technology, Shanghai 200093, China
Abstract:In order to pay more attention to learning for probability density distribution of original samples in imbalanced data classification problem, an oversampling algorithm based on the gradient penalty generation adversarial network (OGPG) was proposed. Firstly, generation adversarial network (GAN) was adopted to effectively learn the probability density distribution of original data. Secondly, the gradient penalty was used to constrain the gradient two-norm of the input term of discriminator, which reduced the overfitting and gradient disappearance that appeared easily in GAN, so that the new samples were reasonably generated. In the experiment, the k-nearest neighbor and decision tree classifiers were adopted to compare the other oversampling algorithms, the evaluation indicators were significantly improved. The Wilcoxon signed-rank test was used to verify the statistical difference between this algorithm and the comparison algorithm. The results show that this algorithm has good effectiveness and generality.
Keywords: imbalanced data   oversampling algorithm   probability density distribution  GAN   gradient penalty
点击此处可从《上海理工大学学报》浏览原始摘要信息
点击此处可从《上海理工大学学报》下载免费的PDF全文
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号