首页 | 本学科首页   官方微博 | 高级检索  
     

基于深度学习的语声抑郁识别*
引用本文:吴情,胡维平,陈丹丹,肖 婷. 基于深度学习的语声抑郁识别*[J]. 应用声学, 2022, 41(5): 837-842
作者姓名:吴情  胡维平  陈丹丹  肖 婷
作者单位:广西师范大学,广西师范大学电子工程学院,广西师范大学电子工程学院,广西师范大学电子工程学院
摘    要:世界各地抑郁症患者数量不断增多,抑郁症的诊断和治疗面临着医生短缺问题,针对这一问题,提出了CNN和结合注意力机制的BLSTM特征融合模型。从特征选择和网络构架两方面进行了研究,对比了几种经典语声特征,得出梅尔倒谱系数对抑郁分类效果最好,再将梅尔倒谱系数分别送进CNN和结合注意力机制的BLSTM网络实现抑郁分类。在DAIC-WOZ数据集上进行实验,所提出的方法对语声抑郁的分类精确度达到78.06 %,F1分数达到74.68%。关键词:抑郁识别;语声分析;分类

关 键 词:抑郁识别;语声分析;分类
收稿时间:2021-07-25
修稿时间:2022-08-22

Speech depression recognition based on deep learning
Affiliation:Guangxi Normal University,College of Electronic Engineering, Guangxi Normal University,College of Electronic Engineering, Guangxi Normal University,College of Electronic Engineering, Guangxi Normal University
Abstract:The number of depression patients is increasing around the world. There is a shortage of doctors to diagnose and treat depression. In response to this problem, CNN and BLSTM feature fusion model combined with attention mechanism are proposed.Research has been carried out from the aspects of feature selection and network architecture. By comparing several classical speech features, it is concluded that the Mel-frequency Cepstrum Coefficient (MFCC) has the best effect on depression classification, and then the Meier cepstrum coefficient is sent into CNN and BLSTM network combined with attention mechanism respectively to achieve depression classification.Experiments on the DAIC-WOZ data set show that the proposed method has a classification accuracy of 78.06 % and a F1 score of 74.68%.
Keywords:
点击此处可从《应用声学》浏览原始摘要信息
点击此处可从《应用声学》下载免费的PDF全文
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号