首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
随着计算机技术的发展,人们对和谐人机交互的要求不断提高,这就要求计算机能理解说话人的情感信息,即能进行语音情感识别。本文提出了一种基于支持向量机(SVM)的语音情感识别方法,主要对人类的6种基本情感:高兴、惊奇、愤怒、悲伤、恐惧、平静进行研究。首先对自建语音情感数据库的情感语句提取特征,然后运用序列前向选择(SFS)算...  相似文献   

2.
对语音情感识别的起源及主要研究内容作了介绍,对国内外语音情感识别的研究现状作了归纳总结;对语音情感特征的提取、情感分类器的建模算法作了重点分析介绍,最后对情感识别未来发展方向进行了展望.  相似文献   

3.
语音情感识别的研究进展   总被引:11,自引:0,他引:11  
情感在人类的感知、决策等过程扮演着重要角色.长期以来情感智能研究只存在于心理学和认知科学领域,近年来随着人工智能的发展,情感智能跟计算机技术结合产生了情感计算这一研究课题,这将大大的促进计算机技术的发展.情感自动识别是通向情感计算的第一步.语音作为人类最重要的交流媒介,携带着丰富的情感信息.如何从语音中自动识别说话者的情感状态近年来受到各领域研究者的广泛关注.本文从语音情感识别所涉及的几个重要问题出发,包括情感理论及情感分类、情感语音数据库、语音中的情感特征和语音情感识别算法等,介绍了当前的研究进展,并讨论了今后研究的几个关键问题.  相似文献   

4.
语音信号的情感特征分析与识别研究综述   总被引:13,自引:0,他引:13  
语音情感的分析与识别是近年来新兴研究课题之一,本文介绍了近几年来国内外语音情感识别的状况,阐述了各种人类情感分类的方法,归纳了各种语音特征参数的提取方法以及各特征参数对情感识别的意义,在此基础上综述了国内外在情感识别领域的研究进展与主要识别建模方法,同时总结了各种识别建模方法的利弊。最后概括了语音情感识别领域的发展趋势,并进行了展望。  相似文献   

5.
人机交互中的语音情感识别研究进展   总被引:7,自引:0,他引:7  
语音情感识别是当前信号处理、模式识别、人工智能、人机交互等领域的热点研究课题,其研究的最终目的是赋予计算机情感能力,使得人机交互做到真正的和谐和自然。本文综述了语音情感识别所涉及到的几个关键问题,包括情感表示理论、情感语音数据库、情感声学特征分析以及情感识别方法四个方面的最新进展,并指出了研究中存在的问题及下一步发展的方向。  相似文献   

6.
语音信号中的情感特征分析和识别的研究   总被引:11,自引:0,他引:11  
本文分析了含有欢快、愤怒、惊奇、悲伤等4种情感语音信号的时间构造、振幅构造、基频构造和共振峰构造的特征。通过和不带情感的平静语音信号的比较,总结了不同情感语音信号的情感特征的分布规律。根据这些分析,提取了9个情感特征进行了情感识别的实验,获得了基本上接近于人的正常表现的识别结果。  相似文献   

7.
周慧  魏霖静 《电子设计工程》2012,20(16):188-190
提出了一种基于LS-SVM的情感语音识别方法。即先提取实验中语音信号的基频,能量,语速等参数为情感特征,然后采用LS-SVM方法对相应的情感语音信号建立模型,进行识别。实验结果表明,利用LS-SVM进行基本情感识别时,识别率较高。  相似文献   

8.
实际的研究表明,语音情感识别方法有多种.介绍了一种基于GMM的语音情感识别方法,包括该方法的优点、存在的问题或不足等,并对此进行了思考,给出了一些处理办法.  相似文献   

9.
语音感知是无人系统的重要组成部分,已有的工作大多集中于单个智能体的语音感知,受噪声、混响等因素的影响,性能存在上限。因此研究多智能体语音感知,通过多智能体自组织、相互协作,提高感知性能非常必要。假设每个智能体输出一个通道的语音流条件下,本文提出一种多智能体自组织语音系统,旨在综合利用所有通道提高感知性能;并进一步以语音识别为例,提出能处理大规模多智能体语音识别的通道选择方法。基于Sparsemax算子的端到端语音识别流注意机制,将带噪通道权重置零,使流注意力具备通道选择能力,但Sparsemax算子会将过多通道权重置零。本文提出Scaling Sparsemax算子,只将带噪较强的通道权重置零;同时提出了多层流注意力结构,有效降低了计算复杂度。在30个智能体的无人系统环境下,基于conformer架构的识别系统实验结果表明,在通道数失配的测试环境下,提出的Scaling Sparsemax在仿真数据集上的文字差错率(WER)相比Softmax降低30%以上,在半真实数据集上降低20%以上。  相似文献   

10.
Emotion recognition is a hot research in modern intelligent systems. The technique is pervasively used in autonomous vehicles, remote medical service, and human–computer interaction (HCI). Traditional speech emotion recognition algorithms cannot be effectively generalized since both training and testing data are from the same domain, which have the same data distribution. In practice, however, speech data is acquired from different devices and recording environments. Thus, the data may differ significantly in terms of language, emotional types and tags. To solve such problem, in this work, we propose a bimodal fusion algorithm to realize speech emotion recognition, where both facial expression and speech information are optimally fused. We first combine the CNN and RNN to achieve facial emotion recognition. Subsequently, we leverage the MFCC to convert speech signal to images. Therefore, we can leverage the LSTM and CNN to recognize speech emotion. Finally, we utilize the weighted decision fusion method to fuse facial expression and speech signal to achieve speech emotion recognition. Comprehensive experimental results have demonstrated that, compared with the uni-modal emotion recognition, bimodal features-based emotion recognition achieves a better performance.  相似文献   

11.
A new semi-serial fusion method of multiple feature based on learning using privileged information(LUPI) model was put forward.The exploitation of LUPI paradigm permits the improvement of the learning accuracy and its stability,by additional information and computations using optimization methods.The execution time is also reduced,by sparsity and dimension of testing feature.The essence of improvements obtained using multiple features types for the emotion recognition(speech expression recognition),is particularly applicable when there is only one modality but still need to improve the recognition.The results show that the LUPI in unimodal case is effective when the size of the feature is considerable.In comparison to other methods using one type of features or combining them in a concatenated way,this new method outperforms others in recognition accuracy,execution reduction,and stability.  相似文献   

12.
为了提高情感识别的正确率,针对单一语音信号特征和表面肌电信号特征存在的局限性,提出了一种集成语音信号特征和表面肌电信号特征的情感自动识别模型.首先对语音信号和表面肌电信号进行预处理,并分别提取相关的语音信号和表面肌电信号特征,然后采用支持向量机对语音信号和表面肌电信号特征进行学习,分别建立相应的情感分类器,得到相应的识别结果,最后将识别结果分别输入到支持向量机确定两种特征的权重系数,从而得到最终的情感识别结果.两个标准语情感数据库的仿真结果表明,相对于其它情感识别模型,本文模型大幅提高了情感识别的正确率,人机交互情感识别系统提供了一种新的研究工具.  相似文献   

13.
To utilize the supra-segmental nature of Mandarin tones, this article proposes a feature extraction method for hidden markov model (HMM) based tone modeling. The method uses linear transforms to project F0 (fundamental frequency) features of neighboring syllables as compensations, and adds them to the original F0 features of the current syllable. The transforms are discriminatively trained by using an objective function termed as "minimum tone error", which is a smooth approximation of tone recognition accuracy. Experiments show that the new tonal features achieve 3.82% tone recognition rate improvement, compared with the baseline, using maximum likelihood trained HMM on the normal F0 features. Further experiments show that discriminative HMM training on the new features is 8.78% better than the baseline.  相似文献   

14.
结合音质特征和韵律特征的语音情感识别   总被引:3,自引:0,他引:3  
为了提高语音情感的正确识别率,在情感语音韵律特征的基础上,提出情感语音音质特征的提取.结合音质特征参数和韵律特征参数,采用支持向量机分类器实现汉语普通话生气、高兴、悲伤和惊奇四种主要情感类型语音的情感识别.实验结果表明,语音音质特征参数和韵律特征参数相结合取得的情感平均正确识别率为88.1%,比单独使用韵律特征参数高出6%.可见,语音音质特征是一种较有效的情感特征参数.  相似文献   

15.
语音识别是人机语音通信的关键技术之一,也是难题之一.介绍了一种语音识别系统,主要介绍了该系统的语音处理流程,阐述了系统使用Mel频标倒谱参数作为特征提取的方法,采用隐马尔科夫模型算法的测度估计技术.通过严格测试,该系统达到实用化要求.该语音识别系统较好的实现了在移动电子设备上资源有限条件下方便快捷的汉字语音输入,具有重大现实意义.  相似文献   

16.
语音情感识别中,情感特征信息的提取和选择、情感识别模型的选择是2个重要部分.结合语音信号的声学特征参数和听觉特征参数进行情感识别,针对两类不同情感之间的差别选择最优的特征集,并设计了一个基于神经网络的情感交叉识别,与听觉特征参数结合,经过分类器得到识别情感,达到平均92%识别率.  相似文献   

17.
语音识别作为信息技术中一种人机接口的关键技术,具有重要的研究意义和广泛的应用价值。介绍了语音识别技术发展的历程,具体阐述了语音识别概念、基本原理、声学建模方法等基本知识,并对语音识别技术在各领域的应用作了简要介绍。  相似文献   

18.
Speech emotion recognition using modified quadratic discrimination function   总被引:1,自引:0,他引:1  
Quadratic Discrimination Function (QDF) is commonly used in speech emotion recognition, which proceeds on the premise that the input data is normal distribution. In this paper, we propose a transformation to normalize the emotional features, emotion recognition. Features based on prosody then derivate a Modified QDF (MQDF) to speech and voice quality are extracted and Principal Component Analysis Neural Network (PCANN) is used to reduce dimension of the feature vectors. The results show that voice quality features are effective supplement for recognition, and the method in this paper could improve the recognition ratio effectively.  相似文献   

19.
路翀  刘晓东  刘万泉 《电子设计工程》2011,19(21):186-188,192
针对压缩感知(Compressed Sensing,CS)方法需将图像矩阵转化为向量后进行特征提取,导致数据维数很大,计算复杂等缺点,提出二维离散余弦变换(2DDCT)和压缩感知(Compressed Sensing,CS)相结合的人脸识别方法。新方法首先利用2DDCT将图像变换到频域,压缩人脸图像以去掉人眼不敏感的中频分量与高频分量,这样有效降低了所需特征的维数,减少了计算量;然后通过感知算法进行特征提取得到人脸识别特征,最后运用最近邻分类器完成人脸的识别。在ORL、Yale及Feret人脸数据库的实验结果证明了该算法的有效性与稳健性,特别是在YaleB人脸数据库运用该方法得到了很好的试验结果。  相似文献   

20.
Most researches of emotion recognition focus on single person information. However, everyone's emotions will affect each other. For example, when the teacher is angry, the student's nervousness will increase. But the facial expression information of the light single is already quite large. It means that group emotion recognition will encounter a huge traffic bottleneck. Therefore, there is a vast amount of data collected by end‐devices that will be uploaded to the emotion cloud for big data analysis. Because different emotions may require different analytical methods, in the face of diverse big data, connecting different emotion clouds is a viable alternative method to extend the emotion cloud hardware. In this paper, we built a software defined networking (SDN) multi‐emotion cloud platform to connect different emotion clouds. Through the advantages of the splicing control plane and the data plane, the routing path can be changed using software. This means that the individual conditions of different students can be handled by a dedicated system via service function (SF). The load balancing effect between different emotion clouds is achieved by optimizing the SFC. In addition, we propose a SFC‐based dynamic load balancing mechanism which eliminates a large number of SFC creation processes. The simulation results show that the proposed mechanism can effectively allocate resources to different emotion clouds to achieve real‐time emotion recognition. This is the first strategy to use SFC to balance the emotion data that the teachers can change teaching policy in a timely manner in response to students' emotions.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号