共查询到20条相似文献,搜索用时 15 毫秒
1.
2.
对语音情感识别的起源及主要研究内容作了介绍,对国内外语音情感识别的研究现状作了归纳总结;对语音情感特征的提取、情感分类器的建模算法作了重点分析介绍,最后对情感识别未来发展方向进行了展望. 相似文献
3.
语音情感识别的研究进展 总被引:11,自引:0,他引:11
情感在人类的感知、决策等过程扮演着重要角色.长期以来情感智能研究只存在于心理学和认知科学领域,近年来随着人工智能的发展,情感智能跟计算机技术结合产生了情感计算这一研究课题,这将大大的促进计算机技术的发展.情感自动识别是通向情感计算的第一步.语音作为人类最重要的交流媒介,携带着丰富的情感信息.如何从语音中自动识别说话者的情感状态近年来受到各领域研究者的广泛关注.本文从语音情感识别所涉及的几个重要问题出发,包括情感理论及情感分类、情感语音数据库、语音中的情感特征和语音情感识别算法等,介绍了当前的研究进展,并讨论了今后研究的几个关键问题. 相似文献
4.
5.
6.
7.
提出了一种基于LS-SVM的情感语音识别方法。即先提取实验中语音信号的基频,能量,语速等参数为情感特征,然后采用LS-SVM方法对相应的情感语音信号建立模型,进行识别。实验结果表明,利用LS-SVM进行基本情感识别时,识别率较高。 相似文献
8.
实际的研究表明,语音情感识别方法有多种.介绍了一种基于GMM的语音情感识别方法,包括该方法的优点、存在的问题或不足等,并对此进行了思考,给出了一些处理办法. 相似文献
9.
语音感知是无人系统的重要组成部分,已有的工作大多集中于单个智能体的语音感知,受噪声、混响等因素的影响,性能存在上限。因此研究多智能体语音感知,通过多智能体自组织、相互协作,提高感知性能非常必要。假设每个智能体输出一个通道的语音流条件下,本文提出一种多智能体自组织语音系统,旨在综合利用所有通道提高感知性能;并进一步以语音识别为例,提出能处理大规模多智能体语音识别的通道选择方法。基于Sparsemax算子的端到端语音识别流注意机制,将带噪通道权重置零,使流注意力具备通道选择能力,但Sparsemax算子会将过多通道权重置零。本文提出Scaling Sparsemax算子,只将带噪较强的通道权重置零;同时提出了多层流注意力结构,有效降低了计算复杂度。在30个智能体的无人系统环境下,基于conformer架构的识别系统实验结果表明,在通道数失配的测试环境下,提出的Scaling Sparsemax在仿真数据集上的文字差错率(WER)相比Softmax降低30%以上,在半真实数据集上降低20%以上。 相似文献
10.
Emotion recognition is a hot research in modern intelligent systems. The technique is pervasively used in autonomous vehicles, remote medical service, and human–computer interaction (HCI). Traditional speech emotion recognition algorithms cannot be effectively generalized since both training and testing data are from the same domain, which have the same data distribution. In practice, however, speech data is acquired from different devices and recording environments. Thus, the data may differ significantly in terms of language, emotional types and tags. To solve such problem, in this work, we propose a bimodal fusion algorithm to realize speech emotion recognition, where both facial expression and speech information are optimally fused. We first combine the CNN and RNN to achieve facial emotion recognition. Subsequently, we leverage the MFCC to convert speech signal to images. Therefore, we can leverage the LSTM and CNN to recognize speech emotion. Finally, we utilize the weighted decision fusion method to fuse facial expression and speech signal to achieve speech emotion recognition. Comprehensive experimental results have demonstrated that, compared with the uni-modal emotion recognition, bimodal features-based emotion recognition achieves a better performance. 相似文献
11.
A new semi-serial fusion method of multiple feature based on learning using privileged information(LUPI) model was put forward.The exploitation of LUPI paradigm permits the improvement of the learning accuracy and its stability,by additional information and computations using optimization methods.The execution time is also reduced,by sparsity and dimension of testing feature.The essence of improvements obtained using multiple features types for the emotion recognition(speech expression recognition),is particularly applicable when there is only one modality but still need to improve the recognition.The results show that the LUPI in unimodal case is effective when the size of the feature is considerable.In comparison to other methods using one type of features or combining them in a concatenated way,this new method outperforms others in recognition accuracy,execution reduction,and stability. 相似文献
12.
为了提高情感识别的正确率,针对单一语音信号特征和表面肌电信号特征存在的局限性,提出了一种集成语音信号特征和表面肌电信号特征的情感自动识别模型.首先对语音信号和表面肌电信号进行预处理,并分别提取相关的语音信号和表面肌电信号特征,然后采用支持向量机对语音信号和表面肌电信号特征进行学习,分别建立相应的情感分类器,得到相应的识别结果,最后将识别结果分别输入到支持向量机确定两种特征的权重系数,从而得到最终的情感识别结果.两个标准语情感数据库的仿真结果表明,相对于其它情感识别模型,本文模型大幅提高了情感识别的正确率,人机交互情感识别系统提供了一种新的研究工具. 相似文献
13.
To utilize the supra-segmental nature of Mandarin tones, this article proposes a feature extraction method for hidden markov model (HMM) based tone modeling. The method uses linear transforms to project F0 (fundamental frequency) features of neighboring syllables as compensations, and adds them to the original F0 features of the current syllable. The transforms are discriminatively trained by using an objective function termed as "minimum tone error", which is a smooth approximation of tone recognition accuracy. Experiments show that the new tonal features achieve 3.82% tone recognition rate improvement, compared with the baseline, using maximum likelihood trained HMM on the normal F0 features. Further experiments show that discriminative HMM training on the new features is 8.78% better than the baseline. 相似文献
14.
15.
16.
17.
语音识别作为信息技术中一种人机接口的关键技术,具有重要的研究意义和广泛的应用价值。介绍了语音识别技术发展的历程,具体阐述了语音识别概念、基本原理、声学建模方法等基本知识,并对语音识别技术在各领域的应用作了简要介绍。 相似文献
18.
Quadratic Discrimination Function (QDF) is commonly used in speech emotion recognition, which proceeds on the premise that the input data is normal distribution. In this paper, we propose a transformation to normalize the emotional features, emotion recognition. Features based on prosody then derivate a Modified QDF (MQDF) to speech and voice quality are extracted and Principal Component Analysis Neural Network (PCANN) is used to reduce dimension of the feature vectors. The results show that voice quality features are effective supplement for recognition, and the method in this paper could improve the recognition ratio effectively. 相似文献
19.
针对压缩感知(Compressed Sensing,CS)方法需将图像矩阵转化为向量后进行特征提取,导致数据维数很大,计算复杂等缺点,提出二维离散余弦变换(2DDCT)和压缩感知(Compressed Sensing,CS)相结合的人脸识别方法。新方法首先利用2DDCT将图像变换到频域,压缩人脸图像以去掉人眼不敏感的中频分量与高频分量,这样有效降低了所需特征的维数,减少了计算量;然后通过感知算法进行特征提取得到人脸识别特征,最后运用最近邻分类器完成人脸的识别。在ORL、Yale及Feret人脸数据库的实验结果证明了该算法的有效性与稳健性,特别是在YaleB人脸数据库运用该方法得到了很好的试验结果。 相似文献
20.
Most researches of emotion recognition focus on single person information. However, everyone's emotions will affect each other. For example, when the teacher is angry, the student's nervousness will increase. But the facial expression information of the light single is already quite large. It means that group emotion recognition will encounter a huge traffic bottleneck. Therefore, there is a vast amount of data collected by end‐devices that will be uploaded to the emotion cloud for big data analysis. Because different emotions may require different analytical methods, in the face of diverse big data, connecting different emotion clouds is a viable alternative method to extend the emotion cloud hardware. In this paper, we built a software defined networking (SDN) multi‐emotion cloud platform to connect different emotion clouds. Through the advantages of the splicing control plane and the data plane, the routing path can be changed using software. This means that the individual conditions of different students can be handled by a dedicated system via service function (SF). The load balancing effect between different emotion clouds is achieved by optimizing the SFC. In addition, we propose a SFC‐based dynamic load balancing mechanism which eliminates a large number of SFC creation processes. The simulation results show that the proposed mechanism can effectively allocate resources to different emotion clouds to achieve real‐time emotion recognition. This is the first strategy to use SFC to balance the emotion data that the teachers can change teaching policy in a timely manner in response to students' emotions. 相似文献