期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

1.

孟建庭吴及王作英《电声技术》2004,(8):51-53

研究了分布式语音识别系统(DSR)的架构,实现并测试了客户端／服务器架构的分布式语音识别系统。系统采用基于段长分布的隐含马尔可夫模型(DDBHMM)的识别算法,使用多服务器、多客户端的系统架构,采用负载平衡的方式分配识别资源,能够达到稳定高效的性能。相似文献

2.

基于MVQM的说话人识别的研究

谢建平成新民赵力《电声技术》2006,(2):41-43

提出了一种新的说话人识别方法。该方法综合了VQ和GMM的优点,通过用VQ误差尺度取代传统GMM的输出概率函数,减少了建模时对训练数据量的要求,提高了识别速度。实验结果证明了该方法的有效性。相似文献

3.

分布式语音识别的相关标准及主要技术

梁钊《电声技术》2004,(12):47-50,53

分布式语音识别(DSR)是近年来出现的新技术,具有广阔的应用前景。结合ETSI关于DSR的最新标准介绍了DSR系统的组成,分析了DSR的主要技术,如前端特征提取算法、特征压缩与纠检错、服务器端的语音重构算法等,最后对DSR技术的应用做了简单展望。相似文献

4.

分布式语音识别的前端处理及相关标准 总被引：1，自引：1，他引：0

王艳琴梁钊蒙山《电声技术》2002,(5):4-7

语音识别在实际应用中受到信道噪声和便携终端计算、存储能力不足等因素制约。分布式语音识别（DSR）不仅解决了上述的问题，还有占用带宽窄、综合成本低等优点，但其应用的前提是提取的参数必须标准化。文中介绍了分布式语音识别前端处理的基本结构以及相关标准。相似文献

5.

EMOTIONAL SPEECH RECOGNITION BASED ON SVM WITH GMM SUPERVECTOR

Chen Yanxiang Xie Jian 《电子科学学刊(英文版)》2012,29(3):339-344

相似文献

6.

基于修正EM算法的说话人识别的研究

成新民沈律赵力邹采荣《电声技术》2004,(12):51-53

提出了针对说话人识别的GMM模型训练的新方法。理论推导和实验结果表明,与GMM常用的传统EM算法相比,提出的新算法能够解决训练中会出现奇异阵的问题,并能提高系统识别率。相似文献

7.

基于高斯滤波器及费舍尔准则的特征提取方法 总被引：1，自引：0，他引：1

李晋徽杨俊安项要杰《电路与系统学报》2013,18(2):400-404

针对梅尔倒谱系数与翻转梅尔倒谱系数在语种识别应用中的不足,采用高斯滤波器代替三角滤波器,提出一种新的梅尔倒谱系数提取方法,解决传统梅尔倒谱系数提取中邻近滤波器相关性较弱的问题,并结合Fisher准则构造出最优混合特征参数,采用高斯混合模型分别对不同混合特征进行语种识别。实验结果表明:基于高斯滤波器及Fisher准则的改进梅尔倒谱系数混合特征参数作为语种识别特征具有较高的识别准确率。相似文献

8.

利用抗噪幂归一化倒谱系数的鸟类声音识别 总被引：3，自引：0，他引：3

下载免费PDF全文

颜鑫李应《电子学报》2013,41(2):295-300

针对真实环境中各种背景噪声下的鸟类声音识别问题,提出了一种基于新型抗噪特征提取的鸟类声音识别技术.首先,根据适用于高度非平稳环境下的噪声估计算法求出噪声功率谱.其次,使用多频带谱减法对声音功率谱进行降噪处理.接着,结合降噪的声音功率谱提取抗噪幂归一化倒谱系数(APNCC).最后,采用支持向量机(SVM)分别对提取的APNCC,幂归一化倒谱系数(PNCC)和Mel频率倒谱系数(MFCC)对34种鸟类声音进行不同环境和信噪比情况下的对比实验.实验表明,提取的APNCC具有较好的平均识别效果及较强的噪声鲁棒性,更适用于信噪比低于30dB环境下的鸟类声音识别. 相似文献

9.

基于小波网络和HMM的语音识别方法

刘维亭朱志宇《电声技术》2004,(11):56-59

利用隐马尔可夫模型(HMM)的动态时间序列建模能力及神经网络的模式分类能力,构成混合语音识别模型,同时考虑到语音信号的非平稳性,采用小波分析方法提取语音特征向量。通过时间规整方法,将所有具有可变长度的语音特征向量转换为相同维数的特征向量,从而简化了神经网络的结构。仿真结果表明,采用混合语音识别模型以及时间规整方法,不仅可提高识别率,同时大大缩减了训练时间,获得了很好的识别效果。相似文献

10.

一种适于非特定人语音识别的并行隐马尔可夫模型 总被引：2，自引：0，他引：2

陈雁翔戴蓓蒨周曦刘鸣《电子与信息学报》2004,26(10):1601-1606

为了适合非特定人语音识别,提出了一种由多条并行马尔可夫链组成的并行HMM(Parallel Hidden Markov Model,PHMM),从而融合了基于分类的语音识别中为各个类别建立的模板,提高了识别性能,各条链之间允许有交叉,使得融合的多模板之间存在状态共享,同时PHMM可以在训练过程中自动完成聚类,且测试语音的输出结果来自所有类别,无需聚类分析和类别判断,这些都减少了存储量和计算量,汉语非特定人孤立数字的识别实验表明,PHMM较之传统CHMM使识别性能及噪声鲁棒性都得到了改善。相似文献

11.

汉语连续语音识别中多项式拟合语音轨迹模型的研究

下载免费PDF全文

欧智坚王作英《电子学报》2003,31(4):608-611

尽管作为当前最为流行的语音识别模型, HMM由于采用状态输出独立同分布假设,忽略了对语音轨迹动态特性的描述.本文基于一个更为灵活的语音描述统计框架—广义DDBHMM,提出了一个具体的多项式拟合语音轨迹模型,以及新的训练和识别算法,更好地刻划了真实的语音特性.本文还给出了一种有效的剪枝算法,得到一个实用化模型.汉语大词汇量非特定人连续语音识别的实验表明,这种剪枝的多项式拟合语音轨迹模型以较少的计算量明显改善了识别系统的性能. 相似文献

12.

Statistical Model‐Based Noise Reduction Approach for Car Interior Applications to Speech Recognition

Sung Joo Lee Byung Ok Kang Ho‐Young Jung Yunkeun Lee Hyung Soon Kim 《ETRI Journal》2010,32(5):801-809

This paper presents a statistical model‐based noise suppression approach for voice recognition in a car environment. In order to alleviate the spectral whitening and signal distortion problem in the traditional decision‐directed Wiener filter, we combine a decision‐directed method with an original spectrum reconstruction method and develop a new two‐stage noise reduction filter estimation scheme. When a tradeoff between the performance and computational efficiency under resource‐constrained automotive devices is considered, ETSI standard advance distributed speech recognition font‐end (ETSI‐AFE) can be an effective solution, and ETSI‐AFE is also based on the decision‐directed Wiener filter. Thus, a series of voice recognition and computational complexity tests are conducted by comparing the proposed approach with ETSI‐AFE. The experimental results show that the proposed approach is superior to the conventional method in terms of speech recognition accuracy, while the computational cost and frame latency are significantly reduced. 相似文献

13.

Subspace Distribution Clustering HMM for Chinese Digit Speech Recognition

QIN Wei WEI Gang 《中国电子科技》2006,4(1):43-46

As a kind of statistical method, the technique of Hidden Markov Model （HMM） is widely used for speech recognition. In order to train the HMM to be more effective with much less amount of data, the Subspace Distribution Clustering Hidden Markov Model （SDCHMM）, derived from the Continuous Density Hidden Markov Model （CDHMM）, is introduced. With parameter tying, a new method to train SDCHMMs is described. Compared with the conventional training method, an SDCHMM recognizer trained by means of the new method achieves higher accuracy and speed. Experiment results show that the SDCHMM recognizer outperforms the CDHMM recognizer on speech recognition of Chinese digits. 相似文献

14.

基于自适应小生境混合遗传算法的说话人识别 总被引：4，自引：0，他引：4

下载免费PDF全文

林琳王树勋《电子学报》2007,35(1):8-12

为了解决传统高斯混合模型(Gaussian Mixture Model,GMM)对初值敏感,在实际训练中极易得到局部最优参数的问题,本文提出了一种GMM参数优化的新方法.将小生境技术与最大似然估计融入到遗传训练过程,形成了一种新的混合算法,缓解了遗传算法产生的"早熟"现象,提高了算法的局部搜索能力.采用自适应策略来控制交叉和变异算子,同时在适应度评价中融入了其他用户的区分性信息,提高了模型的分类精度,增强了GMM的泛化能力.实验表明,与传统和改进的两种方法相比,本文的方法都可以得到更优的模型参数,使得系统的识别率进一步提高. 相似文献

15.

STATISTICAL FEATURE OF PITCH FREQUENCY DISTRIBUTIONS FOR OBUST SPEAKER IDENTIFICATION

Zhang Linghua Zheng Baoyu Yang Zhen 《电子科学学刊(英文版)》2005,(4)

This letter proposes an effective and robust speech feature extraction method based on statistical analysis of Pitch Frequency Distributions (PFD) for speaker identification. Compared with the conventional cepstrum, PFD is relatively insensitive to Additive White Gaussian Noise (AWGN), but it does not show good performance for speaker identification, even if under clean environments. To compensate this shortcoming, PFD and conventional cepstrum are combined to make the ultimate decision, instead of simply taking one kind of features into account. Experimental results indicate that the hybrid approach can give outstanding improvement for text-independent speaker identification under noisy environments corrupted by AWGN. 相似文献

16.

Statistical feature of pitch frequency distributions for robust speaker identification

ZhangLinghua ZhengBaoyu YangZhen 《电子科学学刊(英文版)》2005,22(4):437-442

This letter proposes an effective and robust speech feature extraction method based on statistical analysis of Pitch Prequency Distributions (PFD) for speaker identification. Compared with the conventional cepstrum, PFD is relatively insensitive to Additive White Gaussian Noise (AWGN), but it does not show good performance for speaker identification, even if under clean environments. To compensate this shortcoming, PFD and conventional cepstrum are combined to make the ultimate decision, instead of simply taking one kind of features into account.Experimental results indicate that the hybrid approach can give outstanding improvement for text-independent speaker identification under noisy environments corrupted by AWGN. 相似文献

17.

Comparison of Khasi Speech Representations with Different Spectral Features and Hidden Markov States

下载免费PDF全文

Bronson Syiem Sushanta Kabir Dutta Juwesh Binong Lairenlakpam Joyprakash Singh 《电子科技学刊:英文版》2021,19(2):155-162

In this paper, we present a comparison of Khasi speech representations with four different spectral features and novel extension towards the development of Khasi speech corpora. These four features include linear predictive coding (LPC), linear prediction cepstrum coefficient (LPCC), perceptual linear prediction (PLP), and Mel frequency cepstral coefficient (MFCC). The 10-hour speech data were used for training and 3-hour data for testing. For each spectral feature, different hidden Markov model (HMM) based recognizers with variations in HMM states and different Gaussian mixture models (GMMs) were built. The performance was evaluated by using the word error rate (WER). The experimental results show that MFCC provides a better representation for Khasi speech compared with the other three spectral features. 相似文献

18.

基于GMM和神经网络的辐射源识别方法

下载免费PDF全文

公绪华袁振涛谭怀英《雷达科学与技术》2014,12(5):482-486

针对基于截获雷达脉冲特征参数的辐射源识别问题,通过建立一个高斯混合模型(GMM),采用最大化期望(EM)方法对模型参数进行训练,构建了一个输入为截获雷达脉冲特征参数,输出为雷达辐射源类型的分类器。同时,为实现对分类识别性能对比,进一步提出基于神经网络方法构建雷达辐射源类型分类器。仿真试验结果表明,基于GMM和神经网络构建的两种分类器均能实现对雷达辐射源的在线识别,且当用于训练的样本比例不低于10%时,均能获得90%以上的分类正确率。相似文献

19.

基于滑动窗的混合高斯模型运动目标检测方法 总被引：1，自引：0，他引：1

周建英吴小培张超吕钊《电子与信息学报》2013,(7)

在复杂场景下,传统混合高斯模型能较好地检测出运动目标,但随着时间的推移,模型参数收敛缓慢且难以适应场景中真实背景的实时变化,从而导致运动目标的错误检测率增加。该文利用滑动窗技术的短时历史记忆特性,提出一种新颖的基于滑动窗的混合高斯模型运动目标检测方法,该方法弥补了传统混合高斯背景模型不能及时形成新背景的缺点,提高了运动检测的完整性,并进一步降低了算法对场景光照变化的敏感性。多场景下的对比实验结果表明,该方法能更准确、完整地检测出运动目标并具有更好的环境适应性。相似文献

20.

Noise‐Robust Speaker Recognition Using Subband Likelihoods and Reliable‐Feature Selection

Sungtak Kim Mikyong Ji Hoirin Kim 《ETRI Journal》2008,30(1):89-100

We consider the feature recombination technique in a multiband approach to speaker identification and verification. To overcome the ineffectiveness of conventional feature recombination in broadband noisy environments, we propose a new subband feature recombination which uses subband likelihoods and a subband reliable‐feature selection technique with an adaptive noise model. In the decision step of speaker recognition, a few very low unreliable feature likelihood scores can cause a speaker recognition system to make an incorrect decision. To overcome this problem, reliable‐feature selection adjusts the likelihood scores of an unreliable feature by comparison with those of an adaptive noise model, which is estimated by the maximum a posteriori adaptation technique using noise features directly obtained from noisy test speech. To evaluate the effectiveness of the proposed methods in noisy environments, we use the TIMIT database and the NTIMIT database, which is the corresponding telephone version of TIMIT database. The proposed subband feature recombination with subband reliable‐feature selection achieves better performance than the conventional feature recombination system with reliable‐feature selection. 相似文献