共查询到20条相似文献,搜索用时 43 毫秒
1.
2.
最小方差无失真响应波束形成算法在应用于语音等宽带信号时,依赖窄带假设可以在频域各个子带分别进行滤波。窄带假设下语音信号协方差矩阵是秩-1矩阵,而实际中窄带信号模型只是实际信号模型的一种近似,同时由于存在统计量估计误差,估计的语音信号协方差矩阵的秩一般大于1。提出利用语音协方差矩阵和噪声协方差矩阵的广义主特征向量来估计相对传递函数,用于重构语音信号协方差矩阵为秩-1矩阵。在REVERB数据集以及CHiME-4数据集上进行实验验证,最小方差无失真响应波束形成算法经过语音协方差矩阵低秩近似后,对估计误差的鲁棒性提高,输出信噪比分别提升平均0.8 dB和1.4 dB,同时提升了语音识别准确率。 相似文献
3.
针对噪声环境下微小气体泄漏难以准确定位的问题,提出了一种基于改进最小方差无失真响应角度谱算法的气体泄漏定位方法。该算法通过引入信噪比追踪加权的方式,提取受噪声影响较小且单个声源能量占优的时频支撑域,并通过Softplus激活函数自适应地调整不同频率分量对角度谱函数的贡献,增加泄漏声源占优的时频域权重;此外,引入基于时频稀疏性的分频带处理,使各子频带内存在一个主导声源能量占优,抑制低频段噪声能量的积累同时避免高频混叠现象。通过软件仿真计算以及实验验证算法的性能,结果表明改进最小方差无失真响应角度谱算法可以实现气体泄漏源的精准定位,定位结果的最大误差在3.5°以内。相比传统算法,该方法在低信噪比和低采样点数下有更高的稳定性、抗噪能力及准确率,可为气体泄漏定位的实际应用提供一定的参考价值。 相似文献
4.
一种鲁棒性的最小方差无失真响应波束形成算法及其应用 总被引:1,自引:0,他引:1
理论上,自适应波束形成方法要比不依赖于输入数据的常规波束形成方法有更好的目标参数估计能力和干扰抑制能力。但在实际水声环境中,声传播模型、接收阵阵列流形以及信号统计特征等因素往往与实际情况存在一定的差异,导致传统的自适应波束形成方法性能下降。因此,提高自适应波束形成方法对上述因素的鲁棒性变得越来越重要。本文基于最差条件最优化的思想,改进MVDR(最小方差无失真响应)方法的约束条件提出了一种鲁棒性最小方差无失真响应自适应波束形成算法(R-MVDR),并对输入数据协方差矩阵和方向向量存在不确定性的情况进行了性能分析,推导给出了波束形成的加权向量和空间谱估计表达式,最后通过海上实验数据进行了验证。结果证明本文提出的算法在实际环境中有更好的方位分辨能力和干扰抑制能力。 相似文献
5.
针对最小方差无失真响应(Minimum Variance Distortionless Response,MVDR)方法在起伏非相关噪声环境下多目标分辨性能严重下降的问题,提出一种非均匀对角减载MVDR (Inhomogeneous Diagonal Unloading MVDR,IDU-MVDR)方法。该方法首先对协方差矩阵进行非均匀对角减载,然后实施MVDR方法。各阵元上的对角减载量通过求解半正定优化问题获得,优化问题中最大化减载量之和,但约束减载后协方差矩阵的最小特征值是一个较小的正值。数值仿真表明,IDUMVDR方法可通过非均匀对角减载消除大部分非相关噪声,但保留小部分噪声分量.因此IDU-MVDR方法较MVDR方法分辨力更高,空间谱中背景级更低、弱目标谱峰更加明显,并且具备一定的稳健性.海上实验结果与数值仿真相一致,验证了IDU-MVDR方法的有效性. 相似文献
6.
快速收敛最小方差无畸变响应算法研究及应用 总被引:4,自引:0,他引:4
常规最小方差无畸变响应(MVDR)自适应波束形成是一种高分辨窄带波束形成器,它是利用实际声场的窄带互谱密度矩阵(CSDM)估计出自适应波束形成权向量。在实际应用中,MVDR算法需要较长的观测时间估计协方差矩阵,不利于对高速运动目标进行定位;对于宽带目标信号,MVDR算法需要对每一个CSDM进行求逆运算,计算量较大;在相干源条件下,目标信号之间会发生\ 相似文献
7.
8.
9.
为了提高最小方差超声成像算法的分辨率、对比度以及对噪声的鲁棒性,提出一种改进的最小方差成像算法。该方法首先基于回波信号中期望信号与噪声信号的可分离性将信号划分为期望信号和噪声信号,然后根据最小方差原理,求出加权向量使期望信号功率最小,同时,为了增加算法对噪声的鲁棒性,对信号方向向量增加一对约束条件,进一步提高图像质量。在全发全收和合成孔径模式下对点目标和吸声斑进行仿真,结果表明所提算法在全发全收模式下,-6 dB处分辨率在最小方差基础上提高了1倍左右,在合成孔径模式下,对比度在特征空间最小方差算法基础上提高了8 dB,且远优于传统延时叠加算法。最后通过实验进一步表明改进的最小方差算法图像在分辨率、对比度及对噪声的鲁棒性等方面表现更优,可以有效的改善超声图像的质量。 相似文献
10.
本文提出一种简单易行的无源测距后置最佳处理方法。这种方法是应用线性最小方差(LMV,Linear Minimum Variance)方法于线性运动系统的状态估计而得到的,其结果是对每次观测作统计平均处理,最佳的加权系数满足一个线性方程。目标运动的基本假设是在一段观测时间中保持常速度,并有随机速度扰动。二个计算机模拟实验结果表明,这方法收敛速度和性能良好,无发散现象,对目标机动情况也能很好适应。计算机实现简单,计算量很小。 相似文献
11.
A feature extraction technique named perceptual MVDR-based cepstral coefficients (PMCCs) was introduced into speaker recognition.PMCCs are extracted and modeled using Gaussian Mixture Models(GMMs) for speaker recognition.In order to compensate for speaker and channel variability effects,joint factor analysis(JFA) is used.The experiments are carried out on the core conditions of NIST 2008 speaker recognition evaluation data.The experimental results show that the systems based on PMCCs can achieve comparable performance to those based on the conventional MFCCs.Besides,the fusion of the two kinds of systems can make significant performance improvement compared to the MFCCs system alone,reducing equal error rate(EER) by the factor between 7.6%and 30.5%as well as minimum detect cost function (minDCF) by the factor between 3.2%and 21.2%on different test sets.The results indicate that PMCCs can be effectively applied in speaker recognition and they are complementary with MFCCs to some extent. 相似文献
12.
Automatic speaker verification using cepstral measurements 总被引:1,自引:0,他引:1
J E Luck 《The Journal of the Acoustical Society of America》1969,46(4):1026-1032
13.
Brown JC 《The Journal of the Acoustical Society of America》1999,105(3):1933-1941
Cepstral coefficients based on a constant Q transform have been calculated for 28 short (1-2 s) oboe sounds and 52 short saxophone sounds. These were used as features in a pattern analysis to determine for each of these sounds comprising the test set whether it belongs to the oboe or to the sax class. The training set consisted of longer sounds of 1 min or more for each of the instruments. A k-means algorithm was used to calculate clusters for the training data, and Gaussian probability density functions were formed from the mean and variance of each of the clusters. Each member of the test set was then analyzed to determine the probability that it belonged to each of the two classes; and a Bayes decision rule was invoked to assign it to one of the classes. Results have been extremely good and are compared to a human perception experiment identifying a subset of these same sounds. 相似文献
14.
Mel frequency cepstral coefficients (MFCC) are the most widely used speech features in automatic speech recognition systems, primarily because the coefficients fit well with the assumptions used in hidden Markov models and because of the superior noise robustness of MFCC over alternative feature sets such as linear prediction-based coefficients. The authors have recently introduced human factor cepstral coefficients (HFCC), a modification of MFCC that uses the known relationship between center frequency and critical bandwidth from human psychoacoustics to decouple filter bandwidth from filter spacing. In this work, the authors introduce a variation of HFCC called HFCC-E in which filter bandwidth is linearly scaled in order to investigate the effects of wider filter bandwidth on noise robustness. Experimental results show an increase in signal-to-noise ratio of 7 dB over traditional MFCC algorithms when filter bandwidth increases in HFCC-E. An important attribute of both HFCC and HFCC-E is that the algorithms only differ from MFCC in the filter bank coefficients: increased noise robustness using wider filters is achieved with no additional computational cost. 相似文献
15.
Rodríguez-Liñares L Garciá-Mateo C 《The Journal of the Acoustical Society of America》2001,109(1):385-389
In this paper, a speaker recognition system that introduces acoustic information into a Gaussian mixture model (GMM)-based recognizer is presented. This is achieved by using a phonetic classifier during the training phase. The experimental results show that, while maintaining the recognition rate, the decrease in the computational load is between 65% and 80% depending on the number of mixtures of the models. 相似文献
16.
一种适于说话人识别的非线性频率尺度变换 总被引:3,自引:0,他引:3
传统的非线性频率尺度变换虽然能够反映人类听觉系统(HAS:Human Auditory System)的感知特性,但不能区别对待语音中包含的语义和个性特征,在表达说话人个性特征方面并不充分。通过分析语音信号不同频带短时谱对说话人识别性能的影响,采用最小二乘法多项式曲线拟合技术,提出了一种非线性频率尺度变换。实验表明,与传统的Mel、Bark和ERB频率尺度变换相比,在同样的训练与测试条件下,平均误识率分别降低70.5%,60.8%和70.5%。这一结果说明,本文提出的非线性频率尺度变换有效地增强了短时谱的说话人个性特征,能够提高说话人识别系统的性能。 相似文献
17.
针对低信噪比说话人识别中缺失数据特征方法鲁棒性下降的问题,提出了一种采用感知听觉场景分析的缺失数据特征提取方法。首先求取语音的缺失数据特征谱,并由语音的感知特性求出感知特性的语音含量。含噪语音经过感知特性的语音增强和对其语谱的二维增强后求解出语音的分布,联合感知特性语音含量和缺失强度参数提取出感知听觉因子。再结合缺失数据特征谱把特征的提取过程分解为不同听觉场景进行区分地分析和处理,以增强说话人识别系统的鲁棒性能。实验结果表明,在-10 dB到10 dB的低信噪比环境下,对于4种不同的噪声,提出的方法比5种对比方法的鲁棒性均有提高,平均识别率分别提高26.0%,19.6%,12.7%,4.6%和6.5%。论文提出的方法,是一种在时-频域中寻找语音鲁棒特征的方法,更适合于低信噪比环境下的说话人识别。 相似文献
18.
长时语音特征在说话人识别技术上的应用 总被引:1,自引:0,他引:1
本文除介绍常用的说话人识别技术外,主要论述了一种基于长时时频特征的说话人识别方法,对输入的语音首先进行VAD处理,得到干净的语音后,对其提取基本时频特征。在每一语音单元内把基频、共振峰、谐波等时频特征的轨迹用Legendre多项式拟合的方法提取出主要的拟合参数,再利用HLDA的技术进行特征降维,用高斯混合模型的均值超向量表示每句话音时频特征的统计信息。在NIST06说话人1side-1side说话人测试集中,取得了18.7%的等错率,与传统的基于MFCC特征的说话人系统进行融合,等错率从4.9%下降到了4.6%,获得了6%的相对等错率下降。 相似文献
19.
Speaker verification (SVR) performance is degraded under reverberation conditions. Cepstral mean subtraction (CMS) is often applied to the feature vectors in order to compensate for convolutive effects of transmission channels, which are considered to have a short-duration impulse response. The effect of reverberation on the performance of CMS applied to the feature vectors in SVR is investigated. Although CMS was found effective in reducing the effect of reverberation for short reverberation time (RT), in cases of long RT, it is shown that CMS may degrade SVR performance rather than improve it. Hence, CMS should not to be used in these cases. In addition, the effect of the room volume was tested and found less critical than the effect of long RT. 相似文献
20.
本文研究自动发音人识别中测试文本的选择.提出并证实了结合汉语特点的测试文本选择的重要性和应用潜力,并总结了几条简单规则;应用并改善了解决发音动态变化的时间域规正法.
本系统用12阶LPCC倒谱系数和基音周期构成混合特征矢量,采用三字三模板匹配的识别方法,在一般实验室环境下,以录音机为传输媒介,达到了0.6%的确认错误率。 相似文献
本系统用12阶LPCC倒谱系数和基音周期构成混合特征矢量,采用三字三模板匹配的识别方法,在一般实验室环境下,以录音机为传输媒介,达到了0.6%的确认错误率。 相似文献