期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

1.

《声学学报：英文版》2015,(5)

提出了一种滑动窗累积量的递推估计算法并应用于语音端点检测中,用以解决传统端点检测方法在噪声环境下检测性能变差的问题。在对含噪语音信号进行加窗之后,利用滑动窗累积量的递推估计算法估计含噪语音信号的高阶累积量值,并在此基础上结合能量特征进行语音端点检测。实验结果表明,所提滑动窗累积量递推估计算法相比较传统高阶累积量计算方法运算效率明显提高;所提端点检测算法在不同噪声和信噪比环境下相比较G.729b算法点正确率Pc-point值平均提升了6.07%。基于滑动窗高阶累积量的语音端点检测算法具有较高的运算效率及良好的鲁棒性。相似文献

2.

低信噪比下采用感知语谱结构边界参数的语音端点检测算法

吴迪赵鹤鸣陶智张晓俊肖仲喆许宜申《声学学报》2014,39(3):392-399

提出了一种采用感知语谱结构边界参数(PSSB)的语音端点检测算法,用于在低信噪比环境下的语音信号预处理。在对含噪语音进行基于听觉感知特性的语音增强之后,针对语音信号的连续分布特性与残留噪声的随机分布特性之间的不同点,对增强后语音的时-频语谱进行二维增强,从而进一步突出连续分布的纯净语音的语谱结构。通过对增强后语音语谱结构的二维边界检测,提出PSSB参数,并用于端点检测。实验结果表明,在白噪声-10 dB到10 dB的各种信噪比环境下,采用PSSB参数的端点检测算法,相对于其它端点检测算法,更有效地检测出语音的端点。在-10 dB的极低信噪比下,提出的方法仍然有75.2%的正确率。采用PSSB参数的端点检测算法,更适合于低信噪比白噪声环境下的语音端点检测。相似文献

3.

卷积噪声环境下语音信号鲁棒特征提取

吕钊吴小培张超李密《声学学报》2010,35(4):465-470

提出了一种基于独立分量分析(ICA)的语音信号鲁棒特征提取算法,用以解决在卷积噪声环境下语音信号的训练与识别特征不匹配的问题。该算法通过短时傅里叶变换将带噪语音信号从时域转换到频域后,采用复值ICA方法从带噪语音的短时谱中分离出语音信号的短时谱,然后根据所得到的语音信号短时谱计算美尔倒谱系数(MFCC)及其一阶差分作为特征参数。在仿真与真实环境下汉语数字语音识别实验中,所提算法相比较传统的MFCC其识别正确率分别提升了34.8%和32.6%。实验结果表明基于ICA方法的语音特征在卷积噪声环境下具有良好的鲁棒性。相似文献

4.

采用骨导语音自适应的语句分割方法*

下载免费PDF全文

苗晓孔张雄伟《应用声学》2019,38(1):68-75

为了解决含噪语句分割问题,也为了解决某些低信噪比环境下传统气导语句分割算法分割效果差、分割准确度低且算法自适应性弱等问题,提出一种基于骨导语音自适应的分段双门限语音分割方法。将骨导语音和气导语音同步采集,获取抗噪性能更好的骨导语音,然后在融合过零率与短时能量中引入随机动态阈值的自适应方法进行端点检测,最后利用分段双门限和语音聚类等手段实现语音分割,提高语音分割算法的鲁棒性。通过实验验证了所提算法的有效性和可行性,同时与其他语音分割算法进行了对比,证明该文所提分割算法精度更高,效果更好。相似文献

5.

采用联合字典优化的噪声鲁棒性语音转换算法

下载免费PDF全文

张石磊简志华孙闽红钟华刘二小《声学学报》2019,44(6):1074-1082

针对含噪语音难以实现有效的语音转换,本文提出了一种采用联合字典优化的噪声鲁棒性语音转换算法。在联合字典的构成中,语音字典采用后向剔除算法(Backward Elimination algorithm,BE)进行优化,同时引入噪声字典,使得含噪语音与联合字典相匹配。实验结果表明,在保证转换效果的前提下,后向剔除算法能够减少字典帧数,降低计算量。在低信噪比和多种噪声环境下,本文算法与传统NMF算法和基于谱减法消噪的NMF转换算法相比具有更好的转换效果,噪声字典的引入提升了语音转换系统的噪声鲁棒性。相似文献

6.

基于子带能量特征的最优化语音端点检测算法研究 总被引：9，自引：2，他引：7

陈振标徐波《声学学报》2005,30(2):171-176

为了提高噪声环境下语音端点检测的鲁棒性,提出了一种结合多子带能量特征和最优化边缘检测判决准则的算法。该算法的突出优点在于:在不同信噪比情况下,其端点检测滤波器的输出基本不变,从而避免了门限调整所带来的困难。实验结果表明,这种算法在多种噪声环境下都能够达到较好的语音检出效果。这种算法克服了传统语音端点检测以短时能量、基频、过零率等作为检测特征时,需要动态调整门限且在低信噪比情况下鲁棒性较差的缺点。相似文献

7.

复杂噪声环境中的语音端点检测 总被引：3，自引：0，他引：3

国雁萌付强颜永红《声学学报》2006,31(6):549-554

提出了一种适用于复杂加性噪声环境的语音端点检测方法。通过对噪声类型的归纳,建立自适应的平稳噪声模型,并根据模型搜索信号能量非平稳的区域。然后基于浊音在频域上的谐波结构,在此区域内检测浊音,从而排除非平稳噪声的干扰。最后根据信号能量精确搜索语音起止点。与目前典型端点检测算法的对比实验表明,在大多数复杂噪声环境下,该算法具有较好的准确率。相似文献

8.

一种参数优化的混沌信号自适应去噪算法

下载免费PDF全文

王梦蛟吴中堂冯久超《物理学报》2015,64(4):40503-040503

针对非线性自适应混沌信号去噪算法的参数优化问题, 考虑到最优滤波窗长受到不同因素的影响, 为提高该算法的自适应性, 提出一种滤波窗长自动最优化的判决准则. 依据混沌信号和噪声自相关函数的不同, 首先采用不同窗长对含噪混沌信号进行去噪, 然后计算每个窗长对应的残差自相关度(RAD), 最后通过对最小RAD所对应的窗长进行一定比例收缩实现窗长的最优化. 仿真结果表明, 该判决准则能够在不同条件下对滤波窗长进行有效的自动最优化, 提高了混沌信号去噪算法的自适应性. 相似文献

9.

一种改进的手机双麦克风消噪系统

《应用声学》2017,(1)

针对现有手机双麦克风消噪系统无法应对多种复杂噪声环境、在消除噪声的同时会引起语音失真等问题,本文提出一种新的手机双麦克风消噪系统。该系统将时域与频域处理结合,在噪声估计和消除两方面均有改进,结合单、双麦克风的噪声估计算法,提高了噪声估计的准确性。此外,将基音检测与消噪处理结合,在语音帧中估计语音基音频率,确定语音和噪声频率点,根据语音和噪声频率点分别调整维纳滤波器参数,在滤除噪声的同时尽可能地保留语音频率点,从而减少语音失真。实验结果表明,与现有双麦克风消噪系统相比,本系统在抑制噪声的同时能够有效减少消噪算法对语音造成的损害,提高了手机通话质量,对于方向性的语音干扰也起到很好的抑制效果。相似文献

10.

基于双传声器的蓝牙耳机降噪算法

下载免费PDF全文

严馨叶邱小军卢晶《应用声学》2014,33(4):313-323

用于免提通信设备的语音增强算法一直是研究的热点问题,而算法处理结果的音质问题近年来也备受关注。针对基于双传声器降噪的蓝牙耳机系统,将常用多通道传声器降噪算法归纳为基于相干函数法和基于空间预分离法这两大类进行分析和比较。基于相干函数法利用两个通道间信号的相干函数对含噪信号滤波达到降噪目的,而基于空间预分离法利用空间特性从含噪信号中分离出噪声参考信号来消除噪声。分析基于降噪量、语音音质和综合性能三个指标,从约束语音损伤的角度分析最优解的形式,并对比两类算法的实际性能。结果表明选择合适的算法可权衡降噪量与语音损伤,达到较好的综合性能。相似文献

11.

A recursive calculating algorithm for higher-order cumulants over sliding window and its application in speech endpoint detection

《声学学报：英文版》2015,(4)

Regarding the performance of traditional endpoint detection algorithms degrades as the environment noise level increases,a recursive calculating algorithm for higher-order cumulants over a sliding window is proposed.Then it is applied to the speech endpoint detection.Furthermore,endpoint detection is carried out with the feature of energy.Experimental results show that both the computational efficiency and the robustness against noise of the proposed algorithm are improved remarkably compared with traditional algorithm.The average probability of correct point detection(Pc-point) of the proposed voice activity detection(VAD) is6.07%higher than that of G.729 b VAD in different noisy at different signal-noise ratios(SNRs)environments. 相似文献

12.

An effective cluster-based model for robust speech detection and speech recognition in noisy environments

Górriz JM Ramírez J Segura JC Puntonet CG 《The Journal of the Acoustical Society of America》2006,120(1):470-481

This paper shows an accurate speech detection algorithm for improving the performance of speech recognition systems working in noisy environments. The proposed method is based on a hard decision clustering approach where a set of prototypes is used to characterize the noisy channel. Detecting the presence of speech is enabled by a decision rule formulated in terms of an averaged distance between the observation vector and a cluster-based noise model. The algorithm benefits from using contextual information, a strategy that considers not only a single speech frame but also a neighborhood of data in order to smooth the decision function and improve speech detection robustness. The proposed scheme exhibits reduced computational cost making it adequate for real time applications, i.e., automated speech recognition systems. An exhaustive analysis is conducted on the AURORA 2 and AURORA 3 databases in order to assess the performance of the algorithm and to compare it to existing standard voice activity detection (VAD) methods. The results show significant improvements in detection accuracy and speech recognition rate over standard VADs such as ITU-T G.729, ETSI GSM AMR, and ETSI AFE for distributed speech recognition and a representative set of recently reported VAD algorithms. 相似文献

13.

Noise estimation based on time–frequency correlation for speech enhancement

Wenhao Yuan Jiajun Lin Wei An Yu Wang Ning Chen 《Applied Acoustics》2013,74(5):770-781

As a fundamental part of speech enhancement, noise estimation is particularly challenging in highly non-stationary noise environments. In this work, we propose an effective algorithm on the basis of the “Improved Minima Controlled Recursive Averaging (IMCRA)” with the objective to improve the performance of noise estimation. The main contributions of this work are: (i) in the algorithm, a rough decision about speech presence is proposed by calculating the autocorrelation and cross-channel correlation of the T–F (Time–Frequency) units; (ii) with this decision, we refine the smoothing parameters for the smoothing of noisy power spectrum and the recursive averaging in noise spectrum estimation as well as the weighting factor for the a priori SNR (Signal to Noise Ratio) estimation in the IMCRA; (iii) we improve the search of local minima during spectral bursts by adding a minimum search with a shorter window. Extensive experiments are carried out to evaluate the performance of our proposed algorithm. The experimental results illustrate that, compared with the IMCRA, the proposed approach significantly improves the accuracy of noise spectrum estimation and the quality of enhanced speech in the typical noise situations. 相似文献

14.

Optimally weighted maximum a posteriori probabilities based on minimum classification error for dual-microphone voice activity detection

Seng Hyun Huang Joon-Hyuk Chang 《Applied Acoustics》2016

The dual-microphone voice activity detection (VAD) technique is proposed by applying discriminative weight training to achieve optimal weighting of spatial features available within the dual-microphone VAD. Since the motivation behind our method is to use the relevant spatial information available from the two microphones, we employ the phase difference, coherence, and power level difference ratio (PLDR) as a feature vector, and then use this feature vector to derive the maximum a posteriori (MAP) probabilities. Then, we combine each MAP probability based on a discriminative weight training, i.e., the minimum classification error (MCE) method to offer an optimal VAD decision in a spectral domain, which successfully represents the dynamic evolution of speech over time even in the non-stationary noise environments. The proposed dual-microphone VAD algorithm outperforms conventional dual-microphone VAD methods based on only single feature among the PLDR, phase difference, and spectral coherence. 相似文献

15.

Noise-robust voice conversion based on joint dictionary optimization

《声学学报：英文版》2020,(2)

A noise robust voice conversion algorithm based on joint dictionary optimization is proposed to effectively convert noisy source speech into the target one. In composition of the joint dictionary, speech dictionary is optimized using backward elimination algorithm. At the same time, a noise dictionary is introduced to match the noisy speech. The experimental results show that the backward elimination algorithm can reduce the number of dictionary frames and reduce the amount of calculation while ensuring the conversion effect. In low SNR and multiple noise environments, the algorithm has better conversion effect than both the traditional NMF algorithm and the NMF conversion algorithm plus spectral subtraction de-noising. The proposed algorithm improves the robustness of voice conversion system. 相似文献

16.

Second generation wavelet transform-based pitch period estimation and voiced/unvoiced decision for speech signals

Ergun Erçelebi 《Applied Acoustics》2003,64(1):25-41

Pitch detection is an important part of speech recognition and speech processing. In this paper, a pitch detection algorithm based on second generation wavelet transform was developed. The proposed algorithm reduces the computational load of those algorithms that were based on classical wavelet transform. The proposed pitch detection algorithm was tested for both real speech and synthetic speech signal. Some experiments were carried out under noisy environment condition to evaluate the accuracy and robustness of the proposed algorithm. Results showed that the proposed algorithm was robust to noise and provided accurate estimates of the pitch period for both low-pitched and high-pitched speakers. Moreover, different wavelet filters that were obtained using second generation wavelet transform were considered to see the effects of them on the proposed algorithm. It was noticed that Haar filter showed good performance as compared to the other wavelet filters. 相似文献

17.

An improved algorithm for noise-robust sparse linear prediction of speech

ZHOU Bin ZOU Xia ZHANG Xiongwei 《声学学报：英文版》2015,(1):84-95

The performance of linear prediction analysis of speech deteriorates rapidly under noisy environments.To tackle this issue,an improved noise-robust sparse linear prediction algorithm is proposed.First,the linear prediction residual of speech is modeled as Student-t distribution,and the additive noise is incorporated explicitly to increase the robustness,thus a probabilistic model for sparse linear prediction of speech is built.Furthermore,variational Bayesian inference is utilized to approximate the intractable posterior distributions of the model parameters,and then the optimal linear prediction parameters are estimated robustly.The experimental results demonstrate the advantage of the developed algorithm in terms of several different metrics compared with the traditional algorithm and the l₁ norm minimization based sparse linear prediction algorithm proposed in recent years.Finally it draws to a conclusion that the proposed algorithm is more robust to noise and is able to increase the speech quality in applications. 相似文献

18.

Speech endpoint detection in real noise environments 总被引：1，自引：0，他引：1

GUO Yanmeng FU Qiang YAN Yonghong 《声学学报：英文版》2007,26(1):39-48

A method of speech endpoint detection in environments of complicated additive noise is presented. Based on the analysis of noise, an adaptive model of stationary noise is proposed to detect the section where the signal is nonstationary. Then the voice is detected in this section by its harmonic structure, and the accurate endpoint is searched using energy. Compared with the typical algorithms, this algorithm operates reliably in most real noise environments. 相似文献