首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到19条相似文献,搜索用时 171 毫秒
1.
尽管信号子空间方法在语音增强中的应用已经得到了广泛的研究,但是作为制约子空间方法性能的子空间维度估计却一直没有得到较好的解决.针对子空间维度估计问题,本文用多通道语音信号互功率谱矩阵的F范数的统计模型来描述语音信号的先验知识和变化规律,提出了一种基于最大化原则的子空间维度估计方法,在接受原假设的前提下最大化子空间维度.实验证明,在客观语音质量评估和主观测评中,所提算法都取得了更好的结果.与传统方法相比,采用本文方法的多通道语音增强算法可在房间回声、低信噪比等恶劣环境下获得更高的噪声消除和更低的语音畸变.  相似文献   

2.
语音存在概率的估计是语音增强的核心技术之一,针对传统的存在概率估计方法是启发式的,没有把存在概率的估计统一到一个理论框架之中,不能保证估计最优,提出了一种基于序贯隐马尔可夫模型(SHMM)的存在概率估计方法,在每一子带上构建一个SHMM模型描述对数功率谱包络的时间序列,把谱包络序列看作一个在语音和噪声状态之间转移的动态一阶马尔可夫链,采用单高斯函数构建每一状态的概率模型,语音状态的后验概率即为语音信号的存在概率。为了满足算法实时性要求,SHMM参数估计简化为一阶回归过程,根据极大似然准则逐帧更新模型参数。实验表明:SHMM所描述的时序相关性对存在概率的估计起到关键作用,它优于一般的启发式估计方法;SHMM算法的语音增强分段信噪比(SegSNR)和对数谱失真(LSD)性能优于经典的改进型最小统计量控制递归平均(IMCRA)算法。   相似文献   

3.
语音信号的增强相对谱滤波   总被引:3,自引:0,他引:3  
提出了在语音信号对数功率谱域和功率谱域顺序滤波的新的增强RASTA滤波(ERASTA)方法。语音识别和说话人识别实验表明,ERASTA滤波能够有效地去除加性噪声和卷积噪声的干扰,ERASTA算法与语音信号的失真过程和噪声的功率谱无关。ERASTA方法性能同JRASTA算法类似或更好,且不需要JRASTA 算法中的实时语音信噪比估计。ERASTA 滤波器的设计表明,低频率的谱调制分量可引起语音识别和说话人识别性能的下降,说话人识别较语音识别需要较小的谱时间调制带宽。  相似文献   

4.
结合幅度谱和功率谱字典的语音增强方法   总被引:1,自引:0,他引:1       下载免费PDF全文
从双路字典学习、噪声功率谱估计、语音幅度谱重构角度提出了一种改进的谱特征稀疏表示语音增强方法。在字典学习阶段,融合功率谱与幅度谱特征,采用区分性字典降低语音字典和噪声字典的相干性;在语音增强阶段,提出一种噪声功率谱估计方法对非平稳噪声进行跟踪估计;考虑到幅度谱和功率谱特征对不同噪声的适应程度不同,设计了语音重构权值表。对分别由幅度谱和功率谱恢复而来的两路信号进行自适应加权重构,结合相位补偿函数得到增强后的语音信号。实验结果表明,该方法在平稳、非平稳噪声环境下相比于单一谱特征的语音增强方法平均提高31.6%,改善了语音增强方法的性能。   相似文献   

5.
针对现有基于字典学习的增强算法需要先验信息、不易实时处理的问题,提出一种便于实时处理的无监督的单通道语音增强算法。首先,该算法将无监督条件下背景噪声的建模问题转化为带噪语音幅度谱的稀疏低秩噪声分解;然后,采用增量非负子空间方法对背景噪声进行在线字典学习,获得能够体现背景噪声时变特性的自适应噪声字典;最后,利用所得的噪声字典,采用易于实时处理的逐帧迭代方式,对带噪语音进行处理。实验结果表明:相较于多带谱减法和基于低秩稀疏矩阵分解的增强算法,所提算法在噪声抑制方面的性能尤为显著,在多项性能评价指标上,均表现出更好的结果。  相似文献   

6.
李轶南  张雄伟  贾冲  陈亮  曾理 《声学学报》2015,40(4):607-614
针对现有基于字典学习的增强算法需要先验信息、不易实时处理的问题,提出一种便于实时处理的无监督的单通道语音增强算法。首先,该算法将无监督条件下背景噪声的建模问题转化为带噪语音幅度谱的稀疏低秩噪声分解;然后,采用增量非负子空间方法对背景噪声进行在线字典学习,获得能够体现背景噪声时变特性的自适应噪声字典;最后,利用所得的噪声字典,采用易于实时处理的逐帧迭代方式,对带噪语音进行处理。实验结果表明:相较于多带谱减法和基于低秩稀疏矩阵分解的增强算法,所提算法在噪声抑制方面的性能尤为显著,在多项性能评价指标上,均表现出更好的结果。   相似文献   

7.
由于传统谱减语音增强存在残留的"音乐噪声",因此基于传统谱减法降噪的电子耳蜗(CI)感知的声音品质也会受到影响.为提高CI的抗噪性,本文提出了一种自适应变阶谱减算法,并将该方法应用于电子耳蜗的语音增强中.根据CI电极对应的频带关系,该算法先对采集的带噪声音信号功率谱进行Bark子带划分,并在每个Bark子带中根据信噪比的变化进行谱减阶数和系数的自适应调节,使各子带噪声更均衡地去除,基本消除了传统方法存在的"音乐噪声".基于该算法的电子耳蜗ACE仿真实验及测听结果表明,与传统谱减法相比,改进的算法能更好地抑制背景噪声和残留噪声,仿真得到的CI合成音感知更好和更清晰.  相似文献   

8.
采用L1/2稀疏约束的梅尔倒谱系数语音重建方法   总被引:1,自引:0,他引:1       下载免费PDF全文
周健  刘荣敏  窦云峰  路成  陶亮 《声学学报》2018,43(6):991-999
提出了一种利用L1/2稀疏约束从梅尔倒谱系数重建语音时域信号方法。从梅尔倒谱系数估计语音幅度谱是一个欠定问题,现有的方法均采用幅度谱最小均方误差估计或采用L1正则化进行幅度谱的稀疏约束。相比于L1正则化模型,L1/2的稀疏约束特性更强,为此,本文在从梅尔倒谱系数估计语音幅度谱时引入L1/2正则化约束,并利用求解的稀疏幅度谱估计相位谱,最后利用估计的频谱重建时域语音信号。实验结果表明,与幅度谱最小均方误差法相比,本文算法所估计出的语音信号具有更高的语音质量;在噪声环境下进行语音重建实验,与L1正则化幅度谱估计方法相比,本文算法重建的语音质量更好,表现出更好抗噪性。   相似文献   

9.
基于双向搜索方法的最小值控制递归平均语音增强算法   总被引:4,自引:0,他引:4  
曾毓敏  王鹏 《声学学报》2010,35(1):81-87
语音增强效果的提高,有赖于对噪声的准确估计和对噪声变化的及时跟踪与更新。为了提高对非平稳噪声的估计和更新能力,本文基于"改进的最小值控制递归平均"(IMCRA)算法,提出了噪声谱最小值双向搜索的改进算法。该算法结合前向搜索和后向搜索谱最小值方法的特点,有效提高噪声估计的准确性、减小非平稳噪声跟踪的延迟。实验仿真表明:在非平稳噪声环境和低信噪比条件的语音信号增强处理中,本文提出的改进算法非常有效,与IMCRA算法相比,它可以获得更好的分段信噪比的提高。   相似文献   

10.
为了从带噪信号中得到纯净的语音信号,提出了一种采用性别相关模型的单通道语音增强算法。具体而言,在训练阶段,分别训练了与性别相关的深度神经网络-非负矩阵分解模型用于估计非负矩阵分解中的权重参数;在测试阶段,提出了一种基于非负矩阵分解和组稀疏惩罚的算法用于判断测试语音中说话人的性别信息,然后再采用对应的模型估计权重,并结合已训练好的字典进行语音增强。实验结果表明所提算法在噪声抑制量及语音质量上,均优于一些基于非负矩阵分解的算法和基于深度神经网络的算法。   相似文献   

11.
为实现噪声情况下的人声分离,提出了一种采用稀疏非负矩阵分解与深度吸引子网络的单通道人声分离算法。首先,通过训练得到人声与噪声的字典矩阵,将其作为先验信息从带噪混合语音中分离出人声与噪声的系数矩阵;然后,根据人声系数矩阵中不同的声源成分在嵌入空间中的相似性不同,使用深度吸引子网络将其分离为各声源语音的系数矩阵;最后,使用分离得到的各语音系数矩阵与人声的字典矩阵重构干净的分离语音。在不同噪声情况下的实验结果表明,本文算法能够在抑制背景噪声的同时提高分离语音的整体质量,优于结合声噪人声分离模型的对比算法。   相似文献   

12.
基于修正Mel域掩蔽模型和无语音概率的耳语音增强   总被引:1,自引:0,他引:1  
提出了一种基于修正Mel域听觉掩蔽模型和无语音概率的耳语音增强方法。该方法根据耳语音的发音特点对Mel频率进行修正,对每一帧耳语音信号进行Mel域频带滤波,同时通过无语音概率(SAP)动态地确定每个频带的听觉掩蔽阈值,对不同的听觉掩蔽阈值自适应地调整谱减系数来进行耳语音增强。对增强后的耳语音进行客观和主观测试,结果表明,该方法与其它谱减法相比,能将残留噪声和背景噪声控制在人耳掩蔽阈值下,取得更小的语音失真,主观听觉也得到了很大的改善。   相似文献   

13.
A single-channel algorithm is proposed for noise reduction in cochlear implants. The proposed algorithm is based on subspace principles and projects the noisy speech vector onto "signal" and "noise" subspaces. An estimate of the clean signal is made by retaining only the components in the signal subspace. The performance of the subspace reduction algorithm is evaluated using 14 subjects wearing the Clarion device. Results indicated that the subspace algorithm produced significant improvements in sentence recognition scores compared to the subjects' daily strategy, at least in stationary noise. Further work is needed to extend the subspace algorithm to nonstationary noise environments.  相似文献   

14.
Although the signal subspace approach has been studied extensively for speech enhancement,no good solution has been found to identify signal subspace dimension in multichannel situation.This paper presents a signal subspace dimension estimator based on F-norm of correlation matrix,with which subspace-based multi-channel speech enhancement is robust to adverse acoustic environments such as room reverberation and low input signal to noise ratio (SNR).Experiments demonstrate the presented method leads to more noise reduction and less speech distortion comparing with traditional methods.  相似文献   

15.
This paper presents a subspace approach for voice activity detection (VAD). The proposed approach is based on an embedded prewhitening scheme for the simultaneous diagonalization of the clean speech and noise covariance matrices to provide a decision rule based on likelihood ratio test in signal subspace domain. Experimental results show that the proposed subspace-based VAD algorithm outperforms the method using a Gaussian model in a conventional discrete Fourier transform domain at the low signal-to-noise conditions.  相似文献   

16.
This paper presents a new method to speech enhancement based on time-frequency analysis and adaptive digital filtering. The proposed method for dual-channel speech enhancement was developed by tracking frequencies of corrupting signal by the discrete Gabor transform (DGT) and implementing multi-notch adaptive digital filter (MNADF) at those frequencies. Since no a priori knowledge of the noise source statistics is required this method differs from traditional speech enhancement methods. Specifically, the proposed method was applied to the case where speech quality and intelligibility deteriorate in the presence of background noise. Speech coders and automatic speech recognition (ASR) systems are designed to act on clean speech signals. Therefore, corrupted speech signals by the noise must be enhanced before their processing. The method uses a primary input containing the corrupted speech signal while a reference input containing the noise only. In this paper, we designed MNADF instead of single-notch adaptive digital filter and used DGT to track frequencies of corrupting signal because fast filtering process and fast measure of the time-dependent noise frequency are of great importance in speech enhancement process. Therefore, MNADF was implemented to take advantage of fast filtering process. Different types of noises from Noisex-92 database were used to degrade real speech signals. Objective measures, the study of the speech spectrograms and global signal-to-noise ratio (SNR), segmental SNR (segSNR), Itakura-Saito distance measure as well as subjective listing test demonstrated consistently superior enhancement performance of the proposed method over traditional speech enhancement method such as spectral subtraction. Combining MNADF and DGT, excellent speech enhancement was obtained.  相似文献   

17.
An efficient method for calculating the Lagrange multipliers and the analytical gradients of one state included in a state average MCSCF wave function is presented. It is demonstrated that the state average energy of an ‘equal-weight’ scheme is invariant to rotations within the state average subspace and that the corresponding rotations should be eliminated from the Lagrangian equations. Finally, a diagnostic is presented, which gauges the energy difference between a state defined by a state average calculation and the corresponding fully variational multi-configurational SCF state.  相似文献   

18.
In this paper, a novel single microphone channel-based speech enhancement technique is presented. While most of the conventional nonnegative matrix factorization-based approaches focus on generating a basis matrix of speech and noise for enhancement, the proposed algorithm performs an additional process to reconstruct speech from noisy speech when these two elements are highly overlapped in selected spectral bands. This process involves a log-spectral amplitude based estimator, which provides the spectrotemporal speech presence probability to obtain a more accurate reconstruction. Moreover, the proposed algorithm applies an unsupervised learning method to the input noise, so it is adaptable to any type of environmental noise without a pre-trained dictionary. The experimental results demonstrate that the proposed algorithm obtains improved speech enhancement performance compared with conventional single channel-based approaches.  相似文献   

19.
A fictitious domain method is presented for solving elliptic partial differential equations using Galerkin spectral approximation. The fictitious domain approach consists in immersing the original domain into a larger and geometrically simpler one in order to avoid the use of boundary fitted or unstructured meshes. In the present study, boundary constraints are enforced using Lagrange multipliers and the novel aspect is that the Lagrange multipliers are associated with smooth forcing functions, compactly supported inside the fictitious domain. This allows the accuracy of the spectral method to be preserved, unlike the classical discrete Lagrange multipliers method, in which the forcing is defined on the boundaries. In order to have a robust and efficient method, equations for the Lagrange multipliers are solved directly with an influence matrix technique. Using a Fourier–Chebyshev approximation, the high-order accuracy of the method is demonstrated on one- and two-dimensional elliptic problems of second- and fourth-order. The principle of the method is general and can be applied to solve elliptic problems using any high order variational approximation.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号