期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

1.

黄建军张雄伟张亚非邹霞《声学学报》2012,37(5):539-547

针对以往语音增强算法在非平稳噪声环境下性能急剧下降的问题,基于时频字典学习方法提出了一种新的单通道语音增强算法。首先,提出采用时频字典学习方法对噪声的频谱结构的先验信息进行建模,并将其融入到卷积非负矩阵分解的框架下;然后,在固定噪声时频字典情况下,推导了时变增益和语音时频字典的乘性迭代求解公式;最后,利用该迭代公式更新语音和噪声的时变增益系数以及语音的时频字典,通过语音时频字典和时变增益的卷积运算重构出语音的幅度谱并用二值时频掩蔽方法消除噪声干扰。实验结果表明,在多项语音质量评价指标上,本文算法都取得了更好的结果。在非平稳噪声和低信噪比环境下,相比于多带谱减法和非负稀疏编码去噪算法,本文算法更有效地消除了噪声,增强后的语音具有更好的质量。相似文献

2.

面向语音增强的约束序贯高斯混合模型噪声功率谱估计 总被引：1，自引：0，他引：1

下载免费PDF全文

许春冬张震战鸽应冬文李军锋颜永红《声学学报》2017,42(5):633-640

提出了一种基于极大似然的噪声对数功率谱估计方法,采用高斯混合模型对每一个频带上的功率谱包络构建统计模型,将时序包络划分为语音和非语音类,它们分别对应于高斯混合模型的两个高斯分量,描述语音和非语音的统计分布,其中非语音高斯分量的均值即为噪声功率谱的最优估计.采用序贯学习的方法,在极大似然准则下逐帧更新模型参数,并逐帧给出噪声功率谱的最优估计值。此外,由于序贯更新过程中语音信号长时缺失,容易导致模型失稳,提出了一种在线的最小描述长度准则(MDL)来判断语音信号是否长时缺失,从而保证了模型的稳定性.实验表明,算法性能整体优于经典的MS和IMCRA算法。相似文献

3.

基于字典学习和稀疏表示的单通道语音增强算法综述* 总被引：1，自引：0，他引：1

下载免费PDF全文

叶中付朱媛媛贾翔宇《应用声学》2019,38(4):645-652

如何从带噪语音信号中恢复出干净的语音信号一直都是信号处理领域的热点问题。近年来研究者相继提出了一些基于字典学习和稀疏表示的单通道语音增强算法,这些算法利用语音信号在时频域上的稀疏特性,通过学习训练数据样本的结构特征和规律来构造相应的字典,再对带噪语音信号进行投影以估计出干净语音信号。针对训练样本与测试数据不匹配的情况,有监督类的非负矩阵分解方法与基于统计模型的传统语音增强方法相结合,在增强阶段对语音字典和噪声字典进行更新,从而估计出干净语音信号。本文首先介绍了单通道情况下语音增强的信号模型,然后对4种典型的增强方法进行了阐述,最后对未来可能的研究热点进行了展望。相似文献

4.

稀疏低秩噪声模型下无监督实时单通道语音增强算法

下载免费PDF全文

李轶南张雄伟贾冲陈亮曾理《声学学报》2015,40(4):607-614

针对现有基于字典学习的增强算法需要先验信息、不易实时处理的问题,提出一种便于实时处理的无监督的单通道语音增强算法。首先,该算法将无监督条件下背景噪声的建模问题转化为带噪语音幅度谱的稀疏低秩噪声分解;然后,采用增量非负子空间方法对背景噪声进行在线字典学习,获得能够体现背景噪声时变特性的自适应噪声字典;最后,利用所得的噪声字典,采用易于实时处理的逐帧迭代方式,对带噪语音进行处理。实验结果表明:相较于多带谱减法和基于低秩稀疏矩阵分解的增强算法,所提算法在噪声抑制方面的性能尤为显著,在多项性能评价指标上,均表现出更好的结果。相似文献

5.

稀疏低秩噪声模型下无监督实时单通道语音增强算法

《声学学报：英文版》2015,(4)

针对现有基于字典学习的增强算法需要先验信息、不易实时处理的问题,提出一种便于实时处理的无监督的单通道语音增强算法。首先,该算法将无监督条件下背景噪声的建模问题转化为带噪语音幅度谱的稀疏低秩噪声分解;然后,采用增量非负子空间方法对背景噪声进行在线字典学习,获得能够体现背景噪声时变特性的自适应噪声字典;最后,利用所得的噪声字典,采用易于实时处理的逐帧迭代方式,对带噪语音进行处理。实验结果表明:相较于多带谱减法和基于低秩稀疏矩阵分解的增强算法,所提算法在噪声抑制方面的性能尤为显著,在多项性能评价指标上,均表现出更好的结果。相似文献

6.

基于多域特征提取和深度学习的声源被动测距*

下载免费PDF全文

肖旭王同王文博苏林马力任群言《应用声学》2021,40(1):131-141

由于实际海洋环境中存在大量的非高斯噪声,一些基于高斯假设的传统去噪方法在实际海洋环境中性能下降甚至失效。针对非高斯噪声,如α稳定分布噪声、非平稳行船噪声下的脉冲信号的去噪与重构,该文提出一种基于深度学习的方法。去噪模型首先通过学习带噪信号短时傅里叶变换谱与残差谱之间的映射关系以去除环境噪声,之后对去噪信号的时频谱进行逆变换重构脉冲信号。仿真实验结果表明,深度学习模型在非高斯噪声环境下脉冲信号的去噪与重构任务中有着良好的表现,在实测样本上也表现出良好的泛化性,体现了一定的工程应用价值。相似文献

7.

基于双向搜索方法的最小值控制递归平均语音增强算法 总被引：4，自引：0，他引：4

曾毓敏王鹏《声学学报》2010,35(1):81-87

语音增强效果的提高,有赖于对噪声的准确估计和对噪声变化的及时跟踪与更新。为了提高对非平稳噪声的估计和更新能力,本文基于"改进的最小值控制递归平均"（IMCRA）算法,提出了噪声谱最小值双向搜索的改进算法。该算法结合前向搜索和后向搜索谱最小值方法的特点,有效提高噪声估计的准确性、减小非平稳噪声跟踪的延迟。实验仿真表明:在非平稳噪声环境和低信噪比条件的语音信号增强处理中,本文提出的改进算法非常有效,与IMCRA算法相比,它可以获得更好的分段信噪比的提高。相似文献

8.

舰船多辐射声源近距离通过特性的仿真

下载免费PDF全文

罗建湛雅倩马定坤《应用声学》2008,27(2):108-112

由于舰船三个特定部位(尾部、中后部、中部)的辐射噪声具有明显不同的功率谱特征,利用这些特征具有实际的意义。我们采用简化的近距离舰船辐射噪声三亮点模型来逼近舰船这三个特定部位的辐射噪声中的连续谱结构,采用时间滑动的卷积算法来重构舰船某一辐射噪声源的噪声序列,重构的噪声序列同时具有要求的幅度概率分布和功率谱形状。在重构不同辐射声源时域信号的基础上,对舰船辐射噪声的通过特性以及三个辐射噪声源在不同的接收位置的影响进行了仿真。仿真结果与实测的某型舰船的通过频谱结果相比较,表明其频域特征是大体一致的。相似文献

9.

采用L_1/2稀疏约束的梅尔倒谱系数语音重建方法 总被引：1，自引：0，他引：1

下载免费PDF全文

周健刘荣敏窦云峰路成陶亮《声学学报》2018,43(6):991-999

提出了一种利用L_1/2稀疏约束从梅尔倒谱系数重建语音时域信号方法。从梅尔倒谱系数估计语音幅度谱是一个欠定问题,现有的方法均采用幅度谱最小均方误差估计或采用L1正则化进行幅度谱的稀疏约束。相比于L₁正则化模型,L_1/2的稀疏约束特性更强,为此,本文在从梅尔倒谱系数估计语音幅度谱时引入L_1/2正则化约束,并利用求解的稀疏幅度谱估计相位谱,最后利用估计的频谱重建时域语音信号。实验结果表明,与幅度谱最小均方误差法相比,本文算法所估计出的语音信号具有更高的语音质量;在噪声环境下进行语音重建实验,与L₁正则化幅度谱估计方法相比,本文算法重建的语音质量更好,表现出更好抗噪性。相似文献

10.

联合深度编解码网络和时频掩蔽估计的单通道语音增强 总被引：1，自引：0，他引：1

下载免费PDF全文

时文华张雄伟邹霞孙蒙李莉《声学学报》2020,45(3):299-307

提出了一种联合深度编解码神经网络和时频掩蔽估计的语音增强方法。该方法利用深度编解码网络估计时频掩蔽表示,并联合带噪语音的幅度谱学习带噪语音与纯净语音幅度谱之间的非线性映射关系。深度编解码网络采用卷积-反卷积网络结构。在编码端,利用卷积网络的局部感知特性,对带噪语音的时频域结构特征进行建模,提取语音特征,同时抑制背景噪声。在解码端,利用编码端提取到的语音特征逐层恢复局部细节信息并重构语音信号。同时,在编解码端对应层之间引入跳跃连接,以减少由于池化和全连接操作导致的低层细节信息丢失的问题。在TIMIT语音库和不完全匹配噪声集下进行仿真实验,实验结果表明,该方法可以有效抑制噪声,且能较好地恢复出语音细节成分。相似文献

11.

Single-channel speech enhancement method using reconstructive NMF with spectrotemporal speech presence probabilities

Seongjae Lee David K. Han Hanseok Ko 《Applied Acoustics》2017

In this paper, a novel single microphone channel-based speech enhancement technique is presented. While most of the conventional nonnegative matrix factorization-based approaches focus on generating a basis matrix of speech and noise for enhancement, the proposed algorithm performs an additional process to reconstruct speech from noisy speech when these two elements are highly overlapped in selected spectral bands. This process involves a log-spectral amplitude based estimator, which provides the spectrotemporal speech presence probability to obtain a more accurate reconstruction. Moreover, the proposed algorithm applies an unsupervised learning method to the input noise, so it is adaptable to any type of environmental noise without a pre-trained dictionary. The experimental results demonstrate that the proposed algorithm obtains improved speech enhancement performance compared with conventional single channel-based approaches. 相似文献

12.

Noise estimation based on time–frequency correlation for speech enhancement

Wenhao Yuan Jiajun Lin Wei An Yu Wang Ning Chen 《Applied Acoustics》2013,74(5):770-781

As a fundamental part of speech enhancement, noise estimation is particularly challenging in highly non-stationary noise environments. In this work, we propose an effective algorithm on the basis of the “Improved Minima Controlled Recursive Averaging (IMCRA)” with the objective to improve the performance of noise estimation. The main contributions of this work are: (i) in the algorithm, a rough decision about speech presence is proposed by calculating the autocorrelation and cross-channel correlation of the T–F (Time–Frequency) units; (ii) with this decision, we refine the smoothing parameters for the smoothing of noisy power spectrum and the recursive averaging in noise spectrum estimation as well as the weighting factor for the a priori SNR (Signal to Noise Ratio) estimation in the IMCRA; (iii) we improve the search of local minima during spectral bursts by adding a minimum search with a shorter window. Extensive experiments are carried out to evaluate the performance of our proposed algorithm. The experimental results illustrate that, compared with the IMCRA, the proposed approach significantly improves the accuracy of noise spectrum estimation and the quality of enhanced speech in the typical noise situations. 相似文献

13.

An approach based on simplified KLT and wavelet transform for enhancing speech degraded by non-stationary wideband noise

Hong Wei Lou Guang Rui Hu 《Journal of sound and vibration》2003,268(4):717-729

It is well known that the non-stationary wideband noise is the most difficult to be removed in speech enhancement. In this paper a novel speech enhancement algorithm based on the dyadic wavelet transform and the simplified Karhunen-Loeve transform (KLT) is proposed to suppress the non-stationary wideband noise. The noisy speech is decomposed into components by the wavelet space and KLT-based vector space, and the components are processed and reconstructed, respectively, by distinguishing between voiced speech and unvoiced speech. There are no requirements of noise whitening and SNR pre-calculating. In order to evaluate the performance of this algorithm in more detail, a three-dimensional spectral distortion measure is introduced. Experiments and comparison between different speech enhancement systems by means of the distortion measure show that the proposed method has no drawbacks existing in the previous methods and performs better shaping and suppressing of the non-stationary wideband noise for speech enhancement. 相似文献

14.

An iterative noise cross-PSD estimation for two-microphone speech enhancement 总被引：2，自引：0，他引：2

Mohsen Rahmani Ahmad Akbari Beghdad Ayad 《Applied Acoustics》2009,70(3):514-521

Among various speech enhancement methods, dual-microphone methods are of a great importance for their low cost implementation and for exploiting spatial-filtering benefits of the microphone arrays. Coherence based methods are well-known as efficient two-microphone noise reduction techniques. These techniques do not work well, when received noise signals are correlated. These can be improved when the cross power spectral density (CPSD) of noise is available. In this paper, we propose an iterative approach for estimation of the noise CPSD to be employed in coherence based methods. We compare the proposed iterative noise CPSD estimation with a noise CPSD estimation technique based on voice activity detector (VAD), both of which are employed in a two-microphone speech enhancement, separately. Evaluation results show that the two-microphone speech enhancement scheme utilizing the proposed noise CPSD estimation technique performs superior than the enhancement system using the VAD-based noise CPSD estimation. 相似文献

15.

Speech enhancement using a generic noise codebook

S Srinivasan DH Rao Naidu 《The Journal of the Acoustical Society of America》2012,132(2):EL161-EL167

Although single-microphone noise reduction methods perform well in stationary noise environments, their performance in non-stationary conditions remains unsatisfactory. Use of prior knowledge about speech and noise power spectral densities in the form of trained codebooks has been previously shown to address this limitation. While it is possible to use trained speech codebooks in a practical system, the variety of noise types encountered in practice makes the use of trained noise codebooks less practical. This letter presents a method that uses a generic noise codebook for speech enhancement that can be generated on-the-fly and provides good performance. 相似文献

16.

联合精确比值掩蔽与深度神经网络的单通道语音增强方法

下载免费PDF全文

柏浩钧张天骐刘鉴兴叶绍鹏《声学学报》2022,47(3):394-404

针对目前有监督语音增强忽略了纯净语音、噪声与带噪语音之间的幅度谱相似性对增强效果影响等问题,提出了一种联合精确比值掩蔽(ARM)与深度神经网络(DNN)的语音增强方法。该方法利用纯净语音与带噪语音、噪声与带噪语音的幅度谱归一化互相关系数,设计了一种基于时频域理想比值掩蔽的精确比值掩蔽作为目标掩蔽;然后以纯净语音和噪声幅度谱为训练目标的DNN为基线,通过该DNN的输出来估计目标掩蔽,并对基线DNN和目标掩蔽进行联合优化,增强语音由目标掩蔽从带噪语音中估计得到;此外,考虑到纯净语音与噪声的区分性信息,采用一种区分性训练函数代替均方误差(MSE)函数作为基线DNN的目标函数,以使网络输出更加准确。实验表明,区分性训练函数提升了基线DNN以及整个联合优化网络的增强效果;在匹配噪声和不匹配噪声下,相比于其它常见DNN方法,本文方法取得了更高的平均客观语音质量评估(PESQ)和短时客观可懂度(STOI),增强后的语音保留了更多语音成分,同时对噪声的抑制效果更加明显。相似文献