首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 328 毫秒
1.
针对含噪语音难以实现有效的语音转换,本文提出了一种采用联合字典优化的噪声鲁棒性语音转换算法。在联合字典的构成中,语音字典采用后向剔除算法(Backward Elimination algorithm,BE)进行优化,同时引入噪声字典,使得含噪语音与联合字典相匹配。实验结果表明,在保证转换效果的前提下,后向剔除算法能够减少字典帧数,降低计算量。在低信噪比和多种噪声环境下,本文算法与传统NMF算法和基于谱减法消噪的NMF转换算法相比具有更好的转换效果,噪声字典的引入提升了语音转换系统的噪声鲁棒性。   相似文献   

2.
Under the condition of limited target speaker's corpus, this paper proposed an algorithm for voice conversion using unified tensor dictionary with limited corpus. Firstly,parallel speech of N speakers was selected randomly from the speech corpus to build the base of tensor dictionary. And then, after the operation of multi-series dynamic time warping for those chosen speech, N two-dimension basic dictionaries can be generated which constituted the unified tensor dictionary. During the conversion stage, the two dictionaries of source and target speaker were established by linear combination of the N basic dictionaries using the two speakers' speech. The experimental results showed that when the number of the basic speaker was 14, our algorithm can obtain the compared performance of the traditional NMFbased method with few target speaker corpus, which greatly facilitate the application of voice conversion system.  相似文献   

3.
谷东  简志华 《声学学报》2018,43(5):864-872
针对目标说话人可能存在语料不足的情况,本文提出了一种有限语料下的统一张量字典语音转换算法。从语料库中选取N个说话人作为语音张量字典的基础说话人,通过多序列动态时间规整算法使这N个说话人的平行语音段对齐,从而建立由N个二维基础字典构成的张量字典。在语音转换阶段,源、目标说话人语音都可以通过张量字典中各基础字典的线性组合,构造出各自的语音字典,实现了语音转换。实验结果表明,当基础说话人个数达到14时,只需要极少的目标说话人语料,便可获得与传统的基于非负矩阵分解转换算法相当的转换效果,这极大地方便了语音转换系统的应用。   相似文献   

4.
In this paper, a novel single microphone channel-based speech enhancement technique is presented. While most of the conventional nonnegative matrix factorization-based approaches focus on generating a basis matrix of speech and noise for enhancement, the proposed algorithm performs an additional process to reconstruct speech from noisy speech when these two elements are highly overlapped in selected spectral bands. This process involves a log-spectral amplitude based estimator, which provides the spectrotemporal speech presence probability to obtain a more accurate reconstruction. Moreover, the proposed algorithm applies an unsupervised learning method to the input noise, so it is adaptable to any type of environmental noise without a pre-trained dictionary. The experimental results demonstrate that the proposed algorithm obtains improved speech enhancement performance compared with conventional single channel-based approaches.  相似文献   

5.
李轶南  张雄伟  贾冲  陈亮  曾理 《声学学报》2015,40(4):607-614
针对现有基于字典学习的增强算法需要先验信息、不易实时处理的问题,提出一种便于实时处理的无监督的单通道语音增强算法。首先,该算法将无监督条件下背景噪声的建模问题转化为带噪语音幅度谱的稀疏低秩噪声分解;然后,采用增量非负子空间方法对背景噪声进行在线字典学习,获得能够体现背景噪声时变特性的自适应噪声字典;最后,利用所得的噪声字典,采用易于实时处理的逐帧迭代方式,对带噪语音进行处理。实验结果表明:相较于多带谱减法和基于低秩稀疏矩阵分解的增强算法,所提算法在噪声抑制方面的性能尤为显著,在多项性能评价指标上,均表现出更好的结果。   相似文献   

6.
针对以往语音增强算法在非平稳噪声环境下性能急剧下降的问题,基于时频字典学习方法提出了一种新的单通道语音增强算法。首先,提出采用时频字典学习方法对噪声的频谱结构的先验信息进行建模,并将其融入到卷积非负矩阵分解的框架下;然后,在固定噪声时频字典情况下,推导了时变增益和语音时频字典的乘性迭代求解公式;最后,利用该迭代公式更新语音和噪声的时变增益系数以及语音的时频字典,通过语音时频字典和时变增益的卷积运算重构出语音的幅度谱并用二值时频掩蔽方法消除噪声干扰。实验结果表明,在多项语音质量评价指标上,本文算法都取得了更好的结果。在非平稳噪声和低信噪比环境下,相比于多带谱减法和非负稀疏编码去噪算法,本文算法更有效地消除了噪声,增强后的语音具有更好的质量。   相似文献   

7.
为了从带噪信号中得到纯净的语音信号,提出了一种采用性别相关模型的单通道语音增强算法。具体而言,在训练阶段,分别训练了与性别相关的深度神经网络-非负矩阵分解模型用于估计非负矩阵分解中的权重参数;在测试阶段,提出了一种基于非负矩阵分解和组稀疏惩罚的算法用于判断测试语音中说话人的性别信息,然后再采用对应的模型估计权重,并结合已训练好的字典进行语音增强。实验结果表明所提算法在噪声抑制量及语音质量上,均优于一些基于非负矩阵分解的算法和基于深度神经网络的算法。   相似文献   

8.
如何从带噪语音信号中恢复出干净的语音信号一直都是信号处理领域的热点问题。近年来研究者相继提出了一些基于字典学习和稀疏表示的单通道语音增强算法,这些算法利用语音信号在时频域上的稀疏特性,通过学习训练数据样本的结构特征和规律来构造相应的字典,再对带噪语音信号进行投影以估计出干净语音信号。针对训练样本与测试数据不匹配的情况,有监督类的非负矩阵分解方法与基于统计模型的传统语音增强方法相结合,在增强阶段对语音字典和噪声字典进行更新,从而估计出干净语音信号。本文首先介绍了单通道情况下语音增强的信号模型,然后对4种典型的增强方法进行了阐述,最后对未来可能的研究热点进行了展望。  相似文献   

9.
This paper addresses the problem of the speech quality improvement using adaptive filtering algorithms. Recently in Djendi and Bendoumia (2014) [1], we have proposed a new two-channel backward algorithm for noise reduction and speech intelligibility enhancement. The main drawback of proposed two-channel subband algorithm is its poor performance when the number of subband is high. This inconvenience is well seen in the steady state regime values. The source of this problem is the fixed step-sizes of the cross-adaptive filtering algorithms that distort the speech signal when they are selected high and degrade the convergence speed behaviours when they are selected small. In this paper, we propose four modifications of this algorithm which allow improving both the convergence speed and the steady state values even in very noisy condition and a high number of subbands. To confirm the good performance of the four proposed variable-step-size SBBSS algorithms, we have carried out several simulations in various noisy environments. In these simulations, we have evaluated objective and subjective criteria as the system mismatch, the cepstral distance, the output signal-to-noise-ratio, and the mean opinion score (MOS) method to compare the four proposed variables step-size versions of the SBBSS algorithm with their original versions and with the two-channel fullband backward (2CFB) least mean square algorithm.  相似文献   

10.
提出了一种滑动窗累积量的递推估计算法并应用于语音端点检测中,用以解决传统端点检测方法在噪声环境下检测性能变差的问题。在对含噪语音信号进行加窗之后,利用滑动窗累积量的递推估计算法估计含噪语音信号的高阶累积量值,并在此基础上结合能量特征进行语音端点检测。实验结果表明,所提滑动窗累积量递推估计算法相比较传统高阶累积量计算方法运算效率明显提高;所提端点检测算法在不同噪声和信噪比环境下相比较G.729b算法点正确率Pc-point值平均提升了6.07%。基于滑动窗高阶累积量的语音端点检测算法具有较高的运算效率及良好的鲁棒性。   相似文献   

11.
结合幅度谱和功率谱字典的语音增强方法   总被引:1,自引:0,他引:1       下载免费PDF全文
从双路字典学习、噪声功率谱估计、语音幅度谱重构角度提出了一种改进的谱特征稀疏表示语音增强方法。在字典学习阶段,融合功率谱与幅度谱特征,采用区分性字典降低语音字典和噪声字典的相干性;在语音增强阶段,提出一种噪声功率谱估计方法对非平稳噪声进行跟踪估计;考虑到幅度谱和功率谱特征对不同噪声的适应程度不同,设计了语音重构权值表。对分别由幅度谱和功率谱恢复而来的两路信号进行自适应加权重构,结合相位补偿函数得到增强后的语音信号。实验结果表明,该方法在平稳、非平稳噪声环境下相比于单一谱特征的语音增强方法平均提高31.6%,改善了语音增强方法的性能。   相似文献   

12.
This paper shows an accurate speech detection algorithm for improving the performance of speech recognition systems working in noisy environments. The proposed method is based on a hard decision clustering approach where a set of prototypes is used to characterize the noisy channel. Detecting the presence of speech is enabled by a decision rule formulated in terms of an averaged distance between the observation vector and a cluster-based noise model. The algorithm benefits from using contextual information, a strategy that considers not only a single speech frame but also a neighborhood of data in order to smooth the decision function and improve speech detection robustness. The proposed scheme exhibits reduced computational cost making it adequate for real time applications, i.e., automated speech recognition systems. An exhaustive analysis is conducted on the AURORA 2 and AURORA 3 databases in order to assess the performance of the algorithm and to compare it to existing standard voice activity detection (VAD) methods. The results show significant improvements in detection accuracy and speech recognition rate over standard VADs such as ITU-T G.729, ETSI GSM AMR, and ETSI AFE for distributed speech recognition and a representative set of recently reported VAD algorithms.  相似文献   

13.
针对目前有监督语音增强忽略了纯净语音、噪声与带噪语音之间的幅度谱相似性对增强效果影响等问题,提出了一种联合精确比值掩蔽(ARM)与深度神经网络(DNN)的语音增强方法。该方法利用纯净语音与带噪语音、噪声与带噪语音的幅度谱归一化互相关系数,设计了一种基于时频域理想比值掩蔽的精确比值掩蔽作为目标掩蔽;然后以纯净语音和噪声幅度谱为训练目标的DNN为基线,通过该DNN的输出来估计目标掩蔽,并对基线DNN和目标掩蔽进行联合优化,增强语音由目标掩蔽从带噪语音中估计得到;此外,考虑到纯净语音与噪声的区分性信息,采用一种区分性训练函数代替均方误差(MSE)函数作为基线DNN的目标函数,以使网络输出更加准确。实验表明,区分性训练函数提升了基线DNN以及整个联合优化网络的增强效果;在匹配噪声和不匹配噪声下,相比于其它常见DNN方法,本文方法取得了更高的平均客观语音质量评估(PESQ)和短时客观可懂度(STOI),增强后的语音保留了更多语音成分,同时对噪声的抑制效果更加明显。   相似文献   

14.
Regarding the performance of traditional endpoint detection algorithms degrades as the environment noise level increases,a recursive calculating algorithm for higher-order cumulants over a sliding window is proposed.Then it is applied to the speech endpoint detection.Furthermore,endpoint detection is carried out with the feature of energy.Experimental results show that both the computational efficiency and the robustness against noise of the proposed algorithm are improved remarkably compared with traditional algorithm.The average probability of correct point detection(Pc-point) of the proposed voice activity detection(VAD) is6.07%higher than that of G.729 b VAD in different noisy at different signal-noise ratios(SNRs)environments.  相似文献   

15.
Musical residual noise is a major problem for a speech enhancement system. This noise is very annoying to the human ear and can significantly deteriorate the perception quality of enhanced speech. In this study, we aim at reducing the quantity of musical residual noise by a two-stage speech enhancement approach. In the first stage a preprocessor enhances noisy speech using an algorithm which combines the two-step-decision-directed and the Virag methods. In the second stage the enhanced speech signal is post-processed by an iterative-directional-median filter to significantly reduce the quantity of residual noise, while maintaining the harmonic spectra. Experimental results show that the proposed approach can significantly improve the performance of a speech enhancement system by reducing the quantity of residual noise.  相似文献   

16.
Microphone array-based speech enhancement has great importance for speech communications and speech recognition. To reduce the aperture of the microphone array and to increase the effect of the speech enhancement will greatly broaden the application areas of the microphone array. An array crosstalk resistant adaptive noise cancellation method is therefore presented. And then an improved spectral subtraction algorithm is further cascaded to obtain better enhancement results. Theoretic analysis and experiments indicate that the proposed scheme needs only a very small microphone array while it simultaneously achieves a higher SNR improvement. Besides, the proposed scheme can be used in many noisy environments and is easy for real-time implementation.  相似文献   

17.
针对低信噪比图像去噪问题,提出了一种基于K-SVD(Singular Value Decomposition)和残差比(Residual Ratio Iteration Termination)的正交匹配追踪(Orthogonal Matching Pursuit,OMP)图像稀疏分解去噪算法。该算法利用K-SVD算法将离散余弦变换(Discrete cosine transform,DCT)框架产生的冗余字典训练成能够有效反映图像结构特征的超完备字典,以实现图像的有效表示。然后以残差比作为OMP算法迭代的终止条件来实现图像的去噪。实验表明,该算法相对于传统基于Symlets小波图像去噪、基于Contourlet变换的图像去噪,以及基于DCT冗余字典的稀疏表示图像去噪,能够更加有效地滤除低信噪比图像中的高斯白噪声,保留原图像的有用信息。  相似文献   

18.
为实现噪声情况下的人声分离,提出了一种采用稀疏非负矩阵分解与深度吸引子网络的单通道人声分离算法。首先,通过训练得到人声与噪声的字典矩阵,将其作为先验信息从带噪混合语音中分离出人声与噪声的系数矩阵;然后,根据人声系数矩阵中不同的声源成分在嵌入空间中的相似性不同,使用深度吸引子网络将其分离为各声源语音的系数矩阵;最后,使用分离得到的各语音系数矩阵与人声的字典矩阵重构干净的分离语音。在不同噪声情况下的实验结果表明,本文算法能够在抑制背景噪声的同时提高分离语音的整体质量,优于结合声噪人声分离模型的对比算法。   相似文献   

19.
行驶汽车环境中的话音活动检测研究   总被引:1,自引:0,他引:1       下载免费PDF全文
话音活动检测是语音交互和通信系统的重要部分,其作用是区分输入信号中的语音段和背景噪声段,检测的依据主要是语音和噪声的各种时频特性,其中,浊语音的周期性和谐波特性是一种广泛应用的特征。但是在行驶的汽车环境中,由于噪声非平稳且信噪比较低,这类特征较难得到可靠的检测。为此,本文根据浊音谐波结构的基本规律,利用时变噪声环境中各频带信噪比不同的特点,提出一种较为鲁棒的谐波快速检测算法。算法以较小的时频块为分析单元,利用一组基频在对数尺度上变化的谐波模板,自适应地搜索谐波结构清晰的部分,并以此检测浊语音信号。实验证明,该算法能够在行驶的汽车环境中达到较可靠的话音/非话音区别效果。  相似文献   

20.
路成  田猛  周健  王华彬  陶亮 《声学学报》2017,42(3):377-384
为了刻画语音信号帧间相关性和使用更少的语音基表示语音特征,提出一种采用L1/2稀疏约束的卷积非负矩阵分解方法进行单通道语音增强。首先,进行噪声学习得到噪声基;然后,以噪声基为先验信息结合L1/2稀疏约束卷积非负矩阵分解方法学习含噪语音中的语音基成分;最后,利用学习到的语音基和系数重建出干净语音信号。在不同噪声环境下进行的实验结果表明,本文方法优于采用L1稀疏约束的卷积非负矩阵方法及传统的统计语音增强方法。   相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号