期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

1.

Speech enhancement based on the discrete Gabor transform and multi-notch adaptive digital filters

Ergun Erçelebi 《Applied Acoustics》2004,65(8):739-762

This paper presents a new method to speech enhancement based on time-frequency analysis and adaptive digital filtering. The proposed method for dual-channel speech enhancement was developed by tracking frequencies of corrupting signal by the discrete Gabor transform (DGT) and implementing multi-notch adaptive digital filter (MNADF) at those frequencies. Since no a priori knowledge of the noise source statistics is required this method differs from traditional speech enhancement methods. Specifically, the proposed method was applied to the case where speech quality and intelligibility deteriorate in the presence of background noise. Speech coders and automatic speech recognition (ASR) systems are designed to act on clean speech signals. Therefore, corrupted speech signals by the noise must be enhanced before their processing. The method uses a primary input containing the corrupted speech signal while a reference input containing the noise only. In this paper, we designed MNADF instead of single-notch adaptive digital filter and used DGT to track frequencies of corrupting signal because fast filtering process and fast measure of the time-dependent noise frequency are of great importance in speech enhancement process. Therefore, MNADF was implemented to take advantage of fast filtering process. Different types of noises from Noisex-92 database were used to degrade real speech signals. Objective measures, the study of the speech spectrograms and global signal-to-noise ratio (SNR), segmental SNR (segSNR), Itakura-Saito distance measure as well as subjective listing test demonstrated consistently superior enhancement performance of the proposed method over traditional speech enhancement method such as spectral subtraction. Combining MNADF and DGT, excellent speech enhancement was obtained. 相似文献

2.

改进的正交匹配追踪的语音增强算法*

下载免费PDF全文

武正平马建芬张朝霞杨东东《应用声学》2018,37(6):934-939

为了提高传统正交匹配追踪（Orthogonal Matching Pursuit,OMP ）算法的语音增强性能和运算速度,本研究基于稀疏编码理论,提出了一种改进的OMP算法的语音增强算法。其一,将K-奇异值分解（K-singular value decomposition,K-SVD）算法与OMP算法相结合,通过设置能量阈值的方法,提高OMP算法的语音增强性能;其二,通过改进传统OMP算法中信号稀疏逼近的计算方法,提高算法的运算速度。改进的OMP算法的语音增强算法与传统K-SVD语音增强算法相比,采用PESQ评价增强语音的质量,NCM评价语音的可懂度。在NCM的值基本保持不变的情况下,PESQ的值平均提高约12.47%,取得了更好的增强效果。取得了更好的增强效果。改进的OMP算法的运算速度与传统OMP算法相比提高近一倍。相似文献

3.

Remote-sensing image encryption in hybrid domains

Xiaoqiang Zhang Guiliang Zhu Shilong Ma 《Optics Communications》2012,285(7):1736-1743

Remote-sensing technology plays an important role in military and industrial fields. Remote-sensing image is the main means of acquiring information from satellites, which always contain some confidential information. To securely transmit and store remote-sensing images, we propose a new image encryption algorithm in hybrid domains. This algorithm makes full use of the advantages of image encryption in both spatial domain and transform domain. First, the low-pass subband coefficients of image DWT (discrete wavelet transform) decomposition are sorted by a PWLCM system in transform domain. Second, the image after IDWT (inverse discrete wavelet transform) reconstruction is diffused with 2D (two-dimensional) Logistic map and XOR operation in spatial domain. The experiment results and algorithm analyses show that the new algorithm possesses a large key space and can resist brute-force, statistical and differential attacks. Meanwhile, the proposed algorithm has the desirable encryption efficiency to satisfy requirements in practice. 相似文献

4.

基于字典学习和稀疏表示的单通道语音增强算法综述* 总被引：1，自引：0，他引：1

下载免费PDF全文

叶中付朱媛媛贾翔宇《应用声学》2019,38(4):645-652

如何从带噪语音信号中恢复出干净的语音信号一直都是信号处理领域的热点问题。近年来研究者相继提出了一些基于字典学习和稀疏表示的单通道语音增强算法,这些算法利用语音信号在时频域上的稀疏特性,通过学习训练数据样本的结构特征和规律来构造相应的字典,再对带噪语音信号进行投影以估计出干净语音信号。针对训练样本与测试数据不匹配的情况,有监督类的非负矩阵分解方法与基于统计模型的传统语音增强方法相结合,在增强阶段对语音字典和噪声字典进行更新,从而估计出干净语音信号。本文首先介绍了单通道情况下语音增强的信号模型,然后对4种典型的增强方法进行了阐述,最后对未来可能的研究热点进行了展望。相似文献

5.

Efficient computation of quadratic-phase integrals in optics

Ozaktas HM Koç A Sari I Kutay MA 《Optics letters》2006,31(1):35-37

We present a fast NlogN time algorithm for computing quadratic-phase integrals. This three-parameter class of integrals models propagation in free space in the Fresnel approximation, passage through thin lenses, and propagation in quadratic graded-index media as well as any combination of any number of these and is therefore of importance in optics. By carefully managing the sampling rate, one need not choose N much larger than the space-bandwidth product of the signals, despite the highly oscillatory integral kernel. The only deviation from exactness arises from the approximation of a continuous Fourier transform with the discrete Fourier transform. Thus the algorithm computes quadratic-phase integrals with a performance similar to that of the fast-Fourier-transform algorithm in computing the Fourier transform, in terms of both speed and accuracy. 相似文献

6.

An approach based on simplified KLT and wavelet transform for enhancing speech degraded by non-stationary wideband noise

Hong Wei Lou Guang Rui Hu 《Journal of sound and vibration》2003,268(4):717-729

It is well known that the non-stationary wideband noise is the most difficult to be removed in speech enhancement. In this paper a novel speech enhancement algorithm based on the dyadic wavelet transform and the simplified Karhunen-Loeve transform (KLT) is proposed to suppress the non-stationary wideband noise. The noisy speech is decomposed into components by the wavelet space and KLT-based vector space, and the components are processed and reconstructed, respectively, by distinguishing between voiced speech and unvoiced speech. There are no requirements of noise whitening and SNR pre-calculating. In order to evaluate the performance of this algorithm in more detail, a three-dimensional spectral distortion measure is introduced. Experiments and comparison between different speech enhancement systems by means of the distortion measure show that the proposed method has no drawbacks existing in the previous methods and performs better shaping and suppressing of the non-stationary wideband noise for speech enhancement. 相似文献

7.

采用上下文相关的注意力机制及循环神经网络的语音增强方法 总被引：1，自引：1，他引：0

下载免费PDF全文

蓝天惠国强李萌吕忆蓝刘峤《声学学报》2020,45(6):897-905

提出了采用上下文相关的注意力机制及循环神经网络的语音增强方法。该方法在训练阶段联合训练计算注意力评分的多层感知机和增强语音的深度循环网络,在测试阶段计算每一帧语音的注意力向量并与该帧语音拼接输入深度循环网络增强。在不同信噪比的实验中,该方法相比基线模型能更好地提高语音质量和可懂度,-6 dB下相对带噪语音短时客观可懂度(STOI)和语音质量感知评估(PESQ)可分别提高0.16和0.77,同时在未知噪声条件下该方法性能仍最优或接近最优。因此注意力机制可以有效强化模型对上下文信息的利用能力,从而提高模型增强性能。相似文献

8.

基于修正Mel域掩蔽模型和无语音概率的耳语音增强 总被引：1，自引：0，他引：1

陶智赵鹤鸣吴迪陈大庆张晓俊《声学学报》2009,34(4):370-377

提出了一种基于修正Mel域听觉掩蔽模型和无语音概率的耳语音增强方法。该方法根据耳语音的发音特点对Mel频率进行修正,对每一帧耳语音信号进行Mel域频带滤波,同时通过无语音概率(SAP)动态地确定每个频带的听觉掩蔽阈值,对不同的听觉掩蔽阈值自适应地调整谱减系数来进行耳语音增强。对增强后的耳语音进行客观和主观测试,结果表明,该方法与其它谱减法相比,能将残留噪声和背景噪声控制在人耳掩蔽阈值下,取得更小的语音失真,主观听觉也得到了很大的改善。相似文献

9.

L_1/2稀疏约束卷积非负矩阵分解的单通道语音增强方法

下载免费PDF全文

路成田猛周健王华彬陶亮《声学学报》2017,42(3):377-384

为了刻画语音信号帧间相关性和使用更少的语音基表示语音特征,提出一种采用L_1/2稀疏约束的卷积非负矩阵分解方法进行单通道语音增强。首先,进行噪声学习得到噪声基;然后,以噪声基为先验信息结合L_1/2稀疏约束卷积非负矩阵分解方法学习含噪语音中的语音基成分;最后,利用学习到的语音基和系数重建出干净语音信号。在不同噪声环境下进行的实验结果表明,本文方法优于采用L₁稀疏约束的卷积非负矩阵方法及传统的统计语音增强方法。相似文献

10.

A subspace approach based on embedded prewhitening for voice activity detection

Kim DK Chang JH 《The Journal of the Acoustical Society of America》2011,130(5):EL304-EL310

This paper presents a subspace approach for voice activity detection (VAD). The proposed approach is based on an embedded prewhitening scheme for the simultaneous diagonalization of the clean speech and noise covariance matrices to provide a decision rule based on likelihood ratio test in signal subspace domain. Experimental results show that the proposed subspace-based VAD algorithm outperforms the method using a Gaussian model in a conventional discrete Fourier transform domain at the low signal-to-noise conditions. 相似文献

11.

基于卡尔曼滤波的低复杂度去混响算法* 总被引：1，自引：1，他引：0

下载免费PDF全文

齐园蕾杨飞然杨军《应用声学》2018,37(4):559-566

在电话会议、智能音箱等应用场景下,传声器往往处在声源的远场。混响信号的存在会掩蔽后续到达的直达声信号,降低传声器接收信号的语音质量,以及语音识别系统的准确识别率。多通道线性预测算法是一种经典的盲去混响算法,但该算法往往具有较高的计算复杂度。本文提出了一种简化的卡尔曼滤波更新算法,通过对角化卡尔曼滤波器状态向量误差协方差矩阵,降低了自适应多通道线性预测去混响算法的复杂度。通过与现有分块对角简化算法对比发现,本文提出的简化算法在保证语音质量的同时,进一步降低了原卡尔曼滤波算法的复杂度。相似文献

12.

联合深度神经网络和凸优化的单通道语音增强算法 总被引：1，自引：1，他引：0

下载免费PDF全文

张晓艳张天骐葛宛营白杨柳《声学学报》2021,46(3):471-480

噪声估计的准确性直接影响语音增强算法的好坏,为提升当前语音增强算法的噪声抑制效果,有效求解无约束优化问题,提出一种联合深度神经网络(DNN)和凸优化的时频掩蔽优化算法进行单通道语音增强.首先,提取带噪语音的能量谱作为DNN的输入特征;接着,将噪声与带噪语音的频带内互相关系数(ICC Factor)作为DNN的训练目标;... 相似文献

13.

A discrete fractional angular transform

Zhengjun Liu Muhammad Ashfaq Ahmad Shutian Liu 《Optics Communications》2008,281(6):1424-1429

A new discrete fractional transform defined by two parameters (angle and fractional order) is presented. All eigenvectors of the transform are obtained by an angle using recursion method. This transform is named as discrete fractional angular transform (DFAT). The computational load of kernel matrix of the DFAT is minimum than all other transforms with fractional order. This characteristics has very important practical applications in signal and image processing. Numerical results and the mathematical properties of this transform are also given. As fractional Fourier transform, this transform can be applied in one and two dimensional signal processing. 相似文献

14.

Noise reduction of speech signals using time-varying and multi-band adaptive gain control for smart digital hearing protectors

Narimene Lezzoum Ghyslain GagnonJérémie Voix 《Applied Acoustics》2016

In this paper, a single-channel speech enhancement algorithm based on non-linear and multi-band Adaptive Gain Control (AGC) is proposed. The algorithm requires neither Signal-to-Noise Ratio (SNR) nor noise parameters estimation. It reduces the background noise in the temporal domain rather than the spectral domain using a non-linear and automatically adjustable gain function for multi-band AGC. The gain function varies in time and is deduced from the temporal envelope of each frequency band to highly compress the frequency regions where noise is present and lightly compress the frequency regions where speech is present. Objective evaluation using the PESQ (Perceptual Evaluation of Speech Quality) metric shows that the proposed algorithm performs better than three benchmarks, namely: the spectral subtraction, the Wiener filter based on a priori SNR estimation and a band-pass modulation filtering algorithm. In addition, blind subjective tests show that the proposed algorithm introduces less musical noise compared to the benchmark algorithms and was preferred 78.8% of the time in terms of signal quality. The proposed algorithm is implemented in a miniature low power digital signal processor to validate its feasibility and complexity for smart hearing protection in noisy environments. 相似文献

15.

近似窄带假设下的最小方差无失真响应波束形成 总被引：1，自引：0，他引：1

下载免费PDF全文

王子腾孙兴伟李军锋颜永红《声学学报》2020,45(2):161-168

最小方差无失真响应波束形成算法在应用于语音等宽带信号时,依赖窄带假设可以在频域各个子带分别进行滤波。窄带假设下语音信号协方差矩阵是秩-1矩阵,而实际中窄带信号模型只是实际信号模型的一种近似,同时由于存在统计量估计误差,估计的语音信号协方差矩阵的秩一般大于1。提出利用语音协方差矩阵和噪声协方差矩阵的广义主特征向量来估计相对传递函数,用于重构语音信号协方差矩阵为秩-1矩阵。在REVERB数据集以及CHiME-4数据集上进行实验验证,最小方差无失真响应波束形成算法经过语音协方差矩阵低秩近似后,对估计误差的鲁棒性提高,输出信噪比分别提升平均0.8 dB和1.4 dB,同时提升了语音识别准确率。相似文献

16.

联合精确比值掩蔽与深度神经网络的单通道语音增强方法

下载免费PDF全文

柏浩钧张天骐刘鉴兴叶绍鹏《声学学报》2022,47(3):394-404

针对目前有监督语音增强忽略了纯净语音、噪声与带噪语音之间的幅度谱相似性对增强效果影响等问题,提出了一种联合精确比值掩蔽(ARM)与深度神经网络(DNN)的语音增强方法。该方法利用纯净语音与带噪语音、噪声与带噪语音的幅度谱归一化互相关系数,设计了一种基于时频域理想比值掩蔽的精确比值掩蔽作为目标掩蔽;然后以纯净语音和噪声幅度谱为训练目标的DNN为基线,通过该DNN的输出来估计目标掩蔽,并对基线DNN和目标掩蔽进行联合优化,增强语音由目标掩蔽从带噪语音中估计得到;此外,考虑到纯净语音与噪声的区分性信息,采用一种区分性训练函数代替均方误差(MSE)函数作为基线DNN的目标函数,以使网络输出更加准确。实验表明,区分性训练函数提升了基线DNN以及整个联合优化网络的增强效果;在匹配噪声和不匹配噪声下,相比于其它常见DNN方法,本文方法取得了更高的平均客观语音质量评估(PESQ)和短时客观可懂度(STOI),增强后的语音保留了更多语音成分,同时对噪声的抑制效果更加明显。相似文献

17.

面向自定义语音唤醒的关键词相关的单通道语音增强

下载免费PDF全文

刘作桢吴愁黎塔赵庆卫《声学学报》2023,48(2):415-424

提出一种面向自定义语音唤醒的单通道语音增强方法。该方法预先将关键词音素信息存入文本编码矩阵,并在常规语音增强模型基础上添加一个基于注意力机制的音素偏置模块。该模块利用语音增强模型中间特征从文本编码矩阵中获取当前帧的音素信息,并将其融入语音增强模型的后续计算中,从而提升语音增强模型对关键词相关音素的增强效果。在不同噪声环境下的实验结果表明,该方法可以更有效地抑制关键词部分噪声。同时所提出方法对比常规语音增强方法与其他文本相关语音增强方法,在自定义语音唤醒性能上可以分别获得14.3%和7.6%的相对提升。相似文献

18.

Single-channel speech enhancement method using reconstructive NMF with spectrotemporal speech presence probabilities

Seongjae Lee David K. Han Hanseok Ko 《Applied Acoustics》2017

In this paper, a novel single microphone channel-based speech enhancement technique is presented. While most of the conventional nonnegative matrix factorization-based approaches focus on generating a basis matrix of speech and noise for enhancement, the proposed algorithm performs an additional process to reconstruct speech from noisy speech when these two elements are highly overlapped in selected spectral bands. This process involves a log-spectral amplitude based estimator, which provides the spectrotemporal speech presence probability to obtain a more accurate reconstruction. Moreover, the proposed algorithm applies an unsupervised learning method to the input noise, so it is adaptable to any type of environmental noise without a pre-trained dictionary. The experimental results demonstrate that the proposed algorithm obtains improved speech enhancement performance compared with conventional single channel-based approaches. 相似文献

19.

Noise reduction using three-step gain factor and iterative-directional-median filter

《Applied Acoustics》2014

Musical residual noise is a major problem for a speech enhancement system. This noise is very annoying to the human ear and can significantly deteriorate the perception quality of enhanced speech. In this study, we aim at reducing the quantity of musical residual noise by a two-stage speech enhancement approach. In the first stage a preprocessor enhances noisy speech using an algorithm which combines the two-step-decision-directed and the Virag methods. In the second stage the enhanced speech signal is post-processed by an iterative-directional-median filter to significantly reduce the quantity of residual noise, while maintaining the harmonic spectra. Experimental results show that the proposed approach can significantly improve the performance of a speech enhancement system by reducing the quantity of residual noise. 相似文献

20.

离散小波变换耦合静电场理论的图像快速伪造检测算法

刘欢刘朝涛黄丽《应用声学》2016,24(3):44-47

为了解决当前图像伪造检测算法主要是在图像空域中定位伪造区域,难以降低图像维数,使其复杂度大;且不能有效检测几何变换篡改形式的伪造区域,导致其鲁棒性不佳的不足,本文提出了离散小波变换耦合静电场理论的图像伪造检测算法。首先,引入离散小波变换,提取伪造图像的低频子带,降低图像空间;再基于静电场理论,将提取子带映射到虚拟电场中,提取鲁棒性较强的特征,利用Radix排序算法对特征完成重组,形成特征矩阵;最后,定义相同仿射变换,并用其处理排序矩阵,完成伪造区域检测。实验测试结果显示：与当前的移动复制伪造检测技术相比,本文算法具有更高的定位效率与检测精度;同时拥有较强的鲁棒性,有效抗击几何变换篡改。相似文献