首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 171 毫秒
1.
吴礼福  王华  程义  郭业才 《应用声学》2016,35(4):288-293
混响是室内声学中的重要现象,在室内设计与音频信号处理中都需要测量或估计混响时间。本文改进了一种基于最大似然估计的混响时间盲估计方法,即采用说话人在房间中自然说话时发出的混响语音信号来估计混响时间的方法。该方法首先确定语音衰减段的最优边界,其次计算该衰减段的两个额外参数,据此筛选出符合条件的语音段,最后将满足条件的语音段采用最大似然估计得到混响时间估计值。在五个不同混响时间条件下的仿真表明,与已有方法相比,改进方法估计的混响时间同真实混响时间的偏差更小,方差更低,估计准确性较高。  相似文献   

2.
孙兴伟  李军锋  颜永红 《声学学报》2021,46(6):1234-1241
提出一种结合卷积神经网络的编解码器模型和混响时间注意力机制的混响抑制算法,该算法通过编解码器模型实现混响抑制,并利用混响时间注意力机制克服混响环境变化对混响抑制效果的影响。该算法在编码器中使用具有不同大小的卷积核来处理混响语音幅度谱,从而获得包含多尺度上下文信息的编码特征;通过引入注意力模块,实现在不同的混响时间环境中选择性地使用不同权重的编码特征生成加权特征;最后,在解码器中使用加权特征来重建混响抑制后的语音信号幅度谱。在模拟和真实的混响环境下,该算法相对于基线系统在语音混响调制能量比上分别取得了0.36 dB和0.66 dB的提升。实验结果表明,该算法可以适应不同混响环境的变化,相对基线系统在真实混响环境下具有更高的鲁棒性。   相似文献   

3.
周立君  刘宇  白璐  茹志兵  于帅 《应用光学》2020,41(1):120-126
研究了基于生成式对抗网络(GAN)和跨域自适应迁移学习的样本生成和自动标注方法。该方法利用自适应迁移学习网络,基于已有的少量可见光图像样本集,挖掘目标在红外和可见光图像中特征内在相关性,构建自适应的转换迁移学习网络模型,生成标注好的目标图像。提出的方法解决了红外图像样本数量少且标注费时的问题,为后续多频段协同目标检测和识别获得了足够的样本数据。实验结果表明:自动标注算法对实际采集的装甲目标图像和生成的装甲目标图像各1 000张进行自动标注测试,对实际装甲目标图像的标注准确率达到95%以上,对生成的装甲目标标注准确率达到83%以上;利用真实图像和生成图像的混合数据集训练的分类器的性能和使用纯真实图像时基本一致。  相似文献   

4.
偏度最大化多通道逆滤波语音去混响研究*   总被引:1,自引:1,他引:0       下载免费PDF全文
房间混响会降低语音质量和语音可懂度。高阶统计量是衡量非高斯性的重要参量,基于语音非高斯特性可实现语音去混响。本文提出一种基于高阶统计量的多通道语音去混响方法,该方法首次用多通道语音信号线性预测残差的三阶统计量偏度(Skewness)构造代价函数,以去混响重建信号线性预测残差的偏度最大化为目标自适应地更新逆滤波器;同时结合语音信号的产生模型,提出基于偏度准则的线性预测与房间脉冲响应逆滤波联合估计方法,进一步提高去混响算法性能。实验结果表明,该方法相较于已有的基于线性预测残差四阶统计量峰度(Kurtosis)的方法具有更好的去混响效果,且对噪声具有更强的鲁棒性。  相似文献   

5.
谢江荣  李范鸣  卫红  李冰 《光学学报》2019,39(3):142-148
提出了一种应用于红外目标仿真的模型。利用训练后的条件深度卷积生成对抗网络,只需输入随机噪声和类别标签,便能够自动产生预期类别的红外目标仿真图像。在手写数字数据集(MNIST)和红外数据集上分别训练模型参数,再进行自动生成实验,均可以产生高真实度的样本图像;将判别网络提取的特征用于分类实验,并将所提方法合成的图像用于数据增强,以提升分类器性能。研究结果表明,所提方法能够有效模仿红外辐射特征。  相似文献   

6.
浅海混响建模的声束跟踪理论   总被引:3,自引:0,他引:3  
研究建立了基于声束跟踪理论的浅海混响强度计算方法和混响时间序列仿真方法。给出了混响强度计算的简要理论推导,并进行了模型计算值与实验值的比较。建立了一种混响时间序列仿真模型,给出了其实现框架和方法。结合实验数据与文献研究结果,进行了混响序列相关特性的检验与分析。结果表明:建立的混响强度计算模型能很好地进行浅海混响强度的预报,混响序列仿真模型能仿真具有不同包络分布的混响序列,且其相关特性符合实验与文献研究结果。   相似文献   

7.
俞悟周  王佐民 《应用声学》1998,17(5):11-16,48
本文提出采用非线性滤波抑制在强背景噪声环境中用M-序列相关法得到的房间脉冲响应中的残余噪声影响,以扩大混响衰减曲线的动态范围,从而达到能够在强背景噪声环境下准确测量混响时间的目的。首先讨论了影响M序列相关法测量混响时间的几个因素。其次,采用非线性滤波进一步抑制背景噪声的影响。结果表明,非线性滤波的效果相当显著。本文还在非白噪声背景条件下用该法进行强背景下的混响时间测量,结果与传统测量结果符合得很好  相似文献   

8.
为克服机器学习方法在油藏单井产量预测中的过拟合问题,提高油田生产中的产量预测精度,提出一种基于条件生成式对抗网络(CGAN)的油藏单井产量预测模型。该模型使用长短期记忆、全连接等基础神经网络,构建生成和判别网络模型。生成网络模型以产量影响因素为条件输入,生成预测产量数据,利用对数损失函数评价预测数据与真实数据之间的偏差,通过条件生成式对抗网络的博弈训练,并结合贝叶斯超参数优化算法,优化模型结构,综合提高模型的泛化能力。基于Eclipse数值模拟软件建立同一井网条件下不同地质和生产条件下的油藏单井产量数据库,以地质与生产条件等产量影响因素作为模型的条件输入,进行油藏单井产量预测。结果表明:与全连接神经网络(FCNN)、随机森林(RF)以及长短期记忆神经网络(LSTM)模型的预测结果相比,CGAN模型在测试集上的平均绝对百分比误差分别提升了2.59%、 0.81%以及1.72%,并且过拟合比最小(1.027)。说明CGAN降低了机器学习产量预测模型的过拟合程度,提高了模型的泛化能力与预测精度,验证了所提算法的优越性,对指导油田高效开发和保障我国能源战略安全具有重要意义。  相似文献   

9.
李楠 《光学技术》2022,(6):755-762
机载红外探测系统在近地背景下检测目标时,地面将对弱小目标产生严重的干扰,导致传统检测方法对弱小目标的检测性能下降。针对该问题,利用生成对抗网络提出一种近地背景下的机载红外探测系统弱小目标检测方法。将深度自编码器作为生成对抗网络的网络框架,引入inception机制对视觉信息进行多尺度特征提取,并引入残差块来缓解梯度消失问题。在神经网络的对抗训练中,生成器考虑了移动损失与对抗损失两个损失函数,提高了生成器的训练效果。最终,在公开的无人机机载红外探测数据集上完成了实验,结果表明所提方法能在近地背景下成功检测出红外弱小目标,且检测的平均精度与速率均优于其它对比方法。  相似文献   

10.
王新  夏广远 《应用声学》2023,42(5):954-962
面向管道法兰连接松动引起的泄漏检测需求,为解决数据样本不足和减少特征指标手动选取的繁琐环节。本文,考虑到生成性对抗网络(GAN)作为数据扩充工具,已被证明能够生成与真实数据相似的样本数据。同时,卷积神经网络(CNN)作为一种深度学习方法,为自动提取数据的特征提供了一种有效的方法。开展了基于GAN和CNN的铝合金管道法兰连接松动泄漏检测研究。首先,搭建管道泄漏标定和数据采集实验台,利用声发射技术获取不同等级的原始泄漏信号。其次,采用GAN生成样本数据扩充原始数据。同时,为了评估生成模型的性能,引入统计特评估生成质量。最后,将生成的样本数据与原始数据设置为不同训练集,基于卷积神经网络构建智能分类检测模型,应用于管道泄漏检测。同时,分类检测结果与小样本智能分类方法SVM进行了比较,实验结果表明,基于GAN和CNN构建的智能分类模型可显著提高管道法兰连接松动泄漏检测精度。  相似文献   

11.
An acoustic vector sensor provides measurements of both the pressure and particle velocity of a sound field in which it is placed. These measurements are vectorial in nature and can be used for the purpose of source localization. A straightforward approach towards determining the direction of arrival (DOA) utilizes the acoustic intensity vector, which is the product of pressure and particle velocity. The accuracy of an intensity vector based DOA estimator in the presence of noise has been analyzed previously. In this paper, the effects of reverberation upon the accuracy of such a DOA estimator are examined. It is shown that particular realizations of reverberation differ from an ideal isotropically diffuse field, and induce an estimation bias which is dependent upon the room impulse responses (RIRs). The limited knowledge available pertaining the RIRs is expressed statistically by employing the diffuse qualities of reverberation to extend Polack's statistical RIR model. Expressions for evaluating the typical bias magnitude as well as its probability distribution are derived.  相似文献   

12.
一种频域合成房间频率响应的人工混响方法   总被引:1,自引:1,他引:0       下载免费PDF全文
给出了一种频域合成房间频率响应的方法用于卷积法人工混响,基于频域内房间频率响应的后期部分为高斯随机过程的假设,用自回归滑动平均模型为其自协方差函数和功率谱密度进行参数化描述,在对自回归滑动平均模型中的参数求解后,通过逆滤波得到了房间频率响应后期部分,与房间频率响应前期部分组合后经过傅里叶反变换得到完整的房间脉冲响应。仿真结果表明该方法的混响效果与镜像源法接近,明显优于反馈延迟网络法,但其计算复杂度比镜像源法低,便于实时应用。  相似文献   

13.
Room impulse responses (RIRs) are used very widely to characterize the acoustic conditions of rooms, such as in the derivation of reverberation time, early decay time and clarity index. This study investigates the subjective decay rate (or reverberance) of RIRs when directly listened to (rather than convolved with a dry signal such as speech or music). Through a subjective experiment, it investigates the effects of gain (or listening level) and background noise level on the reverberance of RIRs that had been measured in three concert auditoria. The task of the experiment was to match the decay rate of RIRs to that of a reference RIR by ear, by adjusting the RIRs’ exponential decay rate. Based on objective loudness modeling, gain should have a positive effect on reverberance, and background noise has a negative effect. This is confirmed in the results of the experiment. Furthermore, the objectively calculated loudness decay function provides an effective predictor of subjective decay rate, which performs better than conventional early decay time or reverberation time for the RIRs tested.  相似文献   

14.
The methods investigated for the room volume estimation are based on geometrical acoustics, eigenmode, and diffuse field models and no data other than the room impulse response are available. The measurements include several receiver positions in a total of 12 rooms of vastly different sizes and acoustic characteristics. The limitations in identifying the pivotal specular reflections of the geometrical acoustics model in measured room impulse responses are examined both theoretically and experimentally. The eigenmode method uses the theoretical expression for the Schroeder frequency and the difficulty of accurately estimating this frequency from the varying statistics of the room transfer function is highlighted. Reliable results are only obtained with the diffuse field model and a part of the observed variance in the experimental results is explained by theoretical expressions for the standard deviation of the reverberant sound pressure and the reverberation time. The limitations due to source and receiver directivity are discussed and a simple volume estimation method based on an approximate relationship with the reverberation time is also presented.  相似文献   

15.
The reverberation time (RT) is an important parameter for characterizing the quality of an auditory space. Sounds in reverberant environments are subject to coloration. This affects speech intelligibility and sound localization. Many state-of-the-art audio signal processing algorithms, for example in hearing-aids and telephony, are expected to have the ability to characterize the listening environment, and turn on an appropriate processing strategy accordingly. Thus, a method for characterization of room RT based on passively received microphone signals represents an important enabling technology. Current RT estimators, such as Schroeder's method, depend on a controlled sound source, and thus cannot produce an online, blind RT estimate. Here, a method for estimating RT without prior knowledge of sound sources or room geometry is presented. The diffusive tail of reverberation was modeled as an exponentially damped Gaussian white noise process. The time-constant of the decay, which provided a measure of the RT, was estimated using a maximum-likelihood procedure. The estimates were obtained continuously, and an order-statistics filter was used to extract the most likely RT from the accumulated estimates. The procedure was illustrated for connected speech. Results obtained for simulated and real room data are in good agreement with the real RT values.  相似文献   

16.
The decay function for the evaluation of the reverberation time is often obtained by the method of the backward integration of a squared room impulse response as suggested by M.R. Schroeder more than four decades ago. Since then much work has been published about its implementation. However, soon after the initial exploitation of the method, it was realized that the effects of the background noise contaminating the room impulse response required a careful consideration for accomplishing better results.This paper describes an alternative method dealing with the problem of the backward integration of noisy room impulse responses. This method is based on the processing of two impulse responses sequentially recorded for a fixed source and receiver arrangement in a room. Statistical criteria are proposed to identify how the effect of the noise corrupts the level decay curve using a noise-free synthesized room impulse response as well as measurements performed in a real room.  相似文献   

17.
This paper presents a method of calculating sound build up, steady state and sound reduction phenomena from the impulse response of rooms. The noise components of both the testing signal and the room response are omitted and wave phenomena occurring in the room are also neglected. A situation corresponding to the geometrical propagation of sound is thus simulated. The resulting formulae are an extension of corresponding methods for the numerical modelling of acoustical fields in rooms. In this way, as well as the impulse response, sound build up and reverberation curves may also be obtained. An example using the ray tracing technique is presented.  相似文献   

18.
In everyday listening, both background noise and reverberation degrade the speech signal. Psychoacoustic evidence suggests that human speech perception under reverberant conditions relies mostly on monaural processing. While speech segregation based on periodicity has achieved considerable progress in handling additive noise, little research in monaural segregation has been devoted to reverberant scenarios. Reverberation smears the harmonic structure of speech signals, and our evaluations using a pitch-based segregation algorithm show that an increase in the room reverberation time causes degraded performance due to weakened periodicity in the target signal. We propose a two-stage monaural separation system that combines the inverse filtering of the room impulse response corresponding to target location and a pitch-based speech segregation method. As a result of the first stage, the harmonicity of a signal arriving from target direction is partially restored while signals arriving from other directions are further smeared, and this leads to improved segregation. A systematic evaluation of the system shows that the proposed system results in considerable signal-to-noise ratio gains across different conditions. Potential applications of this system include robust automatic speech recognition and hearing aid design.  相似文献   

19.
The localization of sound sources, and particularly speech, has a numerous number of applications to the industry. This has motivated a continuous effort in developing robust direction-of-arrival detection algorithms, in order to overcome the limitations imposed by real scenarios, such as multiple reflections and undesirable noise sources. Time difference of arrival-based methods, and particularly, generalized cross-correlation approaches have been widely investigated in acoustic signal processing, but there is considerable lack in the technical literature about their evaluation in real environments when only two microphones are used. In this work, four generalized cross-correlation methods for localization of speech sources with two microphones have been analyzed in different real scenarios with a stationary noise source. Furthermore, these scenarios have been acoustically characterized, in order to relate the behavior of these cross-correlation methods with the acoustic properties of noisy scenarios. The scope of this study is not only to assess the accuracy and reliability of a set of well-known localization algorithms, but also to determine how the different acoustic properties of the room under analysis have a determinant influence in the final results, by incorporating in the analysis additional factors to the reverberation time and signal-to-noise ratio. Results of this study have outlined the influence of the acoustic properties analysed in the performance of these methods.  相似文献   

20.
The reliability of algorithms for room acoustic simulations has often been confirmed on the basis of the verification of predicted room acoustical parameters. This paper presents a complementary perceptual validation procedure consisting of two experiments, respectively dealing with speech intelligibility, and with sound source front–back localisation.The evaluated simulation algorithm, implemented in software ODEON®, is a hybrid method that is based on an image source algorithm for the prediction of early sound reflection and on ray-tracing for the later part, using a stochastic scattering process with secondary sources. The binaural room impulse response (BRIR) is calculated from a simulated room impulse response where information about the arriving time, intensity and spatial direction of each sound reflection is collected and convolved with a measured Head Related Transfer Function (HRTF). The listening stimuli for the speech intelligibility and localisation tests are auralised convolutions of anechoic sound samples with measured and simulated BRIRs.Perception tests were performed with human subjects in two acoustical environments, i.e. an anechoic and reverberant room, by presenting the stimuli to subjects in a natural way, and via headphones by using two non-individualized HRTFs (artificial head and hearing aids placed on the ears of the artificial head) of both a simulated and a real room.Very good correspondence is found between the results obtained with simulated and measured BRIRs, both for speech intelligibility in the presence of noise and for sound source localisation tests. In the anechoic room an increase in speech intelligibility is observed when noise and signal are presented from sources located at different angles. This improvement is not so evident in the reverberant room, with the sound sources at 1-m distance from the listener. Interestingly, the performance of people for front–back localisation is better in the reverberant room than in the anechoic room.The correlation between people’s ability for sound source localisation on one hand, and their ability for recognition of binaurally received speech in reverberation on the other hand, is found to be weak.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号