期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

1.

Speech enhancement using linearly constrained adaptive constant directivity beam-formers

Avid Avokh 《Applied Acoustics》2010,71(3):262-268

This paper aims to extend previous work on constant directivity beam-formers (CDBs), for the case of multiple desired speech sources, by designing a linearly constrained adaptive CDB (LCA-CDB) which preserves the beam-pattern in multiple look directions. Also, the proposed LCA-CDB, adaptively, minimizes the transient noise power in the output of the beam-former, and furthermore, produces some controlled nulls (controlled in both amplitude and angle) on the beam-pattern. This strengthens the system in removing permanent directional noises and producing a frequency-invariant beam-pattern with multiple main-lobes and controlled nulls in arbitrary frequency bands. Through simulating the system and the acoustical situations, the authors have tried to demonstrate the capability of the proposed method in enhancement of broadband and telephony speech in the presence of various noise sources (transient noise, permanent noise and uncorrelated white Gaussian noise). The simulation results obtained in this study confirm the efficiency of the proposed method in suppression of environmental noises. 相似文献

2.

Uncertainties caused by source directivity in room-acoustic investigations

San Martín R Arana M 《The Journal of the Acoustical Society of America》2008,123(6):EL133-EL138

Although deviations in the measurement of acoustic parameters should be lower than the subjectively perceivable change in the corresponding parameter measured, this study reflects that directionality of sound sources could cause wide audience areas to break away from this criterion at high frequencies, even when using dodecahedron loudspeakers which meet the requirements of the ISO 3382 standard. The directivity of four different acoustic sources was measured and the influence of its accurate orientation spatially quantified in five enclosures for speech and music. By means of simulation software, the number of receivers affected by uncertainties greater than difference limens was established. 相似文献

3.

Monaural room acoustic parameters from music and speech

Kendrick P Cox TJ Li FF Zhang Y Chambers JA 《The Journal of the Acoustical Society of America》2008,124(1):278-287

This paper compares two methods for extracting room acoustic parameters from reverberated speech and music. An approach which uses statistical machine learning, previously developed for speech, is extended to work with music. For speech, reverberation time estimations are within a perceptual difference limen of the true value. For music, virtually all early decay time estimations are within a difference limen of the true value. The estimation accuracy is not good enough in other cases due to differences between the simulated data set used to develop the empirical model and real rooms. The second method carries out a maximum likelihood estimation on decay phases at the end of notes or speech utterances. This paper extends the method to estimate parameters relating to the balance of early and late energies in the impulse response. For reverberation time and speech, the method provides estimations which are within the perceptual difference limen of the true value. For other parameters such as clarity, the estimations are not sufficiently accurate due to the natural reverberance of the excitation signals. Speech is a better test signal than music because of the greater periods of silence in the signal, although music is needed for low frequency measurement. 相似文献

4.

Beamformer performance with acoustic vector sensors in air 总被引：1，自引：0，他引：1

Lockwood ME Jones DL 《The Journal of the Acoustical Society of America》2006,119(1):608-619

For some time, compact acoustic vector sensors (AVSs) capable of sensing particle velocity in three orthogonal directions have been used in underwater acoustic sensing applications. Potential advantages of using AVSs in air include substantial noise reduction with a very small aperture and few channels. For this study, a four-microphone array approximating a small (1 cm3) AVS in air was constructed using three gradient microphones and one omnidirectional microphone. This study evaluates the signal extraction performance of one nonadaptive and four adaptive beamforming algorithms. Test signals, consisting of two to five speech sources, were processed with each algorithm, and the signal extraction performance was quantified by calculating the signal-to-noise ratio (SNR) of the output. For a three-microphone array, robust and nonrobust versions of a frequency-domain minimum-variance (FMV) distortionless-response beamformer produced SNR improvements of 11 to 14 dB, and a generalized sidelobe canceller (GSC) produced improvements of 5.5 to 8.5 dB. In comparison, a two-microphone omnidirectional array with a spacing of 15 cm yielded slightly lower SNR improvements for similar multi-interferer speech signals. 相似文献

5.

基于修正Mel域掩蔽模型和无语音概率的耳语音增强 总被引：1，自引：0，他引：1

陶智赵鹤鸣吴迪陈大庆张晓俊《声学学报》2009,34(4):370-377

提出了一种基于修正Mel域听觉掩蔽模型和无语音概率的耳语音增强方法。该方法根据耳语音的发音特点对Mel频率进行修正,对每一帧耳语音信号进行Mel域频带滤波,同时通过无语音概率(SAP)动态地确定每个频带的听觉掩蔽阈值,对不同的听觉掩蔽阈值自适应地调整谱减系数来进行耳语音增强。对增强后的耳语音进行客观和主观测试,结果表明,该方法与其它谱减法相比,能将残留噪声和背景噪声控制在人耳掩蔽阈值下,取得更小的语音失真,主观听觉也得到了很大的改善。相似文献

6.

Syllable intelligibility for temporally filtered LPC cepstral trajectories.

T Arai M Pavel H Hermansky C Avendano 《The Journal of the Acoustical Society of America》1999,105(5):2783-2791

The intelligibility of syllables whose cepstral trajectories were temporally filtered was measured. The speech signals were transformed to their LPC cepstral coefficients, and these coefficients were passed through different filters. These filtered trajectories were recombined with the residuals and the speech signal reconstructed. The intelligibility of the reconstructed speech segments was then measured in two perceptual experiments for Japanese syllables. The effect of various low-pass, high-pass, and bandpass filtering is reported, and the results summarized using a theoretical approach based on the independence of the contributions in different modulation bands. The overall results suggest that speech intelligibility is not severely impaired as long as the filtered spectral components have a rate of change between 1 and 16 Hz. 相似文献

7.

Perceived naturalness of spectrally distorted speech and music

Moore BC Tan CT 《The Journal of the Acoustical Society of America》2003,114(1):408-419

We determined how the perceived naturalness of music and speech (male and female talkers) signals was affected by various forms of linear filtering, some of which were intended to mimic the spectral "distortions" introduced by transducers such as microphones, loudspeakers, and earphones. The filters introduced spectral tilts and ripples of various types, variations in upper and lower cutoff frequency, and combinations of these. All of the differently filtered signals (168 conditions) were intermixed in random order within one block of trials. Levels were adjusted to give approximately equal loudness in all conditions. Listeners were required to judge the perceptual quality (naturalness) of the filtered signals on a scale from 1 to 10. For spectral ripples, perceived quality decreased with increasing ripple density up to 0.2 ripple/ERB(N) and with increasing ripple depth. Spectral tilts also degraded quality, and the effects were similar for positive and negative tilts. Ripples and/or tilts degraded quality more when they extended over a wide frequency range (87-6981 Hz) than when they extended over subranges. Low- and mid-frequency ranges were roughly equally important for music, but the mid-range was most important for speech. For music, the highest quality was obtained for the broadband signal (55-16,854 Hz). Increasing the lower cutoff frequency from 55 Hz resulted in a clear degradation of quality. There was also a distinct degradation as the upper cutoff frequency was decreased from 16,845 Hz. For speech, there was a marked degradation when the lower cutoff frequency was increased from 123 to 208 Hz and when the upper cutoff frequency was decreased from 10,869 Hz. Typical telephone bandwidth (313 to 3547 Hz) gave very poor quality. 相似文献

8.

基于Hilbert变换的水下多源声信号频率的相干探测

张晓琳李开琴刘刚张烈山唐文彦《光子学报》2017,46(7)

为实现水下中低频声信号的探测识别,通过研究水下多声源相干探测信号的特征,理论上给出了相干探测信号频谱混叠情况下的特征表达式,并提出了一种基于Hilbert变换的信号解调处理方法,实现了水下多声源相干探测信号频谱混叠情况下各声源发声频率的解调.该方法将探测信号经过滤波平滑处理之后进行Hilbert变换,得到信号的解析形式,然后对解析信号模值的平方进行二次滤波平滑等处理,分离混叠在一起的频带,将得到的信号进行频谱分析,根据频移值计算得到水下各个声源的发声频率.在光学暗室下搭建激光相干探测系统,对2~6kHz的水下声信号进行实验,实验结果表明,该方法可以有效分离探测信号中混叠在一起的信号频带,并准确提取各水下声信号的发声频率,频率提取重复性不大于2.5Hz. 相似文献

9.

Shear Wave Field Radiated by an Electromagnetic Acoustic Transducer 总被引：1，自引：0，他引：1

下载免费PDF全文

吴迪李明轩王小民《中国物理快报》2006,23(12):3294-3296

The horizontally polarized ultrasonic shear wave field emitted by an electromagnetic acoustic transducer （EMAT） is studied by the surface force distribution on the EMAT approximately described as an inhomogeneous horizontal shear force. The shear wave directivity pattern is plotted by numerical calculations based on our strictly analytic solutions of the wave field we presented previously. An experimental system of EMAT generation and piezoelectric transducer reception is set up to check the predictions of the theoretical wave field by measuring the ultrasonic signals through aluminium block. The directivity pattern of the wave field obtained from the experimental results conforms the theoretical prediction, which lays a foundation for engineering applications of EMATs. 相似文献

10.

Accurate analysis of multitone signals using a DFT

Burgess JC 《The Journal of the Acoustical Society of America》2004,116(1):389-395

Optimum data windows make it possible to determine accurately the amplitude, phase, and frequency of one or more tones (sinusoidal components) in a signal. Procedures presented in this paper can be applied to noisy signals, signals having moderate nonstationarity, and tones close in frequency. They are relevant to many areas of acoustics where sounds are quasistationary. Among these are acoustic probes transmitted through media and natural sounds, such as animal vocalization, speech, and music. The paper includes criteria for multitone FFT block design and an example of application to sound transmission in the atmosphere. 相似文献

11.

体育馆声学设计探讨

下载免费PDF全文

柳孝图《应用声学》1996,15(1):20-25

本文分析了我国综合性体育馆共同的体形特征及其所导致的运用建筑声学时的出现的若干问题，探讨了有关的声学标准，并以工程实践为例，说明必须依靠建筑声学设计和电声设计的结合，才能作好体育馆的声学设计。相似文献

12.

体育馆声学设计探讨 总被引：1，自引：0，他引：1

下载免费PDF全文

柳孝图《应用声学》1996,15(1):20-25

本文分析了我国综合性体育馆共同的体形特征及其所导致的运用建筑本学时出现的若干问题，探讨了有关的声学标准，并以工程实践为例，说明必须依靠建筑产学设计和电声设计的结合，才能作好体育馆的声学设计．相似文献

13.

Acoustic comfort in large dining spaces

Xi Chen Jian Kang 《Applied Acoustics》2017

This study carried out a questionnaire field investigation in two typical large dining spaces. The results suggest that the acoustic comfort of diners has an influence on the comfort evaluation of the overall dining environment, and background noise is an important factor affecting the acoustic comfort evaluation of diners. The role of various individual sound sources in background noise has been investigated, considering general background music, speech sound, activity sound, and mechanical noise, and it has been revealed that background music, other diners’ speech sound and tableware’s impact sound has a dominant impact on the acoustic comfort evaluation of diners. Compared with the existence of background music in background noise, diners’ acoustic comfort evaluation is higher than that without background music. The loudness, articulation, noise level and preference degree of various individual sound sources are factors which affect diners’ acoustic comfort evaluation on sound sources. In terms of demographic and social factors, gender and the frequency of dining out have a significant impact on diners’ acoustic comfort evaluation. 相似文献

14.

Identification of two simultaneous partial discharge sources in an oil-pressboard insulation system using acoustic emission techniques

Prasanta Kundu N.K. Kishore A.K. Sinha 《Applied Acoustics》2012,73(4):395-401

Insulation failure is one of the major causes of catastrophic failure of transformers. It is established that partial discharge (PD) causes insulation degradation and premature failure of insulation. In power apparatus, more than one PD source may be active simultaneously. The nature of insulation degradation for multiple PD sources is different from that due to single PD source. Therefore, it will be helpful for severity assessment of insulation degradation, if the number of active PD sources are identified and classified. This paper presents a method for identification and classification of two simultaneously active PD sources using acoustic emission techniques. The acoustic emission (AE) signals are measured for laboratory simulated PD in an oil-pressboard insulation system for three different electrode systems. The measurements of partial discharge acoustic emission (PDAE) signals are carried out for single PD source and for two simultaneous PD sources. The measured signals are analyzed using discrete wavelet transform (DWT), box counting fractal dimension and lacunarity. Box counting fractal dimension and lacunarity are calculated for DWT decomposed signal of major frequency band. Energy distribution in different frequency bands of DWT decomposed signal along with box counting fractal dimension and lacunarity is used for classification of two simultaneous PD sources. 相似文献

15.

Improvements in intelligibility of noisy reverberant speech using a binaural subband adaptive noise-cancellation processing scheme.

P W Shields D R Campbell 《The Journal of the Acoustical Society of America》2001,110(6):3232-3242

This article reports on the performance of an adaptive subband noise cancellation scheme, which performs binaural preprocessing of speech signals for a hearing-aid application. The multi-microphone subband adaptive (MMSBA) signal processing scheme uses the least mean squares (LMS) algorithm in frequency-limited subbands. The use of subbands enables a diverse processing mechanism to be employed, splitting the two-channel wide-band signal into smaller frequency-limited subbands, which can be processed according to their individual signal characteristics. The frequency delimiting used a linear- or cochlear-spaced subband distribution. The effect of the processing scheme on speech intelligibility was assessed in a trial involving 15 hearing-impaired volunteers with moderate sensorineural hearing loss. The acoustic material consisted of speech and speech-shaped noise signals, generated using simulated and real-room acoustic environments, at signal-to-noise ratios (SNRs) in the range -6 to +3 dB. The results show that the MMSBA scheme delivered average speech intelligibility improvements of 11.5%, with a maximum of 37.25%, in noisy reverberant conditions. There was no significant reduction in mean speech intelligibility due to processing, in any of the test conditions. 相似文献

16.

Across-ear interference from parametrically degraded synthetic speech signals in a dichotic cocktail-party listening task

Brungart DS Simpson BD Darwin CJ Arbogast TL Kidd G 《The Journal of the Acoustical Society of America》2005,117(1):292-304

Recent results have shown that listeners attending to the quieter of two speech signals in one ear (the target ear) are highly susceptible to interference from normal or time-reversed speech signals presented in the unattended ear. However, speech-shaped noise signals have little impact on the segregation of speech in the opposite ear. This suggests that there is a fundamental difference between the across-ear interference effects of speech and nonspeech signals. In this experiment, the intelligibility and contralateral-ear masking characteristics of three synthetic speech signals with parametrically adjustable speech-like properties were examined: (1) a modulated noise-band (MNB) speech signal composed of fixed-frequency bands of envelope-modulated noise; (2) a modulated sine-band (MSB) speech signal composed of fixed-frequency amplitude-modulated sinewaves; and (3) a "sinewave speech" signal composed of sine waves tracking the first four formants of speech. In all three cases, a systematic decrease in performance in the two-talker target-ear listening task was found as the number of bands in the contralateral speech-like masker increased. These results suggest that speech-like fluctuations in the spectral envelope of a signal play an important role in determining the amount of across-ear interference that a signal will produce in a dichotic cocktail-party listening task. 相似文献

17.

一种基于模式识别的多路盲语音提取方法

下载免费PDF全文

徐舜刘郁林柏森《应用声学》2008,27(3):173-180

盲分离算法能在缺少混合系统参数的条件下仅由观测信号估计初始源,但分离信号存在固有的排列模糊性,这往往导致两次批处理过程中同一信号"对不准",因此很难获得连续的源信号。本文针对盲声源分离中存在的相同问题,根据语音和其他音频信号的特征差异,提出一种修正的自相关函数并以其值作为一个特征基元来表征声音信号的时序相关特性,同时用平均声门波形状参数作为另一个特征基元来表征语音产生的生理效应。以这两个参数作为识别不同音频信号的二维模式特征,采用一种模糊聚类算法提取多路盲分离语音。本方法有效克服了批处理盲声源分离中的信号排列顺序的不确定性,并通过选择合适的阈值提取多路连续语音。仿真给出了5路混合音频信号中盲提取两路连续语音的实验结果。相似文献

18.

The directivity pattern of an interferometer for measuring thermal acoustic radiation

V. I. Passechnik 《Acoustical Physics》2002,48(5):589-597

The directivity patterns of a pair of piezoelectric transducers for measuring the spatial correlation function of sound pressures produced by sources of thermal acoustic radiation in the megahertz frequency range are calculated. Sources in the form of a heated plane or strip are considered. The signal detection by two circular or rectangular piezoelectric transducers and by focusing transducers is studied. It is demonstrated that, for measuring the correlation function, the piezoelectric transducers must partially overlap. To determine the directivity pattern with a strong dependence on the distance between the heated object and the pair of piezoelectric transducers, focusing piezoelectric transducers should be used. The results obtained offer possibilities for a noninvasive measurement of the absorption coefficient of a medium and also for the realization of the previously proposed [20] passive acoustic thermotomograph, which does not use a priori information on the absorption coefficient of the medium. 相似文献

19.

语后聋人工耳蜗使用者电刺激听觉部位音调感知研究

下载免费PDF全文

平利川原猛唐国芳冯海泓《声学学报》2012,37(2):204-208

系统地研究了人工耳蜗植入者的电刺激听觉部位音调感知,全面地探讨了部位音调感知与人工耳蜗植入者言语识别和音乐感知的关系。4位成人语后聋人工耳蜗植入者参与了该研究。通过电极音调排序测试度量植入者的部位音调感知能力。言语能力测试和音乐音高分辨测试分别用米考察植入者的言语识别和音乐感知能力。结果显示,随着电极刺激部位从蜗尖移向蜗底,所有受试者均可获得从"低"到"高"的音调感知变化,但个体差异较大。受试者的言语识别结果与其电刺激听觉部位音调感知能力相关,但受到天花板效应影响,对应关系并不明显。受试者的音乐音高分辨成绩与其电刺激听觉部位音调感知能力呈较好的对应关系。结果表明,当前人工耳蜗声音编码策略所传递的声信号特征已可使植入者获得良好的言语识别效果;且安静环境下言语识别对植入者的部位音调感知能力要求不高。但当前的声音编码策略并未能有效对音乐信号进行编码;植入者在理解音乐这类复杂声信号时,其电刺激听觉部位音调感知能力一定程度决定了其听音效果。相似文献

20.

Place-pitch perception in electrical hearing with post-lingually deafened cochlear implant users

《声学学报：英文版》2012,(4):482-488

The main goal of this study was to systematically investigate place-pitch perception in electrical hearing and the relative relationship between place-pitch perception ability,speech understanding and musical pitch discrimination by cochlear implant(CI) users.Electrode pitch ranking test was carried out to evaluate the place-pitch perception ability of CI users. Four post-lingually deafened CI users were recruited.They also participated in the speech recognition test and musical pitch discrimination test.Results showed that place pitch were generally ordered from apical to basal electrodes.The apical electrodes were judged lower in pitch than basal electrodes.Large individual difference was found.Comparing pitch and speech performance,the speech recognition result was related to the place-pitch perception ability of CI users,but this relationship was limited by the ceiling effects.However,a correlative relationship was found between musical pitch discrimination result and place-pitch ability of CI users.It indicated that the current signal processing of CI system can provide sufficient information for speech understanding but not for music perception of CI users.To a certain extent,music perception of CI users was determined by their place-pitch abilities. 相似文献