共查询到20条相似文献,搜索用时 15 毫秒
1.
Rhebergen KS Versfeld NJ Dreschler WA 《The Journal of the Acoustical Society of America》2006,120(6):3988-3997
The extension to the speech intelligibility index (SII; ANSI S3.5-1997 (1997)) proposed by Rhebergen and Versfeld [Rhebergen, K.S., and Versfeld, N.J. (2005). J. Acoust. Soc. Am. 117(4), 2181-2192] is able to predict for normal-hearing listeners the speech intelligibility in both stationary and fluctuating noise maskers with reasonable accuracy. The extended SII model was validated with speech reception threshold (SRT) data from the literature. However, further validation is required and the present paper describes SRT experiments with nonstationary noise conditions that are critical to the extended model. From these data, it can be concluded that the extended SII model is able to predict the SRTs for the majority of conditions, but that predictions are better when the extended SII model includes a function to account for forward masking. 相似文献
2.
Siren noises usually severely disturb the intelligibility of voice communication inside the cabs of police, paramedic and fire vehicles. It is often desired that such unwanted noise can be removed from the speech signal. In this paper, a new method is proposed to adaptively cancel siren noises and enhance speech signals. Based on the characteristics of siren noises, an anti-speech filter and a time delayer are employed in the single and dual channel noise cancellation systems to reduce the siren noises. Experiment results demonstrate that the effectiveness of the proposed method for canceling the siren noises and the performance of the enhanced speech signal is satisfying. 相似文献
3.
Auditory and nonauditory factors affecting speech reception in noise by older listeners 总被引:2,自引:0,他引:2
George EL Zekveld AA Kramer SE Goverts ST Festen JM Houtgast T 《The Journal of the Acoustical Society of America》2007,121(4):2362-2375
Speech reception thresholds (SRTs) for sentences were determined in stationary and modulated background noise for two age-matched groups of normal-hearing (N = 13) and hearing-impaired listeners (N = 21). Correlations were studied between the SRT in noise and measures of auditory and nonauditory performance, after which stepwise regression analyses were performed within both groups separately. Auditory measures included the pure-tone audiogram and tests of spectral and temporal acuity. Nonauditory factors were assessed by measuring the text reception threshold (TRT), a visual analogue of the SRT, in which partially masked sentences were adaptively presented. Results indicate that, for the normal-hearing group, the variance in speech reception is mainly associated with nonauditory factors, both in stationary and in modulated noise. For the hearing-impaired group, speech reception in stationary noise is mainly related to the audiogram, even when audibility effects are accounted for. In modulated noise, both auditory (temporal acuity) and nonauditory factors (TRT) contribute to explaining interindividual differences in speech reception. Age was not a significant factor in the results. It is concluded that, under some conditions, nonauditory factors are relevant for the perception of speech in noise. Further evaluation of nonauditory factors might enable adapting the expectations from auditory rehabilitation in clinical settings. 相似文献
4.
Noordhoek IM Houtgast T Festen JM 《The Journal of the Acoustical Society of America》2000,107(3):1685-1696
In a previous study [Noordhoek et al., J. Acoust. Soc. Am. 105, 2895-2902 (1999)], an adaptive test was developed to determine the speech-reception bandwidth threshold (SRBT), i.e., the width of a speech band around 1 kHz required for a 50% intelligibility score. In this test, the band-filtered speech is presented in complementary bandstop-filtered noise. In the present study, the performance of 34 hearing-impaired listeners was measured on this SRBT test and on more common SRT (speech-reception threshold) tests, namely the SRT in quiet, the standard SRT in noise (standard speech spectrum), and the spectrally adapted SRT in noise (fitted to the individual's dynamic range). The aim was to investigate to what extent the performance on these tests could be explained simply from audibility, as estimated with the SII (speech intelligibility index) model, or require the assumption of suprathreshold deficits. For most listeners, an elevated SRT in quiet or an elevated standard SRT in noise could be explained on the basis of audibility. For the spectrally adapted SRT in noise, and especially for the SRBT, the data of most listeners could not be explained from audibility, suggesting that the effects of suprathreshold deficits may be present. Possibly, such a deficit is an increased downward spread of masking. 相似文献
5.
I M Noordhoek T Houtgast J M Festen 《The Journal of the Acoustical Society of America》1999,105(5):2895-2902
An adaptive test has been developed to determine the minimum bandwidth of speech that a listener needs to reach 50% intelligibility. Measuring this speech-reception bandwidth threshold (SRBT), in addition to the more common speech-reception threshold (SRT) in noise, may be useful in investigating the factors underlying impaired suprathreshold speech perception. Speech was bandpass filtered (center frequency: 1 kHz) and complementary bandstop filtered noise was added. To obtain reference values, the SRBT was measured in 12 normal-hearing listeners at four sound-pressure levels, in combination with three overall spectral tilts. Plotting SRBT as a function of sound-pressure level resulted in U-shaped curves. The most narrow SRBT (1.4 octave) was obtained at an A-weighted sound-pressure level of 55 dB. The required bandwidth increases with increasing level, probably due to upward spread of masking. At a lower level (40 dBA) listeners also need a broader band, because parts of the speech signal will be below threshold. The SII (Speech Intelligibility Index) model reasonably predicts the data, although it seems to underestimate upward spread of masking. 相似文献
6.
Speech-in-noise-measurements are important in clinical practice and have been the subject of research for a long time. The results of these measurements are often described in terms of the speech reception threshold (SRT) and SNR loss. Using the basic concepts that underlie several models of speech recognition in steady-state noise, the present study shows that these measures are ill-defined, most importantly because the slope of the speech recognition functions for hearing-impaired listeners always decreases with hearing loss. This slope can be determined from the slope of the normal-hearing speech recognition function when the SRT for the hearing-impaired listener is known. The SII-function (i.e., the speech intelligibility index (SII) against SNR) is important and provides insights into many potential pitfalls when interpreting SRT data. Standardized SNR loss, sSNR loss, is introduced as a universal measure of hearing loss for speech in steady-state noise. Experimental data demonstrates that, unlike the SRT or SNR loss, sSNR loss is invariant to the target point chosen, the scoring method or the type of speech material. 相似文献
7.
自适应降噪系统的语音分离研究 总被引:1,自引:1,他引:1
对抑制干扰语音的自适应降噪系统(CTRANC)进行了语音分离算法的研究。利用CTRANC抑制干扰信号的特性及语音信号的短时稳定性,借助最优控制相关理论,得到了新的语音分离方法及其自适应滤波迭代步长的计算公式。实验结果表明在双话者的情况下,这种语音分离算法能够保证自适应语音分离系统具有良好的稳定性,较好的实时跟踪能力和收敛性能,且分离出来的语音具有令人满意的清晰度。 相似文献
8.
A key problem for telecommunication or human-machine communication systems concerns speech enhancement in noise. In this domain, a certain number of techniques exist, all of them based on an acoustic-only approach--that is, the processing of the audio corrupted signal using audio information (from the corrupted signal only or additive audio information). In this paper, an audio-visual approach to the problem is considered, since it has been demonstrated in several studies that viewing the speaker's face improves message intelligibility, especially in noisy environments. A speech enhancement prototype system that takes advantage of visual inputs is developed. A filtering process approach is proposed that uses enhancement filters estimated with the help of lip shape information. The estimation process is based on linear regression or simple neural networks using a training corpus. A set of experiments assessed by Gaussian classification and perceptual tests demonstrates that it is indeed possible to enhance simple stimuli (vowel-plosive-vowel sequences) embedded in white Gaussian noise. 相似文献
9.
Effect of spectral envelope smearing on speech reception. I. 总被引:2,自引:0,他引:2
M ter Keurs J M Festen R Plomp 《The Journal of the Acoustical Society of America》1992,91(5):2872-2880
The effect of reduced spectral contrast on the speech-reception threshold (SRT) for sentences in noise and on phoneme identification, was investigated with 16 normal-hearing subjects. Signal processing was performed by smoothing the envelope of the squared short-time fast Fourier transform (FFT) by convolving it with a Gaussian-shaped filter, and overlapping additions to reconstruct a continuous signal. Spectral energy in the frequency region from 100 to 8000 Hz was smeared over bandwidths of 1/8, 1/4, 1/3, 1/2, 1, 2, and 4 oct for the SRT experiment. Vowel and consonant identification was studied for smearing bandwidths of 1/8, 1/2, and 2 oct. Results showed the SRT in noise to increase as the spectral energy was smeared over bandwidths exceeding the ear's critical bandwidth. Vowel identification suffered more from this type of processing than consonant identification. Vowels were primarily confused with the back vowels /c,u/, and consonants were confused where place of articulation is concerned. 相似文献
10.
11.
12.
分析了Shack-Hartmann 波前传感器(S-H WFS)在实际大气条件下,大气湍流波前相位的探测误差在自适应光学系统(AOS)中的传递过程以及最后的控制残余方差,导出了定量分析的数学模型,并给出了分析结果。结果表明,当S-H WFS用于微弱信标光大气湍流的探测时,自适应光学系统中的控制斜率残余误差中除了前人分析[1]的误差外还包含一项由天空背景光斑质心位置引起的常数误差值,并且系统的有效控制带宽会因信标探测对比度的下降而减小,这将大大降低AOS的校正能力。分析结果还表明信标光越弱,对S-H WFS的标定光学系统的像差要求越高。 相似文献
13.
The effect of head-induced interaural time delay (ITD) and interaural level differences (ILD) on binaural speech intelligibility in noise was studied for listeners with symmetrical and asymmetrical sensorineural hearing losses. The material, recorded with a KEMAR manikin in an anechoic room, consisted of speech, presented from the front (0 degree), and noise, presented at azimuths of 0 degree, 30 degrees, and 90 degrees. Derived noise signals, containing either only ITD or only ILD, were generated using a computer. For both groups of subjects, speech-reception thresholds (SRT) for sentences in noise were determined as a function of: (1) noise azimuth, (2) binaural cue, and (3) an interaural difference in overall presentation level, simulating the effect of a monaural hearing acid. Comparison of the mean results with corresponding data obtained previously from normal-hearing listeners shows that the hearing impaired have a 2.5 dB higher SRT in noise when both speech and noise are presented from the front, and 2.6-5.1 dB less binaural gain when the noise azimuth is changed from 0 degree to 90 degrees. The gain due to ILD varies among the hearing-impaired listeners between 0 dB and normal values of 7 dB or more. It depends on the high-frequency hearing loss at the side presented with the most favorable signal-to-noise (S/N) ratio. The gain due to ITD is nearly normal for the symmetrically impaired (4.2 dB, compared with 4.7 dB for the normal hearing), but only 2.5 dB in the case of asymmetrical impairment. When ITD is introduced in noise already containing ILD, the resulting gain is 2-2.5 dB for all groups. The only marked effect of the interaural difference in overall presentation level is a reduction of the gain due to ILD when the level at the ear with the better S/N ratio is decreased. This implies that an optimal monaural hearing aid (with a moderate gain) will hardly interfere with unmasking through ITD, while it may increase the gain due to ILD by preventing or diminishing threshold effects. 相似文献
14.
The present study investigated changes in the prosodic and acoustic-phonetic features of isolated words by four male talkers speaking in quite and in pink noise at a level of 95 dB SPL. Speech samples were collected both with and without an oxygen mask. Changes in duration, fundamental frequency, total energy, and formant center frequency were analyzed. In addition to the expected changes of increased pitch and amplitude associated with speaking in noise without an oxygen mask, significant effects were found (particularly in the formant center frequencies) as a result of using the oxygen mask. When the oxygen mask was employed, no further significant changes were caused by adding noise to the speaking situation. 相似文献
15.
16.
分析了Shack-Hartmann 波前传感器(S-H WFS)在实际大气条件下,大气湍流波前相位的探测误差在自适应光学系统(AOS)中的传递过程以及最后的控制残余方差,导出了定量分析的数学模型,并给出了分析结果。结果表明,当S-H WFS用于微弱信标光大气湍流的探测时,自适应光学系统中的控制斜率残余误差中除了前人分析[1]的误差外还包含一项由天空背景光斑质心位置引起的常数误差值,并且系统的有效控制带宽会因信标探测对比度的下降而减小,这将大大降低AOS的校正能力。分析结果还表明信标光越弱,对S-H WFS的标定光学系统的像差要求越高。 相似文献
17.
Responses of auditory-nerve fibers in anesthetized cats to nine different spoken stop- and nasal-consonant/vowel syllables presented at 70 dB SPL in various levels of speech-shaped noise [signal-to-noise (S/N) ratios of 30, 20, 10, and 0 dB] are reported. The temporal aspects of speech encoding were analyzed using spectrograms. The responses of the "lower-spontaneous-rate" fibers (less than 20/s) were found to be more limited than those of the high-spontaneous-rate fibers. The lower-spontaneous-rate fibers did not encode noise-only portions of the stimulus at the lowest noise level (S/N = 30 dB) and only responded to the consonant if there was a formant or major spectral peak near its characteristic frequency. The fibers' responses at the higher noise levels were compared to those obtained at the lowest noise level using the covariance as a quantitative measure of signal degradation. The lower-spontaneous-rate fibers were found to preserve more of their initial temporal encoding than high-spontaneous-rate fibers of the same characteristic frequency. The auditory-nerve fibers' responses were also analyzed for rate-place encoding of the stimuli. The results are similar to those found for temporal encoding. 相似文献
18.
19.
20.
The purpose of the present study was to examine the benefits of providing audible speech to listeners with sensorineural hearing loss when the speech is presented in a background noise. Previous studies have shown that when listeners have a severe hearing loss in the higher frequencies, providing audible speech (in a quiet background) to these higher frequencies usually results in no improvement in speech recognition. In the present experiments, speech was presented in a background of multitalker babble to listeners with various severities of hearing loss. The signal was low-pass filtered at numerous cutoff frequencies and speech recognition was measured as additional high-frequency speech information was provided to the hearing-impaired listeners. It was found in all cases, regardless of hearing loss or frequency range, that providing audible speech resulted in an increase in recognition score. The change in recognition as the cutoff frequency was increased, along with the amount of audible speech information in each condition (articulation index), was used to calculate the "efficiency" of providing audible speech. Efficiencies were positive for all degrees of hearing loss. However, the gains in recognition were small, and the maximum score obtained by an listener was low, due to the noise background. An analysis of error patterns showed that due to the limited speech audibility in a noise background, even severely impaired listeners used additional speech audibility in the high frequencies to improve their perception of the "easier" features of speech including voicing. 相似文献