首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
In a previous study [Noordhoek et al., J. Acoust. Soc. Am. 105, 2895-2902 (1999)], an adaptive test was developed to determine the speech-reception bandwidth threshold (SRBT), i.e., the width of a speech band around 1 kHz required for a 50% intelligibility score. In this test, the band-filtered speech is presented in complementary bandstop-filtered noise. In the present study, the performance of 34 hearing-impaired listeners was measured on this SRBT test and on more common SRT (speech-reception threshold) tests, namely the SRT in quiet, the standard SRT in noise (standard speech spectrum), and the spectrally adapted SRT in noise (fitted to the individual's dynamic range). The aim was to investigate to what extent the performance on these tests could be explained simply from audibility, as estimated with the SII (speech intelligibility index) model, or require the assumption of suprathreshold deficits. For most listeners, an elevated SRT in quiet or an elevated standard SRT in noise could be explained on the basis of audibility. For the spectrally adapted SRT in noise, and especially for the SRBT, the data of most listeners could not be explained from audibility, suggesting that the effects of suprathreshold deficits may be present. Possibly, such a deficit is an increased downward spread of masking.  相似文献   

2.
Speech-in-noise-measurements are important in clinical practice and have been the subject of research for a long time. The results of these measurements are often described in terms of the speech reception threshold (SRT) and SNR loss. Using the basic concepts that underlie several models of speech recognition in steady-state noise, the present study shows that these measures are ill-defined, most importantly because the slope of the speech recognition functions for hearing-impaired listeners always decreases with hearing loss. This slope can be determined from the slope of the normal-hearing speech recognition function when the SRT for the hearing-impaired listener is known. The SII-function (i.e., the speech intelligibility index (SII) against SNR) is important and provides insights into many potential pitfalls when interpreting SRT data. Standardized SNR loss, sSNR loss, is introduced as a universal measure of hearing loss for speech in steady-state noise. Experimental data demonstrates that, unlike the SRT or SNR loss, sSNR loss is invariant to the target point chosen, the scoring method or the type of speech material.  相似文献   

3.
Speech reception thresholds (SRTs) for sentences were determined in stationary and modulated background noise for two age-matched groups of normal-hearing (N = 13) and hearing-impaired listeners (N = 21). Correlations were studied between the SRT in noise and measures of auditory and nonauditory performance, after which stepwise regression analyses were performed within both groups separately. Auditory measures included the pure-tone audiogram and tests of spectral and temporal acuity. Nonauditory factors were assessed by measuring the text reception threshold (TRT), a visual analogue of the SRT, in which partially masked sentences were adaptively presented. Results indicate that, for the normal-hearing group, the variance in speech reception is mainly associated with nonauditory factors, both in stationary and in modulated noise. For the hearing-impaired group, speech reception in stationary noise is mainly related to the audiogram, even when audibility effects are accounted for. In modulated noise, both auditory (temporal acuity) and nonauditory factors (TRT) contribute to explaining interindividual differences in speech reception. Age was not a significant factor in the results. It is concluded that, under some conditions, nonauditory factors are relevant for the perception of speech in noise. Further evaluation of nonauditory factors might enable adapting the expectations from auditory rehabilitation in clinical settings.  相似文献   

4.
The extension to the speech intelligibility index (SII; ANSI S3.5-1997 (1997)) proposed by Rhebergen and Versfeld [Rhebergen, K.S., and Versfeld, N.J. (2005). J. Acoust. Soc. Am. 117(4), 2181-2192] is able to predict for normal-hearing listeners the speech intelligibility in both stationary and fluctuating noise maskers with reasonable accuracy. The extended SII model was validated with speech reception threshold (SRT) data from the literature. However, further validation is required and the present paper describes SRT experiments with nonstationary noise conditions that are critical to the extended model. From these data, it can be concluded that the extended SII model is able to predict the SRTs for the majority of conditions, but that predictions are better when the extended SII model includes a function to account for forward masking.  相似文献   

5.
Some current single-microphone hearing aids employ techniques for adaptively varying the frequency-gain characteristics in an attempt to improve speech reception in noise. The potential benefit of this strategy depends on the spectral spread of masking and the degree to which it can be reduced by changing the frequency-gain characteristic. In this study these benefits were examined for subjects with normal hearing under static listening conditions. In the unprocessed condition, subjects were presented with nonsense syllables in an octave-band noise centered on 0.5, 1, or 2 kHz. The frequency-gain characteristic was then modified with the goal of reducing the intensity of the frequency region containing the octave-band noise. This processing resulted in increases as large as 60 percentage points in consonant-correct scores with the low- and mid-frequency octave noise bands, and a small increase with the high-frequency noise. Masking patterns produced by the octave noises were also measured and were related to the intelligibility results via an analysis based on Articulation Theory. The Articulation Index was also used to compare the effectiveness of three adaptive rules. A simple multiband volume control is expected to provide much of the benefit of more sophisticated systems without the need for separate estimation of input speech and noise spectra.  相似文献   

6.
Effect of spectral envelope smearing on speech reception. I.   总被引:2,自引:0,他引:2  
The effect of reduced spectral contrast on the speech-reception threshold (SRT) for sentences in noise and on phoneme identification, was investigated with 16 normal-hearing subjects. Signal processing was performed by smoothing the envelope of the squared short-time fast Fourier transform (FFT) by convolving it with a Gaussian-shaped filter, and overlapping additions to reconstruct a continuous signal. Spectral energy in the frequency region from 100 to 8000 Hz was smeared over bandwidths of 1/8, 1/4, 1/3, 1/2, 1, 2, and 4 oct for the SRT experiment. Vowel and consonant identification was studied for smearing bandwidths of 1/8, 1/2, and 2 oct. Results showed the SRT in noise to increase as the spectral energy was smeared over bandwidths exceeding the ear's critical bandwidth. Vowel identification suffered more from this type of processing than consonant identification. Vowels were primarily confused with the back vowels /c,u/, and consonants were confused where place of articulation is concerned.  相似文献   

7.
A method is described to select sentence materials for efficient measurement of the speech reception threshold (SRT). The first part of the paper addresses the creation of the sentence materials, the recording procedure, and a listening experiment to evaluate the new speech materials. The result is a set of 1272 sentences, where every sentence has been uttered by two male and two female speakers. In the second part of the paper, a method is described to select subsets with properties that are desired for an efficient measurement of the SRT. For two speakers, this method has been applied to obtain two subsets for measurement of the SRT in stationary noise with the long-term average spectrum of speech. Lastly, a listening experiment has been conducted where the two subsets (each comprising 39 lists of 13 sentences each) are directly compared to the existing sets of Plomp and Mimpen [Audiology 18, 43-52 (1979)] and Smoorenburg [J. Acoust. Soc. Am. 91, 421-437 (1992)]. One of the outcomes is that the newly developed sets can be considered as equivalent to these existing sets.  相似文献   

8.
The threshold filter is a frequently used technique in ultrasound B-scan to reject the small echoes contributed from backscattering that blur the tissue interface and reduce the image contrast. Note that using the threshold based on one value would simultaneously destroy local waveform features of the reflection echoes with amplitudes larger than threshold value. To resolve this problem, we developed an adaptive threshold filter based on the noise-assisted empirical mode decomposition (EMD). Computer simulations at 7.5 MHz using a single-element transducer with a bandwidth of 60% and a pulselength of 0.5 μs were carried out to explore the feasibility of the algorithm. Image measurements on the carotid artery using a 7.5 MHz, 128 elements, 1D linear array transducer with the same characteristics as those in simulations were used to verify the performance of the algorithm in practice. Compared to the result from the conventional threshold technique, the adaptive threshold filter is able to successfully suppress the smaller backscattering signals without changing the local waveform features of the preserved significant echoes due to refection.  相似文献   

9.
This study compares performance of 24 young normal-hearing (aged 18-28 years) and 24 elderly (aged 61-85 years) listeners on auditive (sensitivity, frequency selectivity, and temporal resolution), cognitive (memory performance, processing speed, and divided attention ability), and speech perception tests (at the phoneme, spondee, and sentence level). Its principal aim is to assess whether the tests selected yield meaningful results. The results obtained will be used to reduce the test battery in order to be manageable in a second study on a much larger number of elderly listeners. The relationships between the tests are explored by multivariate statistical methods. The results show that: (a) in young listeners, individual differences in speech perception performance are remarkably small resulting in low correlations between the tests, while in the elderly tests of phoneme, spondee, and sentence perception overlap considerably; (b) speech perception in the elderly seems to be largely determined by hearing loss at the higher frequencies, whereas the effects of other auditive and cognitive factors seem to be relatively small or absent; and (c) performance in the elderly is only partly correlated with age.  相似文献   

10.
In psychoacoustic studies there is often a need to assess performance indices quickly and reliably. The aim of this study was to establish a quick and reliable procedure for evaluating thresholds in backward masking and frequency discrimination tasks. Based on simulations, four procedures likely to produce the best results were selected, and data collected from 20 naive adult listeners on each. Each procedure used one of two adaptive methods (staircase or maximum-likelihood estimation, each targeting the 79% correct point on the psychometric function) and two response paradigms (3-interval, 2-alternative forced-choice AXB or 3-interval; 3-alternative forced-choice oddball). All procedures yielded statistically equivalent threshold estimates in both backward masking and frequency discrimination, with a trend to lower thresholds for oddball procedures in frequency discrimination. Oddball procedures were both more efficient and more reliable (test-retest) in backward masking, but all four procedures were equally efficient and reliable in frequency discrimination. Fitted psychometric functions yielded similar thresholds to averaging over reversals in staircase procedures. Learning was observed across threshold-assessment blocks and experimental sessions. In four additional groups, each of ten listeners, trained on the different procedures, no differences in performance improvement or rate of learning were observed, suggesting that learning is independent of procedure.  相似文献   

11.
Speech produced in the presence of noise-Lombard speech-is more intelligible in noise than speech produced in quiet, but the origin of this advantage is poorly understood. Some of the benefit appears to arise from auditory factors such as energetic masking release, but a role for linguistic enhancements similar to those exhibited in clear speech is possible. The current study examined the effect of Lombard speech in noise and in quiet for Spanish learners of English. Non-native listeners showed a substantial benefit of Lombard speech in noise, although not quite as large as that displayed by native listeners tested on the same task in an earlier study [Lu and Cooke (2008), J. Acoust. Soc. Am. 124, 3261-3275]. The difference between the two groups is unlikely to be due to energetic masking. However, Lombard speech was less intelligible in quiet for non-native listeners than normal speech. The relatively small difference in Lombard benefit in noise for native and non-native listeners, along with the absence of Lombard benefit in quiet, suggests that any contribution of linguistic enhancements in the Lombard benefit for natives is small.  相似文献   

12.
为了克服低信噪比输入下,语音增强造成语音清音中的弱分量损失,造成重构信号包络失真的问题。论文提出了一种新的语音增强方法。该方法根据语音感知模型,采用不完全小波包分解拟合语音临界频带,并对语音按子带能量进行清浊音区分处理,在阈值计算上,提出了一种清浊音分离,基于子带信号能量的小波包自适应阈值算法。通过仿真实验,客观评测和听音测试表明,该算法在低信噪比输入时较传统算法,能够更加有效地减少重构信号包络失真,在不损伤语音清晰度和自然度的前提下,使输出信噪比明显提高。将该算法与能量谱减法结合,进行二次增强能进一步提高降噪输出的语音质量。  相似文献   

13.
While a large portion of the variance among listeners in speech recognition is associated with the audibility of components of the speech waveform, it is not possible to predict individual differences in the accuracy of speech processing strictly from the audiogram. This has suggested that some of the variance may be associated with individual differences in spectral or temporal resolving power, or acuity. Psychoacoustic measures of spectral-temporal acuity with nonspeech stimuli have been shown, however, to correlate only weakly (or not at all) with speech processing. In a replication and extension of an earlier study [Watson et al., J. Acoust. Soc. Am. Suppl. 1 71. S73 (1982)] 93 normal-hearing college students were tested on speech perception tasks (nonsense syllables, words, and sentences in a noise background) and on six spectral-temporal discrimination tasks using simple and complex nonspeech sounds. Factor analysis showed that the abilities that explain performance on the nonspeech tasks are quite distinct from those that account for performance on the speech tasks. Performance was significantly correlated among speech tasks and among nonspeech tasks. Either, (a) auditory spectral-temporal acuity for nonspeech sounds is orthogonal to speech processing abilities, or (b) the appropriate tasks or types of nonspeech stimuli that challenge the abilities required for speech recognition have yet to be identified.  相似文献   

14.
Quantifying the intelligibility of speech in noise for non-native listeners   总被引:3,自引:0,他引:3  
When listening to languages learned at a later age, speech intelligibility is generally lower than when listening to one's native language. The main purpose of this study is to quantify speech intelligibility in noise for specific populations of non-native listeners, only broadly addressing the underlying perceptual and linguistic processing. An easy method is sought to extend these quantitative findings to other listener populations. Dutch subjects listening to Germans and English speech, ranging from reasonable to excellent proficiency in these languages, were found to require a 1-7 dB better speech-to-noise ratio to obtain 50% sentence intelligibility than native listeners. Also, the psychometric function for sentence recognition in noise was found to be shallower for non-native than for native listeners (worst-case slope around the 50% point of 7.5%/dB, compared to 12.6%/dB for native listeners). Differences between native and non-native speech intelligibility are largely predicted by linguistic entropy estimates as derived from a letter guessing task. Less effective use of context effects (especially semantic redundancy) explains the reduced speech intelligibility for non-native listeners. While measuring speech intelligibility for many different populations of listeners (languages, linguistic experience) may be prohibitively time consuming, obtaining predictions of non-native intelligibility from linguistic entropy may help to extend the results of this study to other listener populations.  相似文献   

15.
This study examined proportional frequency compression as a strategy for improving speech recognition in listeners with high-frequency sensorineural hearing loss. This method of frequency compression preserved the ratios between the frequencies of the components of natural speech, as well as the temporal envelope of the unprocessed speech stimuli. Nonsense syllables spoken by a female and a male talker were used as the speech materials. Both frequency-compressed speech and the control condition of unprocessed speech were presented with high-pass amplification. For the materials spoken by the female talker, significant increases in speech recognition were observed in slightly less than one-half of the listeners with hearing impairment. For the male-talker materials, one-fifth of the hearing-impaired listeners showed significant recognition improvements. The increases in speech recognition due solely to frequency compression were generally smaller than those solely due to high-pass amplification. The results indicate that while high-pass amplification is still the most effective approach for improving speech recognition of listeners with high-frequency hearing loss, proportional frequency compression can offer significant improvements in addition to those provided by amplification for some patients.  相似文献   

16.
Three experiments investigated the roles of interaural time differences (ITDs) and level differences (ILDs) in spatial unmasking in multi-source environments. In experiment 1, speech reception thresholds (SRTs) were measured in virtual-acoustic simulations of an anechoic environment with three interfering sound sources of either speech or noise. The target source lay directly ahead, while three interfering sources were (1) all at the target's location (0 degrees,0 degrees,0 degrees), (2) at locations distributed across both hemifields (-30 degrees,60 degrees,90 degrees), (3) at locations in the same hemifield (30 degrees,60 degrees,90 degrees), or (4) co-located in one hemifield (90 degrees,90 degrees,90 degrees). Sounds were convolved with head-related impulse responses (HRIRs) that were manipulated to remove individual binaural cues. Three conditions used HRIRs with (1) both ILDs and ITDs, (2) only ILDs, and (3) only ITDs. The ITD-only condition produced the same pattern of results across spatial configurations as the combined cues, but with smaller differences between spatial configurations. The ILD-only condition yielded similar SRTs for the (-30 degrees,60 degrees,90 degrees) and (0 degrees,0 degrees,0 degrees) configurations, as expected for best-ear listening. In experiment 2, pure-tone BMLDs were measured at third-octave frequencies against the ITD-only, speech-shaped noise interferers of experiment 1. These BMLDs were 4-8 dB at low frequencies for all spatial configurations. In experiment 3, SRTs were measured for speech in diotic, speech-shaped noise. Noises were filtered to reduce the spectrum level at each frequency according to the BMLDs measured in experiment 2. SRTs were as low or lower than those of the corresponding ITD-only conditions from experiment 1. Thus, an explanation of speech understanding in complex listening environments based on the combination of best-ear listening and binaural unmasking (without involving sound-localization) cannot be excluded.  相似文献   

17.
The acoustic-reflex thresholds (ART) for multicomponent tonal complexes of varying bandwidth and spectral density were obtained from 20 normal-hearing (air-conduction thresholds less than or equal to 20 dB HL at 250-8000 Hz) young adults ranging in age from 20-30 years and 20 normal-hearing, old subjects ranging in age from 60-71 years. The results revealed that the ART decreased with spectral density, plateauing after seven components in the young group and after five components in the old group; the decrease in the acoustic-reflex threshold as a result of the increase in spectral density was less in the old than in the young group. The bandwidth effect (when bandwidth was plotted in hertz or octaves) on the acoustic-reflex threshold was present in the young adults, but substantially reduced in the elderly, as evidenced by the statistically significant interaction between subject group and signal bandwidth. The spectral density results are discussed in terms of their theoretic implications for the energy summation capacity and frequency resolution of the auditory system. The bandwidth results are discussed in terms of their theoretic implications for the frequency-resolving power of the auditory system.  相似文献   

18.
The minimum standard deviations achievable for concurrent estimates of thresholds and psychometric function slopes as well as the optimal target values for adaptive procedures are calculated as functions of stimulus level and track length on the basis of the binomial theory. The optimum pair of targets for a concurrent estimate is found at the correct response probabilities p1 = 0.19 and p2 = 0.81 for the logistic psychometric function. An adaptive procedure that converges at these optimal targets is introduced and tested with Monte Carlo simulations. The efficiency increases rapidly when each subject's response consists of more than one statistically independent Bernoulli trial. Sentence intelligibility tests provide more than one Bernoulli trial per sentence when each word is scored separately. The number of within-sentence trials can be quantified by the j factor [Boothroyd and Nittrouer, J. Acoust. Soc. Am. 84, 101-114 (1988)]. The adaptive procedure was evaluated with 10 normal-hearing and 11 hearing-impaired listeners using two German sentence tests that differ in j factors. The expected advantage of the sentence test with the higher j factor was not observed, possibly due to training effects. Hence, the number of sentences required for a reliable speech reception threshold (approximately 1 dB standard deviation) concurrently with a slope estimate (approximately 20%-30% relative standard deviation) is at least N = 30 if word scoring for short, meaningful sentences (j approximately 2) is performed.  相似文献   

19.
The word recognition ability of 4 normal-hearing and 13 cochlearly hearing-impaired listeners was evaluated. Filtered and unfiltered speech in quiet and in noise were presented monaurally through headphones. The noise varied over listening situations with regard to spectrum, level, and temporal envelope. Articulation index theory was applied to predict the results. Two calculation methods were used, both based on the ANSI S3.5-1969 20-band method [S3.5-1969 (American National Standards Institute, New York)]. Method I was almost identical to the ANSI method. Method II included a level- and hearing-loss-dependent calculation of masking of stationary and on-off gated noise signals and of self-masking of speech. Method II provided the best prediction capability, and it is concluded that speech intelligibility of cochlearly hearing-impaired listeners may also, to a first approximation, be predicted from articulation index theory.  相似文献   

20.
A digital processing method is described for altering spectral contrast (the difference in amplitude between spectral peaks and valleys) in natural utterances. Speech processed with programs implementing the contrast alteration procedure was presented to listeners with moderate to severe sensorineural hearing loss. The task was a three alternative (/b/,/d/, or /g/) stop consonant identification task for consonants at a fixed location in short nonsense utterances. Overall, tokens with enhanced contrast showed moderate gains in percentage correct stop consonant identification when compared to unaltered tokens. Conversely, reducing spectral contrast generally reduced percent correct stop consonant identification. Contrast alteration effects were inconsistent for utterances containing /d/. The observed contrast effects also interacted with token intelligibility.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号