共查询到20条相似文献,搜索用时 15 毫秒
1.
Auditive and cognitive factors in speech perception by elderly listeners. II: Multivariate analyses 总被引:2,自引:0,他引:2
In part I of this study [van Rooij et al., J. Acoust. Soc. Am. 86, 1294-1309 (1989)], the validity and manageability of a test battery comprising auditive (sensitivity, frequency resolution, and temporal resolution), cognitive (memory performance, processing speed, and intellectual abilities), and speech perception tests (at the phoneme, spondee, and sentence level) were investigated. In the present article, the results of a selection of these tests for 72 elderly subjects (aged 60-93 years) are analyzed by multivariate statistical techniques. The results show that the deterioration of speech perception in the elderly consists of two statistically independent components: (a) a large component mainly representing the progressive high-frequency hearing loss with age that accounts for approximately two-thirds of the systematic variance of the tests of speech perception and (b) a smaller component (accounting for one-third of the systematic variance of the speech perception tests) mainly representing a general performance decrement due to reduced mental efficiency, which is indicated by a general slowing of performance and a reduced memory capacity. Although both components are correlated with age, it was found that the balance between auditive and cognitive contributions to speech perception performance did not change with age. 相似文献
2.
The present study evaluated auditory-visual speech perception in cochlear-implant users as well as normal-hearing and simulated-implant controls to delineate relative contributions of sensory experience and cues. Auditory-only, visual-only, or auditory-visual speech perception was examined in the context of categorical perception, in which an animated face mouthing ba, da, or ga was paired with synthesized phonemes from an 11-token auditory continuum. A three-alternative, forced-choice method was used to yield percent identification scores. Normal-hearing listeners showed sharp phoneme boundaries and strong reliance on the auditory cue, whereas actual and simulated implant listeners showed much weaker categorical perception but stronger dependence on the visual cue. The implant users were able to integrate both congruent and incongruent acoustic and optical cues to derive relatively weak but significant auditory-visual integration. This auditory-visual integration was correlated with the duration of the implant experience but not the duration of deafness. Compared with the actual implant performance, acoustic simulations of the cochlear implant could predict the auditory-only performance but not the auditory-visual integration. These results suggest that both altered sensory experience and improvised acoustic cues contribute to the auditory-visual speech perception in cochlear-implant users. 相似文献
3.
Auditive and cognitive factors in speech perception by elderly listeners. I: Development of test battery 总被引:2,自引:0,他引:2
J C van Rooij R Plomp J F Orlebeke 《The Journal of the Acoustical Society of America》1989,86(4):1294-1309
This study compares performance of 24 young normal-hearing (aged 18-28 years) and 24 elderly (aged 61-85 years) listeners on auditive (sensitivity, frequency selectivity, and temporal resolution), cognitive (memory performance, processing speed, and divided attention ability), and speech perception tests (at the phoneme, spondee, and sentence level). Its principal aim is to assess whether the tests selected yield meaningful results. The results obtained will be used to reduce the test battery in order to be manageable in a second study on a much larger number of elderly listeners. The relationships between the tests are explored by multivariate statistical methods. The results show that: (a) in young listeners, individual differences in speech perception performance are remarkably small resulting in low correlations between the tests, while in the elderly tests of phoneme, spondee, and sentence perception overlap considerably; (b) speech perception in the elderly seems to be largely determined by hearing loss at the higher frequencies, whereas the effects of other auditive and cognitive factors seem to be relatively small or absent; and (c) performance in the elderly is only partly correlated with age. 相似文献
4.
5.
The effect of head-induced interaural time delay (ITD) and interaural level differences (ILD) on binaural speech intelligibility in noise was studied for listeners with symmetrical and asymmetrical sensorineural hearing losses. The material, recorded with a KEMAR manikin in an anechoic room, consisted of speech, presented from the front (0 degree), and noise, presented at azimuths of 0 degree, 30 degrees, and 90 degrees. Derived noise signals, containing either only ITD or only ILD, were generated using a computer. For both groups of subjects, speech-reception thresholds (SRT) for sentences in noise were determined as a function of: (1) noise azimuth, (2) binaural cue, and (3) an interaural difference in overall presentation level, simulating the effect of a monaural hearing acid. Comparison of the mean results with corresponding data obtained previously from normal-hearing listeners shows that the hearing impaired have a 2.5 dB higher SRT in noise when both speech and noise are presented from the front, and 2.6-5.1 dB less binaural gain when the noise azimuth is changed from 0 degree to 90 degrees. The gain due to ILD varies among the hearing-impaired listeners between 0 dB and normal values of 7 dB or more. It depends on the high-frequency hearing loss at the side presented with the most favorable signal-to-noise (S/N) ratio. The gain due to ITD is nearly normal for the symmetrically impaired (4.2 dB, compared with 4.7 dB for the normal hearing), but only 2.5 dB in the case of asymmetrical impairment. When ITD is introduced in noise already containing ILD, the resulting gain is 2-2.5 dB for all groups. The only marked effect of the interaural difference in overall presentation level is a reduction of the gain due to ILD when the level at the ear with the better S/N ratio is decreased. This implies that an optimal monaural hearing aid (with a moderate gain) will hardly interfere with unmasking through ITD, while it may increase the gain due to ILD by preventing or diminishing threshold effects. 相似文献
6.
Previous work has established that naturally produced clear speech is more intelligible than conversational speech for adult hearing-impaired listeners and normal-hearing listeners under degraded listening conditions. The major goal of the present study was to investigate the extent to which naturally produced clear speech is an effective intelligibility enhancement strategy for non-native listeners. Thirty-two non-native and 32 native listeners were presented with naturally produced English sentences. Factors that varied were speaking style (conversational versus clear), signal-to-noise ratio (-4 versus -8 dB) and talker (one male versus one female). Results showed that while native listeners derived a substantial benefit from naturally produced clear speech (an improvement of about 16 rau units on a keyword-correct count), non-native listeners exhibited only a small clear speech effect (an improvement of only 5 rau units). This relatively small clear speech effect for non-native listeners is interpreted as a consequence of the fact that clear speech is essentially native-listener oriented, and therefore is only beneficial to listeners with extensive experience with the sound structure of the target language. 相似文献
7.
A J Klein J H Mills W Y Adkins 《The Journal of the Acoustical Society of America》1990,87(3):1266-1271
Upward spreading of masking, measured in terms of absolute masked threshold, is greater in hearing-impaired listeners than in listeners with normal hearing. The purpose of this study was to make further observations on upward-masked thresholds and speech recognition in noise in elderly listeners. Two age groups were used: One group consisted of listeners who were more than 60 years old, and the second group consisted of listeners who were less than 36 years old. Both groups had listeners with normal hearing as well as listeners with mild to moderate sensorineural loss. The masking paradigm consisted of a continuous low-pass-filtered (1000-Hz) noise, which was mixed with the output of a self-tracking, sweep-frequency Bekesy audiometer. Thresholds were measured in quiet and with maskers at 70 and 90 dB SPL. The upward-masked thresholds were similar for young and elderly hearing-impaired listeners. A few elderly listeners had lower upward-masked thresholds compared with the young control group; however, their on-frequency masked thresholds were nearly identical to the control group. A significant correlation was found between upward-masked thresholds and the Speech Perception in Noise (SPIN) test in elderly listeners. 相似文献
8.
Won JH Drennan WR Nie K Jameyson EM Rubinstein JT 《The Journal of the Acoustical Society of America》2011,130(1):376-388
The goals of the present study were to measure acoustic temporal modulation transfer functions (TMTFs) in cochlear implant listeners and examine the relationship between modulation detection and speech recognition abilities. The effects of automatic gain control, presentation level and number of channels on modulation detection thresholds (MDTs) were examined using the listeners' clinical sound processor. The general form of the TMTF was low-pass, consistent with previous studies. The operation of automatic gain control had no effect on MDTs when the stimuli were presented at 65 dBA. MDTs were not dependent on the presentation levels (ranging from 50 to 75 dBA) nor on the number of channels. Significant correlations were found between MDTs and speech recognition scores. The rates of decay of the TMTFs were predictive of speech recognition abilities. Spectral-ripple discrimination was evaluated to examine the relationship between temporal and spectral envelope sensitivities. No correlations were found between the two measures, and 56% of the variance in speech recognition was predicted jointly by the two tasks. The present study suggests that temporal modulation detection measured with the sound processor can serve as a useful measure of the ability of clinical sound processing strategies to deliver clinically pertinent temporal information. 相似文献
9.
van Wijngaarden SJ Steeneken HJ Houtgast T 《The Journal of the Acoustical Society of America》2002,111(4):1906-1916
When listening to languages learned at a later age, speech intelligibility is generally lower than when listening to one's native language. The main purpose of this study is to quantify speech intelligibility in noise for specific populations of non-native listeners, only broadly addressing the underlying perceptual and linguistic processing. An easy method is sought to extend these quantitative findings to other listener populations. Dutch subjects listening to Germans and English speech, ranging from reasonable to excellent proficiency in these languages, were found to require a 1-7 dB better speech-to-noise ratio to obtain 50% sentence intelligibility than native listeners. Also, the psychometric function for sentence recognition in noise was found to be shallower for non-native than for native listeners (worst-case slope around the 50% point of 7.5%/dB, compared to 12.6%/dB for native listeners). Differences between native and non-native speech intelligibility are largely predicted by linguistic entropy estimates as derived from a letter guessing task. Less effective use of context effects (especially semantic redundancy) explains the reduced speech intelligibility for non-native listeners. While measuring speech intelligibility for many different populations of listeners (languages, linguistic experience) may be prohibitively time consuming, obtaining predictions of non-native intelligibility from linguistic entropy may help to extend the results of this study to other listener populations. 相似文献
10.
Speech can remain intelligible for listeners with normal hearing when processed by narrow bandpass filters that transmit only a small fraction of the audible spectrum. Two experiments investigated the basis for the high intelligibility of narrowband speech. Experiment 1 confirmed reports that everyday English sentences can be recognized accurately (82%-98% words correct) when filtered at center frequencies of 1500, 2100, and 3000 Hz. However, narrowband low predictability (LP) sentences were less accurately recognized than high predictability (HP) sentences (20% lower scores), and excised narrowband words were even less intelligible than LP sentences (a further 23% drop). While experiment 1 revealed similar levels of performance for narrowband and broadband sentences at conversational speech levels, experiment 2 showed that speech reception thresholds were substantially (>30 dB) poorer for narrowband sentences. One explanation for this increased disparity between narrowband and broadband speech at threshold (compared to conversational speech levels) is that spectral components in the sloping transition bands of the filters provide important cues for the recognition of narrowband speech, but these components become inaudible as the signal level is reduced. Experiment 2 also showed that performance was degraded by the introduction of a speech masker (a single competing talker). The elevation in threshold was similar for narrowband and broadband speech (11 dB, on average), but because the narrowband sentences required considerably higher sound levels to reach their thresholds in quiet compared to broadband sentences, their target-to-masker ratios were very different (+23 dB for narrowband sentences and -12 dB for broadband sentences). As in experiment 1, performance was better for HP than LP sentences. The LP-HP difference was larger for narrowband than broadband sentences, suggesting that context provides greater benefits when speech is distorted by narrow bandpass filtering. 相似文献
11.
Arehart KH Kates JM Anderson MC Harvey LO 《The Journal of the Acoustical Society of America》2007,122(2):1150-1164
Noise and distortion reduce speech intelligibility and quality in audio devices such as hearing aids. This study investigates the perception and prediction of sound quality by both normal-hearing and hearing-impaired subjects for conditions of noise and distortion related to those found in hearing aids. Stimuli were sentences subjected to three kinds of distortion (additive noise, peak clipping, and center clipping), with eight levels of degradation for each distortion type. The subjects performed paired comparisons for all possible pairs of 24 conditions. A one-dimensional coherence-based metric was used to analyze the quality judgments. This metric was an extension of a speech intelligibility metric presented in Kates and Arehart (2005) [J. Acoust. Soc. Am. 117, 2224-2237] and is based on dividing the speech signal into three amplitude regions, computing the coherence for each region, and then combining the three coherence values across frequency in a calculation based on the speech intelligibility index. The one-dimensional metric accurately predicted the quality judgments of normal-hearing listeners and listeners with mild-to-moderate hearing loss, although some systematic errors were present. A multidimensional analysis indicates that several dimensions are needed to describe the factors used by subjects to judge the effects of the three distortion types. 相似文献
12.
Speech-in-noise-measurements are important in clinical practice and have been the subject of research for a long time. The results of these measurements are often described in terms of the speech reception threshold (SRT) and SNR loss. Using the basic concepts that underlie several models of speech recognition in steady-state noise, the present study shows that these measures are ill-defined, most importantly because the slope of the speech recognition functions for hearing-impaired listeners always decreases with hearing loss. This slope can be determined from the slope of the normal-hearing speech recognition function when the SRT for the hearing-impaired listener is known. The SII-function (i.e., the speech intelligibility index (SII) against SNR) is important and provides insights into many potential pitfalls when interpreting SRT data. Standardized SNR loss, sSNR loss, is introduced as a universal measure of hearing loss for speech in steady-state noise. Experimental data demonstrates that, unlike the SRT or SNR loss, sSNR loss is invariant to the target point chosen, the scoring method or the type of speech material. 相似文献
13.
Frequency resolution was evaluated for two normal-hearing and seven hearing-impaired subjects with moderate, flat sensorineural hearing loss by measuring percent correct detection of a 2000-Hz tone as the width of a notch in band-reject noise increased. The level of the tone was fixed for each subject at a criterion performance level in broadband noise. Discrimination of synthetic speech syllables that differed in spectral content in the 2000-Hz region was evaluated as a function of the notch width in the same band-reject noise. Recognition of natural speech consonant/vowel syllables in quiet was also tested; results were analyzed for percent correct performance and relative information transmitted for voicing and place features. In the hearing-impaired subjects, frequency resolution at 2000 Hz was significantly correlated with the discrimination of synthetic speech information in the 2000-Hz region and was not related to the recognition of natural speech nonsense syllables unless (a) the speech stimuli contained the vowel /i/ rather than /a/, and (b) the score reflected information transmitted for place of articulation rather than percent correct. 相似文献
14.
Auditory and nonauditory factors affecting speech reception in noise by older listeners 总被引:2,自引:0,他引:2
George EL Zekveld AA Kramer SE Goverts ST Festen JM Houtgast T 《The Journal of the Acoustical Society of America》2007,121(4):2362-2375
Speech reception thresholds (SRTs) for sentences were determined in stationary and modulated background noise for two age-matched groups of normal-hearing (N = 13) and hearing-impaired listeners (N = 21). Correlations were studied between the SRT in noise and measures of auditory and nonauditory performance, after which stepwise regression analyses were performed within both groups separately. Auditory measures included the pure-tone audiogram and tests of spectral and temporal acuity. Nonauditory factors were assessed by measuring the text reception threshold (TRT), a visual analogue of the SRT, in which partially masked sentences were adaptively presented. Results indicate that, for the normal-hearing group, the variance in speech reception is mainly associated with nonauditory factors, both in stationary and in modulated noise. For the hearing-impaired group, speech reception in stationary noise is mainly related to the audiogram, even when audibility effects are accounted for. In modulated noise, both auditory (temporal acuity) and nonauditory factors (TRT) contribute to explaining interindividual differences in speech reception. Age was not a significant factor in the results. It is concluded that, under some conditions, nonauditory factors are relevant for the perception of speech in noise. Further evaluation of nonauditory factors might enable adapting the expectations from auditory rehabilitation in clinical settings. 相似文献
15.
A glimpsing model of speech perception in noise 总被引:5,自引:0,他引:5
Cooke M 《The Journal of the Acoustical Society of America》2006,119(3):1562-1573
Do listeners process noisy speech by taking advantage of "glimpses"-spectrotemporal regions in which the target signal is least affected by the background? This study used an automatic speech recognition system, adapted for use with partially specified inputs, to identify consonants in noise. Twelve masking conditions were chosen to create a range of glimpse sizes. Several different glimpsing models were employed, differing in the local signal-to-noise ratio (SNR) used for detection, the minimum glimpse size, and the use of information in the masked regions. Recognition results were compared with behavioral data. A quantitative analysis demonstrated that the proportion of the time-frequency plane glimpsed is a good predictor of intelligibility. Recognition scores in each noise condition confirmed that sufficient information exists in glimpses to support consonant identification. Close fits to listeners' performance were obtained at two local SNR thresholds: one at around 8 dB and another in the range -5 to -2 dB. A transmitted information analysis revealed that cues to voicing are degraded more in the model than in human auditory processing. 相似文献
16.
Binaural speech intelligibility of individual listeners under realistic conditions was predicted using a model consisting of a gammatone filter bank, an independent equalization-cancellation (EC) process in each frequency band, a gammatone resynthesis, and the speech intelligibility index (SII). Hearing loss was simulated by adding uncorrelated masking noises (according to the pure-tone audiogram) to the ear channels. Speech intelligibility measurements were carried out with 8 normal-hearing and 15 hearing-impaired listeners, collecting speech reception threshold (SRT) data for three different room acoustic conditions (anechoic, office room, cafeteria hall) and eight directions of a single noise source (speech in front). Artificial EC processing errors derived from binaural masking level difference data using pure tones were incorporated into the model. Except for an adjustment of the SII-to-intelligibility mapping function, no model parameter was fitted to the SRT data of this study. The overall correlation coefficient between predicted and observed SRTs was 0.95. The dependence of the SRT of an individual listener on the noise direction and on room acoustics was predicted with a median correlation coefficient of 0.91. The effect of individual hearing impairment was predicted with a median correlation coefficient of 0.95. However, for mild hearing losses the release from masking was overestimated. 相似文献
17.
Litvak LM Spahr AJ Saoji AA Fridman GY 《The Journal of the Acoustical Society of America》2007,122(2):982-991
Spectral resolution has been reported to be closely related to vowel and consonant recognition in cochlear implant (CI) listeners. One measure of spectral resolution is spectral modulation threshold (SMT), which is defined as the smallest detectable spectral contrast in the spectral ripple stimulus. SMT may be determined by the activation pattern associated with electrical stimulation. In the present study, broad activation patterns were simulated using a multi-band vocoder to determine if similar impairments in speech understanding scores could be produced in normal-hearing listeners. Tokens were first decomposed into 15 logarithmically spaced bands and then re-synthesized by multiplying the envelope of each band by matched filtered noise. Various amounts of current spread were simulated by adjusting the drop-off of the noise spectrum away from the peak (40-5 dBoctave). The average SMT (0.25 and 0.5 cyclesoctave) increased from 6.3 to 22.5 dB, while average vowel identification scores dropped from 86% to 19% and consonant identification scores dropped from 93% to 59%. In each condition, the impairments in speech understanding were generally similar to those found in CI listeners with similar SMTs, suggesting that variability in spread of neural activation largely accounts for the variability in speech perception of CI listeners. 相似文献
18.
T. Houtgast 《Applied Acoustics》1981,14(1):15-25
Intelligibility tests were performed by teachers and pupils in classrooms under a variety of (road traffic) noise conditions. The intelligibility scores are found to deteriorate at (indoor) noise levels exceeding a critical value of — 15 dB with regard to a teacher's long-term (reverberant) speech level. The implications for external noise levels are discussed: typically, an external noise level of 50 dB(A) would imply that the critical indoor level is exceeded for about 20 per cent of teachers. 相似文献
19.
George EL Festen JM Houtgast T 《The Journal of the Acoustical Society of America》2006,120(4):2295-2311
The Speech Reception Threshold for sentences in stationary noise and in several amplitude-modulated noises was measured for 8 normal-hearing listeners, 29 sensorineural hearing-impaired listeners, and 16 normal-hearing listeners with simulated hearing loss. This approach makes it possible to determine whether the reduced benefit from masker modulations, as often observed for hearing-impaired listeners, is due to a loss of signal audibility, or due to suprathreshold deficits, such as reduced spectral and temporal resolution, which were measured in four separate psychophysical tasks. Results show that the reduced masking release can only partly be accounted for by reduced audibility, and that, when considering suprathreshold deficits, the normal effects associated with a raised presentation level should be taken into account. In this perspective, reduced spectral resolution does not appear to qualify as an actual suprathreshold deficit, while reduced temporal resolution does. Temporal resolution and age are shown to be the main factors governing masking release for speech in modulated noise, accounting for more than half of the intersubject variance. Their influence appears to be related to the processing of mainly the higher stimulus frequencies. Results based on calculations of the Speech Intelligibility Index in modulated noise confirm these conclusions. 相似文献
20.
Ison JR Virag TM Allen PD Hammond GR 《The Journal of the Acoustical Society of America》2002,112(1):238-246
Listeners asked to detect tones masked by noise hear frequent signals but miss infrequent probes, suggesting that they attend to spectral regions where they expect the signals to occur. The narrow detection pattern centered on the frequent target approximates that obtained in notched noise, indicating that attention is focused on the auditory filter. We measured attention bands in young and elderly listeners (n=5, 4; 20-25 and 62-82 years of age) for targets (800 or 1200 Hz) and infrequent probe signals (target +/-25-100 Hz) masked in wideband noise. We anticipated that their width would increase with age, as has been reported for auditory filters. A yes-no single-interval procedure provided detection probabilities and detection response speeds. Both measures showed near-linear declines with decreasing signal level, and graded decay functions as probe frequency deviated from the target frequency. Probes deviating from the target by 25 to 50 Hz were equivalent to a 2-dB reduction in signal level for both measures. The equivalent rectangular bandwidth (ERB) for detection approximated 11% of the signal frequency for each age group. Confidence intervals (95%) showed that the elderly ERB could be at most only about 20% larger than that of younger listeners. 相似文献