首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
The goal of this study was to measure the ability of adult hearing-impaired listeners to discriminate formant frequency for vowels in isolation, syllables, and sentences. Vowel formant discrimination for F1 and F2 for the vowels /I epsilon ae / was measured. Four experimental factors were manipulated including linguistic context (isolated vowels, syllables, and sentences), signal level (70 and 95 dB SPL), formant frequency, and cognitive load. A complex identification task was added to the formant discrimination task only for sentences to assess effects of cognitive load. Results showed significant elevation in formant thresholds as formant frequency and linguistic context increased. Higher signal level also elevated formant thresholds primarily for F2. However, no effect of the additional identification task on the formant discrimination was observed. In comparable conditions, these hearing-impaired listeners had elevated thresholds for formant discrimination compared to young normal-hearing listeners primarily for F2. Altogether, poorer performance for formant discrimination for these adult hearing-impaired listeners was mainly caused by hearing loss rather than cognitive difficulty for tasks implemented in this study.  相似文献   

2.
Quantifying the intelligibility of speech in noise for non-native listeners   总被引:3,自引:0,他引:3  
When listening to languages learned at a later age, speech intelligibility is generally lower than when listening to one's native language. The main purpose of this study is to quantify speech intelligibility in noise for specific populations of non-native listeners, only broadly addressing the underlying perceptual and linguistic processing. An easy method is sought to extend these quantitative findings to other listener populations. Dutch subjects listening to Germans and English speech, ranging from reasonable to excellent proficiency in these languages, were found to require a 1-7 dB better speech-to-noise ratio to obtain 50% sentence intelligibility than native listeners. Also, the psychometric function for sentence recognition in noise was found to be shallower for non-native than for native listeners (worst-case slope around the 50% point of 7.5%/dB, compared to 12.6%/dB for native listeners). Differences between native and non-native speech intelligibility are largely predicted by linguistic entropy estimates as derived from a letter guessing task. Less effective use of context effects (especially semantic redundancy) explains the reduced speech intelligibility for non-native listeners. While measuring speech intelligibility for many different populations of listeners (languages, linguistic experience) may be prohibitively time consuming, obtaining predictions of non-native intelligibility from linguistic entropy may help to extend the results of this study to other listener populations.  相似文献   

3.
The goal of this study was to measure the ability of normal-hearing listeners to discriminate formant frequency for vowels in isolation and sentences at three signal levels. Results showed significant elevation in formant thresholds as formant frequency and linguistic context increased. The signal level indicated a rollover effect, especially for F2, in which formant thresholds at 85 dB SPL were lower than thresholds at 70 or 100 dB SPL in both isolated vowels and sentences. This rollover level effect could be due to reduced frequency selectivity and forward/backward masking in sentence at high signal levels for normal-hearing listeners.  相似文献   

4.

Background  

The present experiments were designed to test how the linguistic feature of case is processed in Japanese by native and non-native listeners. We used a miniature version of Japanese as a model to compare sentence comprehension mechanisms in native speakers and non-native learners who had received training until they had mastered the system. In the first experiment we auditorily presented native Japanese speakers with sentences containing incorrect double nominatives and incorrect double accusatives, and with correct sentences. In the second experiment we tested trained non-natives with the same material. Based on previous research in German we expected an N400-P600 biphasic ERP response with specific modulations depending on the violated case and whether the listeners were native or non-native.  相似文献   

5.
Previous work has established that naturally produced clear speech is more intelligible than conversational speech for adult hearing-impaired listeners and normal-hearing listeners under degraded listening conditions. The major goal of the present study was to investigate the extent to which naturally produced clear speech is an effective intelligibility enhancement strategy for non-native listeners. Thirty-two non-native and 32 native listeners were presented with naturally produced English sentences. Factors that varied were speaking style (conversational versus clear), signal-to-noise ratio (-4 versus -8 dB) and talker (one male versus one female). Results showed that while native listeners derived a substantial benefit from naturally produced clear speech (an improvement of about 16 rau units on a keyword-correct count), non-native listeners exhibited only a small clear speech effect (an improvement of only 5 rau units). This relatively small clear speech effect for non-native listeners is interpreted as a consequence of the fact that clear speech is essentially native-listener oriented, and therefore is only beneficial to listeners with extensive experience with the sound structure of the target language.  相似文献   

6.
Within the framework of a study on the merits of a frequency-dependent automatic gain control in hearing aids, the effect of varying the slope of the amplitude-frequency response on the speech-reception threshold (SRT) for sentences in noise was studied for normal-hearing listeners. Speech and noise were both subjected to the same amplitude-frequency response. In the first experiment, the effect of a constant slope was investigated (20 listeners). Over a range of about -7 to +10 dB/oct, the SRT in noise remained constant. In the second experiment, a single change in the slope of the amplitude-frequency response was introduced halfway through the sentence. The effect of varying the transition time over a range down to 0.125 s appeared to be very small. In the third experiment, the slope varied continuously with range and variation frequency (0.25-2 Hz) as the parameters. The masked SRT increased gradually with variation frequency. The results indicate that the masked SRT for sentences is remarkably resistant to dynamic variations in the slope of the amplitude-frequency response.  相似文献   

7.
Two signal-processing algorithms, derived from those described by Stubbs and Summerfield [R.J. Stubbs and Q. Summerfield, J. Acoust. Soc. Am. 84, 1236-1249 (1988)], were used to separate the voiced speech of two talkers speaking simultaneously, at similar intensities, in a single channel. Both algorithms use fundamental frequency (FO) as the basis for segregation. One attenuates the interfering voice by filtering the cepstrum of the signal. The other is a hybrid algorithm that combines cepstral filtering with the technique of harmonic selection [T.W. Parsons, J. Acoust. Soc. Am. 60, 911-918 (1976)]. The algorithms were evaluated and compared in perceptual experiments involving listeners with normal hearing and listeners with cochlear hearing impairments. In experiment 1 the processing was used to separate voiced sentences spoken on a monotone. Both algorithms gave significant increases in intelligibility to both groups of listeners. The improvements were equivalent to an increase of 3-4 dB in the effective signal-to-noise ratio (SNR). In experiment 2 the processing was used to separate voiced sentences spoken with time-varying intonation. For normal-hearing listeners, cepstral filtering gave a significant increase in intelligibility, while the hybrid algorithm gave an increase that was on the margins of significance (p = 0.06). The improvements were equivalent to an increase of 2-3 dB in the effective SNR. For impaired listeners, no intelligibility improvements were demonstrated with intoned sentences. The decrease in performance for intoned material is attributed to limitations of the algorithms when FO is nonstationary.  相似文献   

8.
Despite a lack of traditional speech features, novel sentences restricted to a narrow spectral slit can retain nearly perfect intelligibility [R. M. Warren et al., Percept. Psychophys. 57, 175-182 (1995)]. The current study employed 514 listeners to elucidate the cues allowing this high intelligibility, and to examine generally the use of narrow-band temporal speech patterns. When 1/3-octave sentences were processed to preserve the overall temporal pattern of amplitude fluctuation, but eliminate contrasting amplitude patterns within the band, sentence intelligibility dropped from values near 100% to values near zero (experiment 1). However, when a 1/3-octave speech band was partitioned to create a contrasting pair of independently amplitude-modulated 1/6-octave patterns, some intelligibility was restored (experiment 2). An additional experiment (3) showed that temporal patterns can also be integrated across wide frequency separations, or across the two ears. Despite the linguistic content of single temporal patterns, open-set intelligibility does not occur. Instead, a contrast between at least two temporal patterns is required for the comprehension of novel sentences and their component words. These contrasting patterns can reside together within a narrow range of frequencies, or they can be integrated across frequencies or ears. This view of speech perception, in which across-frequency changes in energy are seen as systematic changes in the temporal fluctuation patterns at two or more fixed loci, is more in line with the physiological encoding of complex signals.  相似文献   

9.
Speech intelligibility and localization in a multi-source environment.   总被引:1,自引:0,他引:1  
Natural environments typically contain sound sources other than the source of interest that may interfere with the ability of listeners to extract information about the primary source. Studies of speech intelligibility and localization by normal-hearing listeners in the presence of competing speech are reported on in this work. One, two or three competing sentences [IEEE Trans. Audio Electroacoust. 17(3), 225-246 (1969)] were presented from various locations in the horizontal plane in several spatial configurations relative to a target sentence. Target and competing sentences were spoken by the same male talker and at the same level. All experiments were conducted both in an actual sound field and in a virtual sound field. In the virtual sound field, both binaural and monaural conditions were tested. In the speech intelligibility experiment, there were significant improvements in performance when the target and competing sentences were spatially separated. Performance was similar in the actual sound-field and virtual sound-field binaural listening conditions for speech intelligibility. Although most of these improvements are evident monaurally when using the better ear, binaural listening was necessary for large improvements in some situations. In the localization experiment, target source identification was measured in a seven-alternative absolute identification paradigm with the same competing sentence configurations as for the speech study. Performance in the localization experiment was significantly better in the actual sound-field than in the virtual sound-field binaural listening conditions. Under binaural conditions, localization performance was very good, even in the presence of three competing sentences. Under monaural conditions, performance was much worse. For the localization experiment, there was no significant effect of the number or configuration of the competing sentences tested. For these experiments, the performance in the speech intelligibility experiment was not limited by localization ability.  相似文献   

10.
It is hypothesized that in sine-wave replicas of natural speech, lexical tone recognition would be severely impaired due to the loss of F0 information, but the linguistic information at the sentence level could be retrieved even with limited tone information. Forty-one native Mandarin-Chinese-speaking listeners participated in the experiments. Results showed that sine-wave tone-recognition performance was on average only 32.7% correct. However, sine-wave sentence-recognition performance was very accurate, approximately 92% correct on average. Therefore the functional load of lexical tones on sentence recognition is limited, and the high-level recognition of sine-wave sentences is likely attributed to the perceptual organization that is influenced by top-down processes.  相似文献   

11.
Native and nonnative listeners categorized final /v/ versus /f/ in English nonwords. Fricatives followed phonetically long (originally /v/-preceding) or short (originally /f/-preceding) vowels. Vowel duration was constant for each participant and sometimes mismatched other voicing cues. Previous results showed that English but not Dutch listeners (whose L1 has no final voicing contrast) nevertheless used the misleading vowel duration for /v/-/f/ categorization. New analyses showed that Dutch listeners did use vowel duration initially, but quickly reduced its use, whereas the English listeners used it consistently throughout the experiment. Thus, nonnative listeners adapted to the stimuli more flexibly than native listeners did.  相似文献   

12.
The speech-reception threshold (SRT) for sentences presented in a fluctuating interfering background sound of 80 dBA SPL is measured for 20 normal-hearing listeners and 20 listeners with sensorineural hearing impairment. The interfering sounds range from steady-state noise, via modulated noise, to a single competing voice. Two voices are used, one male and one female, and the spectrum of the masker is shaped according to these voices. For both voices, the SRT is measured as well in noise spectrally shaped according to the target voice as shaped according to the other voice. The results show that, for normal-hearing listeners, the SRT for sentences in modulated noise is 4-6 dB lower than for steady-state noise; for sentences masked by a competing voice, this difference is 6-8 dB. For listeners with moderate sensorineural hearing loss, elevated thresholds are obtained without an appreciable effect of masker fluctuations. The implications of these results for estimating a hearing handicap in everyday conditions are discussed. By using the articulation index (AI), it is shown that hearing-impaired individuals perform poorer than suggested by the loss of audibility for some parts of the speech signal. Finally, three mechanisms are discussed that contribute to the absence of unmasking by masker fluctuations in hearing-impaired listeners. The low sensation level at which the impaired listeners receive the masker seems a major determinant. The second and third factors are: reduced temporal resolution and a reduction in comodulation masking release, respectively.  相似文献   

13.
Previous research has shown that speech recognition differences between native and proficient non-native listeners emerge under suboptimal conditions. Current evidence has suggested that the key deficit that underlies this disproportionate effect of unfavorable listening conditions for non-native listeners is their less effective use of compensatory information at higher levels of processing to recover from information loss at the phoneme identification level. The present study investigated whether this non-native disadvantage could be overcome if enhancements at various levels of processing were presented in combination. Native and non-native listeners were presented with English sentences in which the final word varied in predictability and which were produced in either plain or clear speech. Results showed that, relative to the low-predictability-plain-speech baseline condition, non-native listener final word recognition improved only when both semantic and acoustic enhancements were available (high-predictability-clear-speech). In contrast, the native listeners benefited from each source of enhancement separately and in combination. These results suggests that native and non-native listeners apply similar strategies for speech-in-noise perception: The crucial difference is in the signal clarity required for contextual information to be effective, rather than in an inability of non-native listeners to take advantage of this contextual information per se.  相似文献   

14.
Speech can remain intelligible for listeners with normal hearing when processed by narrow bandpass filters that transmit only a small fraction of the audible spectrum. Two experiments investigated the basis for the high intelligibility of narrowband speech. Experiment 1 confirmed reports that everyday English sentences can be recognized accurately (82%-98% words correct) when filtered at center frequencies of 1500, 2100, and 3000 Hz. However, narrowband low predictability (LP) sentences were less accurately recognized than high predictability (HP) sentences (20% lower scores), and excised narrowband words were even less intelligible than LP sentences (a further 23% drop). While experiment 1 revealed similar levels of performance for narrowband and broadband sentences at conversational speech levels, experiment 2 showed that speech reception thresholds were substantially (>30 dB) poorer for narrowband sentences. One explanation for this increased disparity between narrowband and broadband speech at threshold (compared to conversational speech levels) is that spectral components in the sloping transition bands of the filters provide important cues for the recognition of narrowband speech, but these components become inaudible as the signal level is reduced. Experiment 2 also showed that performance was degraded by the introduction of a speech masker (a single competing talker). The elevation in threshold was similar for narrowband and broadband speech (11 dB, on average), but because the narrowband sentences required considerably higher sound levels to reach their thresholds in quiet compared to broadband sentences, their target-to-masker ratios were very different (+23 dB for narrowband sentences and -12 dB for broadband sentences). As in experiment 1, performance was better for HP than LP sentences. The LP-HP difference was larger for narrowband than broadband sentences, suggesting that context provides greater benefits when speech is distorted by narrow bandpass filtering.  相似文献   

15.
This study examined within- and across-electrode-channel processing of temporal gaps in successful users of MED-EL COMBI 40+ cochlear implants. The first experiment tested across-ear gap duration discrimination (GDD) in four listeners with bilateral implants. The results demonstrated that across-ear GDD thresholds are elevated relative to monaural, within-electrode-channel thresholds; the size of the threshold shift was approximately the same as for monaural, across-electrode-channel configurations. Experiment 1 also demonstrated a decline in GDD performance for channel-asymmetric markers. The second experiment tested the effect of envelope fluctuation on gap detection (GD) for monaural markers carried on a single electrode channel. Results from five cochlear implant listeners indicated that envelopes associated with 50-Hz wide bands of noise resulted in poorer GD thresholds than envelopes associated with 300-Hz wide bands of noise. In both cases GD thresholds improved when envelope fluctuations were compressed by an exponent of 0.2. The results of both experiments parallel those found for acoustic hearing, therefore suggesting that temporal processing of gaps is largely limited by factors central to the cochlea.  相似文献   

16.
Listeners' ability to understand speech in adverse listening conditions is partially due to the redundant nature of speech. Natural redundancies are often lost or altered when speech is filtered, such as done in AI/SII experiments. It is important to study how listeners recognize speech when the speech signal is unfiltered and the entire broadband spectrum is present. A correlational method [R. A. Lutfi, J. Acoust. Soc. Am. 97, 1333-1334 (1995); V. M. Richards and S. Zhu, J. Acoust. Soc. Am. 95, 423-424 (1994)] has been used to determine how listeners use spectral cues to perceive nonsense syllables when the full speech spectrum is present [K. A. Doherty and C. W. Turner, J. Acoust. Soc. Am. 100, 3769-3773 (1996); C. W. Turner et al., J. Acoust. Soc. Am. 104, 1580-1585 (1998)]. The experiments in this study measured spectral-weighting strategies for more naturally occurring speech stimuli, specifically sentences, using a correlational method for normal-hearing listeners. Results indicate that listeners placed the greatest weight on spectral information within bands 2 and 5 (562-1113 and 2807-11,000 Hz), respectively. Spectral-weighting strategies for sentences were also compared to weighting strategies for nonsense syllables measured in a previous study (C. W. Turner et al., 1998). Spectral-weighting strategies for sentences were different from those reported for nonsense syllables.  相似文献   

17.
Audio recordings were made while six vocally untrained individuals read sentences aloud after breathing to three different lung volume levels-typical, high, and low. A perceptual experiment was conducted on these speech samples. The perceptual experiment consisted of a two-alternative forced-choice design, in which listeners heard matched pairs of sentences and were asked to identify which sentence in the pair departed from normal sounding speech. The results of the perceptual experiment showed that listeners can accurately discriminate between speech produced at both lung volume extremes. The percentage of correct identification was higher for speech produced at low lung volumes than that for high lung volumes. Factors such as order of presentation and removal of SPL as an acoustic cue made little difference in the ability of listeners to discriminate lung volume level from the speech signals.  相似文献   

18.
Four experiments were carried out to examine listener- and talker-related factors that may influence degree of perceived foreign accent. In each, native English listeners rated English sentences for degree of accent. It was found that degree of accent is influenced by range effects. The larger the proportion of native (or near-native) speakers included in a set of sentences being evaluated, the more strongly accented listeners judged sentences spoken by non-native speakers to be. Foreign accent ratings were not stable. Listeners judged a set of non-native-produced sentences to be more strongly accented after, as compared to before, they became familiar with those sentences. One talker-related effect noted in the study was the finding that adults' pronunciation of an L2 may improve over time. Late L2 learners who had lived in the United States for an average of 14.3 years received significantly higher scores than late learners who had resided in the United States for 0.7 years. Another talker-related effect pertained to the age of L2 learning (AOL). Native Spanish subjects with an AOL of five to six years were not found to have an accent (i.e., to receive significantly lower scores than native English speakers), whereas native Chinese subjects with an average AOL of 7.6 years did have a measurable accent. The paper concludes with the presentation of several hypotheses concerning the relationship between AOL and degree of foreign accent.  相似文献   

19.
The purpose of this study was to examine the contribution of information provided by vowels versus consonants to sentence intelligibility in young normal-hearing (YNH) and typical elderly hearing-impaired (EHI) listeners. Sentences were presented in three conditions, unaltered or with either the vowels or the consonants replaced with speech shaped noise. Sentences from male and female talkers in the TIMIT database were selected. Baseline performance was established at a 70 dB SPL level using YNH listeners. Subsequently EHI and YNH participants listened at 95 dB SPL. Participants listened to each sentence twice and were asked to repeat the entire sentence after each presentation. Words were scored correct if identified exactly. Average performance for unaltered sentences was greater than 94%. Overall, EHI listeners performed more poorly than YNH listeners. However, vowel-only sentences were always significantly more intelligible than consonant-only sentences, usually by a ratio of 2:1 across groups. In contrast to written English or words spoken in isolation, these results demonstrated that for spoken sentences, vowels carry more information about sentence intelligibility than consonants for both young normal-hearing and elderly hearing-impaired listeners.  相似文献   

20.
The goal of this study was to establish the ability of normal-hearing listeners to discriminate formant frequency in vowels in everyday speech. Vowel formant discrimination in syllables, phrases, and sentences was measured for high-fidelity (nearly natural) speech synthesized by STRAIGHT [Kawahara et al., Speech Commun. 27, 187-207 (1999)]. Thresholds were measured for changes in F1 and F2 for the vowels /I, epsilon, ae, lambda/ in /bVd/ syllables. Experimental factors manipulated included phonetic context (syllables, phrases, and sentences), sentence discrimination with the addition of an identification task, and word position. Results showed that neither longer phonetic context nor the addition of the identification task significantly affected thresholds, while thresholds for word final position showed significantly better performance than for either initial or middle position in sentences. Results suggest that an average of 0.37 barks is required for normal-hearing listeners to discriminate vowel formants in modest length sentences, elevated by 84% compared to isolated vowels. Vowel formant discrimination in several phonetic contexts was slightly elevated for STRAIGHT-synthesized speech compared to formant-synthesized speech stimuli reported in the study by Kewley-Port and Zheng [J. Acoust. Soc. Am. 106, 2945-2958 (1999)]. These elevated thresholds appeared related to greater spectral-temporal variability for high-fidelity speech produced by STRAIGHT than for formant-synthesized speech.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号