共查询到20条相似文献,搜索用时 15 毫秒
1.
Semantic and phonetic enhancements for speech-in-noise recognition by native and non-native listeners 总被引:1,自引:0,他引:1
Previous research has shown that speech recognition differences between native and proficient non-native listeners emerge under suboptimal conditions. Current evidence has suggested that the key deficit that underlies this disproportionate effect of unfavorable listening conditions for non-native listeners is their less effective use of compensatory information at higher levels of processing to recover from information loss at the phoneme identification level. The present study investigated whether this non-native disadvantage could be overcome if enhancements at various levels of processing were presented in combination. Native and non-native listeners were presented with English sentences in which the final word varied in predictability and which were produced in either plain or clear speech. Results showed that, relative to the low-predictability-plain-speech baseline condition, non-native listener final word recognition improved only when both semantic and acoustic enhancements were available (high-predictability-clear-speech). In contrast, the native listeners benefited from each source of enhancement separately and in combination. These results suggests that native and non-native listeners apply similar strategies for speech-in-noise perception: The crucial difference is in the signal clarity required for contextual information to be effective, rather than in an inability of non-native listeners to take advantage of this contextual information per se. 相似文献
2.
Cutler A Garcia Lecumberri ML Cooke M 《The Journal of the Acoustical Society of America》2008,124(2):1264-1268
Speech recognition in noise is harder in second (L2) than first languages (L1). This could be because noise disrupts speech processing more in L2 than L1, or because L1 listeners recover better though disruption is equivalent. Two similar prior studies produced discrepant results: Equivalent noise effects for L1 and L2 (Dutch) listeners, versus larger effects for L2 (Spanish) than L1. To explain this, the latter experiment was presented to listeners from the former population. Larger noise effects on consonant identification emerged for L2 (Dutch) than L1 listeners, suggesting that task factors rather than L2 population differences underlie the results discrepancy. 相似文献
3.
CY Lee Y Zhang X Li L Tao ZS Bond 《The Journal of the Acoustical Society of America》2012,132(2):1130-1140
Speaker variability and noise are two common sources of acoustic variability. The goal of this study was to examine whether these two sources of acoustic variability affected native and non-native perception of Mandarin fricatives to different degrees. Multispeaker Mandarin fricative stimuli were presented to 40 native and 52 non-native listeners in two presentation formats (blocked by speaker and mixed across speakers). The stimuli were also mixed with speech-shaped noise to create five levels of signal-to- noise ratios. The results showed that noise affected non-native identification disproportionately. By contrast, the effect of speaker variability was comparable between the native and non-native listeners. Confusion patterns were interpreted with reference to the results of acoustic analysis, suggesting native and non-native listeners used distinct acoustic cues for fricative identification. It was concluded that not all sources of acoustic variability are treated equally by native and non-native listeners. Whereas noise compromised non-native fricative perception disproportionately, speaker variability did not pose a special challenge to the non-native listeners. 相似文献
4.
The amount of acoustic information that native and non-native listeners need for syllable identification was investigated by comparing the performance of monolingual English speakers and native Spanish speakers with either an earlier or a later age of immersion in an English-speaking environment. Duration-preserved silent-center syllables retaining 10, 20, 30, or 40 ms of the consonant-vowel and vowel-consonant transitions were created for the target vowels /i, I, eI, epsilon, ae/ and /a/, spoken by two males in /bVb/ context. Duration-neutral syllables were created by editing the silent portion to equate the duration of all vowels. Listeners identified the syllables in a six-alternative forced-choice task. The earlier learners identified the whole-word and 40 ms duration-preserved syllables as accurately as the monolingual listeners, but identified the silent-center syllables significantly less accurately overall. Only the monolingual listener group identified syllables significantly more accurately in the duration-preserved than in the duration-neutral condition, suggesting that the non-native listeners were unable to recover from the syllable disruption sufficiently to access the duration cues in the silent-center syllables. This effect was most pronounced for the later learners, who also showed the most vowel confusions and the greatest decrease in performance from the whole word to the 40 ms transition condition. 相似文献
5.
Munson B Donaldson GS Allen SL Collison EA Nelson DA 《The Journal of the Acoustical Society of America》2003,113(2):925-935
Many studies have noted great variability in speech perception ability among postlingually deafened adults with cochlear implants. This study examined phoneme misperceptions for 30 cochlear implant listeners using either the Nucleus-22 or Clarion version 1.2 device to examine whether listeners with better overall speech perception differed qualitatively from poorer listeners in their perception of vowel and consonant features. In the first analysis, simple regressions were used to predict the mean percent-correct scores for consonants and vowels for the better group of listeners from those of the poorer group. A strong relationship between the two groups was found for consonant identification, and a weak, nonsignificant relationship was found for vowel identification. In the second analysis, it was found that less information was transmitted for consonant and vowel features to the poorer listeners than to the better listeners; however, the pattern of information transmission was similar across groups. Taken together, results suggest that the performance difference between the two groups is primarily quantitative. The results underscore the importance of examining individuals' perception of individual phoneme features when attempting to relate speech perception to other predictor variables. 相似文献
6.
Brouwer S Van Engen KJ Calandruccio L Bradlow AR 《The Journal of the Acoustical Society of America》2012,131(2):1449-1464
This study examined whether speech-on-speech masking is sensitive to variation in the degree of similarity between the target and the masker speech. Three experiments investigated whether speech-in-speech recognition varies across different background speech languages (English vs Dutch) for both English and Dutch targets, as well as across variation in the semantic content of the background speech (meaningful vs semantically anomalous sentences), and across variation in listener status vis-a?-vis the target and masker languages (native, non-native, or unfamiliar). The results showed that the more similar the target speech is to the masker speech (e.g., same vs different language, same vs different levels of semantic content), the greater the interference on speech recognition accuracy. Moreover, the listener's knowledge of the target and the background language modulate the size of the release from masking. These factors had an especially strong effect on masking effectiveness in highly unfavorable listening conditions. Overall this research provided evidence that that the degree of target-masker similarity plays a significant role in speech-in-speech recognition. The results also give insight into how listeners assign their resources differently depending on whether they are listening to their first or second language. 相似文献
7.
This study investigated how native language background interacts with speaking style adaptations in determining levels of speech intelligibility. The aim was to explore whether native and high proficiency non-native listeners benefit similarly from native and non-native clear speech adjustments. The sentence-in-noise perception results revealed that fluent non-native listeners gained a large clear speech benefit from native clear speech modifications. Furthermore, proficient non-native talkers in this study implemented conversational-to-clear speaking style modifications in their second language (L2) that resulted in significant intelligibility gain for both native and non-native listeners. The results of the accentedness ratings obtained for native and non-native conversational and clear speech sentences showed that while intelligibility was improved, the presence of foreign accent remained constant in both speaking styles. This suggests that objective intelligibility and subjective accentedness are two independent dimensions of non-native speech. Overall, these results provide strong evidence that greater experience in L2 processing leads to improved intelligibility in both production and perception domains. These results also demonstrated that speaking style adaptations along with less signal distortion can contribute significantly towards successful native and non-native interactions. 相似文献
8.
Tuinman A Mitterer H Cutler A 《The Journal of the Acoustical Society of America》2011,130(3):1643-1652
In sequences such as law and order, speakers of British English often insert /r/ between law and and. Acoustic analyses revealed such "intrusive" /r/ to be significantly shorter than canonical /r/. In a 2AFC experiment, native listeners heard British English sentences in which /r/ duration was manipulated across a word boundary [e.g., saw (r)ice], and orthographic and semantic factors were varied. These listeners responded categorically on the basis of acoustic evidence for /r/ alone, reporting ice after short /r/s, rice after long /r/s; orthographic and semantic factors had no effect. Dutch listeners proficient in English who heard the same materials relied less on durational cues than the native listeners, and were affected by both orthography and semantic bias. American English listeners produced intermediate responses to the same materials, being sensitive to duration (less so than native, more so than Dutch listeners), and to orthography (less so than the Dutch), but insensitive to the semantic manipulation. Listeners from language communities without common use of intrusive /r/ may thus interpret intrusive /r/ as canonical /r/, with a language difference increasing this propensity more than a dialect difference. Native listeners, however, efficiently distinguish intrusive from canonical /r/ by exploiting the relevant acoustic variation. 相似文献
9.
This study examined the effect of presumed mismatches between speech input and the phonological representations of English words by native speakers of English (NE) and Spanish (NS). The English test words, which were produced by a NE speaker and a NS speaker, varied orthogonally in lexical frequency and neighborhood density and were presented to NE listeners and to NS listeners who differed in English pronunciation proficiency. It was hypothesized that mismatches between phonological representations and speech input would impair word recognition, especially for items from dense lexical neighborhoods which are phonologically similar to many other words and require finer sound discrimination. Further, it was assumed that L2 phonological representations would change with L2 proficiency. The results showed the expected mismatch effect only for words from dense neighborhoods. For Spanish-accented stimuli, the NS groups recognized more words from dense neighborhoods than the NE group did. For native-produced stimuli, the low-proficiency NS group recognized fewer words than the other two groups. The-high proficiency NS participants' performance was as good as the NE group's for words from sparse neighborhoods, but not for words from dense neighborhoods. These results are discussed in relation to the development of phonological representations of L2 words. (200 words). 相似文献
10.
Speech produced in the presence of noise-Lombard speech-is more intelligible in noise than speech produced in quiet, but the origin of this advantage is poorly understood. Some of the benefit appears to arise from auditory factors such as energetic masking release, but a role for linguistic enhancements similar to those exhibited in clear speech is possible. The current study examined the effect of Lombard speech in noise and in quiet for Spanish learners of English. Non-native listeners showed a substantial benefit of Lombard speech in noise, although not quite as large as that displayed by native listeners tested on the same task in an earlier study [Lu and Cooke (2008), J. Acoust. Soc. Am. 124, 3261-3275]. The difference between the two groups is unlikely to be due to energetic masking. However, Lombard speech was less intelligible in quiet for non-native listeners than normal speech. The relatively small difference in Lombard benefit in noise for native and non-native listeners, along with the absence of Lombard benefit in quiet, suggests that any contribution of linguistic enhancements in the Lombard benefit for natives is small. 相似文献
11.
Effect of stimulation rate on phoneme recognition by nucleus-22 cochlear implant listeners 总被引:3,自引:0,他引:3
This study investigated the effect of pulsatile stimulation rate on medial vowel and consonant recognition in cochlear implant listeners. Experiment 1 measured phoneme recognition as a function of stimulation rate in six Nucleus-22 cochlear implant listeners using an experimental four-channel continuous interleaved sampler (CIS) speech processing strategy. Results showed that all stimulation rates from 150 to 500 pulses/s/electrode produced equally good performance, while stimulation rates lower than 150 pulses/s/electrode produced significantly poorer performance. Experiment 2 measured phoneme recognition by implant listeners and normal-hearing listeners as a function of the low-pass cutoff frequency for envelope information. Results from both acoustic and electric hearing showed no significant difference in performance for all cutoff frequencies higher than 20 Hz. Both vowel and consonant scores dropped significantly when the cutoff frequency was reduced from 20 Hz to 2 Hz. The results of these two experiments suggest that temporal envelope information can be conveyed by relatively low stimulation rates. The pattern of results for both electrical and acoustic hearing is consistent with a simple model of temporal integration with an equivalent rectangular duration (ERD) of the temporal integrator of about 7 ms. 相似文献
12.
van Wijngaarden SJ Steeneken HJ Houtgast T 《The Journal of the Acoustical Society of America》2002,111(4):1906-1916
When listening to languages learned at a later age, speech intelligibility is generally lower than when listening to one's native language. The main purpose of this study is to quantify speech intelligibility in noise for specific populations of non-native listeners, only broadly addressing the underlying perceptual and linguistic processing. An easy method is sought to extend these quantitative findings to other listener populations. Dutch subjects listening to Germans and English speech, ranging from reasonable to excellent proficiency in these languages, were found to require a 1-7 dB better speech-to-noise ratio to obtain 50% sentence intelligibility than native listeners. Also, the psychometric function for sentence recognition in noise was found to be shallower for non-native than for native listeners (worst-case slope around the 50% point of 7.5%/dB, compared to 12.6%/dB for native listeners). Differences between native and non-native speech intelligibility are largely predicted by linguistic entropy estimates as derived from a letter guessing task. Less effective use of context effects (especially semantic redundancy) explains the reduced speech intelligibility for non-native listeners. While measuring speech intelligibility for many different populations of listeners (languages, linguistic experience) may be prohibitively time consuming, obtaining predictions of non-native intelligibility from linguistic entropy may help to extend the results of this study to other listener populations. 相似文献
13.
This study examined the effect of linguistic experience on perception of the English /s/-/z/ contrast in word-final position. The durations of the periodic ("vowel") and aperiodic ("fricative") portions of stimuli, ranging from peas to peace, were varied in a 5 X 5 factorial design. Forced-choice identification judgments were elicited from two groups of native speakers of American English differing in dialect, and from two groups each of native speakers of French, Swedish, and Finnish differing in English-language experience. The results suggested that the non-native subjects used cues established for the perception of phonetic contrasts in their native language to identify fricatives as /s/ or /z/. Lengthening vowel duration increased /z/ judgments in all eight subject groups, although the effect was smaller for native speakers of French than for native speakers of the other languages. Shortening fricative duration, on the other hand, significantly decreased /z/ judgments only by the English and French subjects. It did not influence voicing judgments by the Swedish and Finnish subjects, even those who had lived for a year or more in an English-speaking environment. These findings raise the question of whether adults who learn a foreign language can acquire the ability to integrate multiple acoustic cues to a phonetic contrast which does not exist in their native language. 相似文献
14.
Gordon-Salant S Yeni-Komshian GH Fitzgibbons PJ 《The Journal of the Acoustical Society of America》2010,128(5):3152-3160
This study investigated the effects of age and hearing loss on perception of accented speech presented in quiet and noise. The relative importance of alterations in phonetic segments vs. temporal patterns in a carrier phrase with accented speech also was examined. English sentences recorded by a native English speaker and a native Spanish speaker, together with hybrid sentences that varied the native language of the speaker of the carrier phrase and the final target word of the sentence were presented to younger and older listeners with normal hearing and older listeners with hearing loss in quiet and noise. Effects of age and hearing loss were observed in both listening environments, but varied with speaker accent. All groups exhibited lower recognition performance for the final target word spoken by the accented speaker compared to that spoken by the native speaker, indicating that alterations in segmental cues due to accent play a prominent role in intelligibility. Effects of the carrier phrase were minimal. The findings indicate that recognition of accented speech, especially in noise, is a particularly challenging communication task for older people. 相似文献
15.
D H Klatt 《The Journal of the Acoustical Society of America》1968,44(2):401-407
16.
English consonant recognition in noise and in reverberation by Japanese and American listeners 总被引:1,自引:0,他引:1
English consonant recognition in undegraded and degraded listening conditions was compared for listeners whose primary language was either Japanese or American English. There were ten subjects in each of the two groups, termed the non-native (Japanese) and the native (American) subjects, respectively. The Modified Rhyme Test was degraded either by a babble of voices (S/N = -3 dB) or by a room reverberation (reverberation time, T = 1.2 s). The Japanese subjects performed at a lower level than the American subjects in both noise and reverberation, although the performance difference in the undegraded, quiet condition was relatively small. There was no difference between the scores obtained in noise and in reverberation for either group. A limited-error analysis revealed some differences in type of errors for the groups of listeners. Implications of the results are discussed in terms of the effects of degraded listening conditions on non-native listeners' speech perception. 相似文献
17.
Garcia Lecumberri ML Cooke M 《The Journal of the Acoustical Society of America》2006,119(4):2445-2454
Spoken communication in a non-native language is especially difficult in the presence of noise. This study compared English and Spanish listeners' perceptions of English intervocalic consonants as a function of masker type. Three maskers (stationary noise, multitalker babble, and competing speech) provided varying amounts of energetic and informational masking. Competing English and Spanish speech maskers were used to examine the effect of masker language. Non-native performance fell short of that of native listeners in quiet, but a larger performance differential was found for all masking conditions. Both groups performed better in competing speech than in stationary noise, and both suffered most in babble. Since babble is a less effective energetic masker than stationary noise, these results suggest that non-native listeners are more adversely affected by both energetic and informational masking. A strong correlation was found between non-native performance in quiet and degree of deterioration in noise, suggesting that non-native phonetic category learning can be fragile. A small effect of language background was evident: English listeners performed better when the competing speech was Spanish. 相似文献
18.
Previous studies have shown improved sensitivity to native-language contrasts and reduced sensitivity to non-native phonetic contrasts when comparing 6-8 and 10-12-month-old infants. This developmental pattern is interpreted as reflecting the onset of language-specific processing around the first birthday. However, generalization of this finding is limited by the fact that studies have yielded inconsistent results and that insufficient numbers of phonetic contrasts have been tested developmentally; this is especially true for native-language phonetic contrasts. Three experiments assessed the effects of language experience on affricate-fricative contrasts in a cross-language study of English and Mandarin adults and infants. Experiment 1 showed that English-speaking adults score lower than Mandarin-speaking adults on Mandarin alveolo-palatal affricate-fricative discrimination. Experiment 2 examined developmental change in the discrimination of this contrast in English- and Mandarin-leaning infants between 6 and 12 months of age. The results demonstrated that native-language performance significantly improved with age while performance on the non-native contrast decreased. Experiment 3 replicated the perceptual improvement for a native contrast: 6-8 and 10-12-month-old English-learning infants showed a performance increase at the older age. The results add to our knowledge of the developmental patterns of native and non-native phonetic perception. 相似文献
19.
Strange W Akahane-Yamada R Kubo R Trent SA Nishi K 《The Journal of the Acoustical Society of America》2001,109(4):1691-1704
This study investigated the extent to which adult Japanese listeners' perceived phonetic similarity of American English (AE) and Japanese (J) vowels varied with consonantal context. Four AE speakers produced multiple instances of the 11 AE vowels in six syllabic contexts /b-b, b-p, d-d, d-t, g-g, g-k/ embedded in a short carrier sentence. Twenty-four native speakers of Japanese were asked to categorize each vowel utterance as most similar to one of 18 Japanese categories [five one-mora vowels, five two-mora vowels, plus/ei, ou/ and one-mora and two-mora vowels in palatalized consonant CV syllables, C(j)a(a), C(j)u(u), C(j)o(o)]. They then rated the "category goodness" of the AE vowel to the selected Japanese category on a seven-point scale. None of the 11 AE vowels was assimilated unanimously to a single J response category in all context/speaker conditions; consistency in selecting a single response category ranged from 77% for /eI/ to only 32% for /ae/. Median ratings of category goodness for modal response categories were somewhat restricted overall, ranging from 5 to 3. Results indicated that temporal assimilation patterns (judged similarity to one-mora versus two-mora Japanese categories) differed as a function of the voicing of the final consonant, especially for the AE vowels, /see text/. Patterns of spectral assimilation (judged similarity to the five J vowel qualities) of /see text/ also varied systematically with consonantal context and speakers. On the basis of these results, it was predicted that relative difficulty in the identification and discrimination of AE vowels by Japanese speakers would vary significantly as a function of the contexts in which they were produced and presented. 相似文献
20.
Thresholds of ongoing interaural time difference (ITD) were obtained from normal-hearing and hearing-impaired listeners who had high-frequency, sensorineural hearing loss. Several stimuli (a 500-Hz sinusoid, a narrow-band noise centered at 500 Hz, a sinusoidally amplitude-modulated 4000-Hz tone, and a narrow-band noise centered at 4000 Hz) and two criteria [equal sound-pressure level (Eq SPL) and equal sensation level (Eq SL)] for determining the level of stimuli presented to each listener were employed. The ITD thresholds and slopes of the psychometric functions were elevated for hearing-impaired listeners for the two high-frequency stimuli in comparison to: the listener's own low-frequency thresholds; and data obtained from normal-hearing listeners for stimuli presented with Eq SPL interaurally. The two groups of listeners required similar ITDs to reach threshold when stimuli were presented at Eq SLs to each ear. For low-frequency stimuli, the ITD thresholds of the hearing-impaired listener were generally slightly greater than those obtained from the normal-hearing listeners. Whether these stimuli were presented at either Eq SPL or Eq SL did not differentially affect the ITD thresholds across groups. 相似文献