期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

1.

Recognition of accented English in quiet and noise by younger and older listeners

Gordon-Salant S Yeni-Komshian GH Fitzgibbons PJ 《The Journal of the Acoustical Society of America》2010,128(5):3152-3160

This study investigated the effects of age and hearing loss on perception of accented speech presented in quiet and noise. The relative importance of alterations in phonetic segments vs. temporal patterns in a carrier phrase with accented speech also was examined. English sentences recorded by a native English speaker and a native Spanish speaker, together with hybrid sentences that varied the native language of the speaker of the carrier phrase and the final target word of the sentence were presented to younger and older listeners with normal hearing and older listeners with hearing loss in quiet and noise. Effects of age and hearing loss were observed in both listening environments, but varied with speaker accent. All groups exhibited lower recognition performance for the final target word spoken by the accented speaker compared to that spoken by the native speaker, indicating that alterations in segmental cues due to accent play a prominent role in intelligibility. Effects of the carrier phrase were minimal. The findings indicate that recognition of accented speech, especially in noise, is a particularly challenging communication task for older people. 相似文献

2.

Rapid adaptation to foreign-accented English 总被引：1，自引：0，他引：1

Clarke CM Garrett MF 《The Journal of the Acoustical Society of America》2004,116(6):3647-3658

This study explored the perceptual benefits of brief exposure to non-native speech. Native English listeners were exposed to English sentences produced by non-native speakers. Perceptual processing speed was tracked by measuring reaction times to visual probe words following each sentence. Three experiments using Spanish- and Chinese-accented speech indicate that processing speed is initially slower for accented speech than for native speech but that this deficit diminishes within one minute of exposure. Control conditions rule out explanations for the adaptation effect based on practice with the task and general strategies for dealing with difficult speech. Further results suggest that adaptation can occur within as few as two to four sentence-length utterances. The findings emphasize the flexibility of human speech processing and require models of spoken word recognition that can rapidly accommodate significant acoustic-phonetic deviations from native language speech patterns. 相似文献

3.

Effects and modeling of phonetic and acoustic confusions in accented speech

Fung P Liu Y 《The Journal of the Acoustical Society of America》2005,118(5):3279-3293

Accented speech recognition is more challenging than standard speech recognition due to the effects of phonetic and acoustic confusions. Phonetic confusion in accented speech occurs when an expected phone is pronounced as a different one, which leads to erroneous recognition. Acoustic confusion occurs when the pronounced phone is found to lie acoustically between two baseform models and can be equally recognized as either one. We propose that it is necessary to analyze and model these confusions separately in order to improve accented speech recognition without degrading standard speech recognition. Since low phonetic confusion units in accented speech do not give rise to automatic speech recognition errors, we focus on analyzing and reducing phonetic and acoustic confusability under high phonetic confusion conditions. We propose using likelihood ratio test to measure phonetic confusion, and asymmetric acoustic distance to measure acoustic confusion. Only accent-specific phonetic units with low acoustic confusion are used in an augmented pronunciation dictionary, while phonetic units with high acoustic confusion are reconstructed using decision tree merging. Experimental results show that our approach is effective and superior to methods modeling phonetic confusion or acoustic confusion alone in accented speech, with a significant 5.7% absolute WER reduction, without degrading standard speech recognition. 相似文献

4.

Proportional frequency compression of speech for listeners with sensorineural hearing loss.

C W Turner R R Hurtig 《The Journal of the Acoustical Society of America》1999,106(2):877-886

This study examined proportional frequency compression as a strategy for improving speech recognition in listeners with high-frequency sensorineural hearing loss. This method of frequency compression preserved the ratios between the frequencies of the components of natural speech, as well as the temporal envelope of the unprocessed speech stimuli. Nonsense syllables spoken by a female and a male talker were used as the speech materials. Both frequency-compressed speech and the control condition of unprocessed speech were presented with high-pass amplification. For the materials spoken by the female talker, significant increases in speech recognition were observed in slightly less than one-half of the listeners with hearing impairment. For the male-talker materials, one-fifth of the hearing-impaired listeners showed significant recognition improvements. The increases in speech recognition due solely to frequency compression were generally smaller than those solely due to high-pass amplification. The results indicate that while high-pass amplification is still the most effective approach for improving speech recognition of listeners with high-frequency hearing loss, proportional frequency compression can offer significant improvements in addition to those provided by amplification for some patients. 相似文献

5.

The time course and magnitude of perceptual acclimatization to frequency responses: evidence from monaural fitting of hearing aids.

S Gatehouse 《The Journal of the Acoustical Society of America》1992,92(3):1258-1268

At high presentation levels, normally aided ears yield better performance for speech identification than normally unaided ears, while at low presentation levels the converse is true [S. Gatehouse, J. Acoust. Soc. Am. 86, 2103-2106 (1989)]. To explain this process further, the speech identification abilities of four subjects with bilateral symmetric sensorineural hearing impairment were investigated following provision of a single hearing aid. Results showed significant increases in the benefit from amplifying speech in the aided ear, but not in the control ear. In addition, a headphone simulation of the unaided condition for the fitted ear shows a decrease in speech identification. The benefits from providing a particular frequency spectrum do not emerge immediately, but over a time course of at least 6-12 weeks. The findings support the existence of perceptual acclimatization effects, and call into question short-term methods of hearing aid evaluation and selection by comparative speech identification tests. 相似文献

6.

Quantifying the intelligibility of speech in noise for non-native talkers

van Wijngaarden SJ Steeneken HJ Houtgast T 《The Journal of the Acoustical Society of America》2002,112(6):3004-3013

The intelligibility of speech pronounced by non-native talkers is generally lower than speech pronounced by native talkers, especially under adverse conditions, such as high levels of background noise. The effect of foreign accent on speech intelligibility was investigated quantitatively through a series of experiments involving voices of 15 talkers, differing in language background, age of second-language (L2) acquisition and experience with the target language (Dutch). Overall speech intelligibility of L2 talkers in noise is predicted with a reasonable accuracy from accent ratings by native listeners, as well as from the self-ratings for proficiency of L2 talkers. For non-native speech, unlike native speech, the intelligibility of short messages (sentences) cannot be fully predicted by phoneme-based intelligibility tests. Although incorrect recognition of specific phonemes certainly occurs as a result of foreign accent, the effect of reduced phoneme recognition on the intelligibility of sentences may range from severe to virtually absent, depending on (for instance) the speech-to-noise ratio. Objective acoustic-phonetic analyses of accented speech were also carried out, but satisfactory overall predictions of speech intelligibility could not be obtained with relatively simple acoustic-phonetic measures. 相似文献

7.

Perceptual coherence in listeners having longstanding childhood hearing losses, listeners with adult-onset hearing losses, and listeners with normal hearing

Pittman A 《The Journal of the Acoustical Society of America》2008,123(1):441-449

Perceptual coherence, the process by which the individual elements of complex sounds are bound together, was examined in adult listeners with longstanding childhood hearing losses, listeners with adult-onset hearing losses, and listeners with normal hearing. It was hypothesized that perceptual coherence would vary in strength between the groups due to their substantial differences in hearing history. Bisyllabic words produced by three talkers as well as comodulated three-tone complexes served as stimuli. In the first task, the second formant of each word was isolated and presented for recognition. In the second task, an isolated formant was paired with an intact word and listeners indicated whether or not the isolated second formant was a component of the intact word. In the third task, the middle component of the three-tone complex was presented in the same manner. For the speech stimuli, results indicate normal perceptual coherence in the listeners with adult-onset hearing loss but significantly weaker coherence in the listeners with childhood hearing losses. No differences were observed across groups for the nonspeech stimuli. These results suggest that perceptual coherence is relatively unaffected by hearing loss acquired during adulthood but appears to be impaired when hearing loss is present in early childhood. 相似文献

8.

Effects of language experience and stimulus complexity on the categorical perception of pitch direction

Xu Y Gandour JT Francis AL 《The Journal of the Acoustical Society of America》2006,120(2):1063-1074

Whether or not categorical perception results from the operation of a special, language-specific, speech mode remains controversial. In this cross-language (Mandarin Chinese, English) study of the categorical nature of tone perception, we compared native Mandarin and English speakers' perception of a physical continuum of fundamental frequency contours ranging from a level to rising tone in both Mandarin speech and a homologous (nonspeech) harmonic tone. This design permits us to evaluate the effect of language experience by comparing Chinese and English groups; to determine whether categorical perception is speech-specific or domain-general by comparing speech to nonspeech stimuli for both groups; and to examine whether categorical perception involves a separate categorical process, distinct from regions of sensory discontinuity, by comparing speech to nonspeech stimuli for English listeners. Results show evidence of strong categorical perception of speech stimuli for Chinese but not English listeners. Categorical perception of nonspeech stimuli was comparable to that for speech stimuli for Chinese but weaker for English listeners, and perception of nonspeech stimuli was more categorical for English listeners than was perception of speech stimuli. These findings lead us to adopt a memory-based, multistore model of perception in which categorization is domain-general but influenced by long-term categorical representations. 相似文献

9.

An articulation index based procedure for predicting the speech recognition performance of hearing-impaired individuals

C V Pavlovic G A Studebaker R L Sherbecoe 《The Journal of the Acoustical Society of America》1986,80(1):50-57

An articulation index calculation procedure developed for use with individual normal-hearing listeners [C. Pavlovic and G. Studebaker, J. Acoust. Soc. Am. 75, 1606-1612 (1984)] was modified to account for the deterioration in suprathreshold speech processing produced by sensorineural hearing impairment. Data from four normal-hearing and four hearing-impaired subjects were used to relate the loss in hearing sensitivity to the deterioration in speech processing in quiet and in noise. The new procedure only requires hearing threshold measurements and consists of the following two modifications of the original AI procedure of Pavlovic and Studebaker (1984): The speech and noise spectrum densities are integrated over bandwidths which are, when expressed in decibels, larger than the critical bandwidths by 10% of the hearing loss. This is in contrast to the unmodified procedure where integration is performed over critical bandwidths. The contribution of each frequency to the AI is the product of its contribution in the unmodified AI procedure and a "speech desensitization factor." The desensitization factor is specified as a function of the hearing loss. The predictive accuracies of both the unmodified and the modified calculation procedures were assessed by comparing the expected and observed speech recognition scores of four hearing-impaired subjects under various conditions of speech filtering and noise masking. The modified procedure appears accurate for general applications. In contrast, the unmodified procedure appears accurate only for applications where results obtained under various conditions on a single listener are compared to each other. 相似文献

10.

Monosyllabic word recognition at higher-than-normal speech and noise levels

Studebaker GA Sherbecoe RL McDaniel DM Gwaltney CA 《The Journal of the Acoustical Society of America》1999,105(4):2431-2444

The effects of intensity on monosyllabic word recognition were studied in adults with normal hearing and mild-to-moderate sensorineural hearing loss. The stimuli were bandlimited NU#6 word lists presented in quiet and talker-spectrum-matched noise. Speech levels ranged from 64 to 99 dB SPL and S/N ratios from 28 to -4 dB. In quiet, the performance of normal-hearing subjects remained essentially constant in noise, at a fixed S/N ratio, it decreased as a linear function of speech level. Hearing-impaired subjects performed like normal-hearing subjects tested in noise when the data were corrected for the effects of audibility loss. From these and other results, it was concluded that: (1) speech intelligibility in noise decreases when speech levels exceed 69 dB SPL and the S/N ratio remains constant; (2) the effects of speech and noise level are synergistic; (3) the deterioration in intelligibility can be modeled as a relative increase in the effective masking level; (4) normal-hearing and hearing-impaired subjects are affected similarly by increased signal level when differences in speech audibility are considered; (5) the negative effects of increasing speech and noise levels on speech recognition are similar for all adult subjects, at least up to 80 years; and (6) the effective dynamic range of speech may be larger than the commonly assumed value of 30 dB. 相似文献

11.

The effects of compression ratio, signal-to-noise ratio, and level on speech recognition in normal-hearing listeners. 总被引：2，自引：0，他引：2

B W Hornsby T A Ricketts 《The Journal of the Acoustical Society of America》2001,109(6):2964-2973

Previous research has demonstrated reduced speech recognition when speech is presented at higher-than-normal levels (e.g., above conversational speech levels), particularly in the presence of speech-shaped background noise. Persons with hearing loss frequently listen to speech-in-noise at these levels through hearing aids, which incorporate multiple-channel, wide dynamic range compression. This study examined the interactive effects of signal-to-noise ratio (SNR), speech presentation level, and compression ratio on consonant recognition in noise. Nine subjects with normal hearing identified CV and VC nonsense syllables in a speech-shaped noise at two SNRs (0 and +6 dB), three presentation levels (65, 80, and 95 dB SPL) and four compression ratios (1:1, 2:1, 4:1, and 6:1). Stimuli were processed through a simulated three-channel, fast-acting, wide dynamic range compression hearing aid. Consonant recognition performance decreased as compression ratio increased and presentation level increased. Interaction effects were noted between SNR and compression ratio, as well as between presentation level and compression ratio. Performance decrements due to increases in compression ratio were larger at the better (+6 dB) SNR and at the lowest (65 dB SPL) presentation level. At higher levels (95 dB SPL), such as those experienced by persons with hearing loss, increasing compression ratio did not significantly affect speech intelligibility. 相似文献

12.

Talker and listener effects on degree of perceived foreign accent.

J E Flege K L Fletcher 《The Journal of the Acoustical Society of America》1992,91(1):370-389

Four experiments were carried out to examine listener- and talker-related factors that may influence degree of perceived foreign accent. In each, native English listeners rated English sentences for degree of accent. It was found that degree of accent is influenced by range effects. The larger the proportion of native (or near-native) speakers included in a set of sentences being evaluated, the more strongly accented listeners judged sentences spoken by non-native speakers to be. Foreign accent ratings were not stable. Listeners judged a set of non-native-produced sentences to be more strongly accented after, as compared to before, they became familiar with those sentences. One talker-related effect noted in the study was the finding that adults' pronunciation of an L2 may improve over time. Late L2 learners who had lived in the United States for an average of 14.3 years received significantly higher scores than late learners who had resided in the United States for 0.7 years. Another talker-related effect pertained to the age of L2 learning (AOL). Native Spanish subjects with an AOL of five to six years were not found to have an accent (i.e., to receive significantly lower scores than native English speakers), whereas native Chinese subjects with an average AOL of 7.6 years did have a measurable accent. The paper concludes with the presentation of several hypotheses concerning the relationship between AOL and degree of foreign accent. 相似文献

13.

Stop-consonant recognition for normal-hearing listeners and listeners with high-frequency hearing loss. II: Articulation index predictions

J R Dubno D D Dirks A B Schaefer 《The Journal of the Acoustical Society of America》1989,85(1):355-364

Articulation index (AI) theory was used to evaluate stop-consonant recognition of normal-hearing listeners and listeners with high-frequency hearing loss. From results reported in a companion article [Dubno et al., J. Acoust. Soc. Am. 85, 347-354 (1989)], a transfer function relating the AI to stop-consonant recognition was established, and a frequency importance function was determined for the nine stop-consonant-vowel syllables used as test stimuli. The calculations included the rms and peak levels of the speech that had been measured in 1/3 octave bands; the internal noise was estimated from the thresholds for each subject. The AI model was then used to predict performance for the hearing-impaired listeners. A majority of the AI predictions for the hearing-impaired subjects fell within +/- 2 standard deviations of the normal-hearing listeners' results. However, as observed in previous data, the AI tended to overestimate performance of the hearing-impaired listeners. The accuracy of the predictions decreased with the magnitude of high-frequency hearing loss. Thus, with the exception of performance for listeners with severe high-frequency hearing loss, the results suggest that poorer speech recognition among hearing-impaired listeners results from reduced audibility within critical spectral regions of the speech stimuli. 相似文献

14.

Spectral and threshold effects on recognition of speech at higher-than-normal levels

Dubno JR Horwitz AR Ahlstrom JB 《The Journal of the Acoustical Society of America》2006,120(1):310-320

To examine spectral and threshold effects for speech and noise at high levels, recognition of nonsense syllables was assessed for low-pass-filtered speech and speech-shaped maskers and high-pass-filtered speech and speech-shaped maskers at three speech levels, with signal-to-noise ratio held constant. Subjects were younger adults with normal hearing and older adults with normal hearing but significantly higher average quiet thresholds. A broadband masker was always present to minimize audibility differences between subject groups and across presentation levels. For subjects with lower thresholds, the declines in recognition of low-frequency syllables in low-frequency maskers were attributed to nonlinear growth of masking which reduced "effective" signal-to-noise ratio at high levels, whereas the decline for subjects with higher thresholds was not fully explained by nonlinear masking growth. For all subjects, masking growth did not entirely account for declines in recognition of high-frequency syllables in high-frequency maskers at high levels. Relative to younger subjects with normal hearing and lower quiet thresholds, older subjects with normal hearing and higher quiet thresholds had poorer consonant recognition in noise, especially for high-frequency speech in high-frequency maskers. Age-related effects on thresholds and task proficiency may be determining factors in the recognition of speech in noise at high levels. 相似文献

15.

Speaker-independent factors affecting the perception of foreign accent in a second language

Levi SV Winters SJ Pisoni DB 《The Journal of the Acoustical Society of America》2007,121(4):2327-2338

Previous research on foreign accent perception has largely focused on speaker-dependent factors such as age of learning and length of residence. Factors that are independent of a speaker's language learning history have also been shown to affect perception of second language speech. The present study examined the effects of two such factors--listening context and lexical frequency--on the perception of foreign-accented speech. Listeners rated foreign accent in two listening contexts: auditory-only, where listeners only heard the target stimuli, and auditory + orthography, where listeners were presented with both an auditory signal and an orthographic display of the target word. Results revealed that higher frequency words were consistently rated as less accented than lower frequency words. The effect of the listening context emerged in two interactions: the auditory + orthography context reduced the effects of lexical frequency, but increased the perceived differences between native and non-native speakers. Acoustic measurements revealed some production differences for words of different levels of lexical frequency, though these differences could not account for all of the observed interactions from the perceptual experiment. These results suggest that factors independent of the speakers' actual speech articulations can influence the perception of degree of foreign accent. 相似文献

16.

Evaluation of orthogonal polynomial compression

H Levitt A C Neuman 《The Journal of the Acoustical Society of America》1991,90(1):241-252

In orthogonal polynomial compression, the short-term speech spectrum is first approximated by a family of orthogonal polynomials. The coefficients of each polynomial, which vary over time, are then adjusted in terms of their average value and range of variation. These adjustments can be used to compress (or expand) temporal variations in the average level, slope, and various forms of curvature of the short-term speech spectrum. The analysis and reconstruction of the short-term speech spectrum using orthogonal polynomials was implemented using a digital master hearing aid. This method of compression was evaluated on eight sensorineurally hearing-impaired listeners. Speech recognition scores were obtained for a range of compression conditions and input levels with and without frequency shaping. The results showed significant advantages over conventional linear amplification when temporal variations in the average level of the short-term spectrum were compressed, a result comparable to that obtained with conventional amplitude compression. A subset of the subjects showed further improvement when temporal variations in spectrum slope were compressed, but these subjects also showed similar improvements when frequency shaping was combined with level-only compression. None of the subjects showed improved speech recognition scores when variations in quadratic curvature were compressed in addition to level and slope compression. 相似文献

17.

Cross-language specialization in phonetic processing: English and Hindi perception of /w/-/v/ speech and nonspeech

Iverson P Wagner A Pinet M Rosen S 《The Journal of the Acoustical Society of America》2011,130(5):EL297-EL303

This study examined the perceptual specialization for native-language speech sounds, by comparing native Hindi and English speakers in their perception of a graded set of English /w/-/v/ stimuli that varied in similarity to natural speech. The results demonstrated that language experience does not affect general auditory processes for these types of sounds; there were strong cross-language differences for speech stimuli, and none for stimuli that were nonspeech. However, the cross-language differences extended into a gray area of speech-like stimuli that were difficult to classify, suggesting that the specialization occurred in phonetic processing prior to categorization. 相似文献

18.

单通道语音增强算法对汉语语音可懂度影响的研究 总被引：1，自引：0，他引：1

杨琳张建平颜永红《声学学报》2010,35(2):248-253

考察了当前常用的几种单通道语音增强算法对汉语语音可懂度的影响。受不同类型噪音干扰的语音经过5种单通道语音增强算法的处理后,播放给具有正常听力水平的被试进行听辩,考察增强后语音的可懂度。实验结果表明,语音增强算法并不能改进语音的可懂度水平;通过分析具体的错误原因,发现听辩错误主要来自于音素错误,与声调关系不大;而且,同英文的辨识结果相比,一些增强算法对于中、英文可懂度影响差异显著。相似文献

19.

Effects of age and mild hearing loss on speech recognition in noise 总被引：5，自引：0，他引：5

J R Dubno D D Dirks D E Morgan 《The Journal of the Acoustical Society of America》1984,76(1):87-96

Using an adaptive strategy, the effects of mild sensorineural hearing loss and adult listeners' chronological age on speech recognition in babble were evaluated. The signal-to-babble ratio required to achieve 50% recognition was measured for three speech materials presented at soft to loud conversational speech levels. Four groups of subjects were tested: (1) normal-hearing listeners less than 44 years of age, (2) subjects less than 44 years old with mild sensorineural hearing loss and excellent speech recognition in quiet, (3) normal-hearing listeners greater than 65 with normal hearing, and (4) subjects greater than 65 years old with mild hearing loss and excellent performance in quiet. Groups 1 and 3, and groups 2 and 4 were matched on the basis of pure-tone thresholds, and thresholds for each of the three speech materials presented in quiet. In addition, groups 1 and 2 were similar in terms of mean age and age range, as were groups 3 and 4. Differences in performance in noise as a function of age were observed for both normal-hearing and hearing-impaired listeners despite equivalent performance in quiet. Subjects with mild hearing loss performed significantly worse than their normal-hearing counterparts. These results and their implications are discussed. 相似文献

20.

Benefits of amplification for speech recognition in background noise

Turner CW Henry BA 《The Journal of the Acoustical Society of America》2002,112(4):1675-1680

The purpose of the present study was to examine the benefits of providing audible speech to listeners with sensorineural hearing loss when the speech is presented in a background noise. Previous studies have shown that when listeners have a severe hearing loss in the higher frequencies, providing audible speech (in a quiet background) to these higher frequencies usually results in no improvement in speech recognition. In the present experiments, speech was presented in a background of multitalker babble to listeners with various severities of hearing loss. The signal was low-pass filtered at numerous cutoff frequencies and speech recognition was measured as additional high-frequency speech information was provided to the hearing-impaired listeners. It was found in all cases, regardless of hearing loss or frequency range, that providing audible speech resulted in an increase in recognition score. The change in recognition as the cutoff frequency was increased, along with the amount of audible speech information in each condition (articulation index), was used to calculate the "efficiency" of providing audible speech. Efficiencies were positive for all degrees of hearing loss. However, the gains in recognition were small, and the maximum score obtained by an listener was low, due to the noise background. An analysis of error patterns showed that due to the limited speech audibility in a noise background, even severely impaired listeners used additional speech audibility in the high frequencies to improve their perception of the "easier" features of speech including voicing. 相似文献