期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

1.

Effects of hearing loss and spectral shaping on identification and neural response patterns of stop-consonant stimuli

Harkrider AW Plyler PN Hedrick MS 《The Journal of the Acoustical Society of America》2006,120(2):915-925

In order to determine the effects of hearing loss and spectral shaping on a dynamic spectral speech cue, behavioral identification and neural response patterns of stop-consonant stimuli varying along the /b-d-g/ place-of-articulation continuum were measured from 11 young adults (mean age = 27 years) and 10 older adults (mean age = 55.2 years) with normal hearing, and compared to those from 10 older adults (mean age = 61.3 years) with mild-to-moderate hearing impairment. Psychometric functions and N1-P2 cortical evoked responses were obtained using consonant-vowel (CV) stimuli with frequency-independent (unshaped) amplification as well as with frequency-dependent (shaped) amplification that enhanced F2 relative to the rest of the stimulus. Results indicated that behavioral identification and neural response patterns of stop-consonant CVs were affected primarily by aging and secondarily by age-related hearing loss. Further, enhancing the audibility of the F2 transition cue with spectrally shaped amplification partially reduced the effects of age-related hearing loss on categorization ability but not neural response patterns of stop-consonant CVs. These findings suggest that aging affects excitatory and inhibitory processes and may contribute to the perceptual differences of dynamic spectral cues seen in older versus young adults. Additionally, age and age-related hearing loss may have separate influences on neural function. 相似文献

2.

Children's recognition of American English consonants in noise

Nishi K Lewis DE Hoover BM Choi S Stelmachowicz PG 《The Journal of the Acoustical Society of America》2010,127(5):3177-3188

In contrast to the availability of consonant confusion studies with adults, to date, no investigators have compared children's consonant confusion patterns in noise to those of adults in a single study. To examine whether children's error patterns are similar to those of adults, three groups of children (24 each in 4-5, 6-7, and 8-9 yrs. old) and 24 adult native speakers of American English (AE) performed a recognition task for 15 AE consonants in /ɑ/-consonant-/ɑ/ nonsense syllables presented in a background of speech-shaped noise. Three signal-to-noise ratios (SNR: 0, +5, and +10 dB) were used. Although the performance improved as a function of age, the overall consonant recognition accuracy as a function of SNR improved at a similar rate for all groups. Detailed analyses using phonetic features (manner, place, and voicing) revealed that stop consonants were the most problematic for all groups. In addition, for the younger children, front consonants presented in the 0 dB SNR condition were more error prone than others. These results suggested that children's use of phonetic cues do not develop at the same rate for all phonetic features. 相似文献

3.

Stop-consonant recognition for normal-hearing listeners and listeners with high-frequency hearing loss. I: The contribution of selected frequency regions

J R Dubno D D Dirks D E Ellison 《The Journal of the Acoustical Society of America》1989,85(1):347-354

The purpose of this study is to specify the contribution of certain frequency regions to consonant place perception for normal-hearing listeners and listeners with high-frequency hearing loss, and to characterize the differences in stop-consonant place perception among these listeners. Stop-consonant recognition and error patterns were examined at various speech-presentation levels and under conditions of low- and high-pass filtering. Subjects included 18 normal-hearing listeners and a homogeneous group of 10 young, hearing-impaired individuals with high-frequency sensorineural hearing loss. Differential filtering effects on consonant place perception were consistent with the spectral composition of acoustic cues. Differences in consonant recognition and error patterns between normal-hearing and hearing-impaired listeners were observed when the stimulus bandwidth included regions of threshold elevation for the hearing-impaired listeners. Thus place-perception differences among listeners are, for the most part, associated with stimulus bandwidths corresponding to regions of hearing loss. 相似文献

4.

Patterns of phoneme perception errors by listeners with cochlear implants as a function of overall speech perception ability

Munson B Donaldson GS Allen SL Collison EA Nelson DA 《The Journal of the Acoustical Society of America》2003,113(2):925-935

Many studies have noted great variability in speech perception ability among postlingually deafened adults with cochlear implants. This study examined phoneme misperceptions for 30 cochlear implant listeners using either the Nucleus-22 or Clarion version 1.2 device to examine whether listeners with better overall speech perception differed qualitatively from poorer listeners in their perception of vowel and consonant features. In the first analysis, simple regressions were used to predict the mean percent-correct scores for consonants and vowels for the better group of listeners from those of the poorer group. A strong relationship between the two groups was found for consonant identification, and a weak, nonsignificant relationship was found for vowel identification. In the second analysis, it was found that less information was transmitted for consonant and vowel features to the poorer listeners than to the better listeners; however, the pattern of information transmission was similar across groups. Taken together, results suggest that the performance difference between the two groups is primarily quantitative. The results underscore the importance of examining individuals' perception of individual phoneme features when attempting to relate speech perception to other predictor variables. 相似文献

5.

Formant onsets and formant transitions as developmental cues to vowel perception

Ohde RN German SR 《The Journal of the Acoustical Society of America》2011,130(3):1628-1642

The purpose of this study was to determine whether children give more perceptual weight than do adults to dynamic spectral cues versus static cues. Listeners were 10 children between the ages of 3;8 and 4;1 (mean 3;11) and ten adults between the ages of 23;10 and 32;0 (mean 25;11). Three experimental stimulus conditions were presented, with each containing stimuli of 30 ms duration. The first experimental condition consisted of unchanging formant onset frequencies ranging in value from frequencies for [i] to those for [a], appropriate for a bilabial stop consonant context. The second two experimental conditions consisted of either an [i] or [a] onset frequency with a 25 ms portion of a formant transition whose trajectory was toward one of a series of target frequencies ranging from those for [i] to those for [a]. Results indicated that the children attended differently than the adults on both the [a] and [i] formant onset frequency cue to identify the vowels. The adults gave more equal weight to the [i]-onset and [a]-onset dynamic cues as reflected in category boundaries than the children did. For the [i]-onset condition, children were not as confident compared to adults in vowel perception, as reflected in slope analyses. 相似文献

6.

Speech waveform envelope cues for consonant recognition 总被引：4，自引：0，他引：4

D J Van Tasell S D Soli V M Kirby G P Widin 《The Journal of the Acoustical Society of America》1987,82(4):1152-1161

This study investigated the cues for consonant recognition that are available in the time-intensity envelope of speech. Twelve normal-hearing subjects listened to three sets of spectrally identical noise stimuli created by multiplying noise with the speech envelopes of 19(aCa) natural-speech nonsense syllables. The speech envelope for each of the three noise conditions was derived using a different low-pass filter cutoff (20, 200, and 2000 Hz). Average consonant identification performance was above chance for the three noise conditions and improved significantly with the increase in envelope bandwidth from 20-200 Hz. SINDSCAL multidimensional scaling analysis of the consonant confusions data identified three speech envelope features that divided the 19 consonants into four envelope feature groups ("envemes"). The enveme groups in combination with visually distinctive speech feature groupings ("visemes") can distinguish most of the 19 consonants. These results suggest that near-perfect consonant identification performance could be attained by subjects who receive only enveme and viseme information and no spectral information. 相似文献

7.

Audibility and recognition of stop consonants in normal and hearing-impaired subjects

C W Turner M P Robb 《The Journal of the Acoustical Society of America》1987,81(5):1566-1573

The purpose of this study was to examine the effect of spectral-cue audibility on the recognition of stop consonants in normal-hearing and hearing-impaired adults. Subjects identified six synthetic CV speech tokens in a closed-set response task. Each syllable differed only in the initial 40-ms consonant portion of the stimulus. In order to relate performance to spectral-cue audibility, the initial 40 ms of each CV were analyzed via FFT and the resulting spectral array was passed through a sliding-filter model of the human auditory system to account for logarithmic representation of frequency and the summation of stimulus energy within critical bands. This allowed the spectral data to be displayed in comparison to a subject's sensitivity thresholds. For normal-hearing subjects, an orderly function relating the percentage of audible stimulus to recognition performance was found, with perfect discrimination performance occurring when the bulk of the stimulus spectrum was presented at suprathreshold levels. For the hearing-impaired subjects, however, it was found in many instances that suprathreshold presentation of stop-consonant spectral cues did not yield recognition equivalent to that found for the normal-hearing subjects. These results demonstrate that while the audibility of individual stop consonants is an important factor influencing recognition performance in hearing-impaired subjects, it is not always sufficient to explain the effects of sensorineural hearing loss. 相似文献

8.

语音中元音和辅音的听觉感知研究 总被引：1，自引：0，他引：1

下载免费PDF全文

颜永红李军锋应冬文《应用声学》2013,32(3):231-236

本文对语音中元音和辅音的听觉感知研究进行综述。80多年前基于无意义音节的权威实验结果表明辅音对人的听感知更为重要,由于实验者在学术上的成就和权威性,这一结论成为了常识,直到近20年前基于自然语句的实验挑战了这个结论并引发了新一轮的研究。本文主要围绕元音和辅音对语音感知的相对重要性、元音和辅音的稳态信息和边界动态信息对语音感知的影响以及相关研究的潜在应用等进行较为系统的介绍,最后给出了总结与展望。相似文献

9.

A mathematical model of medial consonant identification by cochlear implant users

Svirsky MA Sagi E Meyer TA Kaiser AR Teoh SW 《The Journal of the Acoustical Society of America》2011,129(4):2191-2200

The multidimensional phoneme identification model is applied to consonant confusion matrices obtained from 28 postlingually deafened cochlear implant users. This model predicts consonant matrices based on these subjects' ability to discriminate a set of postulated spectral, temporal, and amplitude speech cues as presented to them by their device. The model produced confusion matrices that matched many aspects of individual subjects' consonant matrices, including information transfer for the voicing, manner, and place features, despite individual differences in age at implantation, implant experience, device and stimulation strategy used, as well as overall consonant identification level. The model was able to match the general pattern of errors between consonants, but not the full complexity of all consonant errors made by each individual. The present study represents an important first step in developing a model that can be used to test specific hypotheses about the mechanisms cochlear implant users employ to understand speech. 相似文献

10.

Dynamic spectral structure specifies vowels for children and adults

Nittrouer S 《The Journal of the Acoustical Society of America》2007,122(4):2328-2339

When it comes to making decisions regarding vowel quality, adults seem to weight dynamic syllable structure more strongly than static structure, although disagreement exists over the nature of the most relevant kind of dynamic structure: spectral change intrinsic to the vowel or structure arising from movements between consonant and vowel constrictions. Results have been even less clear regarding the signal components children use in making vowel judgments. In this experiment, listeners of four different ages (adults, and 3-, 5-, and 7-year-old children) were asked to label stimuli that sounded either like steady-state vowels or like CVC syllables which sometimes had middle sections masked by coughs. Four vowel contrasts were used, crossed for type (front/back or closed/open) and consonant context (strongly or only slightly constraining of vowel tongue position). All listeners recognized vowel quality with high levels of accuracy in all conditions, but children were disproportionately hampered by strong coarticulatory effects when only steady-state formants were available. Results clarified past studies, showing that dynamic structure is critical to vowel perception for all aged listeners, but particularly for young children, and that it is the dynamic structure arising from vocal-tract movement between consonant and vowel constrictions that is most important. 相似文献

11.

Relative contributions of spectral and temporal cues for phoneme recognition 总被引：4，自引：0，他引：4

Xu L Thompson CS Pfingst BE 《The Journal of the Acoustical Society of America》2005,117(5):3255-3267

Cochlear implants provide users with limited spectral and temporal information. In this study, the amount of spectral and temporal information was systematically varied through simulations of cochlear implant processors using a noise-excited vocoder. Spectral information was controlled by varying the number of channels between 1 and 16, and temporal information was controlled by varying the lowpass cutoff frequencies of the envelope extractors from 1 to 512 Hz. Consonants and vowels processed using those conditions were presented to seven normal-hearing native-English-speaking listeners for identification. The results demonstrated that both spectral and temporal cues were important for consonant and vowel recognition with the spectral cues having a greater effect than the temporal cues for the ranges of numbers of channels and lowpass cutoff frequencies tested. The lowpass cutoff for asymptotic performance in consonant and vowel recognition was 16 and 4 Hz, respectively. The number of channels at which performance plateaued for consonants and vowels was 8 and 12, respectively. Within the above-mentioned ranges of lowpass cutoff frequency and number of channels, the temporal and spectral cues showed a tradeoff for phoneme recognition. Information transfer analyses showed different relative contributions of spectral and temporal cues in the perception of various phonetic/acoustic features. 相似文献

12.

Syllable structure and integration of voicing and manner of articulation information in labial consonant identification

Silbert NH 《The Journal of the Acoustical Society of America》2012,131(5):4076-4086

Speech perception requires the integration of information from multiple phonetic and phonological dimensions. A sizable literature exists on the relationships between multiple phonetic dimensions and single phonological dimensions (e.g., spectral and temporal cues to stop consonant voicing). A much smaller body of work addresses relationships between phonological dimensions, and much of this has focused on sequences of phones. However, strong assumptions about the relevant set of acoustic cues and/or the (in)dependence between dimensions limit previous findings in important ways. Recent methodological developments in the general recognition theory framework enable tests of a number of these assumptions and provide a more complete model of distinct perceptual and decisional processes in speech sound identification. A hierarchical Bayesian Gaussian general recognition theory model was fit to data from two experiments investigating identification of English labial stop and fricative consonants in onset (syllable initial) and coda (syllable final) position. The results underscore the importance of distinguishing between conceptually distinct processing levels and indicate that, for individual subjects and at the group level, integration of phonological information is partially independent with respect to perception and that patterns of independence and interaction vary with syllable position. 相似文献

13.

Gordon-Salant S Yeni-Komshian GH Fitzgibbons PJ Barrett J 《The Journal of the Acoustical Society of America》2006,119(4):2455-2466

This study investigated age-related differences in sensitivity to temporal cues in modified natural speech sounds. Listeners included young noise-masked subjects, elderly normal-hearing subjects, and elderly hearing-impaired subjects. Four speech continua were presented to listeners, with stimuli from each continuum varying in a single temporal dimension. The acoustic cues varied in separate continua were voice-onset time, vowel duration, silence duration, and transition duration. In separate conditions, the listeners identified the word stimuli, discriminated two stimuli in a same-different paradigm, and discriminated two stimuli in a 3-interval, 2-alternative forced-choice procedure. Results showed age-related differences in the identification function crossover points for the continua that varied in silence duration and transition duration. All listeners demonstrated shorter difference limens (DLs) for the three-interval paradigm than the two-interval paradigm, with older hearing-impaired listeners showing larger DLs than the other listener groups for the silence duration cue. The findings support the general hypothesis that aging can influence the processing of specific temporal cues that are related to consonant manner distinctions. 相似文献

14.

Compensation for coarticulation, /u/-fronting, and sound change in standard southern British: an acoustic and perceptual study

Harrington J Kleber F Reubold U 《The Journal of the Acoustical Society of America》2008,123(5):2825-2835

The aim of the study was to establish whether /u/-fronting, a sound change in progress in standard southern British, could be linked synchronically to the fronting effects of a preceding anterior consonant both in speech production and speech perception. For the production study, which consisted of acoustic analyses of isolated monosyllables produced by two different age groups, it was shown for younger speakers that /u/ was phonetically fronted and that the coarticulatory influence of consonants on /u/ was less than in older speakers. For the perception study, responses were elicited from the same subjects to two minimal word-pair continua that differed in the direction of the consonants' coarticulatory fronting effects on /u/. Consistent with their speech production, young listeners' /u/ category boundary was shifted toward /i/ and they compensated perceptually less for the fronting effects of the consonants on /u/ than older listeners. The findings support Ohala's model in which certain sound changes can be linked to the listener's failure to compensate for coarticulation. The results are also shown to be consistent with episodic models of speech perception in which phonological frequency effects bring about a realignment of the variants of a phonological category in speech production and perception. 相似文献

15.

The ability of cochlear implant users to use temporal envelope cues recovered from speech frequency modulation

JH Won C Lorenzi K Nie X Li EM Jameyson WR Drennan JT Rubinstein 《The Journal of the Acoustical Society of America》2012,132(2):1113-1119

Previous studies have demonstrated that normal-hearing listeners can understand speech using the recovered "temporal envelopes," i.e., amplitude modulation (AM) cues from frequency modulation (FM). This study evaluated this mechanism in cochlear implant (CI) users for consonant identification. Stimuli containing only FM cues were created using 1, 2, 4, and 8-band FM-vocoders to determine if consonant identification performance would improve as the recovered AM cues become more available. A consistent improvement was observed as the band number decreased from 8 to 1, supporting the hypothesis that (1) the CI sound processor generates recovered AM cues from broadband FM, and (2) CI users can use the recovered AM cues to recognize speech. The correlation between the intact and the recovered AM components at the output of the sound processor was also generally higher when the band number was low, supporting the consonant identification results. Moreover, CI subjects who were better at using recovered AM cues from broadband FM cues showed better identification performance with intact (unprocessed) speech stimuli. This suggests that speech perception performance variability in CI users may be partly caused by differences in their ability to use AM cues recovered from FM speech cues. 相似文献

16.

The use of static and dynamic vowel cues by multichannel cochlear implant users.

K I Kirk N Tye-Murray R R Hurtig 《The Journal of the Acoustical Society of America》1992,91(6):3487-3498

Multichannel cochlear implant users vary greatly in their word-recognition abilities. This study examined whether their word recognition was related to the use of either highly dynamic or relatively steady-state vowel cues contained in /bVb/ and /wVb/ syllables. Nine conditions were created containing different combinations of formant transition, steady-state, and duration cues. Because processor strategies differ, the ability to perceive static and dynamic information may depend on the type of cochlear implant used. Ten Nucleus and ten Ineraid subjects participated, along with 12 normal-hearing control subjects. Vowel identification did not differ between implanted groups, but both were significantly poorer at identifying vowels than the normal-hearing group. Vowel identification was best when at least two kinds of cues were available. Using only one type of cue, performance was better with excised vowels containing steady-state formants than in "vowelless" syllables, where the center vocalic portion was deleted and transitions were joined. In the latter syllable type, Nucleus subjects identified vowels significantly better when /b/ was the initial consonant; the other two groups were not affected by specific consonantal context. Cochlear implant subjects' word-recognition was positively correlated with the use of dynamic vowel cues, but not with steady-state cues. 相似文献

17.

Spectral and duration properties of front vowels as cues to final stop-consonant voicing

R M Fischer R N Ohde 《The Journal of the Acoustical Society of America》1990,88(3):1250-1259

The perception of voicing in final velar stop consonants was investigated by systematically varying vowel duration, change in offset frequency of the final first formant (F1) transition, and rate of frequency change in the final F1 transition for several vowel contexts. Consonant-vowel-consonant (CVC) continua were synthesized for each of three vowels, [i,I,ae], which represent a range of relatively low to relatively high-F1 steady-state values. Subjects responded to the stimuli under both an open- and closed-response condition. Results of the study show that both vowel duration and F1 offset properties influence perception of final consonant voicing, with the salience of the F1 offset property higher for vowels with high-F1 steady-state frequencies than low-F1 steady-state frequencies, and the opposite occurring for the vowel duration property. When F1 onset and offset frequencies were controlled, rate of the F1 transition change had inconsistent and minimal effects on perception of final consonant voicing. Thus the findings suggest that it is the termination value of the F1 offset transition rather than rate and/or duration of frequency change, which cues voicing in final velar stop consonants during the transition period preceding closure. 相似文献

18.

The effects of selective consonant amplification on sentence recognition in noise by hearing-impaired listeners

Saripella R Loizou PC Thibodeau L Alford JA 《The Journal of the Acoustical Society of America》2011,130(5):3028-3037

Weak consonants (e.g., stops) are more susceptible to noise than vowels, owing partially to their lower intensity. This raises the question whether hearing-impaired (HI) listeners are able to perceive (and utilize effectively) the high-frequency cues present in consonants. To answer this question, HI listeners were presented with clean (noise absent) weak consonants in otherwise noise-corrupted sentences. Results indicated that HI listeners received significant benefit in intelligibility (4 dB decrease in speech reception threshold) when they had access to clean consonant information. At extremely low signal-to-noise ratio (SNR) levels, however, HI listeners received only 64% of the benefit obtained by normal-hearing listeners. This lack of equitable benefit was investigated in Experiment 2 by testing the hypothesis that the high-frequency cues present in consonants were not audible to HI listeners. This was tested by selectively amplifying the noisy consonants while leaving the noisy sonorant sounds (e.g., vowels) unaltered. Listening tests indicated small (～10%), but statistically significant, improvements in intelligibility at low SNR conditions when the consonants were amplified in the high-frequency region. Selective consonant amplification provided reliable low-frequency acoustic landmarks that in turn facilitated a better lexical segmentation of the speech stream and contributed to the small improvement in intelligibility. 相似文献

19.

Sensory and nonsensory influences on children's performance of dichotic pitch perception tasks

Edwards VT Giaschi DE Low P Edgell D 《The Journal of the Acoustical Society of America》2005,117(5):3157-3164

Dichotic pitch perception reflects the auditory system's use of binaural cues to perceptually separate different sound sources and to determine the spatial location of sounds. Several studies were conducted to identify factors that influence children's dichotic pitch perception thresholds. An initial study of school children revealed an age-related improvement in thresholds for lateralizing dichotic pitch tones. In subsequent studies potential sensory and nonsensory limitations on young children's performance of dichotic pitch lateralization tasks were examined. A training study showed that with sufficient practice, young children lateralize dichotic pitch stimuli as well as adults, indicating an age difference in perceptual learning of the lateralization task. Changing the task requirements so that young children made a judgment about the pitch of dichotic pitch tones, rather than the spatial location of the tones, also resulted in significantly better thresholds. These findings indicate that nonsensory factors limit young children's performance of dichotic pitch tasks. 相似文献

20.

Improving syllable identification by a preprocessing method reducing overlap-masking in reverberant environments

Hodoshima N Arai T Kusumoto A Kinoshita K 《The Journal of the Acoustical Society of America》2006,119(6):4055-4064

Overlap-masking degrades speech intelligibility in reverberation [R. H. Bolt and A. D. MacDonald, J. Acoust. Soc. Am. 21(6), 577-580 (1949)]. To reduce the effect of this degradation, steady-state suppression has been proposed as a preprocessing technique [Arai et al., Proc. Autumn Meet. Acoust. Soc. Jpn., 2001; Acoust. Sci. Tech. 23(8), 229-232 (2002)]. This technique automatically suppresses steady-state portions of speech that have more energy but are less crucial for speech perception. The present paper explores the effect of steady-state suppression on syllable identification preceded by /a/ under various reverberant conditions. In each of two perception experiments, stimuli were presented to 22 subjects with normal hearing. The stimuli consisted of mono-syllables in a carrier phrase with and without steady-state suppression and were presented under different reverberant conditions using artificial impulse responses. The results indicate that steady-state suppression statistically improves consonant identification for reverberation times of 0.7 to 1.2 s. Analysis of confusion matrices shows that identification of voiced consonants, stop and nasal consonants, and bilabial, alveolar, and velar consonants were especially improved by steady-state suppression. The steady-state suppression is demonstrated to be an effective preprocessing method for improving syllable identification by reducing the effect of overlap-masking under specific reverberant conditions. 相似文献