首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 46 毫秒
1.
Cues to the voicing distinction for final /f,s,v,z/ were assessed for 24 impaired- and 11 normal-hearing listeners. In base-line tests the listeners identified the consonants in recorded /d circumflex C/ syllables. To assess the importance of various cues, tests were conducted of the syllables altered by deletion and/or temporal adjustment of segments containing acoustic patterns related to the voicing distinction for the fricatives. The results showed that decreasing the duration of /circumflex/ preceding /v/ or /z/, and lengthening the /circumflex/ preceding /f/ or /s/, considerably reduced the correctness of voicing perception for the hearing-impaired group, while showing no effect for the normal-hearing group. For the normals, voicing perception deteriorated for /f/ and /s/ when the frications were deleted from the syllables, and for /v/ and /z/ when the vowel offsets were removed from the syllables with duration-adjusted vowels and deleted frications. We conclude that some hearing-impaired listeners rely to a greater extent on vowel duration as a voicing cue than do normal-hearing listeners.  相似文献   

2.
It has been proposed that young children may have a perceptual preference for transitional cues [Nittrouer, S. (2002). J. Acoust. Soc. Am. 112, 711-719]. According to this proposal, this preference can manifest itself either as heavier weighting of transitional cues by children than by adults, or as heavier weighting of transitional cues than of other, more static, cues by children. This study tested this hypothesis by examining adults' and children's cue weighting for the contrasts /saI/-/integral of aI/, /de/-/be/, /ta/-/da/, and /ti/-/di/. Children were found to weight transitions more heavily than did adults for the fricative contrast /saI/-/integral aI/, and were found to weight transitional cues more heavily than nontransitional cues for the voice-onset-time contrast /ta/-/da/. However, these two patterns of cue weighting were not found to hold for the contrasts /de/-/be/ and /ti/-/di/. Consistent with several studies in the literature, results suggest that children do not always show a bias towards vowel-formant transitions, but that cue weighting can differ according to segmental context, and possibly the physical distinctiveness of available acoustic cues.  相似文献   

3.
An examination of the effect of phrase-final lengthening on the temporal correlates of voicing in syllable-final /s/ and /z/ was conducted. Discriminant analyses revealed that a combination of vowel duration, frication duration, and the duration of simultaneous voicing and frication was quite successful in determining voicing independently of phrase-final lengthening. Two perceptual experiments revealed that human listeners' recognition of the segments does benefit from hearing the syllables in sentential context as opposed to when they are excised from context and presented in isolation. The benefit was greatest for /s/ in phrase-final position and /z/ in phrase-internal position. This suggests that the presence of sentential context allows listeners to factor out the influence of phrase-final lengthening on vowel duration and to more accurately interpret this cue to voicing of the final fricative. These findings extend previous results on rate-dependent processing of overall speaking rate to the processing of local speaking rate. By doing so, they provide further evidence of the importance of extended phonetic context in speech recognition.  相似文献   

4.
The primary aim of this study was to determine if adults whose native language permits neither voiced nor voiceless stops to occur in word-final position can master the English word-final /t/-/d/ contrast. Native English-speaking listeners identified the voicing feature in word-final stops produced by talkers in five groups: native speakers of English, experienced and inexperienced native Spanish speakers of English, and experienced and inexperienced native Mandarin speakers of English. Contrary to hypothesis, the experienced second language (L2) learners' stops were not identified significantly better than stops produced by the inexperienced L2 learners; and their stops were correctly identified significantly less often than stops produced by the native English speakers. Acoustic analyses revealed that the native English speakers made vowels significantly longer before /d/ than /t/, produced /t/-final words with a higher F1 offset frequency than /d/-final words, produced more closure voicing in /d/ than /t/, and sustained closure longer for /t/ than /d/. The L2 learners produced the same kinds of acoustic differences between /t/ and /d/, but theirs were usually of significantly smaller magnitude. Taken together, the results suggest that only a few of the 40 L2 learners examined in the present study had mastered the English word-final /t/-/d/ contrast. Several possible explanations for this negative finding are presented. Multiple regression analyses revealed that the native English listeners made perceptual use of the small, albeit significant, vowel duration differences produced in minimal pairs by the nonnative speakers. A significantly stronger correlation existed between vowel duration differences and the listeners' identifications of final stops in minimal pairs when the perceptual judgments were obtained in an "edited" condition (where post-vocalic cues were removed) than in a "full cue" condition. This suggested that listeners may modify their identification of stops based on the availability of acoustic cues.  相似文献   

5.
The two principal sources of sound in speech, voicing and frication, occur simultaneously in voiced fricatives as well as at the vowel-fricative boundary in phonologically voiceless fricatives. Instead of simply overlapping, the two sources interact. This paper is an acoustic study of one such interaction effect: the amplitude modulation of the frication component when voicing is present. Corpora of sustained and fluent-speech English fricatives were recorded and analyzed using a signal-processing technique designed to extract estimates of modulation depth. Results reveal a pattern, consistent across speaking style, speaker, and place of articulation, for modulation at fo to rise at low voicing strengths and subsequently saturate. Voicing strength needed to produce saturation varied 60-66 dB across subjects and experimental conditions. Modulation depths at saturation varied little across speakers but significantly for place of articulation (with [z] showing particularly strong modulation) clustering at approximately 0.4-0.5 (a 40%-50% fluctuation above and below unmodulated amplitude); spectral analysis of modulating signals revealed weak but detectable modulation at the second and third harmonics (i.e., 2fo and 3fo).  相似文献   

6.
Perceptual equivalence of acoustic cues that differentiate /r/ and /l/   总被引:1,自引:0,他引:1  
The perceptual effects of orthogonal variations in two acoustic parameters which differentiate American English prevocalic /r/ and /l/ were examined. A spectral cue (frequency onset and transition of F2 and F3) and a temporal cue (relative duration of initial steady state and transition of F1) were varied in synthetic versions of "rock" and "lock." Four temporal variations in each of ten stimuli of a spectral-cue continuum were generated. Phonetic identification and oddity discrimination tasks with the four series showed systematic displacement of perceptual boundaries and discrimination peaks, thus reflecting a trading relation between the two cues. The perceptual equivalence of spectral and temporal cues was investigated by comparing the accuracy of discrimination of three types of stimulus comparisons: phonetically facilitating two-cue pairs, one-cue pairs, and phonetically conflicting two-cue pairs. As predicted, discrimination accuracy was ordered: Facilitating cues greater than one-cue greater than conflicting cues, indicating that perceivers discriminated on the basis of an integrated phonetic percept.  相似文献   

7.
Discharge patterns of auditory-nerve fibers in anesthetized cats were obtained for two stimulus levels in response to synthetic stimuli with dynamic characteristics appropriate for selected consonants. A set of stimuli was constructed by preceding a signal that was identified as /da/by another sound that was systematically manipulated so that the entire complex would sound like either /da/, /ada/, /na/, /sa/, /sa/, or others. Discharge rates of auditory-nerve fibers in response to the common /da/-like formant transitions depended on the preceding context. Average discharge rates during these transitions decreased most for fibers whose CFs were in frequency regions where the context had considerable energy. Some effect of the preceding context on fine time patterns of response to the transitions was also found, but the identity of the largest response components (which often corresponded to the formant frequencies) was in general unaffected. Thus the response patterns during the formant transitions contain cues about both the nature of the transitions and the preceding context. A second set of stimuli sounding like /s/ and /c/ was obtained by varying the duration of the rise in amplitude at the onset of a filtered noise burst. At both 45 and 60 dB SPL, there were fibers which showed a more prominent peak in discharge rate at stimulus onset for /c/ than for /s/, but the CF regions that reflected the clearest distinctions depended on stimulus level. The peaks in discharge rate that occur in response to rapid changes in amplitude or spectrum might be used by the central processor as pointers to portions of speech signals that are rich in phonetic information.  相似文献   

8.
Acoustic analyses were undertaken to explore the durational characteristics of the fricatives [f,theta,s,v,delta z] as cues to initial consonant voicing in English. Based on reports on the perception of voiced-voiceless fricatives, it was expected that there would be clear-cut duration differences distinguishing voiced and voiceless fricatives. Preliminary results for three speakers indicate that, although differences emerged in the overall mean duration of voiced and voiceless fricatives, contrary to expectations, there was a great deal of overlap in the duration distribution of voiced and voiceless fricative tokens. Further research is needed to examine the role of duration as a cue to syllable-initial fricative consonant voicing in English.  相似文献   

9.
This paper investigates the perception of non-native phoneme contrasts which exist in the native language, but not in the position tested. Like English, Dutch contrasts voiced and voiceless obstruents. Unlike English, Dutch allows only voiceless obstruents in word-final position. Dutch and English listeners' accuracy on English final voicing contrasts and their use of preceding vowel duration as a voicing cue were tested. The phonetic structure of Dutch should provide the necessary experience for a native-like use of this cue. Experiment 1 showed that Dutch listeners categorized English final /z/-/s/, /v/-/f/, /b/-/p/, and /d/-/t/ contrasts in nonwords as accurately as initial contrasts, and as accurately as English listeners did, even when release bursts were removed. In experiment 2, English listeners used vowel duration as a cue for one final contrast, although it was uninformative and sometimes mismatched other voicing characteristics, whereas Dutch listeners did not. Although it should be relatively easy for them, Dutch listeners did not use vowel duration. Nevertheless, they attained native-like accuracy, and sometimes even outperformed the native listeners who were liable to be misled by uninformative vowel duration information. Thus, native-like use of cues for non-native but familiar contrasts in unfamiliar positions may hardly ever be attained.  相似文献   

10.
The role of the release burst as a cue to the perception of stop consonants following [s] was investigated in a series of studies. Experiment 1 demonstrated that silent closure duration and burst duration can be traded as cues for the "say"-"stay" distinction. Experiment 2 revealed a similar trading relation between closure duration and burst amplitude. Experiments 3 and 4 suggested, perhaps surprisingly, that absolute, not relative, burst amplitude is important. Experiment 5 demonstrated that listener's sensitivity to bursts in a labeling task is at least equal to their sensitivity in a burst detection task. Experiments 6 and 7 replicated the trading relation between closure duration and burst amplitude for labial stops in the "slit"-"split" and "slash"-"splash" distinctions, although burst amplification, in contrast to attenuation, had no effect. All experiments revealed that listeners are remarkably sensitive to the presence of even very weak release bursts.  相似文献   

11.
In English, voiced and voiceless syllable-initial stop consonants differ in both fundamental frequency at the onset of voicing (onset F0) and voice onset time (VOT). Although both correlates, alone, can cue the voicing contrast, listeners weight VOT more heavily when both are available. Such differential weighting may arise from differences in the perceptual distance between voicing categories along the VOT versus onset F0 dimensions, or it may arise from a bias to pay more attention to VOT than to onset F0. The present experiment examines listeners' use of these two cues when classifying stimuli in which perceptual distance was artificially equated along the two dimensions. Listeners were also trained to categorize stimuli based on one cue at the expense of another. Equating perceptual distance eliminated the expected bias toward VOT before training, but successfully learning to base decisions more on VOT and less on onset F0 was easier than vice versa. Perceptual distance along both dimensions increased for both groups after training, but only VOT-trained listeners showed a decrease in Garner interference. Results lend qualified support to an attentional model of phonetic learning in which learning involves strategic redeployment of selective attention across integral acoustic cues.  相似文献   

12.
This study investigated the role of the amplitude envelope in the vicinity of consonantal release in the perception of the stop-glide contrast. Three sets of acoustic [b-w] continua, each in the vowel environments [a] and [i], were synthesized using parameters derived from natural speech. In the first set, amplitude, formant frequency, and duration characteristics were interpolated between exemplar stop and glide endpoints. In the second set, formant frequency and duration characteristics were interpolated, but all stimuli were given a stop amplitude envelope. The third set was like the second, except that all stimuli were given a glide amplitude envelope. Subjects were given both forced-choice and free-identification tasks. The results of the forced-choice task indicated that amplitude cues were able to override transition slope, duration, and formant frequency cues in the perception of the stop-glide contrast. However, results from the free-identification task showed that, although presence of a stop amplitude envelope turned all stimuli otherwise labeled as glides to stops, the presence of a glide amplitude envelope changed stimuli labeled otherwise as stops to fricatives rather than to glides. These results support the view that the amplitude envelope in the vicinity of the consonantal release is a critical acoustic property for the continuant / noncontinuant contrast. The results are discussed in relation to a theory of acoustic invariance.  相似文献   

13.
To investigate possible auditory factors in the perception of stops and glides (e.g., /b/ vs /w/), a two-category labeling performance was compared on several series of /ba/-/wa/ stimuli and on corresponding nonspeech stimulus series that modeled the first-formant trajectories and amplitude rise times of the speech items. In most respects, performance on the speech and nonspeech stimuli was closely parallel. Transition duration proved to be an effective cue for both the stop/glide distinction and the nonspeech distinction between abrupt and gradual onsets, and the category boundaries along the transition-duration dimension did not differ significantly in the two cases. When the stop/glide distinction was signaled by variation in transition duration, there was a reliable stimulus-length effect: A longer vowel shifted the category boundary toward greater transition durations. A similar effect was observed for the corresponding nonspeech stimuli. Variation in rise time had only a small effect in signaling both the stop/glide distinction and the nonspeech distinction between abrupt and gradual onsets. There was, however, one discrepancy between the speech and nonspeech performance. When the stop/glide distinction was cued by rise-time variation, there was a stimulus-length effect, but no such effect occurred for the corresponding nonspeech stimuli. On balance, the results suggest that there are significant auditory commonalities between the perception of stops and glides and the perception of acoustically analogous nonspeech stimuli.  相似文献   

14.
The effects of mild-to-moderate hearing impairment on the perceptual importance of three acoustic correlates of stop consonant place of articulation were examined. Normal-hearing and hearing-impaired adults identified a stimulus set comprising all possible combinations of the levels of three factors: formant transition type (three levels), spectral tilt type (three levels), and abruptness of frequency change (two levels). The levels of these factors correspond to those appropriate for /b/, /d/, and /g/ in the /ae/ environment. Normal-hearing subjects responded primarily in accord with the place of articulation specified by the formant transitions. Hearing-impaired subjects showed less-than-normal reliance on formant transitions and greater-than-normal reliance on spectral tilt and abruptness of frequency change. These results suggest that hearing impairment affects the perceptual importance of cues to stop consonant identity, increasing the importance of information provided by both temporal characteristics and gross spectral shape and decreasing the importance of information provided by the formant transitions.  相似文献   

15.
Two experiments determined the just noticeable difference (jnd) in onset frequency for speech formant transitions followed by a 1800-Hz steady state. Influences of transition duration (30, 45, 60, and 120 ms), transition-onset region (above or below 1800 Hz), and the rate of transition were examined. An overall improvement in discrimination with duration was observed suggesting better frequency resolution and, consequently, better use of pitch/timbre cues with longer transitions. In addition, falling transitions (with onsets above 1800 Hz) were better discriminated than rising, and changing onset to produce increments in transition rate-of-change in frequency yielded smaller jnd's than changing onset to produce decrements. The shortest transitions displayed additional rate-related effects. This last observation may be due to differences in the degree of dispersion of activity in the cochlea when high-rate transitions are effectively treated as non-time-varying, wideband events. The other results may reflect mechanisms that extract the temporal envelopes of signals: Envelope slope and magnitude differences are proposed to provide discriminative cues that supplement or supplant weaker spectrally based pitch/timbre cues for transitions in the short-to-moderate duration range. It is speculated that these cues may also support some speech perceptual decisions.  相似文献   

16.
The purpose of this study was to determine whether children give more perceptual weight than do adults to dynamic spectral cues versus static cues. Listeners were 10 children between the ages of 3;8 and 4;1 (mean 3;11) and ten adults between the ages of 23;10 and 32;0 (mean 25;11). Three experimental stimulus conditions were presented, with each containing stimuli of 30 ms duration. The first experimental condition consisted of unchanging formant onset frequencies ranging in value from frequencies for [i] to those for [a], appropriate for a bilabial stop consonant context. The second two experimental conditions consisted of either an [i] or [a] onset frequency with a 25 ms portion of a formant transition whose trajectory was toward one of a series of target frequencies ranging from those for [i] to those for [a]. Results indicated that the children attended differently than the adults on both the [a] and [i] formant onset frequency cue to identify the vowels. The adults gave more equal weight to the [i]-onset and [a]-onset dynamic cues as reflected in category boundaries than the children did. For the [i]-onset condition, children were not as confident compared to adults in vowel perception, as reflected in slope analyses.  相似文献   

17.
This paper reports acoustic measurements and results from a series of perceptual experiments on the voiced-voiceless distinction for syllable-final stop consonants in absolute final position and in the context of a following syllable beginning with a different stop consonant. The focus is on temporal cues to the distinction, with vowel duration and silent closure duration as the primary and secondary dimensions, respectively. The main results are that adding a second syllable to a monosyllable increases the number of voiced stop consonant responses, as does shortening of the closure duration in disyllables. Both of these effects are consistent with temporal regularities in speech production: Vowel durations are shorter in the first syllable of disyllables than in monosyllables, and closure durations are shorter for voiced than for voiceless stops in disyllabic utterances of this type. While the perceptual effects thus may derive from two separate sources of tacit phonetic knowledge available to listeners, the data are also consistent with an interpretation in terms of a single effect; one of temporal proximity of following context.  相似文献   

18.
Experiments in lateralization were performed to evaluate the relative contribution of envelope and phase cues in binaural hearing with particular reference to the effects of frequency, amplitude, shape of rise/decay, and duration of peak amplitude. Pure-tone signals were presented with interaural phase shifts ranging between 90 degrees and 360 degrees. For a given value of phase shift, the leading signal was presented randomly to the right or left ear over a block of 100 trials, and the laterality of the resultant image was judged. Rise/decay time was varied from 5 to 200 ms across blocks. The results confirmed our previous finding that a rise/decay time of at least 200 ms is required to secure a psychophysically steady-state signal. This value will, however, depend on the values chosen for the other signal parameters. Within limits, decreasing intensity could be compensated for by decreasing rise/decay, suggesting the psychophysical importance of the initial segment of the signal (precedence effect). For low frequencies of 650 to 1250 Hz, performance is sensitive to interaural phase shift and largely independent of frequency. For higher frequencies of 1500 and 2000 Hz, lateralization is independent of the phase cue and also largely insensitive to change in rise/decay time. Finally, performance remains unchanged with variation in peak duration ranging from 25 to 200 ms.  相似文献   

19.
The experiments reported employed nonspeech analogs of speech stimuli to examine the perceptual interaction between first-formant onset frequency and voice-onset time, acoustic cues to the voicing distinction in English initial stop consonants. The nonspeech stimuli comprised two pure tones varying in relative onset time, and listeners were asked to judge the simultaneity of tone onsets. These judgments were affected by the frequency of the lower tone in a manner that parallels the influence of first-formant onset frequency on voicing judgments. This effect was shown to occur regardless of prior learning and to be systematic over a wide range of lower tone frequencies including frequencies beyond the range of possible first-formant frequencies of speech, suggesting that the effect in speech is not attributable to (tacit) knowledge of production constraints, as some current theories suggest.  相似文献   

20.
Several types of measurements were made to determine the acoustic characteristics that distinguish between voiced and voiceless fricatives in various phonetic environments. The selection of measurements was based on a theoretical analysis that indicated the acoustic and aerodynamic attributes at the boundaries between fricatives and vowels. As expected, glottal vibration extended over a longer time in the obstruent interval for voiced fricatives than for voiceless fricatives, and there were more extensive transitions of the first formant adjacent to voiced fricatives than for the voiceless cognates. When two fricatives with different voicing were adjacent, there were substantial modifications of these acoustic attributes, particularly for the syllable-final fricative. In some cases, these modifications leads to complete assimilation of the voicing feature. Several perceptual studies with synthetic vowel-consonant-vowel stimuli and with edited natural stimuli examined the role of consonant duration, extent and location of glottal vibration, and extent of formant transitions on the identification of the voicing characteristics of fricatives. The perceptual results were in general consistent with the acoustic observations and with expectations based on the theoretical model. The results suggest that listeners base their voicing judgments of intervocalic fricatives on an assessment of the time interval in the fricative during which there is no glottal vibration. This time interval must exceed about 60 ms if the fricative is to be judged as voiceless, except that a small correction to this threshold is applied depending on the extent to which the first-formant transitions are truncated at the consonant boundaries.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号