首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
This study presents EMA (electromagnetic articulography) data on articulation of the vowel /a/ at different prosodic boundaries in French. Three speakers of metropolitan French produced utterances containing the vowel /a/, preceded by /t/ and followed by one of six consonants /b d g f s S/ (three stops and three fricatives), with different prosodic boundaries intervening between the /a/ and the six different consonants. The prosodic boundaries investigated are the Utterance, the Intonational phrase, the Accentual phrase, and the Word. Data for the Tongue Tip, Tongue Body, and Jaw are presented. The articulatory data presented here were recorded at the same time as the acoustic data presented in Tabain [J. Acoust. Soc. Am. 113, 516-531 (2003)]. Analyses show that there is a strong effect on peak displacement of the vowel according to the prosodic hierarchy, with the stronger prosodic boundaries inducing a much lower Tongue Body and Jaw position than the weaker prosodic boundaries. Durations of both the opening movement into and the closing movement out of the vowel are also affected. Peak velocity of the articulatory movements is also examined, and, contrary to results for phrase-final lengthening, it is found that peak velocity of the opening movement into the vowel tends to increase with the higher prosodic boundaries, together with the increased magnitude of the movement between the consonant and the vowel. Results for the closing movement out of the vowel and into the consonant are not so clear. Since one speaker shows evidence of utterance-level articulatory declension, it is suggested that the competing constraints of articulatory declension and prosodic effects might explain some previous results on phrase-final lengthening.  相似文献   

2.
Although some cochlear implant (CI) listeners can show good word recognition accuracy, it is not clear how they perceive and use the various acoustic cues that contribute to phonetic perceptions. In this study, the use of acoustic cues was assessed for normal-hearing (NH) listeners in optimal and spectrally degraded conditions, and also for CI listeners. Two experiments tested the tense/lax vowel contrast (varying in formant structure, vowel-inherent spectral change, and vowel duration) and the word-final fricative voicing contrast (varying in F1 transition, vowel duration, consonant duration, and consonant voicing). Identification results were modeled using mixed-effects logistic regression. These experiments suggested that under spectrally-degraded conditions, NH listeners decrease their use of formant cues and increase their use of durational cues. Compared to NH listeners, CI listeners showed decreased use of spectral cues like formant structure and formant change and consonant voicing, and showed greater use of durational cues (especially for the fricative contrast). The results suggest that although NH and CI listeners may show similar accuracy on basic tests of word, phoneme or feature recognition, they may be using different perceptual strategies in the process.  相似文献   

3.
There exists no clear understanding of the importance of spectral tilt for perception of stop consonants. It is hypothesized that spectral tilt may be particularly salient when formant patterns are ambiguous or degraded. Here, it is demonstrated that relative change in spectral tilt over time, not absolute tilt, significantly influences perception of /b/ vs /d/. Experiments consisted of burstless synthesized stimuli that varied in spectral tilt and onset frequency of the second formant. In Experiment 1, tilt of the consonant at voice onset was varied. In Experiment 2, tilt of the vowel steady state was varied. Results of these experiments were complementary and revealed a significant contribution of relative spectral tilt change only when formant information was ambiguous. Experiments 3 and 4 replicated Experiments 1 and 2 in an /aba/-/ada/ context. The additional tilt contrast provided by the initial vowel modestly enhanced effects. In Experiment 5, there was no effect for absolute tilt when consonant and vowel tilts were identical. Consistent with earlier studies demonstrating contrast between successive local spectral features, perceptual effects of gross spectral characteristics are likewise relative. These findings have implications for perception in nonlaboratory environments and for listeners with hearing impairment.  相似文献   

4.
The phonetic identification ability of an individual (SS) who exhibits the best, or equal to the best, speech understanding of patients using the Symbion four-channel cochlear implant is described. It has been found that SS: (1) can use aspects of signal duration to form categories that are isomorphic with the phonetic categories established by listeners with normal auditory function; (2) can combine temporal and spectral cues in a normal fashion to form categories; (3) can use aspects of fricative noises to form categories that correspond to normal phonetic categories; (4) uses information from both F1 and higher formants in vowel identification; and (5) appears to identify stop consonant place of articulation on the basis of information provided by the center frequency of the burst and by the abruptness of frequency change following signal onset. SS has difficulty identifying stop consonants from the information provided by formant transitions and cannot differentially identify signals that have identical F1's and relatively low-frequency F2's. SS's performance suggests that simple speech processing strategies (filtering of the signal into four bands) and monopolar electrode design are viable options in the design of cochlear prostheses.  相似文献   

5.
The effects of mild-to-moderate hearing impairment on the perceptual importance of three acoustic correlates of stop consonant place of articulation were examined. Normal-hearing and hearing-impaired adults identified a stimulus set comprising all possible combinations of the levels of three factors: formant transition type (three levels), spectral tilt type (three levels), and abruptness of frequency change (two levels). The levels of these factors correspond to those appropriate for /b/, /d/, and /g/ in the /ae/ environment. Normal-hearing subjects responded primarily in accord with the place of articulation specified by the formant transitions. Hearing-impaired subjects showed less-than-normal reliance on formant transitions and greater-than-normal reliance on spectral tilt and abruptness of frequency change. These results suggest that hearing impairment affects the perceptual importance of cues to stop consonant identity, increasing the importance of information provided by both temporal characteristics and gross spectral shape and decreasing the importance of information provided by the formant transitions.  相似文献   

6.
The effect of speaking rate variations on second formant (F2) trajectories was investigated for a continuum of rates. F2 trajectories for the schwa preceding a voiced bilabial stop, and one of three target vocalic nuclei following the stop, were generated for utterances of the form "Put a bV here, where V was /i/,/ae/ or /oI/. Discrete spectral measures at the vowel-consonant and consonant-vowel interfaces, as well as vowel target values, were examined as potential parameters of rate variation; several different whole-trajectory analyses were also explored. Results suggested that a discrete measure at the vowel consonant (schwa-consonant) interface, the F2off value, was in many cases a good index of rate variation, provided the rates were not unusually slow (vowel durations less than 200 ms). The relationship of the spectral measure at the consonant-vowel interface, F2 onset, as well as that of the "target" for this vowel, was less clearly related to rate variation. Whole-trajectory analyses indicated that the rate effect cannot be captured by linear compressions and expansions of some prototype trajectory. Moreover, the effect of rate manipulation on formant trajectories interacts with speaker and vocalic nucleus type, making it difficult to specify general rules for these effects. However, there is evidence that a small number of speaker strategies may emerge from a careful qualitative and quantitative analysis of whole formant trajectories. Results are discussed in terms of models of speech production and a group of speech disorders that is usually associated with anomalies of speaking rate, and hence of formant frequency trajectories.  相似文献   

7.
This study explores the effects of prosodic boundaries on nasality at intonational phrase, word, and syllable boundaries. The subjects were recorded saying phrases that contained a syllable-final nasal consonant followed by a syllable-initial stop. The timing, duration, and magnitude of the nasal airflows measured were used to determine the extent of nasality across boundaries. Nasal amplitudes were found to vary in a speaker-dependent manner among boundary types. However, the patterns of nasal contours and temporal aspects of the airflow parameters consistently varied with boundary type across all the speakers. In general, the duration of nasal airflow and nasal plateau were the longest at the intonational phrase boundary, followed by word boundary and then syllable boundary. In addition to the hierarchical influence of boundary strength, there were unique phonetic markings associated with individual boundaries. In particular, two nasal rises interrupted by nasal inhalation occurred only across an intonation phrase boundary. Also, unexpectedly, a word boundary was marked by the longest postboundary vowel, whereas a syllable boundary was marked with the shortest nasal duration. The results here support the hierarchical effect of boundary on both domain-edge strengthening and cross-boundary coarticulation.  相似文献   

8.
This study assessed the acoustic and perceptual effect of noise on vowel and stop-consonant spectra. Multi-talker babble and speech-shaped noise were added to vowel and stop stimuli at -5 to +10 dB S/N, and the effect of noise was quantified in terms of (a) spectral envelope differences between the noisy and clean spectra in three frequency bands, (b) presence of reliable F1 and F2 information in noise, and (c) changes in burst frequency and slope. Acoustic analysis indicated that F1 was detected more reliably than F2 and the largest spectral envelope differences between the noisy and clean vowel spectra occurred in the mid-frequency band. This finding suggests that in extremely noisy conditions listeners must be relying on relatively accurate F1 frequency information along with partial F2 information to identify vowels. Stop consonant recognition remained high even at -5 dB despite the disruption of burst cues due to additive noise, suggesting that listeners must be relying on other cues, perhaps formant transitions, to identify stops.  相似文献   

9.
Three alternative speech coding strategies suitable for use with cochlear implants were compared in a study of three normally hearing subjects using an acoustic model of a multiple-channel cochlear implant. The first strategy (F2) presented the amplitude envelope of the speech and the second formant frequency. The second strategy (F0 F2) included the voice fundamental frequency, and the third strategy (F0 F1 F2) presented the first formant frequency as well. Discourse level testing with the speech tracking method showed a clear superiority of the F0 F1 F2 strategy when the auditory information was used to supplement lipreading. Tracking rates averaged over three subjects for nine 10-min sessions were 40 wpm for F2, 52 wpm for F0 F2, and 66 wpm for F0 F1 F2. Vowel and consonant confusion studies and a test of prosodic information were carried out with auditory information only. The vowel test showed a significant difference between the strategies, but no differences were found for the other tests. It was concluded that the amplitude and duration cues common to all three strategies accounted for the levels of consonant and prosodic information received by the subjects, while the different tracking rates were a consequence of the better vowel recognition and the more natural quality of the F0 F1 F2 strategy.  相似文献   

10.
The perception of voicing in final velar stop consonants was investigated by systematically varying vowel duration, change in offset frequency of the final first formant (F1) transition, and rate of frequency change in the final F1 transition for several vowel contexts. Consonant-vowel-consonant (CVC) continua were synthesized for each of three vowels, [i,I,ae], which represent a range of relatively low to relatively high-F1 steady-state values. Subjects responded to the stimuli under both an open- and closed-response condition. Results of the study show that both vowel duration and F1 offset properties influence perception of final consonant voicing, with the salience of the F1 offset property higher for vowels with high-F1 steady-state frequencies than low-F1 steady-state frequencies, and the opposite occurring for the vowel duration property. When F1 onset and offset frequencies were controlled, rate of the F1 transition change had inconsistent and minimal effects on perception of final consonant voicing. Thus the findings suggest that it is the termination value of the F1 offset transition rather than rate and/or duration of frequency change, which cues voicing in final velar stop consonants during the transition period preceding closure.  相似文献   

11.
Five commonly used methods for determining the onset of voicing of syllable-initial stop consonants were compared. The speech and glottal activity of 16 native speakers of Cantonese with normal voice quality were investigated during the production of consonant vowel (CV) syllables in Cantonese. Syllables consisted of the initial consonants /ph/, /th/, /kh/, /p/, /t/, and /k/ followed by the vowel /a/. All syllables had a high level tone, and were all real words in Cantonese. Measurements of voicing onset were made based on the onset of periodicity in the acoustic waveform, and on spectrographic measures of the onset of a voicing bar (f0), the onset of the first formant (F1), second formant (F2), and third formant (F3). These measurements were then compared against the onset of glottal opening as determined by electroglottography. Both accuracy and variability of each measure were calculated. Results suggest that the presence of aspiration in a syllable decreased the accuracy and increased the variability of spectrogram-based measurements, but did not strongly affect measurements made from the acoustic waveform. Overall, the acoustic waveform provided the most accurate estimate of voicing onset; measurements made from the amplitude waveform were also the least variable of the five measures. These results can be explained as a consequence of differences in spectral tilt of the voicing source in breathy versus modal phonation.  相似文献   

12.
Acoustic lengthening at prosodic boundaries is well explored, and the articulatory bases for this lengthening are becoming better understood. However, the temporal scope of prosodic boundary effects has not been examined in the articulatory domain. The few acoustic studies examining the distribution of lengthening indicate that boundary effects extend from one to three syllables before the boundary, and that effects diminish as distance from the boundary increases. This diminishment is consistent with the pi-gesture model of prosodic influence [Byrd and Saltzman, J. Phonetics 31, 149-180 (2003)]. The present experiment tests the preboundary and postboundary scope of articulatory lengthening at an intonational phrase boundary. Movement-tracking data are used to evaluate durations of consonant closing and opening movements, acceleration durations, and consonant spatial magnitude. Results indicate that prosodic boundary effects exist locally near the phrase boundary in both directions, diminishing in magnitude more remotely for those subjects who exhibit extended effects. Small postboundary effects that are compensatory in direction are also observed.  相似文献   

13.
Changes in the speech spectrum of vowels and consonants before and after tonsillectomy were investigated to find out the impact of the operation on speech quality. Speech recordings obtained from patients were analyzed using the Kay Elemetrics, Multi-Dimensional Voice Processing (MDVP Advanced) software. Examination of the time-course changes after the operation revealed that certain speech parameters changed. These changes were mainly F3 (formant center frequency) and B3 (formant bandwidth) for the vowel /o/ and a slight decrease in B1 and B2 for the vowel /a/. The noise-to-harmonic ratio (NHR) also decreased slightly, suggesting less nasalized vowels. It was also observed that the fricative, glottal consonant /h/ has been affected. The larger the tonsil had been, the more changes were seen in the speech spectrum. The changes in the speech characteristics (except F3 and B3 for the vowel /o/) tended to recover, suggesting an involvement of auditory feedback and/or replacement of a new soft tissue with the tonsils. Although the changes were minimal and, therefore, have little effect on the extracted acoustic parameters, they cannot be disregarded for those relying on their voice for professional reasons, that is, singers, professional speakers, and so forth.  相似文献   

14.
The purpose of this study was to determine whether children give more perceptual weight than do adults to dynamic spectral cues versus static cues. Listeners were 10 children between the ages of 3;8 and 4;1 (mean 3;11) and ten adults between the ages of 23;10 and 32;0 (mean 25;11). Three experimental stimulus conditions were presented, with each containing stimuli of 30 ms duration. The first experimental condition consisted of unchanging formant onset frequencies ranging in value from frequencies for [i] to those for [a], appropriate for a bilabial stop consonant context. The second two experimental conditions consisted of either an [i] or [a] onset frequency with a 25 ms portion of a formant transition whose trajectory was toward one of a series of target frequencies ranging from those for [i] to those for [a]. Results indicated that the children attended differently than the adults on both the [a] and [i] formant onset frequency cue to identify the vowels. The adults gave more equal weight to the [i]-onset and [a]-onset dynamic cues as reflected in category boundaries than the children did. For the [i]-onset condition, children were not as confident compared to adults in vowel perception, as reflected in slope analyses.  相似文献   

15.
The formant hypothesis of vowel perception, where the lowest two or three formant frequencies are essential cues for vowel quality perception, is widely accepted. There has, however, been some controversy suggesting that formant frequencies are not sufficient and that the whole spectral shape is necessary for perception. Three psychophysical experiments were performed to study this question. In the first experiment, the first or second formant peak of stimuli was suppressed as much as possible while still maintaining the original spectral shape. The responses to these stimuli were not radically different from the ones for the unsuppressed control. In the second experiment, F2-suppressed stimuli, whose amplitude ratios of high- to low-frequency components were systemically changed, were used. The results indicate that the ratio changes can affect perceived vowel quality, especially its place of articulation. In the third experiment, the full-formant stimuli, whose amplitude ratios were changed from the original and whose F2's were kept constant, were used. The results suggest that the amplitude ratio is equal to or more effective than F2 as a cue for place of articulation. We conclude that formant frequencies are not exclusive cues and that the whole spectral shape can be crucial for vowel perception.  相似文献   

16.
Earlier work [Nittrouer et al., J. Speech Hear. Res. 32, 120-132 (1989)] demonstrated greater evidence of coarticulation in the fricative-vowel syllables of children than in those of adults when measured by anticipatory vowel effects on the resonant frequency of the fricative back cavity. In the present study, three experiments showed that this increased coarticulation led to improved vowel recognition from the fricative noise alone: Vowel identification by adult listeners was better overall for children's productions and was successful earlier in the fricative noise. This enhanced vowel recognition for children's samples was obtained in spite of the fact that children's and adults' samples were randomized together, therefore indicating that listeners were able to normalize the vowel information within a fricative noise where there often was acoustic evidence of only one formant associated primarily with the vowel. Correct vowel judgments were found to be largely independent of fricative identification. However, when another coarticulatory effect, the lowering of the main spectral prominence of the fricative noise for /u/ versus /i/, was taken into account, vowel judgments were found to interact with fricative identification. The results show that listeners are sensitive to the greater coarticulation in children's fricative-vowel syllables, and that, in some circumstances, they do not need to make a correct identification of the most prominently specified phone in order to make a correct identification of a coarticulated one.  相似文献   

17.
The addition of low-passed (LP) speech or even a tone following the fundamental frequency (F0) of speech has been shown to benefit speech recognition for cochlear implant (CI) users with residual acoustic hearing. The mechanisms underlying this benefit are still unclear. In this study, eight bimodal subjects (CI users with acoustic hearing in the non-implanted ear) and eight simulated bimodal subjects (using vocoded and LP speech) were tested on vowel and consonant recognition to determine the relative contributions of acoustic and phonetic cues, including F0, to the bimodal benefit. Several listening conditions were tested (CI/Vocoder, LP, T(F0-env), CI/Vocoder + LP, CI/Vocoder + T(F0-env)). Compared with CI/Vocoder performance, LP significantly enhanced both consonant and vowel perception, whereas a tone following the F0 contour of target speech and modulated with an amplitude envelope of the maximum frequency of the F0 contour (T(F0-env)) enhanced only consonant perception. Information transfer analysis revealed a dual mechanism in the bimodal benefit: The tone representing F0 provided voicing and manner information, whereas LP provided additional manner, place, and vowel formant information. The data in actual bimodal subjects also showed that the degree of the bimodal benefit depended on the cutoff and slope of residual acoustic hearing.  相似文献   

18.
Several experiments are described in which synthetic monophthongs from series varying between /i/ and /u/ are presented following filtered precursors. In addition to F(2), target stimuli vary in spectral tilt by applying a filter that either raises or lowers the amplitudes of higher formants. Previous studies have shown that both of these spectral properties contribute to identification of these stimuli in isolation. However, in the present experiments we show that when a precursor sentence is processed by the same filter used to adjust spectral tilt in the target stimulus, listeners identify synthetic vowels on the basis of F(2) alone. Conversely, when the precursor sentence is processed by a single-pole filter with center frequency and bandwidth identical to that of the F(2) peak of the following vowel, listeners identify synthetic vowels on the basis of spectral tilt alone. These results show that listeners ignore spectral details that are unchanged in the acoustic context. Instead of identifying vowels on the basis of incorrect acoustic information, however (e.g., all vowels are heard as /i/ when second formant is perceptually ignored), listeners discriminate the vowel stimuli on the basis of the more informative spectral property.  相似文献   

19.
Previous work has demonstrated that normal-hearing individuals use fine-grained phonetic variation, such as formant movement and duration, when recognizing English vowels. The present study investigated whether these cues are used by adult postlingually deafened cochlear implant users, and normal-hearing individuals listening to noise-vocoder simulations of cochlear implant processing. In Experiment 1, subjects gave forced-choice identification judgments for recordings of vowels that were signal processed to remove formant movement and/or equate vowel duration. In Experiment 2, a goodness-optimization procedure was used to create perceptual vowel space maps (i.e., best exemplars within a vowel quadrilateral) that included F1, F2, formant movement, and duration. The results demonstrated that both cochlear implant users and normal-hearing individuals use formant movement and duration cues when recognizing English vowels. Moreover, both listener groups used these cues to the same extent, suggesting that postlingually deafened cochlear implant users have category representations for vowels that are similar to those of normal-hearing individuals.  相似文献   

20.
Acoustic coupling between the vocal tract and the lower (subglottal) airway results in the introduction of pole-zero pairs corresponding to resonances of the uncoupled lower airway. If the second formant (F2) passes through the second subglottal resonance a discontinuity in amplitude occurs. This work explores the hypothesis that this F2 discontinuity affects how listeners perceive the distinctive feature [back] in transitions from a front vowel (high F2) to a labial stop (low F2). Two versions of the utterances "apter" and "up there" were synthesized with an F2 discontinuity at different locations in the initial VC transition. Subjects heard portions of the utterances with and without the discontinuity, and were asked to identify whether the utterances were real words or not. Results show that the frequency of the F2 discontinuity in an utterance influences the perception of backness in the vowel. Discontinuities of this sort are proposed to play a role in shaping vowel inventories in the world's languages [K. N. Stevens, J. Phonetics 17, 3-46 (1989)]. The results support a model of lexical access in which articulatory-acoustic discontinuities subserve phonological feature identification.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号