首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
The contribution of the nasal murmur and vocalic formant transition to the perception of the [m]-[n] distinction by adult listeners was investigated for speakers of different ages in both consonant-vowel (CV) and vowel-consonant (VC) syllables. Three children in each of the speaker groups 3, 5, and 7 years old, and three adult females and three adult males produced CV and VC syllables consisting of either [m] or [n] and followed or preceded by [i ae u a], respectively. Two productions of each syllable were edited into seven murmur and transitions segments. Across speaker groups, a segment including the last 25 ms of the murmur and the first 25 ms of the vowel yielded higher perceptual identification of place of articulation than any other segment edited from the CV syllable. In contrast, the corresponding vowel+murmur segment in the VC syllable position improved nasal identification relative to other segment types for only the adult talkers. Overall, the CV syllable was perceptually more distinctive than the VC syllable, but this distinctiveness interacted with speaker group and stimulus duration. As predicted by previous studies and the current results of perceptual testing, acoustic analyses of adult syllable productions showed systematic differences between labial and alveolar places of articulation, but these differences were only marginally observed in the youngest children's speech. Also predicted by the current perceptual results, these acoustic properties differentiating place of articulation of nasal consonants were reliably different for CV syllables compared to VC syllables. A series of comparisons of perceptual data across speaker groups, segment types, and syllable shape provided strong support, in adult speakers, for the "discontinuity hypothesis" [K. N. Stevens, in Phonetic Linguistics: Essays in Honor of Peter Ladefoged, edited by V. A. Fromkin (Academic, London, 1985), pp. 243-255], according to which spectral discontinuities at acoustic boundaries provide critical cues to the perception of place of articulation. In child speakers, the perceptual support for the "discontinuity hypothesis" was weaker and the results indicative of developmental changes in speech production.  相似文献   

2.
Coarticulation studies in speech of deaf individuals have so far focused on intrasyllabic patterning of various consonant-vowel sequences. In this study, both inter- and intrasyllabic patterning were examined in disyllables /symbol see text #CVC/ and the effects of phonetic context, speaking rate, and segment type were explored. Systematic observation of F2 and durational measurements in disyllables minimally contrasting in vocalic ([i], [u,][a]) and in consonant ([b], [d]) context, respectively, was made at selected locations in the disyllable, in order to relate inferences about articulatory adjustments with their temporal coordinates. Results indicated that intervocalic coarticulation across hearing and deaf speakers varied as a function of the phonetic composition of disyllables (b_b or d_d). The deaf speakers showed reduced intervocalic coarticulation for bilabial but not for alveolar disyllables compared to the hearing speakers. Furthermore, they showed less marked consonant influences on the schwa and stressed vowel of disyllables compared to the hearing controls. Rate effects were minimal and did not alter the coarticulatory patterns observed across hearing status. The above findings modify the conclusions drawn from previous studies and suggest that the speech of deaf and hearing speakers is guided by different gestural organization.  相似文献   

3.
Earlier work [Nittrouer et al., J. Speech Hear. Res. 32, 120-132 (1989)] demonstrated greater evidence of coarticulation in the fricative-vowel syllables of children than in those of adults when measured by anticipatory vowel effects on the resonant frequency of the fricative back cavity. In the present study, three experiments showed that this increased coarticulation led to improved vowel recognition from the fricative noise alone: Vowel identification by adult listeners was better overall for children's productions and was successful earlier in the fricative noise. This enhanced vowel recognition for children's samples was obtained in spite of the fact that children's and adults' samples were randomized together, therefore indicating that listeners were able to normalize the vowel information within a fricative noise where there often was acoustic evidence of only one formant associated primarily with the vowel. Correct vowel judgments were found to be largely independent of fricative identification. However, when another coarticulatory effect, the lowering of the main spectral prominence of the fricative noise for /u/ versus /i/, was taken into account, vowel judgments were found to interact with fricative identification. The results show that listeners are sensitive to the greater coarticulation in children's fricative-vowel syllables, and that, in some circumstances, they do not need to make a correct identification of the most prominently specified phone in order to make a correct identification of a coarticulated one.  相似文献   

4.
Previous studies of vowel perception have shown that adult speakers of American English and of North German identify native vowels by exploiting at least three types of acoustic information contained in consonant-vowel-consonant (CVC) syllables: target spectral information reflecting the articulatory target of the vowel, dynamic spectral information reflecting CV- and -VC coarticulation, and duration information. The present study examined the contribution of each of these three types of information to vowel perception in prelingual infants and adults using a discrimination task. Experiment 1 examined German adults' discrimination of four German vowel contrasts (see text), originally produced in /dVt/ syllables, in eight experimental conditions in which the type of vowel information was manipulated. Experiment 2 examined German-learning infants' discrimination of the same vowel contrasts using a comparable procedure. The results show that German adults and German-learning infants appear able to use either dynamic spectral information or target spectral information to discriminate contrasting vowels. With respect to duration information, the removal of this cue selectively affected the discriminability of two of the vowel contrasts for adults. However, for infants, removal of contrastive duration information had a larger effect on the discrimination of all contrasts tested.  相似文献   

5.
The influence of vocalic context on various temporal and spectral properties of preceding acoustic segments was investigated in utterances containing [schwa No. CV] sequences produced by two girls aged 4;8 and 9;5 years and by their father. The younger (but not the older) child's speech showed a systematic lowering of [s] noise and [th] release burst spectra before [u] as compared to [i] and [ae]. The older child's speech, on the other hand, showed an orderly relationship of the second-formant frequency in [] to the transconsonantal vowel. Both children tended to produce longer [s] noises and voice onset times as well as higher second-formant peaks at constriction noise offset before [i] than before [u] and [ae]. All effects except the first were shown by the adult who, in addition, produced first-formant frequencies in [] that anticipated the transconsonantal vowel. These observations suggest that different forms of anticipatory coarticulation may have different causes and may follow different developmental patterns. A strategy for future research is suggested.  相似文献   

6.
Vocal microtremor designates a normal slow modulation of the vocal cycle lengths of speakers who do not suffer from pathological tremor of the limbs and whose voices are not perceived as tremulous. Vocal microtremor is therefore distinct from pathological vocal tremor. The objective is to report data about the modulation frequency and modulation level owing to vocal microtremor. The modulation data have been obtained for vowels [a], [i], and [u] sustained by normophonic and mildly dysphonic male and female speakers. The results are the following. First, modulation frequencies and relative modulation levels do not differ significantly for male and female speakers, normophonic and mildly dysphonic speakers, as well as for vowel timbres [a], [i], and [u]. Second, the typical interquartile intervals of the modulation frequency and modulation level are equal to 2.0-4.7 Hz and 0.4%-1.3%, respectively. Third, dissimilarities between data reported by different studies are due to different cutoff frequencies below which spectral peaks are considered not to contribute to vocal microtremor.  相似文献   

7.
The goal of this study is to investigate coarticulatory resistance and aggressiveness for the jaw in Catalan consonants and vowels and, more specifically, for the alveolopalatal nasal //[symbol see text]/ and for dark /l/ for which there is little or no data on jaw position and coarticulation. Jaw movement data for symmetrical vowel-consonant-vowel sequences with the consonants /p, n, l, s, ∫, [ symbol see text], k/ and the vowels /i, a, u/ were recorded by three Catalan speakers with a midsagittal magnetometer. Data reveal that jaw height is greater for /s, ∫/ than for /p, [see text]/, which is greater than for /n, l, k/ during the consonant, and for /i, u/ than for /a/ during the vowel. Differences in coarticulatory variability among consonants and vowels are inversely related to differences in jaw height, i.e., fricatives and high vowels are most resistant, and /n, l, k/ and the low vowel are least resistant. Moreover, coarticulation resistant phonetic segments exert more prominent effects and, thus, are more aggressive than segments specified for a lower degree of coarticulatory resistance. Data are discussed in the light of the degree of articulatory constraint model of coarticulation.  相似文献   

8.
This study assessed the acoustic coarticulatory effects of phrasal accent on [V1.CV2] sequences, when separately applied to V1 or V2, surrounding the voiced stops [b], [d], and [g]. Three adult speakers each produced 360 tokens (six V1 contexts x ten V2 contexts x three stops x two emphasis conditions). Realizing that anticipatory coarticulation of V2 onto the intervocalic C can be influenced by prosodic effects, as well as by vowel context effects, a modified locus equation regression metric was used to isolate the effect of phrasal accent on consonantal F2 onsets, independently of prosodically induced vowel expansion effects. The analyses revealed two main emphasis-dependent effects: systematic differences in F2 onset values and the expected expansion of vowel space. By accounting for the confounding variable of stress-induced vowel space expansion, a small but consistent coarticulatory effect of emphatic stress on the consonant was uncovered in lingually produced stops, but absent in labial stops. Formant calculations based on tube models indicated similarly increased F2 onsets when stressed /d/ and /g/ were simulated with deeper occlusions resulting from more forceful closure movements during phrasal accented speech.  相似文献   

9.
A perceptual analysis of the French vowel [u] produced by 10 speakers under normal and perturbed conditions (Savariaux et al., 1995) is presented which aims at characterizing in the perceptual domain the task of a speaker for this vowel, and, then, at understanding the strategies developed by the speakers to deal with the lip perturbation. Identification and rating tests showed that the French [u] is perceptually fairly well described in the [F1, (F2-F0)] plane, and that the parameter (((F2-F0) + F1)/2) (all frequencies in bark) provides a good overall correlate of the "grave" feature classically used to describe the vowel [u] in all languages. This permitted reanalysis of the behavior of the speakers during the perturbation experiment. Three of them succeed in producing a good [u] in spite of the lip tube, thanks to a combination of limited changes on F1 and (F2-F0), but without producing the strong backward movement of the tongue, which would be necessary to keep the [F1,F2] pattern close to the one measured in normal speech. The only speaker who strongly moved his tongue back and maintained F1 and F2 at low values did not produce a perceptually well-rated [u], but additional tests demonstrate that this gesture allowed him to preserve the most important phonetic features of the French [u], which is primarily a back and rounded vowel. It is concluded that speech production is clearly guided by perceptual requirements, and that the speakers have a good representation of them, even if they are not all able to meet them in perturbed conditions.  相似文献   

10.
This study complements earlier experiments on the perception of the [m]-[n] distinction in CV syllables [B. H. Repp, J. Acoust. Soc. Am. 79, 1987-1999 (1986); B. H. Repp, J. Acoust. Soc. Am. 82, 1525-1538 (1987)]. Six talkers produced VC syllables consisting of [m] or [n] preceded by [i, a, u]. In listening experiments, these syllables were truncated from the beginning and/or from the end, or waveform portions surrounding the point of closure were replaced with noise, so as to map out the distribution of the place of articulation information for consonant perception. These manipulations revealed that the vocalic formant transitions alone conveyed about as much place of articulation information as did the nasal murmur alone, and both signal portions were about as informative in VC as in CV syllables. Nevertheless, full VC syllables were less accurately identified than full CV syllables, especially in female speech. The reason for this was hypothesized to be the relative absence of a salient spectral change between the vowel and the murmur in VC syllables. This hypothesis was supported by the relative ineffectiveness of two additional manipulations meant to disrupt the perception of relational spectral information (channel separation or temporal separation of vowel and murmur) and by subjects' poor identification scores for brief excerpts including the point of maximal spectral change. While, in CV syllables, the abrupt spectral change from the murmur to the vowel provides important additional place of articulation information, for VC syllables it seems as if the format transitions in the vowel and the murmur spectrum functioned as independent cues.  相似文献   

11.
The experiment reported here explores the ability of 4- to 5-day-old neonates to discriminate consonantal place of articulation and vowel quality using shortened CV syllables similar to those used by Blumstein and Stevens [J. Acoust. Soc. Am. 67, 648-662 (1980)], without vowel steady-state information. The results show that the initial 34-44 ms of CV stimuli provide infants with sufficient information to discriminate place of articulation differences in stop consonants ([ba] vs [da], [ba] vs [ga], [bi] vs [di], and [bi] vs [gi]) and following vowel quality ([ba] vs [bi], [da] vs [di], and [ga] vs [gi]). These results suggest that infants can discriminate syllables on the basis of the onset properties of CV signals. Furthermore, this experiment indicates that neonates require little or no exposure to speech to succeed in such a discrimination task.  相似文献   

12.
The purpose of this study was to identify and compare the temporal characteristics of nasalization in relation to (1) languages, (2) vowel contexts, and (3) age groups. Two distinct acoustic energies from the mouth and nose were recorded during speech production (/pamap, pimip, pumup/) using two microphones to obtain the absolute and proportional measurements on the acoustic temporal characteristics of nasalization. Twenty-eight normal adults (14 American English and 14 Korean speakers) and 28 normal children (14 American English and 14 Korean speakers) participated in this study. In both languages, adults showed shorter duration of nasalization than children within all three vowel contexts. The high vowel context revealed longer duration of nasalization than the low vowel context in both languages. There was no significant difference of temporal characteristics of nasalization between American English and Korean. Nasalization showed different timing characteristics between children and adults across vowel contexts. The results are discussed in association with developmental coarticulation and the relationship between acoustic consequences of articulatory events and vowel height.  相似文献   

13.
Hearing talkers produce shorter vowel and word durations in multisyllabic contexts than in monosyllabic contexts. This investigation determined whether a similar effect occurs for deaf talkers, a population often characterized as lacking coarticulation in their speech. Four prelingually deafened adults and two hearing controls produced three sets of word sequences. Each set included a kernel word and six derived forms (e.g., "speed," "speedy," "speeding," etc.). The derived forms were created by adding unstressed and stressed syllables to the kernel form. A spectrographic analysis indicated that the deaf subjects did not always decrease word and vowel durations for the derivatives. Unlike hearing speakers, they often did not reduce vowel segments more than consonant segments. Three explanations are forwarded for the shortening effects. One relates to the implementation of temporal rules, the second concerns the organization imposed upon the articulators to produce speech, and the third suggests a language-independent vocal tract characteristic. The role of auditory information in developing the shortening effects is also considered.  相似文献   

14.
The responses of four high-spontaneous fibers from a damaged cat cochlea responding to naturally uttered consonant-vowel (CV) syllables [m], [p], and [t], each with [a], [i], and [u] in four different levels of noise were simulated using a two-stage computer model. At the lowest noise level [+30 dB signal-to-noise (S/N) ratio], the responses of the models of the three fibers from a heavily damaged portion of the cochlea [characteristic frequencies (CFs) from 1.6 to 2.14 kHz] showed quite different response patterns from those of fibers in normal cochleas: There was little response to the noise alone, the consonant portions of the syllables evoked small-amplitude wide-bandwidth complexes, and the vowel-segment response synchrony was often masked by low-frequency components, especially the first formant. At the next level of noise (S/N = 20 dB), spectral information regarding the murmur segments of the [m] syllables was essentially lost. At the highest noise levels used (S/N = +10 and 0 dB), the noise was almost totally disruptive of coding of the spectral peaks of the consonant portions of the stop CVs. Possible implications of the results with regard to the understanding of speech by hearing-impaired listeners are discussed.  相似文献   

15.
This study explores the following hypothesis: forward looping movements of the tongue that are observed in VCV sequences are due partly to the anatomical arrangement of the tongue muscles, how they are used to produce a velar closure, and how the tongue interacts with the palate during consonantal closure. The study uses an anatomically based two-dimensional biomechanical tongue model. Tissue elastic properties are accounted for in finite-element modeling, and movement is controlled by constant-rate control parameter shifts. Tongue raising and lowering movements are produced by the model mainly with the combined actions of the genioglossus, styloglossus, and hyoglossus. Simulations of V1CV2 movements were made, where C is a velar consonant and V is [a], [i], or [u]. Both vowels and consonants are specified in terms of targets, but for the consonant the target is virtual, and cannot be reached because it is beyond the surface of the palate. If V1 is the vowel [a] or [u], the resulting trajectory describes a movement that begins to loop forward before consonant closure and continues to slide along the palate during the closure. This pattern is very stable when moderate changes are made to the specification of the target consonant location and agrees with data published in the literature. If V1 is the vowel [i], looping patterns are also observed, but their orientation was quite sensitive to small changes in the location of the consonant target. These findings also agree with patterns of variability observed in measurements from human speakers, but they contradict data published by Houde [Ph.D. dissertation (1967)]. These observations support the idea that the biomechanical properties of the tongue could be the main factor responsible for the forward loops when V1 is a back vowel, regardless of whether V2 is a back vowel or a front vowel. In the [i] context it seems that additional factors have to be taken into consideration in order to explain the observations made on some speakers.  相似文献   

16.
A stratified random sample of 20 males and 20 females matched for physiological factors and cultural-linguistic markers were examined to determine differences in fundamental frequency and spectral characteristics during prolongation of three vowels: [a], [i], and [u]. The ethnic-gender breakdown included four sets of five male and five female subjects comprised of Caucasian and African-American speakers of standard American English, native Hindi Indian speakers, and native Mandarin Chinese speakers. Acoustic measures were analyzed using the Multidimensional Voice Program (Kay Elemetrics, Lincoln Park, NJ) (Model 4305) from which fundamental frequency and associated acoustic spectra were extracted from a 200-ms sample of each vowel token. Statistically significant group differences for the main effects of culture, race, and gender were found. The acoustic differences found are attributed to biomechanical, physiological, cultural, and linguistic factors.  相似文献   

17.
Four experiments explored the relative contributions of spectral content and phonetic labeling in effects of context on vowel perception. Two 10-step series of CVC syllables ([bVb] and [dVd]) varying acoustically in F2 midpoint frequency and varying perceptually in vowel height from [delta] to [epsilon] were synthesized. In a forced-choice identification task, listeners more often labeled vowels as [delta] in [dVd] context than in [bVb] context. To examine whether spectral content predicts this effect, nonspeech-speech hybrid series were created by appending 70-ms sine-wave glides following the trajectory of CVC F2's to 60-ms members of a steady-state vowel series varying in F2 frequency. In addition, a second hybrid series was created by appending constant-frequency sine-wave tones equivalent in frequency to CVC F2 onset/offset frequencies. Vowels flanked by frequency-modulated glides or steady-state tones modeling [dVd] were more often labeled as [delta] than were the same vowels surrounded by nonspeech modeling [bVb]. These results suggest that spectral content is important in understanding vowel context effects. A final experiment tested whether spectral content can modulate vowel perception when phonetic labeling remains intact. Voiceless consonants, with lower-amplitude more-diffuse spectra, were found to exert less of an influence on vowel perception than do their voiced counterparts. The data are discussed in terms of a general perceptual account of context effects in speech perception.  相似文献   

18.
The purpose of this study was to determine whether children give more perceptual weight than do adults to dynamic spectral cues versus static cues. Listeners were 10 children between the ages of 3;8 and 4;1 (mean 3;11) and ten adults between the ages of 23;10 and 32;0 (mean 25;11). Three experimental stimulus conditions were presented, with each containing stimuli of 30 ms duration. The first experimental condition consisted of unchanging formant onset frequencies ranging in value from frequencies for [i] to those for [a], appropriate for a bilabial stop consonant context. The second two experimental conditions consisted of either an [i] or [a] onset frequency with a 25 ms portion of a formant transition whose trajectory was toward one of a series of target frequencies ranging from those for [i] to those for [a]. Results indicated that the children attended differently than the adults on both the [a] and [i] formant onset frequency cue to identify the vowels. The adults gave more equal weight to the [i]-onset and [a]-onset dynamic cues as reflected in category boundaries than the children did. For the [i]-onset condition, children were not as confident compared to adults in vowel perception, as reflected in slope analyses.  相似文献   

19.
The goal of this study was to determine whether acoustic properties could be derived for English labial and alveolar nasal consonants that remain stable across vowel contexts, speakers, and syllable positions. In experiment I, critical band analyses were conducted of five tokens each of [m] and [n] followed by the vowels [i e a o u] spoken by three speakers. Comparison of the nature of the changes in the spectral patterns from the murmur to the release showed that, for labials, there was a greater change in energy in the region of Bark 5-7 relative to that of Bark 11-14, whereas, for alveolars, there was a greater change in energy from the murmur to the release in the region of Bark 11-14 relative to that of Bark 5-7. Quantitative analyses of each token indicated that over 89% of the utterances could be appropriately classified for place of articulation by comparing the proportion of energy change in these spectral regions. In experiment II, the spectral patterns of labial and alveolar nasals produced in the context of [s] + nasal ([ m n]) + vowel ([ i e a o u]) by two speakers were explored. The same analysis procedures were used as in experiment I. Eighty-four percent of the utterances were appropriately classified, although labial consonants were less consistently classified than in experiment I. The properties associated with nasal place of articulation found in this study are discussed in relation to those associated with place of articulation in stop consonants and are considered from the viewpoint of a more general theory of acoustic invariance.  相似文献   

20.
A model of a spectrum target prediction mechanism is proposed and evaluated by comparing predicted values with results of psychoacoustic experiments. When the trajectory of the cepstrally smoothed LPC spectrum is approximated by a second-order critically damped system, the proposed model can estimate target values using short-period spectrum sequences (50 ms) without being given the onset positions of the spectral transition. Additionally, this model decreases the length of transitional sounds and recovers vowel characteristics neutralized by coarticulation. Moreover, this model compensates for the transitions of syllables and extracts stable characteristics from syllable transitions. This model is applicable to coarticulation recovery in speech signal processing.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号