首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
The conditions under which listeners do and do not compensate for coarticulatory vowel nasalization were examined through a series of experiments of listeners' perception of naturally produced American English oral and nasal vowels spliced into three contexts: oral (C_C), nasal (N_N), and isolation. Two perceptual paradigms, a rating task in which listeners judged the relative nasality of stimulus pairs and a 4IAX discrimination task in which listeners judged vowel similarity, were used with two listener groups, native English speakers and native Thai speakers. Thai and English speakers were chosen because their languages differ in the temporal extent of anticipatory vowel nasalization. Listeners' responses were highly context dependent. For both perceptual paradigms and both language groups, listeners were less accurate at judging vowels in nasal than in non-nasal (oral or isolation) contexts; nasal vowels in nasal contexts were the most difficult to judge. Response patterns were generally consistent with the hypothesis that, given an appropriate and detectable nasal consonant context, listeners compensate for contextual vowel nasalization and attribute the acoustic effects of the nasal context to their coarticulatory source. However, the results also indicated that listeners do not hear nasal vowels in nasal contexts as oral; listeners retained some sensitivity to vowel nasalization in all contexts, indicating partial compensation for coarticulatory vowel nasalization. Moreover, there were small but systematic differences between the native Thai- and native English-speaking groups. These differences are as expected if perceptual compensation is partial and the extent of compensation is linked to patterns of coarticulatory nasalization in the listeners' native language.  相似文献   

2.
Acoustic characteristics of American English sentence stress produced by native Mandarin speakers are reported. Fundamental frequency (F0), vowel duration, and vowel intensity in the sentence-level stress produced by 40 Mandarin speakers were compared to those of 40 American English speakers. Results obtained from two methods of stress calculation indicated that Mandarin speakers of American English are able to differentiate stressed and unstressed words according to features of F0, duration, and intensity. Although the group of Mandarin speakers were able to signal stress in their sentence productions, the acoustic characteristics of stress were not identical to the American speakers. Mandarin speakers were found to produce stressed words with a significantly higher F0 and shorter duration compared to the American speakers. The groups also differed in production of unstressed words with Mandarin speakers using a higher F0 and greater intensity compared to American speakers. Although the acoustic differences observed may reflect an interference of L1 Mandarin in the production of L2 American English, the outcome of this study suggests no critical divergence between these speakers in the way they implement American English sentence stress.  相似文献   

3.
This study examines cross-linguistic variation in the location of shared vowels in the vowel space across five languages (Cantonese, American English, Greek, Japanese, and Korean) and three age groups (2-year-olds, 5-year-olds, and adults). The vowels /a/, /i/, and /u/ were elicited in familiar words using a word repetition task. The productions of target words were recorded and transcribed by native speakers of each language. For correctly produced vowels, first and second formant frequencies were measured. In order to remove the effect of vocal tract size on these measurements, a normalization approach that calculates distance and angular displacement from the speaker centroid was adopted. Language-specific differences in the location of shared vowels in the formant values as well as the shape of the vowel spaces were observed for both adults and children.  相似文献   

4.
Coarticulatory influences on the perceived height of nasal vowels   总被引:1,自引:0,他引:1  
Certain of the complex spectral effects of vowel nasalization bear a resemblance to the effects of modifying the tongue or jaw position with which the vowel is produced. Perceptual evidence suggests that listener misperceptions of nasal vowel height arise as a result of this resemblance. Whereas previous studies examined isolated nasal vowels, this research focused on the role of phonetic context in shaping listeners' judgments of nasal vowel height. Identification data obtained from native American English speakers indicated that nasal coupling does not necessarily lead to listener misperceptions of vowel quality when the vowel's nasality is coarticulatory in nature. The perceived height of contextually nasalized vowels (in a [bVnd] environment) did not differ from that of oral vowels (in a [bVd] environment) produced with the same tongue-jaw configuration. In contrast, corresponding noncontextually nasalized vowels (in a [bVd] environment) were perceived as lower in quality than vowels in the other two conditions. Presumably the listeners' lack of experience with distinctive vowel nasalization prompted them to resolve the spectral effects of noncontextual nasalization in terms of tongue or jaw height, rather than velic height. The implications of these findings with respect to sound changes affecting nasal vowel height are also discussed.  相似文献   

5.
Cross-language perception studies report influences of speech style and consonantal context on perceived similarity and discrimination of non-native vowels by inexperienced and experienced listeners. Detailed acoustic comparisons of distributions of vowels produced by native speakers of North German (NG), Parisian French (PF) and New York English (AE) in citation (di)syllables and in sentences (surrounded by labial and alveolar stops) are reported here. Results of within- and cross-language discriminant analyses reveal striking dissimilarities across languages in the spectral/temporal variation of coarticulated vowels. As expected, vocalic duration was most important in differentiating NG vowels; it did not contribute to PF vowel classification. Spectrally, NG long vowels showed little coarticulatory change, but back/low short vowels were fronted/raised in alveolar context. PF vowels showed greater coarticulatory effects overall; back and front rounded vowels were fronted, low and mid-low vowels were raised in both sentence contexts. AE mid to high back vowels were extremely fronted in alveolar contexts, with little change in mid-low and low long vowels. Cross-language discriminant analyses revealed varying patterns of spectral (dis)similarity across speech styles and consonantal contexts that could, in part, account for AE listeners' perception of German and French front rounded vowels, and "similar" mid-high to mid-low vowels.  相似文献   

6.
This study examined the effect of linguistic experience on perception of the English /s/-/z/ contrast in word-final position. The durations of the periodic ("vowel") and aperiodic ("fricative") portions of stimuli, ranging from peas to peace, were varied in a 5 X 5 factorial design. Forced-choice identification judgments were elicited from two groups of native speakers of American English differing in dialect, and from two groups each of native speakers of French, Swedish, and Finnish differing in English-language experience. The results suggested that the non-native subjects used cues established for the perception of phonetic contrasts in their native language to identify fricatives as /s/ or /z/. Lengthening vowel duration increased /z/ judgments in all eight subject groups, although the effect was smaller for native speakers of French than for native speakers of the other languages. Shortening fricative duration, on the other hand, significantly decreased /z/ judgments only by the English and French subjects. It did not influence voicing judgments by the Swedish and Finnish subjects, even those who had lived for a year or more in an English-speaking environment. These findings raise the question of whether adults who learn a foreign language can acquire the ability to integrate multiple acoustic cues to a phonetic contrast which does not exist in their native language.  相似文献   

7.
Recent studies have demonstrated that mothers exaggerate phonetic properties of infant-directed (ID) speech. However, these studies focused on a single acoustic dimension (frequency), whereas speech sounds are composed of multiple acoustic cues. Moreover, little is known about how mothers adjust phonetic properties of speech to children with hearing loss. This study examined mothers' production of frequency and duration cues to the American English tense/lax vowel contrast in speech to profoundly deaf (N?=?14) and normal-hearing (N?=?14) infants, and to an adult experimenter. First and second formant frequencies and vowel duration of tense (/i/,?/u/) and lax (/I/,?/?/) vowels were measured. Results demonstrated that for both infant groups mothers hyperarticulated the acoustic vowel space and increased vowel duration in ID speech relative to adult-directed speech. Mean F2 values were decreased for the /u/ vowel and increased for the /I/ vowel, and vowel duration was longer for the /i/, /u/, and /I/ vowels in ID speech. However, neither acoustic cue differed in speech to hearing-impaired or normal-hearing infants. These results suggest that both formant frequencies and vowel duration that differentiate American English tense/lx vowel contrasts are modified in ID speech regardless of the hearing status of the addressee.  相似文献   

8.
Acoustic and perceptual similarities between Japanese and American English (AE) vowels were investigated in two studies. In study 1, a series of discriminant analyses were performed to determine acoustic similarities between Japanese and AE vowels, each spoken by four native male speakers using F1, F2, and vocalic duration as input parameters. In study 2, the Japanese vowels were presented to native AE listeners in a perceptual assimilation task, in which the listeners categorized each Japanese vowel token as most similar to an AE category and rated its goodness as an exemplar of the chosen AE category. Results showed that the majority of AE listeners assimilated all Japanese vowels into long AE categories, apparently ignoring temporal differences between 1- and 2-mora Japanese vowels. In addition, not all perceptual assimilation patterns reflected context-specific spectral similarity patterns established by discriminant analysis. It was hypothesized that this incongruity between acoustic and perceptual similarity may be due to differences in distributional characteristics of native and non-native vowel categories that affect the listeners' perceptual judgments.  相似文献   

9.
Previous studies of vowel perception have shown that adult speakers of American English and of North German identify native vowels by exploiting at least three types of acoustic information contained in consonant-vowel-consonant (CVC) syllables: target spectral information reflecting the articulatory target of the vowel, dynamic spectral information reflecting CV- and -VC coarticulation, and duration information. The present study examined the contribution of each of these three types of information to vowel perception in prelingual infants and adults using a discrimination task. Experiment 1 examined German adults' discrimination of four German vowel contrasts (see text), originally produced in /dVt/ syllables, in eight experimental conditions in which the type of vowel information was manipulated. Experiment 2 examined German-learning infants' discrimination of the same vowel contrasts using a comparable procedure. The results show that German adults and German-learning infants appear able to use either dynamic spectral information or target spectral information to discriminate contrasting vowels. With respect to duration information, the removal of this cue selectively affected the discriminability of two of the vowel contrasts for adults. However, for infants, removal of contrastive duration information had a larger effect on the discrimination of all contrasts tested.  相似文献   

10.
Cross-generational and cross-dialectal variation in vowels among speakers of American English was examined in terms of vowel identification by listeners and vowel classification using pattern recognition. Listeners from Western North Carolina and Southeastern Wisconsin identified 12 vowel categories produced by 120 speakers stratified by age (old adults, young adults, and children), gender, and dialect. The vowels /?, o, ?, u/ were well identified by both groups of listeners. The majority of confusions were for the front /i, ?, e, ?, ?/, the low back /ɑ, ?/ and the monophthongal North Carolina /a?/. For selected vowels, generational differences in acoustic vowel characteristics were perceptually salient, suggesting listeners' responsiveness to sound change. Female exemplars and native-dialect variants produced higher identification rates. Linear discriminant analyses which examined dialect and generational classification accuracy showed that sampling the formant pattern at vowel midpoint only is insufficient to separate the vowels. Two sample points near onset and offset provided enough information for successful classification. The models trained on one dialect classified the vowels from the other dialect with much lower accuracy. The results strongly support the importance of dynamic information in accurate classification of cross-generational and cross-dialectal variations.  相似文献   

11.
Changes in magnitude and variability of duration, fundamental frequency, formant frequencies, and spectral envelope of children's speech are investigated as a function of age and gender using data obtained from 436 children, ages 5 to 17 years, and 56 adults. The results confirm that the reduction in magnitude and within-subject variability of both temporal and spectral acoustic parameters with age is a major trend associated with speech development in normal children. Between ages 9 and 12, both magnitude and variability of segmental durations decrease significantly and rapidly, converging to adult levels around age 12. Within-subject fundamental frequency and formant-frequency variability, however, may reach adult range about 2 or 3 years later. Differentiation of male and female fundamental frequency and formant frequency patterns begins at around age 11, becoming fully established around age 15. During that time period, changes in vowel formant frequencies of male speakers is approximately linear with age, while such a linear trend is less obvious for female speakers. These results support the hypothesis of uniform axial growth of the vocal tract for male speakers. The study also shows evidence for an apparent overshoot in acoustic parameter values, somewhere between ages 13 and 15, before converging to the canonical levels for adults. For instance, teenagers around age 14 differ from adults in that, on average, they show shorter segmental durations and exhibit less within-subject variability in durations, fundamental frequency, and spectral envelope measures.  相似文献   

12.
For each of five vowels [i e a o u] following [t], a continuum from non-nasal to nasal was synthesized. Nasalization was introduced by inserting a pole-zero pair in the vicinity of the first formant in an all-pole transfer function. The frequencies and spacing of the pole and zero were systematically varied to change the degree of nasalization. The selection of stimulus parameters was determined from acoustic theory and the results of pilot experiments. The stimuli were presented for identification and discrimination to listeners whose language included a non-nasal--nasal vowel opposition (Gujarati, Hindi, and Bengali) and to American listeners. There were no significant differences between language groups in the 50% crossover points of the identification functions. Some vowels were more influenced by range and context effects than were others. The language groups showed some differences in the shape of the discrimination functions for some vowels. On the basis of the results, it is postulated that (1) there is a basic acoustic property of nasality, independent of the vowel, to which the auditory system responds in a distinctive way regardless of language background; and (2) there are one or more additional acoustic properties that may be used to various degrees in different languages to enhance the contrast between a nasal vowel and its non-nasal congener. A proposed candidate for the basic acoustic property is a measure of the degree of prominence of the spectral peak in the vicinity of the first formant. Additional secondary properties include shifts in the center of gravity of the low-frequency spectral prominence, leading to a change in perceived vowel height, and changes in overall spectral balance.  相似文献   

13.
Because they consist, in large part, of random turbulent noise, fricatives present a challenge to attempts to specify the phonetic correlates of phonological features. Previous research has focused on temporal properties, acoustic power, and a variety of spectral properties of fricatives in a number of contexts [Jongman et al., J. Acoust. Soc. Am. 108, 1252-1263 (2000); Jesus and Shadle, J. Phonet. 30, 437-467 (2002); Crystal and House, J. Acoust. Soc. Am. 83, 1553-1573 (1988a)]. However, no systematic investigation of the effects of focus and prosodic context on fricative production has been carried out. Manipulation of explicit focus can serve to selectively exaggerate linguistically relevant properties of speech in much the same manner as stress [de Jong, J. Acoust. Soc. Am. 97, 491-504 (1995); de Jong, J. Phonet. 32, 493-516 (2004); de Jong and Zawaydeh, J. Phonet. 30, 53-75 (2002)]. This experimental technique was exploited to investigate acoustic power along with temporal and spectral characteristics of American English fricatives in two prosodic contexts, to probe whether native speakers selectively attend to subsegmental features, and to consider variability in fricative production across speakers. While focus in general increased noise power and duration, speakers did not selectively enhance spectral features of the target fricatives.  相似文献   

14.
This study investigated the extent to which adult Japanese listeners' perceived phonetic similarity of American English (AE) and Japanese (J) vowels varied with consonantal context. Four AE speakers produced multiple instances of the 11 AE vowels in six syllabic contexts /b-b, b-p, d-d, d-t, g-g, g-k/ embedded in a short carrier sentence. Twenty-four native speakers of Japanese were asked to categorize each vowel utterance as most similar to one of 18 Japanese categories [five one-mora vowels, five two-mora vowels, plus/ei, ou/ and one-mora and two-mora vowels in palatalized consonant CV syllables, C(j)a(a), C(j)u(u), C(j)o(o)]. They then rated the "category goodness" of the AE vowel to the selected Japanese category on a seven-point scale. None of the 11 AE vowels was assimilated unanimously to a single J response category in all context/speaker conditions; consistency in selecting a single response category ranged from 77% for /eI/ to only 32% for /ae/. Median ratings of category goodness for modal response categories were somewhat restricted overall, ranging from 5 to 3. Results indicated that temporal assimilation patterns (judged similarity to one-mora versus two-mora Japanese categories) differed as a function of the voicing of the final consonant, especially for the AE vowels, /see text/. Patterns of spectral assimilation (judged similarity to the five J vowel qualities) of /see text/ also varied systematically with consonantal context and speakers. On the basis of these results, it was predicted that relative difficulty in the identification and discrimination of AE vowels by Japanese speakers would vary significantly as a function of the contexts in which they were produced and presented.  相似文献   

15.
The primary aim of this study was to determine if adults whose native language permits neither voiced nor voiceless stops to occur in word-final position can master the English word-final /t/-/d/ contrast. Native English-speaking listeners identified the voicing feature in word-final stops produced by talkers in five groups: native speakers of English, experienced and inexperienced native Spanish speakers of English, and experienced and inexperienced native Mandarin speakers of English. Contrary to hypothesis, the experienced second language (L2) learners' stops were not identified significantly better than stops produced by the inexperienced L2 learners; and their stops were correctly identified significantly less often than stops produced by the native English speakers. Acoustic analyses revealed that the native English speakers made vowels significantly longer before /d/ than /t/, produced /t/-final words with a higher F1 offset frequency than /d/-final words, produced more closure voicing in /d/ than /t/, and sustained closure longer for /t/ than /d/. The L2 learners produced the same kinds of acoustic differences between /t/ and /d/, but theirs were usually of significantly smaller magnitude. Taken together, the results suggest that only a few of the 40 L2 learners examined in the present study had mastered the English word-final /t/-/d/ contrast. Several possible explanations for this negative finding are presented. Multiple regression analyses revealed that the native English listeners made perceptual use of the small, albeit significant, vowel duration differences produced in minimal pairs by the nonnative speakers. A significantly stronger correlation existed between vowel duration differences and the listeners' identifications of final stops in minimal pairs when the perceptual judgments were obtained in an "edited" condition (where post-vocalic cues were removed) than in a "full cue" condition. This suggested that listeners may modify their identification of stops based on the availability of acoustic cues.  相似文献   

16.
Native speakers of Mandarin Chinese have difficulty producing native-like English stress contrasts. Acoustically, English lexical stress is multidimensional, involving manipulation of fundamental frequency (F0), duration, intensity and vowel quality. Errors in any or all of these correlates could interfere with perception of the stress contrast, but it is unknown which correlates are most problematic for Mandarin speakers. This study compares the use of these correlates in the production of lexical stress contrasts by 10 Mandarin and 10 native English speakers. Results showed that Mandarin speakers produced significantly less native-like stress patterns, although they did use all four acoustic correlates to distinguish stressed from unstressed syllables. Mandarin and English speakers' use of amplitude and duration were comparable for both stressed and unstressed syllables, but Mandarin speakers produced stressed syllables with a higher F0 than English speakers. There were also significant differences in formant patterns across groups, such that Mandarin speakers produced English-like vowel reduction in certain unstressed syllables, but not in others. Results suggest that Mandarin speakers' production of lexical stress contrasts in English is influenced partly by native-language experience with Mandarin lexical tones, and partly by similarities and differences between Mandarin and English vowel inventories.  相似文献   

17.
Perception of second language speech sounds is influenced by one's first language. For example, speakers of American English have difficulty perceiving dental versus retroflex stop consonants in Hindi although English has both dental and retroflex allophones of alveolar stops. Japanese, unlike English, has a contrast similar to Hindi, specifically, the Japanese /d/ versus the flapped /r/ which is sometimes produced as a retroflex. This study compared American and Japanese speakers' identification of the Hindi contrast in CV syllable contexts where C varied in voicing and aspiration. The study then evaluated the participants' increase in identifying the distinction after training with a computer-interactive program. Training sessions progressively increased in difficulty by decreasing the extent of vowel truncation in stimuli and by adding new speakers. Although all participants improved significantly, Japanese participants were more accurate than Americans in distinguishing the contrast on pretest, during training, and on posttest. Transfer was observed to three new consonantal contexts, a new vowel context, and a new speaker's productions. Some abstract aspect of the contrast was apparently learned during training. It is suggested that allophonic experience with dental and retroflex stops may be detrimental to perception of the new contrast.  相似文献   

18.
Vowel durations typically vary according to both intrinsic (segment-specific) and extrinsic (contextual) specifications. It can be argued that such variations are due to both predisposition and cognitive learning. The present report utilizes acoustic phonetic measurements from Swedish and American children aged 24 and 30 months to investigate the hypothesis that default behaviors may precede language-specific learning effects. The predicted pattern is the presence of final consonant voicing effects in both languages as a default, and subsequent learning of intrinsic effects most notably in the Swedish children. The data, from 443 monosyllabic tokens containing high-front vowels and final stop consonants, are analyzed in statistical frameworks at group and individual levels. The results confirm that Swedish children show an early tendency to vary vowel durations according to final consonant voicing, followed only six months later by a stage at which the intrinsic influence of vowel identity grows relatively more robust. Measures of vowel formant structure from selected 30-month-old children also revealed a tendency for children of this age to focus on particular acoustic contrasts. In conclusion, the results indicate that early acquisition of vowel specifications involves an interaction between language-specific features and articulatory predispositions associated with phonetic context.  相似文献   

19.
Current theories of cross-language speech perception claim that patterns of perceptual assimilation of non-native segments to native categories predict relative difficulties in learning to perceive (and produce) non-native phones. Cross-language spectral similarity of North German (NG) and American English (AE) vowels produced in isolated hVC(a) (di)syllables (study 1) and in hVC syllables embedded in a short sentence (study 2) was determined by discriminant analyses, to examine the extent to which acoustic similarity was predictive of perceptual similarity patterns. The perceptual assimilation of NG vowels to native AE vowel categories by AE listeners with no German language experience was then assessed directly. Both studies showed that acoustic similarity of AE and NG vowels did not always predict perceptual similarity, especially for "new" NG front rounded vowels and for "similar" NG front and back mid and mid-low vowels. Both acoustic and perceptual similarity of NG and AE vowels varied as a function of the prosodic context, although vowel duration differences did not affect perceptual assimilation patterns. When duration and spectral similarity were in conflict, AE listeners assimilated vowels on the basis of spectral similarity in both prosodic contexts.  相似文献   

20.
As part of an investigation of the temporal implementation rules of English, measurements were made of voice-onset time for initial English stops and the duration of the following voiced vowel in monosyllabic words for New York City speakers. It was found that the VOT of a word-initial consonant was longer before a voiceless final cluster than before a single nasal, and longer before tense vowels than lax vowels. The vowels were also longer in environments where VOT was longer, but VOT did not maintain a constant ratio with the vowel duration, even for a single place of articulation. VOT was changed by a smaller proportion than the following voiced vowel in both cases. VOT changes associated with the vowel were consistent across place of articulation of the stop. In the final experiment, when vowel tensity and final consonant effects were combined, it was found that the proportion of vowel duration change that carried over to the preceding VOT is different for the two phonetic changes. These results imply that temporal implementation rules simultaneously influence several acoustic intervals including both VOT and the "inherent" interval corresponding to a segment, either by independent control of the relevant articulatory variables or by some unknown common mechanism.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号