首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 187 毫秒
1.
The role of language-specific factors in phonetically based trading relations was examined by assessing the ability of 20 native Japanese speakers to identify and discriminate stimuli of two synthetic /r/-/l/ series that varied temporal and spectral parameters independently. Results of forced-choice identification and oddity discrimination tasks showed that the nine Japanese subjects who were able to identify /r/ and /l/ reliably demonstrated a trading relation similar to that of Americans. Discrimination results reflected the perceptual equivalence of temporal and spectral parameters. Discrimination by the 11 Japanese subjects who were unable to identify the /r/-/l/ series differed significantly from the skilled Japanese subjects and native English speakers. However, their performance could not be predicted on the basis of acoustic dissimilarity alone. These results provide evidence that the trading relation between temporal and spectral cues for the /r/-/l/ contrast is not solely attributable to general auditory or language-universal phonetic processing constraints, but rather is also a function of phonemic processes that can be modified in the course of learning a second language.  相似文献   

2.
Eight monolingual Japanese listeners were trained to identify English /r/ and /l/ by using 560 training tokens produced by ten talkers in three positions (200 word initial, 200 consonant cluster, and 160 intervocalic tokens). Their baseline performance and transfer of learning were measured using 200 word initial and 200 consonant cluster tokens produced by additional ten talkers. Long-term training (15 days) with feedback indeed increased sensitivity to the nontraining tokens, but tremendous individual differences were found in terms of initial and final sensitivity and response bias. Even after training, however, there remained some tokens for each subject that were misidentified at a level significantly below chance, suggesting that truly nativelike identification of /r/ and /l/ may never be achieved by adult Japanese learners of English.  相似文献   

3.
Recent work [Iverson et al. (2003) Cognition, 87, B47-57] has suggested that Japanese adults have difficulty learning English /r/ and /l/ because they are overly sensitive to acoustic cues that are not reliable for /r/-/l/ categorization (e.g., F2 frequency). This study investigated whether cue weightings are altered by auditory training, and compared the effectiveness of different training techniques. Separate groups of subjects received High Variability Phonetic Training (natural words from multiple talkers), and 3 techniques in which the natural recordings were altered via signal processing (All Enhancement, with F3 contrast maximized and closure duration lengthened; Perceptual Fading, with F3 enhancement reduced during training; and Secondary Cue Variability, with variation in F2 and durations increased during training). The results demonstrated that all of the training techniques improved /r/-/l/ identification by Japanese listeners, but there were no differences between the techniques. Training also altered the use of secondary acoustic cues; listeners became biased to identify stimuli as English /l/ when the cues made them similar to the Japanese /r/ category, and reduced their use of secondary acoustic cues for stimuli that were dissimilar to Japanese /r/. The results suggest that both category assimilation and perceptual interference affect English /r/ and /l/ acquisition.  相似文献   

4.
Humans were trained to categorize problem non-native phonemes using an animal psychoacoustic procedure that trains monkeys to greater than 90% correct in phoneme identification [Sinnott and Gilmore, Percept. Psychophys. 66, 1341-1350 (2004)]. This procedure uses a manual left versus right response on a lever, a continuously repeated stimulus on each trial, extensive feedback for errors in the form of a repeated correction procedure, and training until asymptotic levels of performance. Here, Japanese listeners categorized the English liquid contrast /r-l/, and English listeners categorized the Middle Eastern dental-retroflex contrast /d-D/. Consonant-vowel stimuli were constructed using four talkers and four vowels. Native listeners and phoneme contrasts familiar to all listeners were included as controls. Responses were analyzed using percent correct, response time, and vowel context effects as measures. All measures indicated nativelike Japanese perception of /r-l/ after 32 daily training sessions, but this was not the case for English perception of /d-D/. Results are related to the concept of "robust" (more easily recovered) versus "fragile" (more easily lost) phonetic contrasts [Burnham, Appl. Psycholing. 7, 207-240 (1986)].  相似文献   

5.
This study examined the effect of linguistic experience on perception of the English /s/-/z/ contrast in word-final position. The durations of the periodic ("vowel") and aperiodic ("fricative") portions of stimuli, ranging from peas to peace, were varied in a 5 X 5 factorial design. Forced-choice identification judgments were elicited from two groups of native speakers of American English differing in dialect, and from two groups each of native speakers of French, Swedish, and Finnish differing in English-language experience. The results suggested that the non-native subjects used cues established for the perception of phonetic contrasts in their native language to identify fricatives as /s/ or /z/. Lengthening vowel duration increased /z/ judgments in all eight subject groups, although the effect was smaller for native speakers of French than for native speakers of the other languages. Shortening fricative duration, on the other hand, significantly decreased /z/ judgments only by the English and French subjects. It did not influence voicing judgments by the Swedish and Finnish subjects, even those who had lived for a year or more in an English-speaking environment. These findings raise the question of whether adults who learn a foreign language can acquire the ability to integrate multiple acoustic cues to a phonetic contrast which does not exist in their native language.  相似文献   

6.
This study examined the perceptual specialization for native-language speech sounds, by comparing native Hindi and English speakers in their perception of a graded set of English /w/-/v/ stimuli that varied in similarity to natural speech. The results demonstrated that language experience does not affect general auditory processes for these types of sounds; there were strong cross-language differences for speech stimuli, and none for stimuli that were nonspeech. However, the cross-language differences extended into a gray area of speech-like stimuli that were difficult to classify, suggesting that the specialization occurred in phonetic processing prior to categorization.  相似文献   

7.
This study examined the production of English /b/ and the perception of short-lag English /b d g/ tokens by four groups of bilinguals who differed according to their age of arrival (AOA) in Canada from Italy and amount of self-reported native language (L1) use. A clear difference emerged between early bilinguals (mean AOA= 8 years) and late bilinguals (mean AOA= 20 years). The late bilinguals showed a stronger L1 influence than the early bilinguals did on both the production and perception of English stops. In experiment 2, the late bilinguals produced a larger percentage of prevoiced English /b/ tokens than early bilinguals and native English (NE) speakers did. In experiment 3, the late bilinguals misidentified short-lag English /b d g/ tokens as /p t k/ more often than the early bilinguals and NE speakers did. Experiment 4 revealed that the frequencies with which the bilinguals prevoiced /b d g/ in Italian and English were correlated. The observed differences between the early and late bilinguals were attributed to differences in the quantity and quality of English phonetic input they had received, not to a greater likelihood by the early than late bilinguals to establish new phonetic categories for English /b d g/.  相似文献   

8.
The American English phoneme /r/ has long been associated with large amounts of articulatory variability during production. This paper investigates the hypothesis that the articulatory variations used by a speaker to produce /r/ in different contexts exhibit systematic tradeoffs, or articulatory trading relations, that act to maintain a relatively stable acoustic signal despite the large variations in vocal tract shape. Acoustic and articulatory recordings were collected from seven speakers producing /r/ in five phonetic contexts. For every speaker, the different articulator configurations used to produce /r/ in the different phonetic contexts showed systematic tradeoffs, as evidenced by significant correlations between the positions of transducers mounted on the tongue. Analysis of acoustic and articulatory variabilities revealed that these tradeoffs act to reduce acoustic variability, thus allowing relatively large contextual variations in vocal tract shape for /r/ without seriously degrading the primary acoustic cue. Furthermore, some subjects appeared to use completely different articulatory gestures to produce /r/ in different phonetic contexts. When viewed in light of current models of speech movement control, these results appear to favor models that utilize an acoustic or auditory target for each phoneme over models that utilize a vocal tract shape target for each phoneme.  相似文献   

9.
Native English speakers were trained to identify Japanese vowel length in three types of training differing in sentential speaking rate: slow-only, fast-only, and slow-fast. Following Pisoni and Lively's high phonetic variability hypothesis [Pisoni, D. B., and Lively, S. E., Speech Perception and Linguistic Experience, 433-459 (1995)], higher stimulus variability by means of training with two rates was hypothesized to aid learners in adapting to speech rate variation more effectively than training with only one rate. Trained participants identified the length of the second vowel of disyllables, short or long, embedded in a sentence of the respective rate, and received immediate feedback. The three trained groups' abilities before and after training were examined with tests containing sentences of slow, normal, and fast rates, and were compared with those of a control that was not trained. A robust effect of slow-fast training, a marginal effect of slow-only training, but no significant effect of fast-only training were found in the overall test scores. Slow-fast and slow-only training showed small advantages over fast-only training on the fast-rate test scores, while effects for all three training types were found on the slow- and normal-rate test scores. The degree to which the results support the high phonetic variability hypothesis is discussed.  相似文献   

10.
Perception of second language speech sounds is influenced by one's first language. For example, speakers of American English have difficulty perceiving dental versus retroflex stop consonants in Hindi although English has both dental and retroflex allophones of alveolar stops. Japanese, unlike English, has a contrast similar to Hindi, specifically, the Japanese /d/ versus the flapped /r/ which is sometimes produced as a retroflex. This study compared American and Japanese speakers' identification of the Hindi contrast in CV syllable contexts where C varied in voicing and aspiration. The study then evaluated the participants' increase in identifying the distinction after training with a computer-interactive program. Training sessions progressively increased in difficulty by decreasing the extent of vowel truncation in stimuli and by adding new speakers. Although all participants improved significantly, Japanese participants were more accurate than Americans in distinguishing the contrast on pretest, during training, and on posttest. Transfer was observed to three new consonantal contexts, a new vowel context, and a new speaker's productions. Some abstract aspect of the contrast was apparently learned during training. It is suggested that allophonic experience with dental and retroflex stops may be detrimental to perception of the new contrast.  相似文献   

11.
This study examined the ability of six-month-old infants to recognize the perceptual similarity of syllables sharing a phonetic segment when variations were introduced in phonetic environment and talker. Infants in a "phonetic" group were visually reinforced for head turns when a change occurred from a background category of labial nasals to a comparison category of alveolar nasals . The infants were initially trained on a [ma]-[na] contrast produced by a male talker. Novel tokens differing in vowel environment and talker were introduced over several stages of increasing complexity. In the most complex stage infants were required to make a head turn when a change occurred from [ma,mi,mu] to [na,ni,nu], with the tokens in each category produced by both male and female talkers. A " nonphonetic " control group was tested using the same pool of stimuli as the phonetic condition. The only difference was that the stimuli in the background and comparison categories were chosen in such a way that the sounds could not be organized by acoustic or phonetic characteristics. Infants in the phonetic group transferred training to novel tokens produced by different talkers and in different vowel contexts. However, infants in the nonphonetic control group had difficulty learning the phonetically unrelated tokens that were introduced as the experiment progressed. These findings suggest that infants recognize the similarity of nasal consonants sharing place of articulation independent of variation in talker and vowel context.  相似文献   

12.
This study examined imitation of a voice onset time (VOT) continuum ranging from/da/to/ta/by by subjects differing in age and/or linguistic experience. The subjects did not reproduce the incremental increases in VOT linearly, but instead showed abrupt shifts in VOT between two or three VOT response "modes." The location of the response shifts occurred at the same location as phoneme boundaries obtained in a previous identification experiment. This supports the view that the stimuli were categorized before being imitated. Children and adults who spoke just Spanish generally produced only lead and short-lag VOT responses. English monolinguals tended to produce stops with only short-lag and long-lag VOT values. The native Spanish adults and children who spoke English, on the other hand, produced stops with VOT values falling into all three model VOT ranges. This was interpreted to mean that they had established a phonetic category [th] with which to implement the voiceless aspirated realizations of /t/ in English. Their inability to produce English /p,t,k/ with the same values as native speakers of English must therefore be attributed to the information specified in their new English phonetic categories (which might be incorrect as the result of exposure to Spanish-accented English), to partially formed phonetic realization rules, or both.  相似文献   

13.
Perceptual equivalence of acoustic cues that differentiate /r/ and /l/   总被引:1,自引:0,他引:1  
The perceptual effects of orthogonal variations in two acoustic parameters which differentiate American English prevocalic /r/ and /l/ were examined. A spectral cue (frequency onset and transition of F2 and F3) and a temporal cue (relative duration of initial steady state and transition of F1) were varied in synthetic versions of "rock" and "lock." Four temporal variations in each of ten stimuli of a spectral-cue continuum were generated. Phonetic identification and oddity discrimination tasks with the four series showed systematic displacement of perceptual boundaries and discrimination peaks, thus reflecting a trading relation between the two cues. The perceptual equivalence of spectral and temporal cues was investigated by comparing the accuracy of discrimination of three types of stimulus comparisons: phonetically facilitating two-cue pairs, one-cue pairs, and phonetically conflicting two-cue pairs. As predicted, discrimination accuracy was ordered: Facilitating cues greater than one-cue greater than conflicting cues, indicating that perceivers discriminated on the basis of an integrated phonetic percept.  相似文献   

14.

Background

Tone languages such as Thai and Mandarin Chinese use differences in fundamental frequency (F0, pitch) to distinguish lexical meaning. Previous behavioral studies have shown that native speakers of a non-tone language have difficulty discriminating among tone contrasts and are sensitive to different F0 dimensions than speakers of a tone language. The aim of the present ERP study was to investigate the effect of language background and training on the non-attentive processing of lexical tones. EEG was recorded from 12 adult native speakers of Mandarin Chinese, 12 native speakers of American English, and 11 Thai speakers while they were watching a movie and were presented with multiple tokens of low-falling, mid-level and high-rising Thai lexical tones. High-rising or low-falling tokens were presented as deviants among mid-level standard tokens, and vice versa. EEG data and data from a behavioral discrimination task were collected before and after a two-day perceptual categorization training task.

Results

Behavioral discrimination improved after training in both the Chinese and the English groups. Low-falling tone deviants versus standards elicited a mismatch negativity (MMN) in all language groups. Before, but not after training, the English speakers showed a larger MMN compared to the Chinese, even though English speakers performed worst in the behavioral tasks. The MMN was followed by a late negativity, which became smaller with improved discrimination. The High-rising deviants versus standards elicited a late negativity, which was left-lateralized only in the English and Chinese groups.

Conclusion

Results showed that native speakers of English, Chinese and Thai recruited largely similar mechanisms when non-attentively processing Thai lexical tones. However, native Thai speakers differed from the Chinese and English speakers with respect to the processing of late F0 contour differences (high-rising versus mid-level tones). In addition, native speakers of a non-tone language (English) were initially more sensitive to F0 onset differences (low-falling versus mid-level contrast), which was suppressed as a result of training. This result converges with results from previous behavioral studies and supports the view that attentive as well as non-attentive processing of F0 contrasts is affected by language background, but is malleable even in adult learners.  相似文献   

15.
Using only three measures of the waveform, the zero-crossing rate, the logarithm of the root-mean-square (rms) energy, and the derivative of the log rms energy with respect to time [termed rate of rise (ROR)], voiceless plosives (including affricates) can be distinguished from voiceless fricatives in word-initial, medial, and final positions. Peaks in the ROR contour are considered for significance to the plosive/fricative distinction by examining the log rms energy and zero-crossing rate. Then, the magnitude of the first significant peak in the ROR contour is used as the primary classifier. The algorithm was tested on 1364 tokens (720 word-initial tokens produced by four female and four male speakers; 360 word-medial tokens produced by two males and two females; 320 word-final tokens produced by two males and two females). Data from two male and two female speakers (360 word-initial tokens) were used as a training set, and the remaining data were used as a test set. The overall rate of correct classification was 96.8%. Implications of this result are discussed.  相似文献   

16.
This paper investigates the functional relationship between articulatory variability and stability of acoustic cues during American English /r/ production. The analysis of articulatory movement data on seven subjects shows that the extent of intrasubject articulatory variability along any given articulatory direction is strongly and inversely related to a measure of acoustic stability (the extent of acoustic variation that displacing the articulators in this direction would produce). The presence and direction of this relationship is consistent with a speech motor control mechanism that uses a third formant frequency (F3) target; i.e., the final articulatory variability is lower for those articulatory directions most relevant to determining the F3 value. In contrast, no consistent relationship across speakers and phonetic contexts was found between hypothesized vocal-tract target variables and articulatory variability. Furthermore, simulations of two speakers' productions using the DIVA model of speech production, in conjunction with a novel speaker-specific vocal-tract model derived from magnetic resonance imaging data, mimic the observed range of articulatory gestures for each subject, while exhibiting the same articulatory/acoustic relations as those observed experimentally. Overall these results provide evidence for a common control scheme that utilizes an acoustic, rather than articulatory, target specification for American English /r/.  相似文献   

17.
The perceptual mechanisms of assimilation and contrast in the phonetic perception of vowels were investigated. In experiment 1, 14 stimulus continua were generated using an /i/-/e/-/a/ vowel continuum. They ranged from a continuum with both ends belonging to the same phonemic category in Japanese, to a continuum with both ends belonging to different phonemic categories. The AXB method was employed and the temporal position of X was changed under three conditions. In each condition ten subjects were required to judge whether X was similar to A or to B. The results demonstrated that assimilation to the temporally closer sound occurs if the phonemic categories of A and B are the same and that contrast to the temporally closer sound occurs if A and B belong to different phonemic categories. It was observed that the transition from assimilation to contrast is continuous except in the /i'/-X-/e/ condition. In experiment 2, the total duration of t 1 (between A and X) and t 2 (between X and B) was changed under five conditions. One stimulus continuum consisted of the same phonemic category in Japanese and the other consisted of different phonemic categories. Six subjects were required to make similarity judgements of X. The results demonstrated that the occurrence of assimilation and contrast to the temporally closer sound seemed to be constant under each of the five conditions. The present findings suggest that assimilation and contrast are determined by three factors: the temporal position of the three stimuli, the acoustic distance between the three stimuli on the stimulus continuum, and the phonemic categories of the three stimuli.  相似文献   

18.
Chinese words may begin with /t/ and /d/, but a /t/-/d/ contrast does not exist in word-final position. The question addressed by experiment 1 was whether Chinese speakers of English could identify the final stop in words like beat and bead. The Chinese subjects examined approached the near-perfect identification rates of native English adults and children for words that were unedited, but performed poorly for words from which final release bursts had been removed. Removing closure voicing had a small effect on the Chinese but not the English listeners' sensitivity. A regression analysis indicated that the Chinese subjects' native language (Mandarin, Taiwanese, Shanghainese) and their scores on an English comprehension test accounted for a significant amount of variance in sensitivity to the (burstless) /t/-/d/ contrast. In experiment 2, a small amount of feedback training administered to Chinese subjects led to a small, nonsignificant increase in sensitivity to the English /t/-/d/ contrast. In experiment 3, more training trials were presented for a smaller number of words. A slightly larger and significant effect of training was obtained. The Chinese subjects who were native speakers of a language that permits obstruents in word-final position seemed to benefit more from the training than those whose native language (L1) has no word-final obstruents. This was interpreted to mean that syllable-processing strategies established during L1 acquisition may influence later L2 learning.  相似文献   

19.
The mental organization of linguistic knowledge and its involvement in speech processing can be investigated using the mismatch negativity (MMN) component of the auditory event-related potential. A contradiction arises, however, between the technical need for strict control of acoustic stimulus properties and the quest for naturalness and acoustic variability of the stimuli. Here, two methods of preparing speech stimulus material were compared. Focussing on the automatic processing of a phonotactic restriction in German, two corresponding sets of various vowel-fricative syllables were used as stimuli. The former syllables were naturally spoken while the latter ones were created by means of cross-splicing. Phonetically, natural and spliced syllables differed with respect to the appropriateness of coarticulatory information about the forthcoming fricative within the vowels. Spliced syllables containing clearly misleading phonetic information were found to elicit larger N2 responses compared to their natural counterparts. Furthermore, MMN results found for the natural syllables could not be replicated with these spliced stimuli. These findings indicate that the automatic processing of the stimuli was considerably affected by the stimulus preparation method. Thus, in spite of its unquestioned benefits for MMN experiments, the splicing technique may lead to interference effects on the linguistic factors under investigation.  相似文献   

20.
Seven listener groups, varying in terms of the nasal consonant inventory of their native language, orthographically labeled and rated a set of naturally produced non-native nasal consonants varying in place of articulation. The seven listener groups included speakers of Malayalam, Marathi, Punjabi, Tamil, Oriya, Bengali, and American English. The stimulus set included bilabial, dental, alveolar, and retroflex nasals from Malayalam, Marathi, and Oriya. The stimulus set and nasal consonant inventories of the seven listener groups were described by both phonemic and allophonic representations. The study was designed to determine the extent to which phonemic and allophonic representations of perceptual categories can be used to predict a listener group's identification of non-native sounds. The results of the experiment showed that allophonic representations were more successful in predicting the native category that listeners used to label a non-native sound in a majority of trials. However, both representations frequently failed to accurately predict the goodness of fit between a non-native sound and a perceptual category. The results demonstrate that the labeling and rating of non-native stimuli were conditioned by a degree of language-specific phonetic detail that corresponds to perceptually relevant cues to native language contrasts.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号