期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

1.

On the relationship between identification and discrimination of non-native nasal consonants

Harnsberger JD 《The Journal of the Acoustical Society of America》2001,110(1):489-503

To examine the relationship between the identification and discrimination of non-native sounds, nasal consonants varying in place of articulation from Malayalam, Marathi, and Oriya were presented in two experiments to seven listener groups varying in their native nasal consonant inventory: Malayalam, Marathi, Punjabi, Tamil, Oriya, Bengali, and American English. The experiments consisted of a categorial AXB discrimination test and a forced-choice identification test with category goodness ratings. The identification test results were used to classify the non-native contrasts as one of five "assimilation types" of the Perceptual Assimilation Model (PAM) that are predicted to vary in their relative discriminability: two-category (TC), uncategorizable-categorizable (UC), both uncategorizable (UU), category-goodness (CG), and single-category (SC). The results showed that the mean percent correct discrimination scores of the assimilation types, but not the range of scores, were accurately predicted. Furthermore, differences in category goodness ratings in the CG and SC assimilations that were predicted to correlate with discrimination showed a weak, but significant correlation (r= 0.3 1, p<0.05). The implications of the results for models of cross-language speech perception were discussed, and an alternative model of cross-language speech perception was outlined, in which the discriminability of non-native contrasts is a function of the similarity of non-native sounds to each other in a multidimensional, phonologized perceptual space. 相似文献

2.

Perception of coarticulatory nasalization by speakers of English and Thai: evidence for partial compensation

Beddor PS Krakow RA 《The Journal of the Acoustical Society of America》1999,106(5):2868-2887

The conditions under which listeners do and do not compensate for coarticulatory vowel nasalization were examined through a series of experiments of listeners' perception of naturally produced American English oral and nasal vowels spliced into three contexts: oral (C_C), nasal (N_N), and isolation. Two perceptual paradigms, a rating task in which listeners judged the relative nasality of stimulus pairs and a 4IAX discrimination task in which listeners judged vowel similarity, were used with two listener groups, native English speakers and native Thai speakers. Thai and English speakers were chosen because their languages differ in the temporal extent of anticipatory vowel nasalization. Listeners' responses were highly context dependent. For both perceptual paradigms and both language groups, listeners were less accurate at judging vowels in nasal than in non-nasal (oral or isolation) contexts; nasal vowels in nasal contexts were the most difficult to judge. Response patterns were generally consistent with the hypothesis that, given an appropriate and detectable nasal consonant context, listeners compensate for contextual vowel nasalization and attribute the acoustic effects of the nasal context to their coarticulatory source. However, the results also indicated that listeners do not hear nasal vowels in nasal contexts as oral; listeners retained some sensitivity to vowel nasalization in all contexts, indicating partial compensation for coarticulatory vowel nasalization. Moreover, there were small but systematic differences between the native Thai- and native English-speaking groups. These differences are as expected if perceptual compensation is partial and the extent of compensation is linked to patterns of coarticulatory nasalization in the listeners' native language. 相似文献

3.

Assimilation and contrast in the phonetic perception of vowels.

S Shigeno 《The Journal of the Acoustical Society of America》1991,90(1):103-111

The perceptual mechanisms of assimilation and contrast in the phonetic perception of vowels were investigated. In experiment 1, 14 stimulus continua were generated using an /i/-/e/-/a/ vowel continuum. They ranged from a continuum with both ends belonging to the same phonemic category in Japanese, to a continuum with both ends belonging to different phonemic categories. The AXB method was employed and the temporal position of X was changed under three conditions. In each condition ten subjects were required to judge whether X was similar to A or to B. The results demonstrated that assimilation to the temporally closer sound occurs if the phonemic categories of A and B are the same and that contrast to the temporally closer sound occurs if A and B belong to different phonemic categories. It was observed that the transition from assimilation to contrast is continuous except in the /i'/-X-/e/ condition. In experiment 2, the total duration of t 1 (between A and X) and t 2 (between X and B) was changed under five conditions. One stimulus continuum consisted of the same phonemic category in Japanese and the other consisted of different phonemic categories. Six subjects were required to make similarity judgements of X. The results demonstrated that the occurrence of assimilation and contrast to the temporally closer sound seemed to be constant under each of the five conditions. The present findings suggest that assimilation and contrast are determined by three factors: the temporal position of the three stimuli, the acoustic distance between the three stimuli on the stimulus continuum, and the phonemic categories of the three stimuli. 相似文献

4.

The time course of acoustic/phonemic cue integration in the sensorineurally hearing-impaired listener

D J Schum M J Collins 《The Journal of the Acoustical Society of America》1990,87(6):2716-2728

There is limited documentation available on how sensorineurally hearing-impaired listeners use the various sources of phonemic information that are known to be distributed across time in the speech waveform. In this investigation, a group of normally hearing listeners and a group of sensorineurally hearing-impaired listeners (with and without the benefit of amplification) identified various consonant and vowel productions that had been systematically varied in duration. The consonants (presented in a /haCa/ environment) and the vowels (presented in a /bVd/ environment) were truncated in steps to eliminate various segments from the end of the stimulus. The results indicated that normally hearing listeners could extract more phonemic information, especially cues to consonant place, from the earlier occurring portions of the stimulus waveforms than could the hearing-impaired listeners. The use of amplification partially decreased the performance differences between the normally hearing listeners and the unaided hearing-impaired listeners. The results are relevant to current models of normal speech perception that emphasize the need for the listener to make phonemic identifications as quickly as possible. 相似文献

5.

Cross-language speech perception in adults: phonemic, phonetic, and acoustic contributions 总被引：5，自引：0，他引：5

L Polka 《The Journal of the Acoustical Society of America》1991,89(6):2961-2977

相似文献

6.

Discrimination of non-native consonant contrasts varying in perceptual assimilation to the listener's native phonological system 总被引：2，自引：0，他引：2

Best CT McRoberts GW Goodell E 《The Journal of the Acoustical Society of America》2001,109(2):775-794

Classic non-native speech perception findings suggested that adults have difficulty discriminating segmental distinctions that are not employed contrastively in their own language. However, recent reports indicate a gradient of performance across non-native contrasts, ranging from near-chance to near-ceiling. Current theoretical models argue that such variations reflect systematic effects of experience with phonetic properties of native speech. The present research addressed predictions from Best's perceptual assimilation model (PAM), which incorporates both contrastive phonological and noncontrastive phonetic influences from the native language in its predictions about discrimination levels for diverse types of non-native contrasts. We evaluated the PAM hypotheses that discrimination of a non-native contrast should be near-ceiling if perceived as phonologically equivalent to a native contrast, lower though still quite good if perceived as a phonetic distinction between good versus poor exemplars of a single native consonant, and much lower if both non-native segments are phonetically equivalent in goodness of fit to a single native consonant. Two experiments assessed native English speakers' perception of Zulu and Tigrinya contrasts expected to fit those criteria. Findings supported the PAM predictions, and provided evidence for some perceptual differentiation of phonological, phonetic, and nonlinguistic information in perception of non-native speech. Theoretical implications for non-native speech perception are discussed, and suggestions are made for further research. 相似文献

7.

Quantifying the intelligibility of speech in noise for non-native listeners 总被引：3，自引：0，他引：3

van Wijngaarden SJ Steeneken HJ Houtgast T 《The Journal of the Acoustical Society of America》2002,111(4):1906-1916

When listening to languages learned at a later age, speech intelligibility is generally lower than when listening to one's native language. The main purpose of this study is to quantify speech intelligibility in noise for specific populations of non-native listeners, only broadly addressing the underlying perceptual and linguistic processing. An easy method is sought to extend these quantitative findings to other listener populations. Dutch subjects listening to Germans and English speech, ranging from reasonable to excellent proficiency in these languages, were found to require a 1-7 dB better speech-to-noise ratio to obtain 50% sentence intelligibility than native listeners. Also, the psychometric function for sentence recognition in noise was found to be shallower for non-native than for native listeners (worst-case slope around the 50% point of 7.5%/dB, compared to 12.6%/dB for native listeners). Differences between native and non-native speech intelligibility are largely predicted by linguistic entropy estimates as derived from a letter guessing task. Less effective use of context effects (especially semantic redundancy) explains the reduced speech intelligibility for non-native listeners. While measuring speech intelligibility for many different populations of listeners (languages, linguistic experience) may be prohibitively time consuming, obtaining predictions of non-native intelligibility from linguistic entropy may help to extend the results of this study to other listener populations. 相似文献

8.

Can listeners hear who is singing? A comparison of three-note and six-note discrimination tasks

Molly L. Erickson Susan R. Perry 《Journal of voice》2003,17(3):353-369

Timbre is typically investigated as a perceptual attribute that differentiates a sound source at one pitch and loudness. Yet the perceptual usefulness of timbre is that it allows the listener to recognize one sound source at different pitches. This paper investigated the ability of listeners to identify which pitch in an ascending or descending sequence of three or six stimuli was sung by a different singer. For three-note sequences, the task was extremely difficult, and with rare exceptions, listeners chose the most dissimilarly pitched stimulus as coming from the oddball singer. For six-note sequences, the detection of the oddball singer was much improved in spite of the added complexity of the task. These results support the idea that timbre should be understood as a transformation that connects the different sounds of one source and that a "rich" set of sounds is necessary to discover the trajectory. 相似文献

9.

Molly L. Erickson 《Journal of voice》2003,17(2):195-206

Traditionally, timbre has been defined as that perceptual attribute that differentiates two sounds when pitch and loudness are equal and thus is a measure of dissimilarity. By such a definition, each voice possesses a set of timbres, and the identity of any voice or voice category across different pitch-loudness-vowel combinations must be due to an abstraction of the pattern of timbre transformation. Using stimuli produced across the singing range by singers from different voice categories, this study sought to examine how timbre and pitch interact in the perception of dissimilarity. This study also investigated whether listener experience affects the perception of timbre as a function of pitch. The resulting multidimensional scaling (MDS) representations showed that for all stimuli and listeners, dimension 1 correlated with pitch, whereas dimension 2 correlated with spectral centroid and separated vocal stimuli into the categories mezzo-soprano and soprano. Dimension 3 appeared highly idiosyncratic depending on the nature of the stimuli and on the experience of the listener. Inexperienced listeners appeared to rely more heavily on pitch in making dissimilarity judgments than did experienced listeners. The resulting MDS representations of dissimilarity across pitch provide a glimpse of the timbre transformation of voice categories across pitch. 相似文献

10.

Molly L. Erickson 《Journal of voice》2008,22(3):290-299

Traditionally, timbre has been defined as that perceptual attribute that differentiates two sounds when pitch and loudness are equal, and thus is a measure of dissimilarity. By such a definition, each voice possesses a set of timbres, and the ability to identify any voice or voice category across different pitch-loudness-vowel combinations must be due to an ability to "link" these timbres by abstracting the "timbre transformation," the manner in which timbre subtly changes across pitch and loudness for a specific voice or voice category. Using stimuli produced across the singing range by singers from different voice categories, this study sought to examine how timbre and pitch interact in the perception of dissimilarity in male singing voices. This study also investigated whether or not listener experience affects the perception of timbre as a function of pitch. The resulting multidimensional scaling (MDS) representations showed that for all stimuli and listeners, dimension 1 correlated with pitch, while dimension 2 correlated with spectral centroid and separated vocal stimuli into the categories baritone and tenor. Dimension 3 appeared highly idiosyncratic depending on the nature of the stimuli and on the experience of the listener. Inexperienced listeners appeared to rely more heavily on pitch in making dissimilarity judgments than did experienced listeners. The resulting MDS representations of dissimilarity across pitch provide a glimpse of the timbre transformation of voice categories across pitch. 相似文献

11.

Acoustic and perceptual similarity of Japanese and American English vowels

Nishi K Strange W Akahane-Yamada R Kubo R Trent-Brown SA 《The Journal of the Acoustical Society of America》2008,124(1):576-588

Acoustic and perceptual similarities between Japanese and American English (AE) vowels were investigated in two studies. In study 1, a series of discriminant analyses were performed to determine acoustic similarities between Japanese and AE vowels, each spoken by four native male speakers using F1, F2, and vocalic duration as input parameters. In study 2, the Japanese vowels were presented to native AE listeners in a perceptual assimilation task, in which the listeners categorized each Japanese vowel token as most similar to an AE category and rated its goodness as an exemplar of the chosen AE category. Results showed that the majority of AE listeners assimilated all Japanese vowels into long AE categories, apparently ignoring temporal differences between 1- and 2-mora Japanese vowels. In addition, not all perceptual assimilation patterns reflected context-specific spectral similarity patterns established by discriminant analysis. It was hypothesized that this incongruity between acoustic and perceptual similarity may be due to differences in distributional characteristics of native and non-native vowel categories that affect the listeners' perceptual judgments. 相似文献

12.

Effects of speaker variability and noise on Mandarin fricative identification by native and non-native listeners

CY Lee Y Zhang X Li L Tao ZS Bond 《The Journal of the Acoustical Society of America》2012,132(2):1130-1140

Speaker variability and noise are two common sources of acoustic variability. The goal of this study was to examine whether these two sources of acoustic variability affected native and non-native perception of Mandarin fricatives to different degrees. Multispeaker Mandarin fricative stimuli were presented to 40 native and 52 non-native listeners in two presentation formats (blocked by speaker and mixed across speakers). The stimuli were also mixed with speech-shaped noise to create five levels of signal-to- noise ratios. The results showed that noise affected non-native identification disproportionately. By contrast, the effect of speaker variability was comparable between the native and non-native listeners. Confusion patterns were interpreted with reference to the results of acoustic analysis, suggesting native and non-native listeners used distinct acoustic cues for fricative identification. It was concluded that not all sources of acoustic variability are treated equally by native and non-native listeners. Whereas noise compromised non-native fricative perception disproportionately, speaker variability did not pose a special challenge to the non-native listeners. 相似文献

13.

Patterns of English phoneme confusions by native and non-native listeners

Cutler A Weber A Smits R Cooper N 《The Journal of the Acoustical Society of America》2004,116(6):3668-3678

Native American English and non-native (Dutch) listeners identified either the consonant or the vowel in all possible American English CV and VC syllables. The syllables were embedded in multispeaker babble at three signal-to-noise ratios (0, 8, and 16 dB). The phoneme identification performance of the non-native listeners was less accurate than that of the native listeners. All listeners were adversely affected by noise. With these isolated syllables, initial segments were harder to identify than final segments. Crucially, the effects of language background and noise did not interact; the performance asymmetry between the native and non-native groups was not significantly different across signal-to-noise ratios. It is concluded that the frequently reported disproportionate difficulty of non-native listening under disadvantageous conditions is not due to a disproportionate increase in phoneme misidentifications. 相似文献

14.

Integrality of nasalization and F1. II. Basic sensitivity and phonetic labeling measure distinct sensory and decision-rule interactions

Macmillan NA Kingston J Thorburn R Dickey LW Bartels C 《The Journal of the Acoustical Society of America》1999,106(5):2913-2932

In vowel perception, nasalization and height (the inverse of the first formant, F1) interact. This paper asks whether the interaction results from a sensory process, decision mechanism, or both. Two experiments used vowels varying in height, degree of nasalization, and three other stimulus parameters: the frequency region of F1, the location of the nasal pole/zero complex relative to F1, and whether a consonant following the vowel was oral or nasal. A fixed-classification experiment, designed to estimate basic sensitivity between stimuli, measured accuracy for discriminating stimuli differing in F1, in nasalization, and on both dimensions. A configuration derived by a multidimensional scaling analysis revealed a perceptual interaction that was stronger for stimuli in which the nasal pole/zero complex was below rather than above the oral pole, and that was present before both nasal and oral consonants. Phonetic identification experiments, designed to measure trading relations between the two dimensions, required listeners to identify height and nasalization in vowels varying in both. Judgments of nasalization depended on F1 as well as on nasalization, whereas judgments of height depended primarily on F1, and on nasalization more when the nasal complex was below than above the oral pole. This pattern was interpreted as a decision-rule interaction that is distinct from the interaction in basic sensitivity. Final consonant nasality had little effect in the classification experiment; in the identification experiment, nasal judgments were more likely when the following consonant was nasal. 相似文献

15.

Reiterant speech as a test of non-native speakers' mastery of the timing of French.

A G Levitt 《The Journal of the Acoustical Society of America》1991,90(6):3008-3018

The reiterant speech of ten native speakers of French was analyzed to develop baseline measures for syllable and consonant/vowel timing for a series of two-, three-, four-, and five-syllable French words spoken in isolation. Ten native speakers of English, who learned French as a second language, produced reiterant versions of both the French words and a comparable set of English words. The native speakers of English were divided into two groups on the basis of their second language experience. The first group consisted of four university-level teachers, who were relatively experienced learners of French, and the second group of six less experienced learners of French. The French reiterant imitations of the two groups of native speakers of English were compared to the native French speakers' productions. The timing patterns of the experienced group of non-native speakers did not differ significantly from those of the native French speakers, whereas there was a significant difference between these two groups and the group of six less experienced second-language learners. Deviations from the French baseline measures produced by the less experienced group are discussed in terms of the influence of the timing patterns of English and the literature on a sensitive period for second language acquisition. 相似文献

16.

Coarticulatory influences on the perceived height of nasal vowels 总被引：1，自引：0，他引：1

R A Krakow P S Beddor L M Goldstein C A Fowler 《The Journal of the Acoustical Society of America》1988,83(3):1146-1158

Certain of the complex spectral effects of vowel nasalization bear a resemblance to the effects of modifying the tongue or jaw position with which the vowel is produced. Perceptual evidence suggests that listener misperceptions of nasal vowel height arise as a result of this resemblance. Whereas previous studies examined isolated nasal vowels, this research focused on the role of phonetic context in shaping listeners' judgments of nasal vowel height. Identification data obtained from native American English speakers indicated that nasal coupling does not necessarily lead to listener misperceptions of vowel quality when the vowel's nasality is coarticulatory in nature. The perceived height of contextually nasalized vowels (in a [bVnd] environment) did not differ from that of oral vowels (in a [bVd] environment) produced with the same tongue-jaw configuration. In contrast, corresponding noncontextually nasalized vowels (in a [bVd] environment) were perceived as lower in quality than vowels in the other two conditions. Presumably the listeners' lack of experience with distinctive vowel nasalization prompted them to resolve the spectral effects of noncontextual nasalization in terms of tongue or jaw height, rather than velic height. The implications of these findings with respect to sound changes affecting nasal vowel height are also discussed. 相似文献

17.

The use of visual cues in the perception of non-native consonant contrasts

Hazan V Sennema A Faulkner A Ortega-Llebaria M Iba M Chunge H 《The Journal of the Acoustical Society of America》2006,119(3):1740-1751

This study assessed the extent to which second-language learners are sensitive to phonetic information contained in visual cues when identifying a non-native phonemic contrast. In experiment 1, Spanish and Japanese learners of English were tested on their perception of a labial/ labiodental consonant contrast in audio (A), visual (V), and audio-visual (AV) modalities. Spanish students showed better performance overall, and much greater sensitivity to visual cues than Japanese students. Both learner groups achieved higher scores in the AV than in the A test condition, thus showing evidence of audio-visual benefit. Experiment 2 examined the perception of the less visually-salient /1/-/r/ contrast in Japanese and Korean learners of English. Korean learners obtained much higher scores in auditory and audio-visual conditions than in the visual condition, while Japanese learners generally performed poorly in both modalities. Neither group showed evidence of audio-visual benefit. These results show the impact of the language background of the learner and visual salience of the contrast on the use of visual cues for a non-native contrast. Significant correlations between scores in the auditory and visual conditions suggest that increasing auditory proficiency in identifying a non-native contrast is linked with an increasing proficiency in using visual cues to the contrast. 相似文献

18.

The foreign language cocktail party problem: Energetic and informational masking effects in non-native speech perception

Cooke M Garcia Lecumberri ML Barker J 《The Journal of the Acoustical Society of America》2008,123(1):414-427

Studies comparing native and non-native listener performance on speech perception tasks can distinguish the roles of general auditory and language-independent processes from those involving prior knowledge of a given language. Previous experiments have demonstrated a performance disparity between native and non-native listeners on tasks involving sentence processing in noise. However, the effects of energetic and informational masking have not been explicitly distinguished. Here, English and Spanish listener groups identified keywords in English sentences in quiet and masked by either stationary noise or a competing utterance, conditions known to produce predominantly energetic and informational masking, respectively. In the stationary noise conditions, non-native talkers suffered more from increasing levels of noise for two of the three keywords scored. In the competing talker condition, the performance differential also increased with masker level. A computer model of energetic masking in the competing talker condition ruled out the possibility that the native advantage could be explained wholly by energetic masking. Both groups drew equal benefit from differences in mean F0 between target and masker, suggesting that processes which make use of this cue do not engage language-specific knowledge. 相似文献

19.

Vowel recognition via cochlear implants and noise vocoders: effects of formant movement and duration

Iverson P Smith CA Evans BG 《The Journal of the Acoustical Society of America》2006,120(6):3998-4006

Previous work has demonstrated that normal-hearing individuals use fine-grained phonetic variation, such as formant movement and duration, when recognizing English vowels. The present study investigated whether these cues are used by adult postlingually deafened cochlear implant users, and normal-hearing individuals listening to noise-vocoder simulations of cochlear implant processing. In Experiment 1, subjects gave forced-choice identification judgments for recordings of vowels that were signal processed to remove formant movement and/or equate vowel duration. In Experiment 2, a goodness-optimization procedure was used to create perceptual vowel space maps (i.e., best exemplars within a vowel quadrilateral) that included F1, F2, formant movement, and duration. The results demonstrated that both cochlear implant users and normal-hearing individuals use formant movement and duration cues when recognizing English vowels. Moreover, both listener groups used these cues to the same extent, suggesting that postlingually deafened cochlear implant users have category representations for vowels that are similar to those of normal-hearing individuals. 相似文献

20.

The interrelation between acoustic context effects and available response categories in speech sound categorization

Benders T Escudero P Sjerps MJ 《The Journal of the Acoustical Society of America》2012,131(4):3079-3087

In an investigation of contextual influences on sound categorization, 64 Peruvian Spanish listeners categorized vowels on an /i/ to /e/ continuum. First, to measure the influence of the stimulus range (broad acoustic context) and the preceding stimuli (local acoustic context), listeners were presented with different subsets of the Spanish /i/-/e/ continuum in separate blocks. Second, the influence of the number of response categories was measured by presenting half of the participants with /i/ and /e/ as responses, and the other half with /i/, /e/, /a/, /o/, and /u/. The results showed that the perceptual category boundary between /i/ and /e/ shifted depending on the stimulus range and that the formant values of locally preceding items had a contrastive influence. Categorization was less susceptible to broad and local acoustic context effects, however, when listeners were presented with five rather than two response options. Vowel categorization depends not only on the acoustic properties of the target stimulus, but also on its broad and local acoustic context. The influence of such context is in turn affected by the number of internal referents that are available to the listener in a task. 相似文献