期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

1.

The use of visual cues in the perception of non-native consonant contrasts

Hazan V Sennema A Faulkner A Ortega-Llebaria M Iba M Chunge H 《The Journal of the Acoustical Society of America》2006,119(3):1740-1751

This study assessed the extent to which second-language learners are sensitive to phonetic information contained in visual cues when identifying a non-native phonemic contrast. In experiment 1, Spanish and Japanese learners of English were tested on their perception of a labial/ labiodental consonant contrast in audio (A), visual (V), and audio-visual (AV) modalities. Spanish students showed better performance overall, and much greater sensitivity to visual cues than Japanese students. Both learner groups achieved higher scores in the AV than in the A test condition, thus showing evidence of audio-visual benefit. Experiment 2 examined the perception of the less visually-salient /1/-/r/ contrast in Japanese and Korean learners of English. Korean learners obtained much higher scores in auditory and audio-visual conditions than in the visual condition, while Japanese learners generally performed poorly in both modalities. Neither group showed evidence of audio-visual benefit. These results show the impact of the language background of the learner and visual salience of the contrast on the use of visual cues for a non-native contrast. Significant correlations between scores in the auditory and visual conditions suggest that increasing auditory proficiency in identifying a non-native contrast is linked with an increasing proficiency in using visual cues to the contrast. 相似文献

2.

The role of medial consonant transitions in word perception.

L A Streeter G N Nigro 《The Journal of the Acoustical Society of America》1979,65(6):1533-1541

In VCV nonsense forms (such as /epsilondepsilon/, while both the CV transition and the VC transition are perceptible in isolation, the CV transition dominates identification of the stop consonant. Thus, the question arises, what role, if any, do VC transitions play in word perception? Stimuli were two-syllable English words in which the medial consonant was either a stop or a fricative (e.g., "feeding" and "gravy"). Each word was constructed in three ways: (1) the VC transition was incompatible with the CV in either place, manner of articulation, or both; (2) the VC transition was eliminated and the steady-state portion of first vowel was substituted in its place; and (3) the original word. All versions of a particular word were identical with respect to duration, pitch contour, and amplitude envelope. While an intelligibility test revealed no differences among the three conditions, data from a paired comparison preference task and an unspeeded lexical decision task indicated that incompatible VC transitions hindered word perception, but lack of VC transitions did not. However, there were clear differences among the three conditions in the speeded lexical decision task for word stimuli, but not for nonword stimuli that were constructed in an analogous fashion. We discuss the use of lexical tasks for speech quality assessment and possible processes by which listeners recognize spoken words. 相似文献

3.

Effect of masker type on native and non-native consonant perception in noise

Garcia Lecumberri ML Cooke M 《The Journal of the Acoustical Society of America》2006,119(4):2445-2454

Spoken communication in a non-native language is especially difficult in the presence of noise. This study compared English and Spanish listeners' perceptions of English intervocalic consonants as a function of masker type. Three maskers (stationary noise, multitalker babble, and competing speech) provided varying amounts of energetic and informational masking. Competing English and Spanish speech maskers were used to examine the effect of masker language. Non-native performance fell short of that of native listeners in quiet, but a larger performance differential was found for all masking conditions. Both groups performed better in competing speech than in stationary noise, and both suffered most in babble. Since babble is a less effective energetic masker than stationary noise, these results suggest that non-native listeners are more adversely affected by both energetic and informational masking. A strong correlation was found between non-native performance in quiet and degree of deterioration in noise, suggesting that non-native phonetic category learning can be fragile. A small effect of language background was evident: English listeners performed better when the competing speech was Spanish. 相似文献

4.

Effects of filtering and vowel environment on consonant perception

T Gay 《The Journal of the Acoustical Society of America》1970,48(4):993-998

相似文献

5.

The perception of phonemic contrasts in a non-native dialect

Dufour S Nguyen N Frauenfelder UH 《The Journal of the Acoustical Society of America》2007,121(4):EL131-EL136

This study examined the impact on speech processing of regional phonetic/phonological variation in the listener's native language. The perception of the /e/-/epsilon/ and /o/-/upside down c/ contrasts, produced by standard but not southern French native speakers, was investigated in these two populations. A repetition priming experiment showed that the latter but not the former perceived words such as /epe/ and /epepsilon/ as homophones. In contrast, both groups perceived the two words of /o/-/upside down c/ minimal pairs (/pom/-/p(uspide down c)m/) as being distinct. Thus, standard-French words can be perceived differently depending on the listener's regional accent. 相似文献

6.

The effect of emphatic stress on consonant vowel coarticulation

Lindblom B Agwuele A Sussman HM Cortes EE 《The Journal of the Acoustical Society of America》2007,121(6):3802-3813

This study assessed the acoustic coarticulatory effects of phrasal accent on [V1.CV2] sequences, when separately applied to V1 or V2, surrounding the voiced stops [b], [d], and [g]. Three adult speakers each produced 360 tokens (six V1 contexts x ten V2 contexts x three stops x two emphasis conditions). Realizing that anticipatory coarticulation of V2 onto the intervocalic C can be influenced by prosodic effects, as well as by vowel context effects, a modified locus equation regression metric was used to isolate the effect of phrasal accent on consonantal F2 onsets, independently of prosodically induced vowel expansion effects. The analyses revealed two main emphasis-dependent effects: systematic differences in F2 onset values and the expected expansion of vowel space. By accounting for the confounding variable of stress-induced vowel space expansion, a small but consistent coarticulatory effect of emphatic stress on the consonant was uncovered in lingually produced stops, but absent in labial stops. Formant calculations based on tube models indicated similarly increased F2 onsets when stressed /d/ and /g/ were simulated with deeper occlusions resulting from more forceful closure movements during phrasal accented speech. 相似文献

7.

Effect of burst amplitude on the perception of stop consonant place of articulation

R N Ohde K N Stevens 《The Journal of the Acoustical Society of America》1983,74(3):706-714

We have examined the effects of the relative amplitude of the release burst on perception of the place of articulation of utterance-initial voiceless and voiced stop consonants. The amplitude of the burst, which occurs within the first 10-15 ms following consonant release, was systematically varied in 5-dB steps from -10 to +10 dB relative to a "normal" burst amplitude for two labial-to-alveolar synthetic speech continua--one comprising voiceless stops and the other, voiced stops. The distribution of spectral energy in the bursts for the labial and alveolar stops at the ends of the continuum was consistent with the spectrum shapes observed in natural utterances, and intermediate shapes were used for intermediate stimuli on the continuum. The results of identification tests with these stimuli showed that the relative amplitude of the burst significantly affected the perception of the place of articulation of both voiceless and voiced stops, but the effect was greater for the former than the latter. The results are consistent with a view that two basic properties contribute to the labial-alveolar distinction in English. One of these is determined by the time course of the change in amplitude in the high-frequency range (above 2500 Hz) in the few tens of ms following consonantal release, and the other is determined by the frequencies of spectral peaks associated with the second and third formants in relation to the first formant. 相似文献

8.

Across-talker effects on non-native listeners' vowel perception in noise

Bent T Kewley-Port D Ferguson SH 《The Journal of the Acoustical Society of America》2010,128(5):3142-3151

This study explored how across-talker differences influence non-native vowel perception. American English (AE) and Korean listeners were presented with recordings of 10 AE vowels in /bVd/ context. The stimuli were mixed with noise and presented for identification in a 10-alternative forced-choice task. The two listener groups heard recordings of the vowels produced by 10 talkers at three signal-to-noise ratios. Overall the AE listeners identified the vowels 22% more accurately than the Korean listeners. There was a wide range of identification accuracy scores across talkers for both AE and Korean listeners. At each signal-to-noise ratio, the across-talker intelligibility scores were highly correlated for AE and Korean listeners. Acoustic analysis was conducted for 2 vowel pairs that exhibited variable accuracy across talkers for Korean listeners but high identification accuracy for AE listeners. Results demonstrated that Korean listeners' error patterns for these four vowels were strongly influenced by variability in vowel production that was within the normal range for AE talkers. These results suggest that non-native listeners are strongly influenced by across-talker variability perhaps because of the difficulty they have forming native-like vowel categories. 相似文献

9.

Discrimination of non-native consonant contrasts varying in perceptual assimilation to the listener's native phonological system 总被引：2，自引：0，他引：2

Best CT McRoberts GW Goodell E 《The Journal of the Acoustical Society of America》2001,109(2):775-794

Classic non-native speech perception findings suggested that adults have difficulty discriminating segmental distinctions that are not employed contrastively in their own language. However, recent reports indicate a gradient of performance across non-native contrasts, ranging from near-chance to near-ceiling. Current theoretical models argue that such variations reflect systematic effects of experience with phonetic properties of native speech. The present research addressed predictions from Best's perceptual assimilation model (PAM), which incorporates both contrastive phonological and noncontrastive phonetic influences from the native language in its predictions about discrimination levels for diverse types of non-native contrasts. We evaluated the PAM hypotheses that discrimination of a non-native contrast should be near-ceiling if perceived as phonologically equivalent to a native contrast, lower though still quite good if perceived as a phonetic distinction between good versus poor exemplars of a single native consonant, and much lower if both non-native segments are phonetically equivalent in goodness of fit to a single native consonant. Two experiments assessed native English speakers' perception of Zulu and Tigrinya contrasts expected to fit those criteria. Findings supported the PAM predictions, and provided evidence for some perceptual differentiation of phonological, phonetic, and nonlinguistic information in perception of non-native speech. Theoretical implications for non-native speech perception are discussed, and suggestions are made for further research. 相似文献

10.

Acoustic properties and perception of stop consonant release transients

B H Repp H B Lin 《The Journal of the Acoustical Society of America》1989,85(1):379-396

This study focuses on the initial component of the stop consonant release burst, the release transient. In theory, the transient, because of its impulselike source, should contain much information about the vocal tract configuration at release, but it is usually weak in intensity and difficult to isolate from the accompanying frication in natural speech. For this investigation, a human talker produced isolated release transients of /b,d,g/ in nine vocalic contexts by whispering these syllables very quietly. He also produced the corresponding CV syllables with regular phonation for comparison. Spectral analyses showed the isolated transients to have a clearly defined formant structure, which was not seen in natural release bursts, whose spectra were dominated by the frication noise. The formant frequencies varied systematically with both consonant place of articulation and vocalic context. Perceptual experiments showed that listeners can identify both consonants and vowels from isolated transients, though not very accurately. Knowing one of the two segments in advance did not help, but when the transients were followed by a compatible synthetic, steady-state vowel, consonant identification improved somewhat. On the whole, isolated transients, despite their clear formant structure, provided only partial information for consonant identification, but no less so, it seems, than excerpted natural release bursts. The information conveyed by artificially isolated transients and by natural (frication-dominated) release bursts appears to be perceptually equivalent. 相似文献

11.

Differential cue weighting in perception and production of consonant voicing

AA Shultz AL Francis F Llanos 《The Journal of the Acoustical Society of America》2012,132(2):EL95-EL101

This study examines English speakers' relative weighting of two voicing cues in production and perception. Participants repeated words differing in initial consonant voicing ([b] or [p]) and labeled synthesized tokens ranging between [ba] and [pa] orthogonally according to voice onset time (VOT) and onset f0. Discriminant function analysis and logistic regression were used to calculate individuals' relative weighting of each cue. Production results showed a significant negative correlation of VOT and onset f0, while perception results showed a trend toward a positive correlation. No significant correlations were found across perception and production, suggesting a complex relationship between the two domains. 相似文献

12.

The clear speech effect for non-native listeners

Bradlow AR Bent T 《The Journal of the Acoustical Society of America》2002,112(1):272-284

Previous work has established that naturally produced clear speech is more intelligible than conversational speech for adult hearing-impaired listeners and normal-hearing listeners under degraded listening conditions. The major goal of the present study was to investigate the extent to which naturally produced clear speech is an effective intelligibility enhancement strategy for non-native listeners. Thirty-two non-native and 32 native listeners were presented with naturally produced English sentences. Factors that varied were speaking style (conversational versus clear), signal-to-noise ratio (-4 versus -8 dB) and talker (one male versus one female). Results showed that while native listeners derived a substantial benefit from naturally produced clear speech (an improvement of about 16 rau units on a keyword-correct count), non-native listeners exhibited only a small clear speech effect (an improvement of only 5 rau units). This relatively small clear speech effect for non-native listeners is interpreted as a consequence of the fact that clear speech is essentially native-listener oriented, and therefore is only beneficial to listeners with extensive experience with the sound structure of the target language. 相似文献

13.

Speech perception by infants: categorization based on nasal consonant place of articulation

J Hillenbrand 《The Journal of the Acoustical Society of America》1984,75(5):1613-1622

This study examined the ability of six-month-old infants to recognize the perceptual similarity of syllables sharing a phonetic segment when variations were introduced in phonetic environment and talker. Infants in a "phonetic" group were visually reinforced for head turns when a change occurred from a background category of labial nasals to a comparison category of alveolar nasals . The infants were initially trained on a [ma]-[na] contrast produced by a male talker. Novel tokens differing in vowel environment and talker were introduced over several stages of increasing complexity. In the most complex stage infants were required to make a head turn when a change occurred from [ma,mi,mu] to [na,ni,nu], with the tokens in each category produced by both male and female talkers. A " nonphonetic " control group was tested using the same pool of stimuli as the phonetic condition. The only difference was that the stimuli in the background and comparison categories were chosen in such a way that the sounds could not be organized by acoustic or phonetic characteristics. Infants in the phonetic group transferred training to novel tokens produced by different talkers and in different vowel contexts. However, infants in the nonphonetic control group had difficulty learning the phonetically unrelated tokens that were introduced as the experiment progressed. These findings suggest that infants recognize the similarity of nasal consonants sharing place of articulation independent of variation in talker and vowel context. 相似文献

14.

The influence of linguistic and musical experience on Cantonese word learning

Cooper A Wang Y 《The Journal of the Acoustical Society of America》2012,131(6):4756-4769

Adult non-native speech perception is subject to influence from multiple factors, including linguistic and extralinguistic experience such as musical training. The present research examines how linguistic and musical factors influence non-native word identification and lexical tone perception. Groups of native tone language (Thai) and non-tone language listeners (English), each subdivided into musician and non-musician groups, engaged in Cantonese tone word training. Participants learned to identify words minimally distinguished by five Cantonese tones during training, also completing musical aptitude and phonemic tone identification tasks. First, the findings suggest that either musical experience or a tone language background leads to significantly better non-native word learning proficiency, as compared to those with neither musical training nor tone language experience. Moreover, the combination of tone language and musical experience did not provide an additional advantage for Thai musicians above and beyond either experience alone. Musicianship was found to be more advantageous than a tone language background for tone identification. Finally, tone identification and musical aptitude scores were significantly correlated with word learning success for English but not Thai listeners. These findings point to a dynamic influence of musical and linguistic experience, both at the tone dentification level and at the word learning stage. 相似文献

15.

English-learning infants' perception of word stress patterns

Skoruppa K Cristià A Peperkamp S Seidl A 《The Journal of the Acoustical Society of America》2011,130(1):EL50-EL55

Adult speakers of different free stress languages (e.g., English, Spanish) differ both in their sensitivity to lexical stress and in their processing of suprasegmental and vowel quality cues to stress. In a head-turn preference experiment with a familiarization phase, both 8-month-old and 12-month-old English-learning infants discriminated between initial stress and final stress among lists of Spanish-spoken disyllabic nonwords that were segmentally varied (e.g. ['nila, 'tuli] vs [lu'ta, pu'ki]). This is evidence that English-learning infants are sensitive to lexical stress patterns, instantiated primarily by suprasegmental cues, during the second half of the first year of life. 相似文献

16.

连续话语中双音节韵律词的重音感知 总被引：5，自引：1，他引：4

王韫佳初敏贺琳冯勇强《声学学报》2003,28(6):534-539

对于从微软亚洲研究院的汉语语音语料库中获得的300个语句中的1,898个双音节韵律词进行了重音感知实验,实验结果表明,连续话语中双音节词的重音感知特点与孤立词的重音感知特点有所不同,它受到词所在的韵律边界的显著影响。在感知实验中,词内两音节的重音得分之差与它们的高音点音高差和时长差都表现出正相关,但与高音点音高差的相关强于与时长差的相关。高音点音高差和时长差在非停顿前不相关,在停顿前为较弱的正相关。实验结果还表明,音节的重音感知受到调型的显著影响。相似文献

17.

The effect of overlap-masking on binaural reverberant word intelligibility

Libbey B Rogers PH 《The Journal of the Acoustical Society of America》2004,116(5):3141-3151

Reverberation interferes with the ability to understand speech in rooms. Overlap-masking explains this degradation by assuming reverberant phonemes endure in time and mask subsequent reverberant phonemes. Most listeners benefit from binaural listening when reverberation exists, indicating that the listener's binaural system processes the two channels to reduce the reverberation. This paper investigates the hypothesis that the binaural word intelligibility advantage found in reverberation is a result of binaural overlap-masking release with the reverberation acting as masking noise. The tests utilize phonetically balanced word lists (ANSI-S3.2 1989), that are presented diotically and binaurally with recorded reverberation and reverberation-like noise. A small room, 62 m3, reverberates the words. These are recorded using two microphones without additional noise sources. The reverberation-like noise is a modified form of these recordings and has a similar spectral content. It does not contain binaural localization cues due to a phase randomization procedure. Listening to the reverberant words binaurally improves the intelligibility by 6.0% over diotic listening. The binaural intelligibility advantage for reverberation-like noise is only 2.6%. This indicates that binaural overlap-masking release is insufficient to explain the entire binaural word intelligibility advantage in reverberation. 相似文献

18.

Sentential, lexical, and acoustic effects on the perception of word boundaries

Mattys SL Melhorn JF 《The Journal of the Acoustical Society of America》2007,122(1):554-567

This study investigates the effects of sentential context, lexical knowledge, and acoustic cues on the segmentation of connected speech. Listeners heard near-homophonous phrases (e.g., plmpaI for "plum pie" versus "plump eye") in isolation, in a sentential context, or in a lexically biasing context. The sentential context and the acoustic cues were piloted to provide strong versus mild support for one segmentation alternative (plum pie) or the other (plump eye). The lexically biasing context favored one segmentation or the other (e.g., skmpaI for "scum pie" versus *"scump eye," and lmpaI, for "lump eye" versus *"lum pie," with the asterisk denoting a lexically unacceptable parse). A forced-choice task, in which listeners indicated which of two words they thought they heard (e.g., "pie" or "eye"), revealed compensatory mechanisms between the sources of information. The effect of both sentential and lexical contexts on segmentation responses was larger when the acoustic cues were mild than when they were strong. Moreover, lexical effects were accompanied with a reduction in sensitivity to the acoustic cues. Sentential context only affected the listeners' response criterion. The results highlight the graded, interactive, and flexible nature of multicue segmentation, as well as functional differences between sentential and lexical contributions to this process. 相似文献

19.

The effect of cross-channel synchrony on the perception of temporal regularity

Krumbholz K Bleeck S Patterson RD Senokozlieva M Seither-Preisler A Lütkenhöner B 《The Journal of the Acoustical Society of America》2005,118(2):946-954

Temporal models of pitch are based on the assumption that the auditory system measures the time intervals between neural events, and that pitch corresponds to the most common time interval. The current experiments were designed to test whether time intervals are analyzed independently in each peripheral channel, or whether the time-interval analysis in one channel is affected by synchronous activity in other channels. Regular and irregular click trains were filtered into narrow frequency bands to produce target and flanker stimuli. The threshold for discriminating a regular target from an irregular distracter click train was measured in the presence of an irregular masker click train in the target band, as a function of the frequency separation between the target band and a flanker band. The flanker click train was either regular or irregular. The threshold for detecting the regular target was 5-7 dB lower when the flanker was regular. The data indicate that the detection of temporal regularity (and thus, pitch) involves cross-channel processes that can operate over widely separated channels. Model simulations suggest that these cross-channel processes occur after the time-interval extraction stage and that they depend on the similarity, or consistency, of the time-interval patterns in the relevant channels. 相似文献

20.

Stop consonant place perception with single-formant stimuli: evidence for the role of the front-cavity resonance.

G M Kuhn 《The Journal of the Acoustical Society of America》1979,65(3):774-788

The third formant and the second formant were found on average to cue the place of articulation of intervocalic stop consonants equally well when the stop consonants occurred before the vowel/i/. This result and others provide some support for the notion that the fundamental resonance of the front cavity plays an important role in the perception of the phonetic dimension of place of articulation. 相似文献