期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

1.

An EMA study of VCV coarticulatory direction

Recasens D 《The Journal of the Acoustical Society of America》2002,111(6):2828-2841

This study addresses three issues that are relevant to coarticulation theory in speech production: whether the degree of articulatory constraint model (DAC model) accounts for patterns of the directionality of tongue dorsum coarticulatory influences; the extent to which those patterns in tongue dorsum coarticulatory direction are similar to those for the tongue tip; and whether speech motor control and phonemic planning use a fixed or a context-dependent temporal window. Tongue dorsum and tongue tip movement data on vowel-to-vowel coarticulation are reported for Catalan VCV sequences with vowels /i/, /a/, and /u/, and consonants /p/, /n/, dark /l/, /s/, /S/, alveolopalatal /n/ and /k/. Electromidsagittal articulometry recordings were carried out for three speakers using the Carstens articulograph. Trajectory data are presented for the vertical dimension for the tongue dorsum, and for the horizontal dimension for tongue dorsum and tip. In agreement with predictions of the DAC model, results show that directionality patterns of tongue dorsum coarticulation can be accounted for to a large extent based on the articulatory requirements on consonantal production. While dorsals exhibit analogous trends in coarticulatory direction for all articulators and articulatory dimensions, this is mostly so for the tongue dorsum and tip along the horizontal dimension in the case of lingual fricatives and apicolaminal consonants. This finding results from different articulatory strategies: while dorsal consonants are implemented through homogeneous tongue body activation, the tongue tip and tongue dorsum act more independently for more anterior consonantal productions. Discontinuous coarticulatory effects reported in the present investigation suggest that phonemic planning is adaptative rather than context independent. 相似文献

2.

A study of jaw coarticulatory resistance and aggressiveness for Catalan consonants and vowels

D Recasens 《The Journal of the Acoustical Society of America》2012,132(1):412-420

The goal of this study is to investigate coarticulatory resistance and aggressiveness for the jaw in Catalan consonants and vowels and, more specifically, for the alveolopalatal nasal //[symbol see text]/ and for dark /l/ for which there is little or no data on jaw position and coarticulation. Jaw movement data for symmetrical vowel-consonant-vowel sequences with the consonants /p, n, l, s, ∫, [ symbol see text], k/ and the vowels /i, a, u/ were recorded by three Catalan speakers with a midsagittal magnetometer. Data reveal that jaw height is greater for /s, ∫/ than for /p, [see text]/, which is greater than for /n, l, k/ during the consonant, and for /i, u/ than for /a/ during the vowel. Differences in coarticulatory variability among consonants and vowels are inversely related to differences in jaw height, i.e., fricatives and high vowels are most resistant, and /n, l, k/ and the low vowel are least resistant. Moreover, coarticulation resistant phonetic segments exert more prominent effects and, thus, are more aggressive than segments specified for a lower degree of coarticulatory resistance. Data are discussed in the light of the degree of articulatory constraint model of coarticulation. 相似文献

3.

Acoustic variability within and across German, French, and American English vowels: phonetic context effects

Strange W Weber A Levy ES Shafiro V Hisagi M Nishi K 《The Journal of the Acoustical Society of America》2007,122(2):1111-1129

Cross-language perception studies report influences of speech style and consonantal context on perceived similarity and discrimination of non-native vowels by inexperienced and experienced listeners. Detailed acoustic comparisons of distributions of vowels produced by native speakers of North German (NG), Parisian French (PF) and New York English (AE) in citation (di)syllables and in sentences (surrounded by labial and alveolar stops) are reported here. Results of within- and cross-language discriminant analyses reveal striking dissimilarities across languages in the spectral/temporal variation of coarticulated vowels. As expected, vocalic duration was most important in differentiating NG vowels; it did not contribute to PF vowel classification. Spectrally, NG long vowels showed little coarticulatory change, but back/low short vowels were fronted/raised in alveolar context. PF vowels showed greater coarticulatory effects overall; back and front rounded vowels were fronted, low and mid-low vowels were raised in both sentence contexts. AE mid to high back vowels were extremely fronted in alveolar contexts, with little change in mid-low and low long vowels. Cross-language discriminant analyses revealed varying patterns of spectral (dis)similarity across speech styles and consonantal contexts that could, in part, account for AE listeners' perception of German and French front rounded vowels, and "similar" mid-high to mid-low vowels. 相似文献

4.

Coarticulatory organization for lip rounding in Turkish and English

S E Boyce 《The Journal of the Acoustical Society of America》1990,88(6):2584-2595

A number of studies, involving English, Swedish, French, and Spanish, have shown that, for sequences of rounded vowels separated by nonlabial consonants, both EMG activity and lip protrusion diminish during the intervocalic consonant interval, producing a "trough" pattern. A two-part study was conducted to (a) compare patterns of protrusion movement (upper and lower lip) and EMG activity (orbicularis oris) for speakers of English and Turkish, a language where phonological rules constrain vowels within a word to agree in rounding and (b) determine which of two current models of coarticulation, the "look-ahead" and "coproduction" models, best explained the data. Results showed Turkish speakers producing "plateau" patterns of movement rather than troughs, and unimodal rather than bimodal patterns of EMG activity. In the second part of the study, one prediction of the coproduction model, that articulatory gestures have stable profiles across contexts, was tested by adding and subtracting movement data signals to synthesize naturally occurring patterns. Results suggest English and Turkish may have different modes of coarticulatory organization. 相似文献

5.

A comparison of intergestural patterns in deaf and hearing adult speakers: implications from an acoustic analysis of disyllables.

A Okalidou K S Harris 《The Journal of the Acoustical Society of America》1999,106(1):394-410

Coarticulation studies in speech of deaf individuals have so far focused on intrasyllabic patterning of various consonant-vowel sequences. In this study, both inter- and intrasyllabic patterning were examined in disyllables /symbol see text #CVC/ and the effects of phonetic context, speaking rate, and segment type were explored. Systematic observation of F2 and durational measurements in disyllables minimally contrasting in vocalic ([i], [u,][a]) and in consonant ([b], [d]) context, respectively, was made at selected locations in the disyllable, in order to relate inferences about articulatory adjustments with their temporal coordinates. Results indicated that intervocalic coarticulation across hearing and deaf speakers varied as a function of the phonetic composition of disyllables (b_b or d_d). The deaf speakers showed reduced intervocalic coarticulation for bilabial but not for alveolar disyllables compared to the hearing speakers. Furthermore, they showed less marked consonant influences on the schwa and stressed vowel of disyllables compared to the hearing controls. Rate effects were minimal and did not alter the coarticulatory patterns observed across hearing status. The above findings modify the conclusions drawn from previous studies and suggest that the speech of deaf and hearing speakers is guided by different gestural organization. 相似文献

6.

Fine-grained variation in caregivers' /s/ predicts their infants' /s/ category

Cristià A 《The Journal of the Acoustical Society of America》2011,129(5):3271-3280

Within the debate on the mechanisms underlying infants' perceptual acquisition, one hypothesis proposes that infants' perception is directly affected by the acoustic implementation of sound categories in the speech they hear. In consonance with this view, the present study shows that individual variation in fine-grained, subphonemic aspects of the acoustic realization of /s/ in caregivers' speech predicts infants' discrimination of this sound from the highly similar /∫/, suggesting that learning based on acoustic cue distributions may indeed drive natural phonological acquisition. 相似文献

7.

Acoustic analyses and perceptual data on anticipatory labial coarticulation in adults and children

J A Sereno S R Baum G C Marean P Lieberman 《The Journal of the Acoustical Society of America》1987,81(2):512-519

The present study investigated anticipatory labial coarticulation in the speech of adults and children. CV syllables, composed of [s], [t], and [d] before [i] and [u], were produced by four adult speakers and eight child speakers aged 3-7 years. Each stimulus was computer edited to include only the aperiodic portion of fricative-vowel and stop-vowel syllables. LPC spectra were then computed for each excised segment. Analyses of the effect of the following vowel on the spectral peak associated with the second formant frequency and on the characteristic spectral prominence for each consonant were performed. Perceptual data were obtained by presenting the aperiodic consonantal segments to subjects who were instructed to identify the following vowel as [i] or [u]. Both the acoustic and the perceptual data show strong coarticulatory effects for the adults and comparable, although less consistent, coarticulation in the speech stimuli of the children. The results are discussed in terms of the articulatory and perceptual aspects of coarticulation in language learning. 相似文献

8.

Syllable structure and integration of voicing and manner of articulation information in labial consonant identification

Silbert NH 《The Journal of the Acoustical Society of America》2012,131(5):4076-4086

Speech perception requires the integration of information from multiple phonetic and phonological dimensions. A sizable literature exists on the relationships between multiple phonetic dimensions and single phonological dimensions (e.g., spectral and temporal cues to stop consonant voicing). A much smaller body of work addresses relationships between phonological dimensions, and much of this has focused on sequences of phones. However, strong assumptions about the relevant set of acoustic cues and/or the (in)dependence between dimensions limit previous findings in important ways. Recent methodological developments in the general recognition theory framework enable tests of a number of these assumptions and provide a more complete model of distinct perceptual and decisional processes in speech sound identification. A hierarchical Bayesian Gaussian general recognition theory model was fit to data from two experiments investigating identification of English labial stop and fricative consonants in onset (syllable initial) and coda (syllable final) position. The results underscore the importance of distinguishing between conceptually distinct processing levels and indicate that, for individual subjects and at the group level, integration of phonological information is partially independent with respect to perception and that patterns of independence and interaction vary with syllable position. 相似文献

9.

Perception of coarticulatory nasalization by speakers of English and Thai: evidence for partial compensation

Beddor PS Krakow RA 《The Journal of the Acoustical Society of America》1999,106(5):2868-2887

The conditions under which listeners do and do not compensate for coarticulatory vowel nasalization were examined through a series of experiments of listeners' perception of naturally produced American English oral and nasal vowels spliced into three contexts: oral (C_C), nasal (N_N), and isolation. Two perceptual paradigms, a rating task in which listeners judged the relative nasality of stimulus pairs and a 4IAX discrimination task in which listeners judged vowel similarity, were used with two listener groups, native English speakers and native Thai speakers. Thai and English speakers were chosen because their languages differ in the temporal extent of anticipatory vowel nasalization. Listeners' responses were highly context dependent. For both perceptual paradigms and both language groups, listeners were less accurate at judging vowels in nasal than in non-nasal (oral or isolation) contexts; nasal vowels in nasal contexts were the most difficult to judge. Response patterns were generally consistent with the hypothesis that, given an appropriate and detectable nasal consonant context, listeners compensate for contextual vowel nasalization and attribute the acoustic effects of the nasal context to their coarticulatory source. However, the results also indicated that listeners do not hear nasal vowels in nasal contexts as oral; listeners retained some sensitivity to vowel nasalization in all contexts, indicating partial compensation for coarticulatory vowel nasalization. Moreover, there were small but systematic differences between the native Thai- and native English-speaking groups. These differences are as expected if perceptual compensation is partial and the extent of compensation is linked to patterns of coarticulatory nasalization in the listeners' native language. 相似文献

10.

Effects of postlingual deafness on speech production: implications for the role of auditory feedback 总被引：2，自引：0，他引：2

R S Waldstein 《The Journal of the Acoustical Society of America》1990,88(5):2099-2114

This study investigated some effects of postlingual deafness on speech by exploring selected properties of consonants, vowels, and suprasegmentals in the speech of seven totally, postlingually deafened individuals. The observed speech properties included parameters that function as phonological contrasts in English, as well as parameters that constitute primarily phonetic distinctions. The results demonstrated that postlingual deafness affects the production of all classes of speech sounds, suggesting that auditory feedback is implicated in regulating the phonetic precision of consonants, vowels, and suprasegmentals over the long term. In addition, the results are discussed in relation to factors that may influence the degree of speech impairment, such as age at onset of deafness. 相似文献

11.

Using links between speech perception and speech production to evaluate different acoustic metrics: a preliminary report

Newman RS 《The Journal of the Acoustical Society of America》2003,113(5):2850-2860

This paper examines whether correlations between speech perception and speech production exist, and, if so, whether they might provide a way of evaluating different acoustic metrics. The cues listeners use for many phonemic distinctions are not known, often because many different acoustic cues are highly correlated with one another, making it difficult to distinguish among them. Perception-production correlations may provide a new means of doing so. In the present paper, correlations were examined between acoustic measures taken on listeners' perceptual prototypes for a given speech category and on their average production of members of that category. Significant correlations were found for VOT among stop consonants, and for spectral peaks (but not centroids or skewness) for voiceless fricatives. These results suggest that correlations between speech perception and production may provide a methodology for evaluating different proposed acoustic metrics. 相似文献

12.

Articulatory dynamics of loud and normal speech 总被引：2，自引：0，他引：2

R Schulman 《The Journal of the Acoustical Society of America》1989,85(1):295-312

A comparison was made between normal and loud productions of bilabial stops and stressed vowels. Simultaneous recordings of lip and jaw movement and the accompanying audio signal were made for four native speakers of Swedish. The stimuli consisted of 12 Swedish vowels appearing in an /i'b_b/ frame and were produced with both normal and increased vocal effort. The displacement, velocity, and relative timing associated with the individual articulators as well as their coarticulatory interactions were studied together with changes in acoustic segmental duration. It is shown that the production of loud as compared with normal speech is characterized by amplification of normal movement patterns that are predictable for the above articulatory parameters. In addition, it was observed that the acoustic durations of bilabial stops were shortened, whereas stressed vowels were lengthened during loud speech production. Two interpretations of the data are offered, viewing loud articulatory behavior as a response to production demands and perceptual constraints, respectively. 相似文献

13.

Coarticulation in sequences of two nonhomorganic stop consonants: perceptual and acoustic evidence

B H Repp 《The Journal of the Acoustical Society of America》1983,74(2):420-427

This study investigated whether any perceptually useful coarticulatory information is carried by the release burst of the first of two successive, nonhomorganic stop consonants. The CV portions of natural VCCV utterances were replaced with matched synthetic stimuli from a continuum spanning the three places of stop articulation. There was a sizable effect of coarticulatory cues in the natural-speech portion on the perception of the second stop consonant. Moreover, when the natural VC portions including the final release burst were presented in isolation, listeners were significantly better than chance at guessing the identity of the following, "missing" syllable-initial stop. The hypothesis that the release burst of a syllable-final stop contains significant coarticulatory information about the place of articulation of a following, nonhomorganic stop was further confirmed in acoustic analyses which revealed significant effects of CV context on the spectral properties of the release bursts. The relationship between acoustic stimulus properties and listeners' perceptual responses was not straightforward, however. 相似文献

14.

Perception of acoustic scale and size in musical instrument sounds

van Dinther R Patterson RD 《The Journal of the Acoustical Society of America》2006,120(4):2158-2176

There is size information in natural sounds. For example, as humans grow in height, their vocal tracts increase in length, producing a predictable decrease in the formant frequencies of speech sounds. Recent studies have shown that listeners can make fine discriminations about which of two speakers has the longer vocal tract, supporting the view that the auditory system discriminates changes on the acoustic-scale dimension. Listeners can also recognize vowels scaled well beyond the range of vocal tracts normally experienced, indicating that perception is robust to changes in acoustic scale. This paper reports two perceptual experiments designed to extend research on acoustic scale and size perception to the domain of musical sounds: The first study shows that listeners can discriminate the scale of musical instrument sounds reliably, although not quite as well as for voices. The second experiment shows that listeners can recognize the family of an instrument sound which has been modified in pitch and scale beyond the range of normal experience. We conclude that processing of acoustic scale in music perception is very similar to processing of acoustic scale in speech perception. 相似文献

15.

Training the perception of Hindi dental and retroflex stops by native speakers of American English and Japanese

Pruitt JS Jenkins JJ Strange W 《The Journal of the Acoustical Society of America》2006,119(3):1684-1696

Perception of second language speech sounds is influenced by one's first language. For example, speakers of American English have difficulty perceiving dental versus retroflex stop consonants in Hindi although English has both dental and retroflex allophones of alveolar stops. Japanese, unlike English, has a contrast similar to Hindi, specifically, the Japanese /d/ versus the flapped /r/ which is sometimes produced as a retroflex. This study compared American and Japanese speakers' identification of the Hindi contrast in CV syllable contexts where C varied in voicing and aspiration. The study then evaluated the participants' increase in identifying the distinction after training with a computer-interactive program. Training sessions progressively increased in difficulty by decreasing the extent of vowel truncation in stimuli and by adding new speakers. Although all participants improved significantly, Japanese participants were more accurate than Americans in distinguishing the contrast on pretest, during training, and on posttest. Transfer was observed to three new consonantal contexts, a new vowel context, and a new speaker's productions. Some abstract aspect of the contrast was apparently learned during training. It is suggested that allophonic experience with dental and retroflex stops may be detrimental to perception of the new contrast. 相似文献

16.

Lexical frequency and neighborhood density effects on the recognition of native and Spanish-accented words by native English and Spanish listeners

Imai S Walley AC Flege JE 《The Journal of the Acoustical Society of America》2005,117(2):896-907

This study examined the effect of presumed mismatches between speech input and the phonological representations of English words by native speakers of English (NE) and Spanish (NS). The English test words, which were produced by a NE speaker and a NS speaker, varied orthogonally in lexical frequency and neighborhood density and were presented to NE listeners and to NS listeners who differed in English pronunciation proficiency. It was hypothesized that mismatches between phonological representations and speech input would impair word recognition, especially for items from dense lexical neighborhoods which are phonologically similar to many other words and require finer sound discrimination. Further, it was assumed that L2 phonological representations would change with L2 proficiency. The results showed the expected mismatch effect only for words from dense neighborhoods. For Spanish-accented stimuli, the NS groups recognized more words from dense neighborhoods than the NE group did. For native-produced stimuli, the low-proficiency NS group recognized fewer words than the other two groups. The-high proficiency NS participants' performance was as good as the NE group's for words from sparse neighborhoods, but not for words from dense neighborhoods. These results are discussed in relation to the development of phonological representations of L2 words. (200 words). 相似文献

17.

Speech segment durations and quantity in Icelandic.

J Pind 《The Journal of the Acoustical Society of America》1999,106(2):1045-1053

Icelandic has a phonologic contrast of quantity, distinguishing long and short vowels and consonants. Perceptual studies have shown that a major cue for quantity in perception is relational, involving the vowel-to-rhyme ratio. This cue is approximately invariant under transformations of rate, thus yielding a higher-order invariant for the perception of quantity in Icelandic. Recently it has, however, been shown that vowel spectra can also influence the perception of quantity. This holds for vowels which have different spectra in their long and short varieties. This finding raises the question of whether the durational contrast is less well articulated in those cases where vowel spectra provide another cue for quantity. To test this possibility, production measurements were carried out on vowels and consonants in words which were spoken by a number of speakers at different utterance rates in two experiments. A simple neural network was then trained on the production measurements. Using the network to classify the training stimuli shows that the durational distinctions between long and short phonemes are as clearly articulated whether or not there is a secondary, spectral, cue to quantity. 相似文献

18.

Visual feedback during speech production

N Tye-Murray 《The Journal of the Acoustical Society of America》1986,79(4):1169-1171

The question of whether visual information can affect ongoing speech production arises from numerous studies demonstrating an interaction between auditory and visual information during speech perception. In a preliminary study, the effect of delayed visual feedback on speech production was examined. Two of the 13 subjects demonstrated speech errors that were directly related to the delayed visual signal. However, in the main experiment, providing immediate visual feedback of the articulators did not diminish the effects of delayed auditory feedback for 11 speakers. 相似文献

19.

Structural design of hidden Markov model speech recognizer using multivalued phonetic features: comparison with segmental speech units.

L Deng K Erler 《The Journal of the Acoustical Society of America》1992,92(6):3058-3067

相似文献

20.

A new portable sound processor for the University of Melbourne/Nucleus Limited multielectrode cochlear implant. 总被引：7，自引：0，他引：7

H J McDermott C M McKay A E Vandali 《The Journal of the Acoustical Society of America》1992,91(6):3367-3371

A new processor, called the spectral maxima sound processor (SMSP), has been developed for the University of Melbourne/Nucleus Limited multielectrode cochlear implant. The SMSP analyses sound signals by means of a bandpass filterbank having 16 channels which are allocated tonotopically to the implanted electrodes. Every 4 ms, typically, the six channels with the largest amplitudes are selected, and six corresponding electrodes are activated. In an ongoing study the performance of the SMSP is being compared with that of the Mini Speech Processor (MSP). Some results of speech perception tests from the first two SMSP users are presented, in which scores for the recognition of vowels, consonants, and words all showed significant increases over the corresponding MSP scores. 相似文献