首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 0 毫秒
1.
The timing of upper lip protrusion movements and accompanying acoustic events was examined for multiple repetitions of word pairs such as "lee coot" and "leaked coot" by four speakers of American English. The duration of the intervocalic consonant string was manipulated by using various combinations of /s/, /t/, /k/, /h/, and /#/. Pairwise comparisons were made of consonant string duration (acoustic /i/ offset to acoustic /u/ onset) with durations of: protrusion movement beginning to acoustic /u/ onset, maximum acceleration of the movement to acoustic /u/ onset, and acoustic /u/ onset to movement end. There were some consonant-specific protrusion effects, primarily on the movement beginning event for /s/. Inferences from measures of the maximum acceleration and movement end events for the non-/s/ subset suggested the simultaneous and variable expression of three competing constraints: (1) end the protrusion movement during the voiced part of the /u/; (2) use a preferred movement duration; and (3) begin the /u/-related protrusion movement when permitted by relaxation of the perceptually motivated constraint that the preceding /i/ be unrounded. The subjects differed in the degree of expression of each constraint, but the results generally indicate that anticipatory coarticulation of lip protrusion is influenced both by acoustic-phonetic context dependencies and dynamical properties of movements. Because of the extensive variation in the data and the small number of subjects, these ideas are tentative; additional work is needed to explore them further.  相似文献   

2.
The influence of vocalic context on various temporal and spectral properties of preceding acoustic segments was investigated in utterances containing [schwa No. CV] sequences produced by two girls aged 4;8 and 9;5 years and by their father. The younger (but not the older) child's speech showed a systematic lowering of [s] noise and [th] release burst spectra before [u] as compared to [i] and [ae]. The older child's speech, on the other hand, showed an orderly relationship of the second-formant frequency in [] to the transconsonantal vowel. Both children tended to produce longer [s] noises and voice onset times as well as higher second-formant peaks at constriction noise offset before [i] than before [u] and [ae]. All effects except the first were shown by the adult who, in addition, produced first-formant frequencies in [] that anticipated the transconsonantal vowel. These observations suggest that different forms of anticipatory coarticulation may have different causes and may follow different developmental patterns. A strategy for future research is suggested.  相似文献   

3.
Earlier work [Nittrouer et al., J. Speech Hear. Res. 32, 120-132 (1989)] demonstrated greater evidence of coarticulation in the fricative-vowel syllables of children than in those of adults when measured by anticipatory vowel effects on the resonant frequency of the fricative back cavity. In the present study, three experiments showed that this increased coarticulation led to improved vowel recognition from the fricative noise alone: Vowel identification by adult listeners was better overall for children's productions and was successful earlier in the fricative noise. This enhanced vowel recognition for children's samples was obtained in spite of the fact that children's and adults' samples were randomized together, therefore indicating that listeners were able to normalize the vowel information within a fricative noise where there often was acoustic evidence of only one formant associated primarily with the vowel. Correct vowel judgments were found to be largely independent of fricative identification. However, when another coarticulatory effect, the lowering of the main spectral prominence of the fricative noise for /u/ versus /i/, was taken into account, vowel judgments were found to interact with fricative identification. The results show that listeners are sensitive to the greater coarticulation in children's fricative-vowel syllables, and that, in some circumstances, they do not need to make a correct identification of the most prominently specified phone in order to make a correct identification of a coarticulated one.  相似文献   

4.
One purpose of the present investigation was to attempt to better understand articulatory movement characteristics of children's speech, particularly as they might relate to the question of why acoustic measures of children's segment durations are often longer than those of adults. In order to address this issue and to consider other general characteristics of children's speech production development, a variety of data was obtained from three groups of children and from a group of adults using strain gauge instrumentation to monitor superior-inferior lip and jaw displacement and peak velocity. Results indicate that the children's peak velocity and articulatory displacement measures were in many respects quite similar to those of the adults, although certain differences were observed. For a number of measures, there were also few peak velocity or displacement differences observed among the three age groups of children, despite the fact that they spanned about a six-year age range. In general, it appears that even when children and adults produce consonant sounds that are perceptually "correct," articulatory differences can be observed among their productions.  相似文献   

5.
Historically, studies of vocal vibrato have concentrated on pulse rate as being a primary factor in determining whether a given vocal movement is a good or bad vibrato or a tremolo or wobble. More recently, investigators have been studying the extent of frequency variation and amplitude variation around their respective means in order to determine their influence on the perception of vibrato. The present study is an additional attempt to understand the three parameters comprising vibrato, their interrelationship, and their relationship to perception. Samples of sustained sung tones were obtained primarily from recordings. The samples were digitized using a 16-bit A/D converter at a sampling frequency of 10 kHz. Each digitized sample was converted to a useful format for marking purposes in order to derive information on vibrato pulse rate, the mean frequency of the tone, the semitone deviation around the mean, percent frequency deviation and percent amplitude variation around the mean amplitude. Data presentation utilizes representative samples of good vibrato, tremolo and wobble and describes differences in waveforms which may impact on perception.  相似文献   

6.
7.
Training American listeners to perceive Mandarin tones has been shown to be effective, with trainees' identification improving by 21%. Improvement also generalized to new stimuli and new talkers, and was retained when tested six months after training [Y. Wang et al., J. Acoust. Soc. Am. 106, 3649-3658 (1999)]. The present study investigates whether the tone contrasts gained perceptually transferred to production. Before their perception pretest and after their post-test, the trainees were recorded producing a list of Mandarin words. Their productions were first judged by native Mandarin listeners in an identification task. Identification of trainees' post-test tone productions improved by 18% relative to their pretest productions, indicating significant tone production improvement after perceptual training. Acoustic analyses of the pre- and post-training productions further reveal the nature of the improvement, showing that post-training tone contours approximate native norms to a greater degree than pretraining tone contours. Furthermore, pitch height and pitch contour are not mastered in parallel, with the former being more resistant to improvement than the latter. These results are discussed in terms of the relationship between non-native tone perception and production as well as learning at the suprasegmental level.  相似文献   

8.
This paper reports two series of experiments that examined the phonetic correlates of lexical stress in Vietnamese compounds in comparison to their phrasal constructions. In the first series of experiments, acoustic and perceptual characteristics of Vietnamese compound words and their phrasal counterparts were investigated on five likely acoustic correlates of stress or prominence (f0 range and contour, duration, intensity and spectral slope, vowel reduction), elicited under two distinct speaking conditions: a "normal speaking" condition and a "maximum contrast" condition which encouraged speakers to employ prosodic strategies for disambiguation. The results suggested that Vietnamese lacks phonetic resources for distinguishing compounds from phrases lexically and that native speakers may employ a phrase-level prosodic disambiguation strategy (juncture marking), when required to do so. However, in a second series of experiments, minimal pairs of bisyllabic coordinative compounds with reversible syllable positions were examined for acoustic evidence of asymmetrical prominence relations. Clear evidence of asymmetric prominences in coordinative compounds was found, supporting independent results obtained from an analysis of reduplicative compounds and tone sandhi in Vietnamese [Nguye;n and Ingram, 2006]. A reconciliation of these apparently conflicting findings on word stress in Vietnamese is presented and discussed.  相似文献   

9.
Acoustic and kinematic analyses, as well as perceptual evaluation, were conducted on the speech of Parkinsonian and normal geriatric adults. As a group, the Parkinsonian speakers had very limited jaw movement compared to the normal geriatrics. For opening gestures, jaw displacements and velocities produced by the Parkinsonian subjects were about half those produced by the normal geriatrics. Lower lip movement amplitude and velocity also were reduced for the Parkinsonian speakers relative to the normal geriatrics, but the magnitude of the reduction was not as great as that seen in the jaw. Lower lip closing velocities expressed as a function of movement amplitude were greater for the Parkinsonian speakers than for the normal geriatrics. This increased velocity of lower lip movement may reflect a difference in the control of lip elevation for the Parkinsonian speakers, an effect that increased with the severity of dysarthria. Acoustically, the Parkinsonian subjects had reduced durations of vocalic segments, reduced formant transitions, and increased voice onset time compared to the normal geriatrics. These effects were greater for the more severe, compared to the milder, dysarthrics and were most apparent in the more complex, vocalic gestures.  相似文献   

10.
Acoustic analyses and perception experiments were conducted to determine the effects of brief deprivation of auditory feedback on fricatives produced by cochlear implant users. The words /si/ and /Si/ were recorded by four children and four adults with their cochlear implant speech processor turned on or off. In the processor-off condition, word durations increased significantly for a majority of talkers. These increases were greater for children compared to adults, suggesting that children may rely on auditory feedback to a greater extent than adults. Significant differences in spectral measures of /S/ were found between processor-on and processor-off conditions for two of the four children and for one of the four adults. These talkers also demonstrated a larger /s/-/S/ contrast in centroid values compared to the other talkers within their respective groups. This finding may indicate that talkers who produce fine spectral distinctions are able to perceive these distinctions through their implants and to use this feedback to fine tune their speech. Two listening experiments provided evidence that some of the acoustic changes were perceptible to normal-hearing listeners. Taken together, these experiments indicate that for certain cochlear-implant users the brief absence of auditory feedback may lead to perceptible modifications in fricative consonants.  相似文献   

11.
Effects of noise on speech production: acoustic and perceptual analyses   总被引:4,自引:0,他引:4  
Acoustical analyses were carried out on a set of utterances produced by two male speakers talking in quiet and in 80, 90, and 100 dB SPL of masking noise. In addition to replicating previous studies demonstrating increases in amplitude, duration, and vocal pitch while talking in noise, these analyses also found reliable differences in the formant frequencies and short-term spectra of vowels. Perceptual experiments were also conducted to assess the intelligibility of utterances produced in quiet and in noise when they were presented at equal S/N ratios for identification. In each experiment, utterances originally produced in noise were found to be more intelligible than utterances produced in the quiet. The results of the acoustic analyses showed clear and consistent differences in the acoustic-phonetic characteristics of speech produced in quiet versus noisy environments. Moreover, these accounts differences produced reliable effects on intelligibility. The findings are discussed in terms of: (1) the nature of the acoustic changes that taken place when speakers produce speech under adverse conditions such as noise, psychological stress, or high cognitive load: (2) the role of training and feedback in controlling and modifying a talker's speech to improve performance of current speech recognizers; and (3) the development of robust algorithms for recognition of speech in noise.  相似文献   

12.
Acoustic and perceptual similarities between Japanese and American English (AE) vowels were investigated in two studies. In study 1, a series of discriminant analyses were performed to determine acoustic similarities between Japanese and AE vowels, each spoken by four native male speakers using F1, F2, and vocalic duration as input parameters. In study 2, the Japanese vowels were presented to native AE listeners in a perceptual assimilation task, in which the listeners categorized each Japanese vowel token as most similar to an AE category and rated its goodness as an exemplar of the chosen AE category. Results showed that the majority of AE listeners assimilated all Japanese vowels into long AE categories, apparently ignoring temporal differences between 1- and 2-mora Japanese vowels. In addition, not all perceptual assimilation patterns reflected context-specific spectral similarity patterns established by discriminant analysis. It was hypothesized that this incongruity between acoustic and perceptual similarity may be due to differences in distributional characteristics of native and non-native vowel categories that affect the listeners' perceptual judgments.  相似文献   

13.
Several types of measurements were made to determine the acoustic characteristics that distinguish between voiced and voiceless fricatives in various phonetic environments. The selection of measurements was based on a theoretical analysis that indicated the acoustic and aerodynamic attributes at the boundaries between fricatives and vowels. As expected, glottal vibration extended over a longer time in the obstruent interval for voiced fricatives than for voiceless fricatives, and there were more extensive transitions of the first formant adjacent to voiced fricatives than for the voiceless cognates. When two fricatives with different voicing were adjacent, there were substantial modifications of these acoustic attributes, particularly for the syllable-final fricative. In some cases, these modifications leads to complete assimilation of the voicing feature. Several perceptual studies with synthetic vowel-consonant-vowel stimuli and with edited natural stimuli examined the role of consonant duration, extent and location of glottal vibration, and extent of formant transitions on the identification of the voicing characteristics of fricatives. The perceptual results were in general consistent with the acoustic observations and with expectations based on the theoretical model. The results suggest that listeners base their voicing judgments of intervocalic fricatives on an assessment of the time interval in the fricative during which there is no glottal vibration. This time interval must exceed about 60 ms if the fricative is to be judged as voiceless, except that a small correction to this threshold is applied depending on the extent to which the first-formant transitions are truncated at the consonant boundaries.  相似文献   

14.
For each of five vowels [i e a o u] following [t], a continuum from non-nasal to nasal was synthesized. Nasalization was introduced by inserting a pole-zero pair in the vicinity of the first formant in an all-pole transfer function. The frequencies and spacing of the pole and zero were systematically varied to change the degree of nasalization. The selection of stimulus parameters was determined from acoustic theory and the results of pilot experiments. The stimuli were presented for identification and discrimination to listeners whose language included a non-nasal--nasal vowel opposition (Gujarati, Hindi, and Bengali) and to American listeners. There were no significant differences between language groups in the 50% crossover points of the identification functions. Some vowels were more influenced by range and context effects than were others. The language groups showed some differences in the shape of the discrimination functions for some vowels. On the basis of the results, it is postulated that (1) there is a basic acoustic property of nasality, independent of the vowel, to which the auditory system responds in a distinctive way regardless of language background; and (2) there are one or more additional acoustic properties that may be used to various degrees in different languages to enhance the contrast between a nasal vowel and its non-nasal congener. A proposed candidate for the basic acoustic property is a measure of the degree of prominence of the spectral peak in the vicinity of the first formant. Additional secondary properties include shifts in the center of gravity of the low-frequency spectral prominence, leading to a change in perceived vowel height, and changes in overall spectral balance.  相似文献   

15.
Current theories of cross-language speech perception claim that patterns of perceptual assimilation of non-native segments to native categories predict relative difficulties in learning to perceive (and produce) non-native phones. Cross-language spectral similarity of North German (NG) and American English (AE) vowels produced in isolated hVC(a) (di)syllables (study 1) and in hVC syllables embedded in a short sentence (study 2) was determined by discriminant analyses, to examine the extent to which acoustic similarity was predictive of perceptual similarity patterns. The perceptual assimilation of NG vowels to native AE vowel categories by AE listeners with no German language experience was then assessed directly. Both studies showed that acoustic similarity of AE and NG vowels did not always predict perceptual similarity, especially for "new" NG front rounded vowels and for "similar" NG front and back mid and mid-low vowels. Both acoustic and perceptual similarity of NG and AE vowels varied as a function of the prosodic context, although vowel duration differences did not affect perceptual assimilation patterns. When duration and spectral similarity were in conflict, AE listeners assimilated vowels on the basis of spectral similarity in both prosodic contexts.  相似文献   

16.
Closants, or consonantlike sounds in infant vocalizations, were described acoustically using 16-kHz spectrograms and LPC or FFT analyses based on waveforms sampled at 20 or 40 kHz. The two major closant types studied were fricatives and trills. Compared to similar fricative sounds in adult speech, the fricative sounds of the 3-, 6-, 9-, and 12-month-old infants had primary spectral components at higher frequencies, i.e., to and above 14 kHz. Trill rate varied from 16-180 Hz with a mean of about 100, approximately four times the mean trill rate reported for adult talkers. Acoustic features are described for various places of articulation for fricatives and trills. The discussion of the data emphasizes dimensions of acoustic contrast that appear in infant vocalizations during the first year of life, and implications of the spectral data for auditory and motor self-stimulation by normal-hearing and hearing-impaired infants.  相似文献   

17.
The aim of the study was to establish whether /u/-fronting, a sound change in progress in standard southern British, could be linked synchronically to the fronting effects of a preceding anterior consonant both in speech production and speech perception. For the production study, which consisted of acoustic analyses of isolated monosyllables produced by two different age groups, it was shown for younger speakers that /u/ was phonetically fronted and that the coarticulatory influence of consonants on /u/ was less than in older speakers. For the perception study, responses were elicited from the same subjects to two minimal word-pair continua that differed in the direction of the consonants' coarticulatory fronting effects on /u/. Consistent with their speech production, young listeners' /u/ category boundary was shifted toward /i/ and they compensated perceptually less for the fronting effects of the consonants on /u/ than older listeners. The findings support Ohala's model in which certain sound changes can be linked to the listener's failure to compensate for coarticulation. The results are also shown to be consistent with episodic models of speech perception in which phonological frequency effects bring about a realignment of the variants of a phonological category in speech production and perception.  相似文献   

18.
The purpose of this study was to use vocal tract simulation and synthesis as means to determine the acoustic and perceptual effects of changing both the cross-sectional area and location of vocal tract constrictions for six different vowels: Area functions at and near vocal tract constrictions are considered critical to the acoustic output and are also the central point of hypotheses concerning speech targets. Area functions for the six vowels, [symbol: see text] were perturbed by changing the cross-sectional area of the constriction (Ac) and the location of the constriction (Xc). Perturbations for Ac were performed for different values of Xc, producing several series of acoustic continua for the different vowels. Acoustic simulations for the different area functions were made using a frequency domain model of the vocal tract. Each simulated vowel was then synthesized as a 1-s duration steady-state segment. The phoneme boundaries of the perturbed synthesized vowels were determined by formal perception tests. Results of the perturbation analyses showed that formants for each of the vowels were more sensitive to changes in constriction cross-sectional area than changes in constriction location. Vowel perception, however, was highly resistant to both types of changes. Results are discussed in terms of articulatory precision and constriction-related speech production strategies.  相似文献   

19.
Trained choral tenors performed a series of vocal tasks before and after a “live” performance. Acoustic (perturbation, harmonic-to-noise ratio, pitch and amplitude ranges) and perceptual analyses (auditory and proprioceptive/kinesthetic) were undertaken to detect changes from pre- to postperformance. Individuality of response to the performance was revealed, with the majority of subjects showing vocal deterioration after performance. The most sensitive vocal tasks were the comfortably pitched notes, high soft notes, and the bottom notes in scale singing. The most sensitive acoustic measure in detecting change from pre- to postperformance was harmonic-to-noise ratio. In contrast to the demonstrated acoustic changes, no significant differences in perceptual ratings were evident after the performance. Perceptual ratings did not reflect the acoustic analysis results. The present study highlights the need to establish further normative data for the singing voice and to consider individual differences in vocal characteristics in future studies of the singing voice.  相似文献   

20.
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号