首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 156 毫秒
1.
Acoustic effects of the time-varying glottal area due to vocal fold vibration on the laryngeal cavity resonance were investigated based on vocal tract area functions and acoustic analysis. The laryngeal cavity consists of the vestibular and ventricular parts of the larynx, and gives rise to a regional acoustic resonance within the vocal tract, with this resonance imparting an extra formant to the vocal tract resonance pattern. Vocal tract transfer functions of the five Japanese vowels uttered by three male subjects were calculated under open- and closed-glottis conditions. The results revealed that the resonance appears at the frequency region from 3.0 to 3.7 kHz when the glottis is closed and disappears when it is open. Real spectra estimated from open- and closed-glottis periods of vowel sounds also showed the on-off pattern of the resonance within a pitch period. Furthermore, a time-domain acoustic analysis of vowels indicated that the resonance component could be observed as a pitch-synchronized rise-and-fall pattern of the bandpass amplitude. The cyclic nature of the resonance can be explained as the laryngeal cavity acting as a closed tube that generates the resonance during a closed-glottis period, but damps the resonance off during an open-glottis period.  相似文献   

2.
Cavities branching off the main vocal tract are ubiquitous in nonhumans. Mammalian air sacs exist in human relatives, including all four great apes, but only a substantially reduced version exists in humans. The present paper focuses on acoustical functions of the air sacs. The hypotheses are investigated on whether the air sacs affect amplitude of utterances and/or position of formants. A multilayer synthetic model of the vocal folds coupled with a vocal tract model was utilized. As an air sac model, four configurations were considered: open and closed uniform tube-like side branches, a rigid cavity, and an inflatable cavity. Results suggest that some air sac configurations can enhance the sound level. Furthermore, an air sac model introduces one or more additional resonance frequencies, shifting formants of the main vocal tract to some extent but not as strongly as previously suggested. In addition, dynamic range of vocalization can be extended by the air sacs. A new finding is also an increased variability of the vocal tract impedance, leading to strong nonlinear source-filter interaction effects. The experiments demonstrated that air-sac-like structures can destabilize the sound source. The results were validated by a transmission line computational model.  相似文献   

3.
Three-dimensional vocal tract shapes and consequent area functions representing the vowels [i, ae, a, u] have been obtained from one male and one female speaker using magnetic resonance imaging (MRI). The two speakers were trained vocal performers and both were adept at manipulation of vocal tract shape to alter voice quality. Each vowel was performed three times, each with one of the three voice qualities: normal, yawny, and twangy. The purpose of the study was to determine some ways in which the vocal tract shape can be manipulated to alter voice quality while retaining a desired phonetic quality. To summarize any overall tract shaping tendencies mean area functions were subsequently computed across the four vowels produced within each specific voice quality. Relative to normal speech, both the vowel area functions and mean area functions showed, in general, that the oral cavity is widened and tract length increased for the yawny productions. The twangy vowels were characterized by shortened tract length, widened lip opening, and a slightly constricted oral cavity. The resulting acoustic characteristics of these articulatory alterations consisted of the first two formants (F1 and F2) being close together for all yawny vowels and far apart for all the twangy vowels.  相似文献   

4.
This study sought to compare formant frequencies estimated from natural phonation to those estimated using two methods of artificial laryngeal stimulation: (1) stimulation of the vocal tract using an artificial larynx placed on the neck and (2) stimulation of the vocal tract using an artificial larynx with an attached tube placed in the oral cavity. Twenty males between the ages of 18 and 45 performed the following three tasks on the vowels /a/ and /i/: (1) 4 seconds of sustained vowel, (2) 2 seconds of sustained vowel followed by 2 seconds of artificial phonation via a neck placement, and (3) 4 seconds of sustained vowel, the last two of which were accompanied by artificial phonation via an oral placement. Frequencies for formants 1-4 were measured for each task at second 1 and second 3 using linear predictive coding. These measures were compared across second 1 and second 3, as well as across all three tasks. Neither of the methods of artificial laryngeal stimulation tested in this study yielded formant frequency estimates that consistently agreed with those obtained from natural phonation for both vowels and all formants. However, when estimating mean formant frequency data for samples of large N, each of the methods agreed with mean estimations obtained from natural phonation for specific vowels and formants. The greatest agreement was found for a neck placement of the artificial larynx on the vowel /a/.  相似文献   

5.
Speakers of rhotic dialects of North American English show a range of different tongue configurations for /r/. These variants produce acoustic profiles that are indistinguishable for the first three formants [Delattre, P., and Freeman, D. C., (1968). "A dialect study of American English r's by x-ray motion picture," Linguistics 44, 28-69; Westbury, J. R. et al. (1998), "Differences among speakers in lingual articulation for American English /r/," Speech Commun. 26, 203-206]. It is puzzling why this should be so, given the very different vocal tract configurations involved. In this paper, two subjects whose productions of "retroflex" /r/ and "bunched" /r/ show similar patterns of F1-F3 but very different spacing between F4 and F5 are contrasted. Using finite element analysis and area functions based on magnetic resonance images of the vocal tract for sustained productions, the results of computer vocal tract models are compared to actual speech recordings. In particular, formant-cavity affiliations are explored using formant sensitivity functions and vocal tract simple-tube models. The difference in F4/F5 patterns between the subjects is confirmed for several additional subjects with retroflex and bunched vocal tract configurations. The results suggest that the F4/F5 differences between the variants can be largely explained by differences in whether the long cavity behind the palatal constriction acts as a half- or a quarter-wavelength resonator.  相似文献   

6.
A 3D cine-MRI technique was developed based on a synchronized sampling method [Masaki et al., J. Acoust. Soc. Jpn. E 20, 375-379 (1999)] to measure the temporal changes in the vocal tract area function during a short utterance /aiueo/ in Japanese. A time series of head-neck volumes was obtained after 640 repetitions of the utterance produced by a male speaker, from which area functions were extracted frame-by-frame. A region-based analysis showed that the volumes of the front and back cavities tend to change reciprocally and that the areas near the larynx and posterior edge of the hard palate were almost constant throughout the utterance. The lower four formants were calculated from all the area functions and compared with those of natural speech sounds. The mean absolute percent error between calculated and measured formants among all the frames was 4.5%. The comparison of vocal tract shapes for the five vowels with those from the static MRI method suggested a problem of MRI observation of the vocal tract: data from static MRI tend to result in a deviation from natural vocal tract geometry because of the gravity effect.  相似文献   

7.
The fundamental frequency of vocal fold oscillation (F(0)) is controlled by laryngeal mechanics and aerodynamic properties. F(0) change per unit change of transglottal pressure (dF/dP) using a shutter valve has been studied and found to have nonlinear, V-shaped relationship with F(0). On the other hand, the vocal tract is also known to affect vocal fold oscillation. This study examined the effect of artificially lengthened vocal tract length on dF/dP. dF/dP was measured in six men using two mouthpieces of different lengths. Results: The dF/dP graph for the longer vocal tract was shifted leftward relative to the shorter one. Conclusion: Using the one-mass model, the nadir of the "V" on the dF/dP graph was strongly influenced by the resonance around the first formant frequency. However, a more precise model is needed to account for the effects of viscosity and turbulence.  相似文献   

8.
According to recent model investigations, vocal tract resonance is relevant to vocal registers. However, no experimental corroboration of this claim has been published so far. In the present investigation, ten professional tenors' vocal tract configurations were analyzed using MRI volumetry. All subjects produced a sustained tone on the pitch F4 (349 Hz) on the vowel /a/ (1) in modal and (2) in falsetto register. The area functions were estimated from the MRI data and their associated formant frequencies were calculated. In a second condition the same subjects repeated the same tasks in a sound treated room and their formant frequencies were estimated by means of inverse filtering. In both recordings similar formant frequencies were observed. Vocal tract shapes differed between modal and falsetto register. In modal as compared to falsetto the lip opening and the oral cavity were wider and the first formant frequency was higher. In this sense the presented results are in agreement with the claim that the formant frequencies differ between registers.  相似文献   

9.
This study was aimed at identifying acoustic and physiological measures useful for monitoring voice changes in postnasopharyngeal patients with nonlaryngeal malignancies, and providing evidences of vocal tract effect on voice through comparisons between individuals with and without intact vocal tract. Simultaneous acoustic-electroglottographic signals recorded during phonation of vowels /i/ and /a/ sustained at habitual, high, and low pitch levels were compared among 10 postradiotherapy patients with nasopharyngeal carcinoma (NPC), 10 voice patients (VPs) with intact vocal tract, and 10 healthy individuals with normal voice (NORM). Results from a series of discriminant analyses revealed that the NPC group generally exhibited lower signal-to-noise (SNR) and open quotient (OQ) and higher Formant 1 frequency (F(1)) and speed quotient (SQ) than the NORM group. Unlike both VP and NORM groups, the NPC group failed to show a pitch effect on all voice measures, including OQ, SQ, percent jitter, percent shimmer, and SNR, suggesting an effect of radiotherapy and/or vocal tract on laryngeal behaviors. For the vowel /i/, on the other hand, only the NPC and NORM groups showed a pattern of pitch-dependent F(1) raising, a reflection of increased pharyngeal narrowing. These findings suggested that the pitch effect on laryngeal behaviors differed not only between individuals with intact vocal tract and those without but also between those with structural and dynamic changes of vocal tract.  相似文献   

10.
The vocal tract shape is three-dimensionally complex. For accurate acoustic analysis, a finite-difference time-domain method was introduced in the present study. By this method, transfer functions of the vocal tract for the five Japanese vowels were calculated from three-dimensionally reconstructed magnetic resonance imaging (MRI) data. The calculated transfer functions were compared with those obtained from acoustic measurements of vocal tract physical models precisely constructed from the same MRI data. Calculated transfer functions agreed well with measured ones up to 10 kHz. Acoustic effects of the piriform fossae, epiglottic valleculae, and inter-dental spaces were also examined. They caused spectral changes by generating dips. The amount of change was significant for the piriform fossae, while it was almost negligible for the other two. The piriform fossae and valleculae generated spectral dips for all the vowels. The dip frequencies of the piriform fossae were almost stable, while those of the valleculae varied among vowels. The inter-dental spaces generated very small spectral dips below 2.5 kHz for the high and middle vowels. In addition, transverse resonances within the oral cavity generated small spectral dips above 4 kHz for the low vowels.  相似文献   

11.
The purpose of this study was to use vocal tract simulation and synthesis as means to determine the acoustic and perceptual effects of changing both the cross-sectional area and location of vocal tract constrictions for six different vowels: Area functions at and near vocal tract constrictions are considered critical to the acoustic output and are also the central point of hypotheses concerning speech targets. Area functions for the six vowels, [symbol: see text] were perturbed by changing the cross-sectional area of the constriction (Ac) and the location of the constriction (Xc). Perturbations for Ac were performed for different values of Xc, producing several series of acoustic continua for the different vowels. Acoustic simulations for the different area functions were made using a frequency domain model of the vocal tract. Each simulated vowel was then synthesized as a 1-s duration steady-state segment. The phoneme boundaries of the perturbed synthesized vowels were determined by formal perception tests. Results of the perturbation analyses showed that formants for each of the vowels were more sensitive to changes in constriction cross-sectional area than changes in constriction location. Vowel perception, however, was highly resistant to both types of changes. Results are discussed in terms of articulatory precision and constriction-related speech production strategies.  相似文献   

12.
A theory of interaction between the source of sound in phonation and the vocal tract filter is developed. The degree of interaction is controlled by the cross-sectional area of the laryngeal vestibule (epilarynx tube), which raises the inertive reactance of the supraglottal vocal tract. Both subglottal and supraglottal reactances can enhance the driving pressures of the vocal folds and the glottal flow, thereby increasing the energy level at the source. The theory predicts that instabilities in vibration modes may occur when harmonics pass through formants during pitch or vowel changes. Unlike in most musical instruments (e.g., woodwinds and brasses), a stable harmonic source spectrum is not obtained by tuning harmonics to vocal tract resonances, but rather by placing harmonics into favorable reactance regions. This allows for positive reinforcement of the harmonics by supraglottal inertive reactance (and to a lesser degree by subglottal compliant reactance) without the risk of instability. The traditional linear source-filter theory is encumbered with possible inconsistencies in the glottal flow spectrum, which is shown to be influenced by interaction. In addition, the linear theory does not predict bifurcations in the dynamical behavior of vocal fold vibration due to acoustic loading by the vocal tract.  相似文献   

13.
The skilled use of nonperiodic phonation techniques in combination with spectrum analysis has been proposed here as a practical method for locating formant frequencies in the singing voice. The study addresses the question of the degree of similarity between sung phonations and their nonperiodic imitations, with respect to both frequency of the first two formants as well as posture of the vocal tract. Using magnetic resonance imaging (MRI), linear predictive coding (LPC), and spectrum analysis, two types of nonperiodic phonation (ingressive and vocal fry) are compared with singing phonations to determine the degree of similarity/difference in acoustic and spatial dimensions of the vocal tract when these phonation types are used to approximate the postures of singing. In comparing phonation types, the close similarity in acoustic data in combination with the relative dissimilarity in spatial data indicates that the accurate imitations are not primarily the result of imitating the singing postures, but have instead an aural basis.  相似文献   

14.
The didjeridu, or yidaki, is a simple tube about 1.5 m long, played with the lips, as in a tuba, but mostly producing just a tonal, rhythmic drone sound. The acoustic impedance spectra of performers' vocal tracts were measured while they played and compared with the radiated sound spectra. When the tongue is close to the hard palate, the vocal tract impedance has several maxima in the range 1-3 kHz. These maxima, if sufficiently large, produce minima in the spectral envelope of the sound because the corresponding frequency components of acoustic current in the flow entering the instrument are small. In the ranges between the impedance maxima, the lower impedance of the tract allows relatively large acoustic current components that correspond to strong formants in the radiated sound. Broad, weak formants can also be observed when groups of even or odd harmonics coincide with bore resonances. Schlieren photographs of the jet entering the instrument and high speed video images of the player's lips show that the lips are closed for about half of each cycle, thus generating high levels of upper harmonics of the lip frequency. Examples of the spectra of "circular breathing" and combined playing and vocalization are shown.  相似文献   

15.
The laryngeal neuromuscular mechanisms for modulating glottal posture and fundamental frequency are of interest in understanding normal laryngeal physiology and treating vocal pathology. The intrinsic laryngeal muscles in an in vivo canine model were electrically activated in a graded fashion to investigate their effects on onset frequency, phonation onset pressure, vocal fold strain, and glottal distance at the vocal processes. Muscle activation plots for these laryngeal parameters were evaluated for the interaction of following pairs of muscle activation conditions: (1) cricothyroid (CT) versus all laryngeal adductors (TA/LCA/IA), (2) CT versus LCA/IA, (3) CT versus thyroarytenoid (TA) and, (4) TA versus LCA/IA (LCA: lateral cricoarytenoid muscle, IA: interarytenoid). Increases in onset frequency and strain were primarily affected by CT activation. Onset pressure correlated with activation of all adductors in activation condition 1, but primarily with CT activation in conditions 2 and 3. TA and CT were antagonistic for strain. LCA/IA activation primarily closed the cartilaginous glottis while TA activation closed the mid-membranous glottis.  相似文献   

16.
Peta White   《Journal of voice》1999,13(4):570-582
High-pitched productions present difficulties in formant frequency analysis due to wide harmonic spacing and poorly defined formants. As a consequence, there is little reliable data regarding children's spoken or sung vowel formants. Twenty-nine 11-year-old Swedish children were asked to produce 4 sustained spoken and sung vowels. In order to circumvent the problem of wide harmonic spacing, F1 and F2 measurements were taken from vowels produced with a sweeping F0. Experienced choir singers were selected as subjects in order to minimize the larynx height adjustments associated with pitch variation in less skilled subjects. Results showed significantly higher formant frequencies for speech than for singing. Formants were consistently higher in girls than in boys suggesting longer vocal tracts in these preadolescent boys. Furthermore, formant scaling demonstrated vowel dependent differences between boys and girls suggesting non-uniform differences in male and female vocal tract dimensions. These vowel-dependent sex differences were not consistent with adult data.  相似文献   

17.
In this article an implementation of a vocal tract model and its validation are described. The model uses a transmission line model to calculate pole and zero frequencies for a vocal tract with a closed side-branch such as a sublingual cavity. In the validation study calculated pole and zero frequencies from the model are compared with frequencies estimated using elementary acoustic formulas for a variety of vocal tract configurations.  相似文献   

18.
The relation between the spatial configuration of the vocal tract as determined by magnetic resonance imaging (MRI) and the acoustical signal produced was investigated. A male subject carried out a set of phonatory tasks, comprising the utterance of the sustained vowels /i/ and /a/, each in a single articulation, and the vowel /epsilon/ with his larynx positioned variously on a vertical axis. Two- and three-dimensional measurements of the vocal tract were performed. The results of these measurements were used to calculate resonance frequencies, according to predictions from acoustical theory. Finally, calculated frequencies were compared with actually measured resonance frequencies in the audio signal. We found a strong relation between the acoustical signal produced and the spatial configuration for the first resonance frequencies of the articulations of the vowel /epsilon/, and first two resonance frequencies of the vowels /a/ and /i/. The capability to determine accurately vocal tract dimensions is a major advantage of this imaging technique.  相似文献   

19.
Acoustic radiation impedance of the mouth is an important parameter when the vocal tract is modelled by the equivalent electrical circuit. If the vocal tract is closed by a cavity, as when the speaker wears some kind of mask, total impedance acoustically loading the vocal tract becomes serial connection of the mouth radiation impedance and the mask impedance. In that case the mouth radiation impedance has to be changed compared to free field conditions. This paper introduces a simplified approach to the modelling of that change by an appropriate reduction coefficient. The analysis based on an experiment preformed by measurement in the vocal tract physical model accompanied with analytical estimation has shown that the value of such reduction coefficient is 0.5. The results reveal that for a vocal tract closed with mask cavity the change in mouth radiation impedance introduced in an equivalent electrical circuit can be approximated by the value for free field radiation decreased by about 50%.  相似文献   

20.

Objective

To analyze the vocal tract morphometry of women with vocal nodules (VN) compared with normal subjects by means of magnetic resonance imaging (MRI) at rest position.

Study Design

Prospective study.

Methods

The present research included 20 young adult women, aged 18–40 years: 10 dysphonic patients with VN and 10 normal subjects. All participants were tested using MRI; 12 measurements of the vocal tract were performed: nine in median sagittal section and three in axial section.

Results

The 12 measurements were smaller in the dysphonic group; statistical significance was obtained for three parameters: in the sagittal plane, the laryngeal vestibule area was significantly smaller in the dysphonic group, with P = 0.012∗ (∗ = statistical significance); in the axial section, the distance between the right and left vocal processes of the arytenoids’ cartilages and the distance between the anterior commissure of the glottis and the laryngeal posterior wall were also significantly lower in the dysphonic group, with P = 0.036∗ and 0.010∗, respectively. Significant differences in the vocal tract morphometry of individuals with VN were observed compared with normal subjects, at rest position.

Conclusions

Results obtained from this study suggest that patients with VN may present a constantly increased tension of the laryngeal muscles, even at rest; moreover, reduced anterior-posterior dimension of the larynx may be a morphological characteristic of patients with VN.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号