首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
SUMMARY: Inverse filtering (IF) is a common method used to estimate the source of voiced speech, the glottal flow. This investigation aims to compare two IF methods: one manual and the other semiautomatic. Glottal flows were estimated from speech pressure waveforms of six female and seven male subjects producing sustained vole /a/ in breathy, normal, and pressed phonation. The closing phase characteristics of the glottal pulse were parameterized using two time-based parameters: the closing quotient (C1Q) and the normalized amplitude quotient (NAQ). The information given by these two parameters indicates a strong correlation between the two IF methods. The results are encouraging in showing that the parameterization of the voice source in different speech sounds can be performed independently of the technique used for inverse filtering.  相似文献   

2.
The normalized amplitude quotient (NAQ), defined as the ratio between the peak-to-peak amplitude of the flow pulse and the negative peak amplitude of the differentiated flow glottogram and normalized with respect to period time, has been shown to be related to glottal adduction. Glottal adduction, in turn, affects mode of phonation and hence perceived phonatory pressedness. The relationship between NAQ and perceived phonatory pressedness was analyzed in a material collected from a professional female singer and singing teacher who sang a triad pattern in breathy, flow, neutral, and pressed phonation in three different loudness conditions (soft, middle, loud). In addition, she also sang the same triad pattern in four different styles of singing, classical, pop, jazz, and blues, in the same three loudness conditions. A panel of experts rated the degree of perceived phonatory press along visual analogue scales. Comparing the obtained mean rated pressedness ratings with the mean NAQ values for the various triads showed that about 73% of the variation in perceived pressedness could be accounted for by variations of NAQ.  相似文献   

3.
A new method "simultaneous inverse filtering and model matching" (SIM) is proposed that allows one to calculate voice source measures without any user interaction. It is based on the discrete all-pole modeling (DAP) technique for inverse filtering (IF), which is modified to include a model of the glottal flow as integral part [LF model, Fant et al., STL-QPSR (Stockholm) 4/1985, 1-13 (1986)]. As the correct LF parameters are initially unknown, they are estimated in an iterative procedure using multi-dimensional optimization techniques that are initialized according to the results of an exhaustive search. The error criteria applied reflect how well the IF is performed after the spectral contribution of the glottal flow has been removed. The resulting optimal LF parameter constellation serves as the basis to calculate 11 voice source measures. The performance was evaluated using synthesized signals and recordings of natural utterances. For the synthesized signals, the accuracy to reproduce the original parameters was high (correlations exceeding 0.88) for measures where the starting point of the glottal cycle did not enter explicitly. Errors were smaller compared to conventional estimation methods where the measures were estimated from the IF signal. The analysis of natural utterances indicates that problems still exist with regard to robustness, but that under advantageous conditions the open quotient, the speed quotient, the closing quotient, the parabolic spectral parameter, and the negative peak amplitude of the glottal flow derivative can indeed be determined automatically by the SIM method.  相似文献   

4.
Electroglottography is a common method for providing noninvasive measurements of glottal activity. The derivative of the electroglottographic signal, however, has not attracted much attention, although it yields reliable indicators of glottal closing instants. The purpose of this paper is to provide a guide to the usefulness of this signal. The main features that are to be found in this signal are presented on the basis of an extensive analysis of a database of items sung by 18 trained singers. Glottal opening and closing instants are related to peaks in the signal; the latter can be used to measure glottal parameters such as fundamental frequency and open quotient. In some cases, peaks are doubled or imprecise, which points to special (but by no means uncommon) glottal configurations. A correlation-based algorithm for the automatic measurement of fundamental frequency and open quotient using the derivative of electroglottographic signals is proposed. It is compared to three other electroglottographic-based methods with regard to the measurement of open quotient in inverse-filtered derived glottal flow. It is shown that agreement with the glottal-flow measurements is much better than most threshold-based measurements in the case of sustained sounds.  相似文献   

5.
A new set of parameters is described for analysis and synthesis of glottal area, vocal fold contact area, and glottal volume flow. Parameters are all nondimensionalized and consist of an abduction quotient, a shape quotient, a phase quotient, and a load quotient in addition to fundamental frequency and vibrational amplitude. The parameters show promise in interpretation of electroglottographic, photoglottographic, and inverse filtered volume velocity waveforms in terms of the glottal configuration. Some comparisons between modeled and measured glottographic waveforms are made.  相似文献   

6.
A finite-volume computational model that solves the time-dependent glottal airflow within a forced-oscillation model of the glottis was employed to study glottal flow separation. Tracheal input velocity was independently controlled with a sinusoidally varying parabolic velocity profile. Control parameters included flow rate (Reynolds number), oscillation frequency and amplitude of the vocal folds, and the phase difference between the superior and inferior glottal margins. Results for static divergent glottal shapes suggest that velocity increase caused glottal separation to move downstream, but reduction in velocity increase and velocity decrease moved the separation upstream. At the fixed frequency, an increase of amplitude of the glottal walls moved the separation further downstream during glottal closing. Increase of Reynolds number caused the flow separation to move upstream in the glottis. The flow separation cross-sectional ratio ranged from approximately 1.1 to 1.9 (average of 1.47) for the divergent shapes. Results suggest that there may be a strong interaction of rate of change of airflow, inertia, and wall movement. Flow separation appeared to be "delayed" during the vibratory cycle, leading to movement of the separation point upstream of the glottal end only after a significant divergent angle was reached, and to persist upstream into the convergent phase of the cycle.  相似文献   

7.
The perception of modal and falsetto registers was analyzed in a material consisting of a total of 104 vowel sounds sung by 13 choir singers, 52 sung in modal register, and 52 in falsetto register. These vowel sounds were classified by 16 expert listeners in a forced choice test and the number of votes for modal was compared to the voice source parameters: (1) closed quotient (Q(closed)), (2) level difference between the two lowest source spectrum partials (H1-H2), (3) AC amplitude, (4) maximum flow declination rate (MFDR), and (5) normalized amplitude quotient (NAQ, AC amplitude/MFDR(*) fundamental frequency). Tones with a high value of Q(closed) and low values of H1-H2 and of NAQ were typically associated with high number of votes for modal register, and vice versa, Q(closed) showing the strongest correlation. Some singer subjects produced tones that could not be classified as either falsetto or modal register, suggesting that classification of registers is not always feasible.  相似文献   

8.
This investigation aims at describing voice function of four nonclassical styles of singing, Rock, Pop, Soul, and Swedish Dance Band. A male singer, professionally experienced in performing in these genres, sang representative tunes, both with their original lyrics and on the syllable /pae/. In addition, he sang tones in a triad pattern ranging from the pitch Bb2 to the pitch C4 on the syllable /pae/ in pressed and neutral phonation. An expert panel was successful in classifying the samples, thus suggesting that the samples were representative of the various styles. Subglottal pressure was estimated from oral pressure during the occlusion for the consonant [p]. Flow glottograms were obtained from inverse filtering. The four lowest formant frequencies differed between the styles. The mean of the subglottal pressure and the mean of the normalized amplitude quotient (NAQ), that is, the ratio between the flow pulse amplitude and the product of period and maximum flow declination rate, were plotted against the mean of fundamental frequency. In these graphs, Rock and Swedish Dance Band assumed opposite extreme positions with respect to subglottal pressure and mean phonation frequency, whereas the mean NAQ values differed less between the styles.  相似文献   

9.
Vocal warm-up was studied in terms of changes in voice parameters during a 45-minute vocal loading session in the morning. The voices of a randomly chosen group of 40 female and 40 male young students were loaded by having them read a novel aloud. The exposure groups (5 females and 5 males per cell) consisted of eight combinations of the following factors: (1) low (25 +/- 5%) or high (65 +/- 5%) relative humidity of ambient air; (2) low [< 65 dB(SPL)] or high [> 65 dB(SPL)] speech output level during vocal loading; (3) sitting or standing posture during vocal loading. Two sets of voice samples were recorded: a resting sample before the loading session and a loading sample after the loading session. The material recorded consisted of /pa:ppa/ words produced normally, as softly and as loudly as possible in this order by all subjects. The long /a/ vowel of the test word was inverse-filtered to obtain the glottal flow waveform. Time domain parameters of the glottal flow [open quotient (OQ), closing quotient (CQ), speed quotient (SQ), fundamental frequency (F0)], amplitude domain parameters of the glottal flow [glottal flow (fAC) and its logarithm, minimum of the first derivative of the glottal flow (dpeak) and its logarithm, amplitude quotient (AQ), and a new parameter, CQAQ], intraoral pressure (p), and sound pressure level (SPL) values of the phonations were analyzed. Voice range profiles (VRP) and the singer's formant (g/G, a/A, cl/c, e1/e, g1/g for females/males) of the loud phonation were also measured. Statistically significant differences between the preloading and postloading samples could be seen in many parameters, but the differences depended on gender and the type of phonation. In females the values of CQ, AQ, and CQAQ decreased and the values of SQ and p increased in normal phonations; the values of fAC, dpeak, and SPL increased in soft phonations; the values of AQ and CQAQ decreased in loud phonations; the harmonic energy in the singer's formant region increased significantly at every pitch. In males the values of OQ and AQ decreased and the values of dpeak, F0, p, and SPL increased in normal phonations; the values of fAC and p increased in soft phonations. The changes could be interpreted as signs of a shift toward hyperfunctional voice production. Low humidity was associated with more hyperfunctional changes than high humidity. High output was associated with more hyperfunctional changes than low output. Sitting position was associated with an increasing trend at both margins of male VRP, whereas the case was the opposite for standing position.  相似文献   

10.
Phonation threshold pressure has previously been defined as the minimum lung pressure required to initiate phonation. By modeling the dependence of this pressure on fundamental frequency, it is shown that relatively simple aerodynamic relations for time-varying flow in the glottis are obtained. Lung pressure and peak glottal flow are nearly linearly related, but not proportional. For this reason, traditional power law relations between vocal power and lung pressure may not hold. Glottal impendance for time-varying flow should be defined differentially rather than as a simple ratio between lung pressure and peak flow. It is shown that the peak flow, the peak flow derivative, the open quotient, and the speed quotient of inverse-filtered glottal flow waveforms all depend explicitly on phonation threshold pressure. Data from singers are compared with those from nonsingers. The primary difference is that singers obtain two to three times greater peak flow for a given lung pressure, suggesting that they adjust their glottal or vocal tract impedance for optimal flow transfer between the source and the resonantor.  相似文献   

11.
Noninvasive measures of vocal fold activity are useful for describingnormal and disordered voice production. Measures of open and speed quotient from glottal airflow and electroglottographic (EGG) waveforms have been used to describe timing events associated with vocal fold vibration. To date, there has been little consistency in the measurement criteria used to calculate quotient values. In this study, criteria of 20% and 50% were applied to the AC amplitude of glottal airflow and inverted EGG waveforms for measurement of open quotient. Criteria of 20%, 50%, and 80%, and a midslope criterion that segmented the waveform between 20% and 80% of the waveform amplitude, were used for the calculation of speed quotient. Subjects produced waveforms at sound pressure levels (SPL) of 70, 75, 80 and 85 dB. Results indicated that approximations of open quotient obtained from the glottal airflow waveform significantly decreased using both the 20% and 50% criteria as SPL increased from 80 to 85 dB. No significant changes were found in open quotient from the EGG waveform as a function of SPL. Results of speed quotient measures from the glottal airflow and EGG waveforms showed a generally increasing trend as SPL increased, although the differences were not statistically significant. The data suggest that the signal type, measurement criterion and SPL must be considered in interpreting quotient measures.  相似文献   

12.
This study was primarily motivated by the need to establish the correspondence between auditory abilities and laryngeal function. Just noticeable differences (JNDs) were obtained for the open quotient and speed quotient of the glottal flow waveform. The quotients were synthesized for both the glottal flow alone, and for the output pressure signal after the glottal flow signal was applied to the synthesis vocal tract for the vowel /a/. Six adult men and five adult women, all teachers of singing, participated as listeners. An adaptive auditory listening procedure was used to estimate JNDs for the four types of stimuli. The group average JND values were as follows. For the standard open quotient value of .6000, JND = 0.0264 (SD = .010) for the glottal flow and JND = 0.0344 (SD = .020) for the output pressure. For the open quotient, there was no statistically significant difference between genders or between the types of signals. For the standard speed quotient value of 2.000, JND = 0.154 (SD = .043) for the glottal flow and JND = 0.319 (SD = .167) for the output pressure. For the speed quotient, there was no statistically significant difference between genders, but the difference between types of stimulus (glottal flow versus output pressure) was significant (p <.006). The variance among the JND values was significantly larger for the output pressure stimuli compared to the glottal flow stimuli for both the open quotient and the speed quotient.  相似文献   

13.
This study presents an approach to visualizing intensity regulation in speech. The method expresses a voice sample in a two-dimensional space using amplitude-domain values extracted from the glottal flow estimated by inverse filtering. The two-dimensional presentation is obtained by expressing a time-domain measure of the glottal pulse, the amplitude quotient (AQ), as a function of the negative peak amplitude of the flow derivative (d(peak)). The regulation of vocal intensity was analyzed with the proposed method from voices varying from extremely soft to very loud with a SPL range of approximately 55 dB. When vocal intensity was increased, the speech samples first showed a rapidly decreasing trend as expressed on the proposed AQ-d(peak) graph. When intensity was further raised, the location of the samples converged toward a horizontal line, the asymptote of a hypothetical hyperbola. This behavior of the AQ-d(peak) graph indicates that the intensity regulation strategy changes from laryngeal to respiratory mechanisms and the method chosen makes it possible to quantify how control mechanisms underlying the regulation of vocal intensity change gradually between the two means. The proposed presentation constitutes an easy-to-implement method to visualize the function of voice production in intensity regulation because the only information needed is the glottal flow wave form estimated by inverse filtering the acoustic speech pressure signal.  相似文献   

14.
Vocal fold vibratory asymmetry is often associated with inefficient sound production through its impact on source spectral tilt. This association is investigated in both a computational voice production model and a group of 47 human subjects. The model provides indirect control over the degree of left-right phase asymmetry within a nonlinear source-filter framework, and high-speed videoendoscopy provides in vivo measures of vocal fold vibratory asymmetry. Source spectral tilt measures are estimated from the inverse-filtered spectrum of the simulated and recorded radiated acoustic pressure. As expected, model simulations indicate that increasing left-right phase asymmetry induces steeper spectral tilt. Subject data, however, reveal that none of the vibratory asymmetry measures correlates with spectral tilt measures. Probing further into physiological correlates of spectral tilt that might be affected by asymmetry, the glottal area waveform is parameterized to obtain measures of the open phase (open/plateau quotient) and closing phase (speed/closing quotient). Subjects' left-right phase asymmetry exhibits low, but statistically significant, correlations with speed quotient (r=0.45) and closing quotient (r=-0.39). Results call for future studies into the effect of asymmetric vocal fold vibration on glottal airflow and the associated impact on voice source spectral properties and vocal efficiency.  相似文献   

15.
Five professional operatic baritone singers' voice-source characteristics were analyzed by means of inverse filtering of the flow signal as captured by a flow mask. The subjects sang a long sustained diminuendo, from loudest to softest, three times on the vowels [a:] and [ae:] at fundamental frequencies representing 25%, 50%, and 75% of their total pitch range as measured in semitones. During the diminuendos, they repeatedly inserted the consonant [p] so that associated subglottal pressures could be estimated from the oral pressure during the p-occlusions. Pooling the three takes of each condition, ten subglottal pressures, equidistantly spaced between highest and lowest, were selected for analysis. Sound-pressure levels (SPL), peak-to-peak glottal airflow, maximum flow declination rate, closed quotient, glottal dc flow, and the level difference between the two lowest partials of the source spectrum (H1-H2) were determined. All parameters except the glottal dc flow showed a systematic variation with subglottal pressure or the fractional excess pressure over threshold. The results are given in terms of equations representing the average across subjects for the relation between subglottal pressure and each of the mentioned voice-source parameters.  相似文献   

16.
This study aims to explore the perceptual relevance of the variations of glottal flow parameters and to what extent a small variation can be detected. Just Noticeable Differences (JNDs) have been measured for three values of open quotient (0.4, 0.6, and 0.8) and two values of asymmetry coefficient (2/3 and 0.8), and the effect of changes of vowel, pitch, vibrato, and amplitude parameters has been tested. Two main groups of subjects have been analyzed: a group of 20 untrained subjects and a group of 10 trained subjects. The results show that the JND for open quotient is highly dependent on the target value: an increase of the JND is noticed when the open quotient target value is increased. The relative JND is constant: ΔOq/Oq = 14% for the untrained and 10% for the trained. In the same way, the JND for asymmetry coefficient is also slightly dependent on the target value–an increase of the asymmetry coefficient value leads to a decrease of the JND. The results show that there is no effect from the selected vowel or frequency (two values have been tested), but that the addition of a vibrato has a small effect on the JND of open quotient. The choice of an amplitude parameter also has a great effect on the JND of open quotient.  相似文献   

17.
《Journal of voice》2020,34(4):645.e19-645.e39
Intraglottal pressure is the driving force of vocal fold vibration. Its time course during the open phase of the vibratory cycle is essential in the mechanics of phonation, but measuring it directly is difficult and may hinder spontaneous voicing. However, it can be computed from the in vivo measured transglottal flow and glottal area (hence the air particle velocity) on the basis of the Bernoulli energy law and the interaction with the inertance of the vocal tract. As to sustained modal phonation, calculations are presented for the two possible shapes of glottal duct: convergent and divergent, including absolute calibration in order to obtain quantitative physical values. Whatever the glottal duct configuration, the calculations based on measured values of glottal area and air flow show that the integrated intraglottal pressure during the opening phase systematically exceeds that during the closing phase, which is the basic condition for sustaining vocal fold oscillation. The key point is that the airflow curve is skewed to the right relative to the glottal area curve. The skewing results from air compressibility and vocal tract inertance. The intraglottal pressure becomes negative during the closing phase. As to the soft (or physiological) voice onset, a similar approach shows that the integrated pressure differences (opening phase − closing phase) actually increase as the onset progresses, and this applies to the results based on Bernoulli's energy law as well as to those based on the interaction with the inertance of the vocal tract. Furthermore and similarly, the phase lead of the pressure wave with respect to the glottal opening progressively increases. The underlying explanation lies in the progressively increasing skewing of the airflow curve to the right with respect to the glottal area curve.  相似文献   

18.
Vocal intensity is studied as a function of fundamental frequency and lung pressure. A combination of analytical and empirical models is used to predict sound pressure levels from glottal waveforms of five professional tenors and twenty five normal control subjects. The glottal waveforms were obtained by inverse filtering the mouth flow. Empirical models describe features of the glottal flow waveform (peak flow, peak flow derivative, open quotient, and speed quotient) in terms of lung pressure and phonation threshold pressure, a key variable that incorporates the Fo dependence of many of the features of the glottal flow. The analytical model describes the contributions to sound pressure levels SPL by the vocal tract. Results show that SPL increases with Fo at a rate of 8-9 dB/octave provided that lung pressure is raised proportional to phonation threshold pressure. The SPL also increases at a rate of 8-9 dB per doubling of excess pressure over threshold, a new quantity that assumes considerable importance in vocal intensity calculations. For the same excess pressure over threshold, the professional tenors produced 10-12 dB greater intensity than the male nonsingers, primarily because their peak airflow was much higher for the same pressure. A simple set of rules is devised for predicting SPL from source waveforms.  相似文献   

19.
《Journal of voice》2023,37(3):444-451
ObjectiveA single injection of basic fibroblast growth factor (bFGF) into the vocal folds of patients with glottal insufficiency has been shown to be effective for a few years. However, the long-term therapeutic effect of a single injection of bFGF into the vocal folds has yet to be demonstrated. In this study, the therapeutic effect of a single injection of bFGF into the vocal folds was investigated over several years by monitoring patients for 36 months following this treatment.MethodsNineteen patients with glottal insufficiency received injections of bFGF diluted to 20 μg/mL in the superficial layer of the lamina propria of the bilateral vocal folds. The following parameters were evaluated at preinjection baseline and 6, 12, 18, 24, and 36 months later, and statistical comparisons were performed. The parameters evaluated were: the Grade, Rough, Breathy, Asthenic, and Strained (GRBAS) scale score; maximum phonation time; acoustic analysis; and glottal wave analysis (GWA) and kymograph edge analysis (KEA) using high-speed digital imaging (HSDI). The amplitude perturbation quotient (APQ) and period perturbation quotient (PPQ) were measured by acoustic analysis. The mean minimum glottal area during vocalization and mean minimum distance between the vocal folds were measured by GWA. The amplitudes of the bilateral vocal folds were measured by KEA.ResultsPostinjection, the GRBAS scale score decreased from 6 months after injection, and maximum phonation time was prolonged. The mean minimum glottal area during vocalization and the mean minimum distance between the vocal folds calculated by GWA of HSDI decreased significantly after 6 months. These effects persisted until 36 months postinjection. APQ and PPQ derived from acoustic analysis tended to decrease, but not significantly. There was no clear change in the amplitudes of the bilateral vocal folds calculated by KEA of HSDI before and after injection.ConclusionsThese results suggest that the effects of a single injection of bFGF into the vocal folds persist for 36 months.  相似文献   

20.
Seventeen healthy women, 45 to 61 years old, were examined using videofiberstroboscopy during phonation at three loudness levels. Two phoniatricians evaluated glottal closure using category and ratio scales. Transglottal airflow was studied by inverse filtering of the oral airflow signal recorded in a flow mask (Glottal Enterprises System) during the spoken phrase /ba:pa:pa:pa:p/ at three loudness levels. Subglottal pressure was estimated from the intraoral pressure during p occlusion. Running speech and the repeated /pa:/ syllables were perceptually evaluated by three speech pathologists regarding breathiness, hypo-, and hyperfunction, using continuous scales. Incomplete glottal closure was found in 35 of 46 phonations (76%). The degree of glottal closure increased significantly with raised loudness. Half of the women closed the glottis completely during loud phonation. Posterior glottal chink (PGC) was the most common gap configuration and was found in 28 of 46 phonations (61%). One third of the PGCs were in the cartilaginous glottis (PGCc) only. Two thirds extended into the membranous portion (PGCm); most of these occurred during soft phonation. Peak flow, peak-to-peak (AC) flow, and the maximum rate of change for the flow in the closing phase increased significantly with raised loudness. Minimum flow decreased significantly from normal to loud voice. Breathiness decreased with increased loudness. The results suggest that the incomplete closure patterns PGCc and PGCm during soft phonation ought primarily to be regarded as normal for Swedish women in this age group.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号