首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
Dynamic high-pass filtering with a -3 dB frequency that is a factor of ten or more below the voice fundamental frequency has a negligible effect on the amplitudes of the Fourier components of an EGG waveform. However, such a filter can significantly distort the waveform due to distortion in the phase or time alignment of these Fourier components. Such high-pass filtering can be introduced purposefully to stabilize the waveform by attenuating low-frequency noise, or may be an undesired effect of using an amplification or data acquisition system designed for acoustic signals. For a given voice fundamental frequency, the amount of distortion depends greatly on the order or attenuation characteristics of the filter and on the type of EGG waveform. Both a high-order filter and a breathy voice tend to increase the amount of distortion. If the characteristics of the high-pass filter are known, there are a number of digital filter techniques that can be used to reduce the phase distortion. However, it is shown that a relatively simple analogue network can also be used to obtain a correction that suffices for most applications. If the precise characteristics of the filter are not known, the response to a square wave can be used to adjust the compensator parameters for an optimal correction.  相似文献   

2.
Electroglottography (EGG) is a method to monitor the vibrations of the vocal folds by measuring the varying impedance to a weak alternating current through the tissues of the neck. The paper is an attempt to give a state-of-the-art report of how electroglottography is used in the clinic. It is based on a search of the pertinent literature was well as on an inquiry to 17 well known specialists in the field. The EGG techniques are described and limitations to the method are pointed out. Attempts to document voice quality by EGG are recognized and computerized methods to obtain information about vibratory perturbations and/or the vibratory frequency of the vocal folds are described. The author's personal conclusion is that the EGG signal is especially well suited for measurements of the glottal vibratory period. In the clinic such measurements are useful for periodicity analysis, as a basis for recording intonation contours, and to establish the characteristics of the voice fundamental frequency.  相似文献   

3.
Subharmonics are an important class of voice signals, relevant for speech, pathological voice, singing, and animal bioacoustics. They arise from special cases of amplitude (AM) or frequency modulation (FM) of the time-domain signal. Surprisingly, to date there is only one open source subharmonics detector available to the scientific community: Sun’s subharmonic-to-harmonic ratio (SHR). Here, this algorithm was subjected to a formal evaluation with two data sets of synthesized and empirical speech samples. Both data sets consisted of electroglottographic (EGG) signals, ie, a physiological correlate of vocal fold oscillation that bypasses vocal tract acoustics. Data Set I contained 2560 synthesized EGG signals with varying degrees of AM and FM, fundamental frequency (fo), periodicity, and signal-to-noise ratio (SNR). Data Set II was made up of 25 EGG samples extracted from the CMU Arctic speech data base. For a “ground truth” of subharmonicity, these samples were manually annotated by a group of five external experts. Analysis of the synthesized data suggested that the SHR metric is relatively robust as long as the subharmonic modulation extent is below 0.35 and 0.7 for the FM and AM scenarios, respectively. In the CMU Arctic speech data samples, the SHR analysis reached a maximum sensitivity of about 87% at a specificity of over 90%, but only for adaptive algorithm parameter settings. In contrast, the algorithm’s default parameter settings could only successfully classify about 9% of all subharmonic instances. The SHR is a useful metric for assessing the degree of subharmonics contained in voice signals, but only at adaptive parameter settings. In particular, the frequency ceiling should be set to five times the highest fo, and the frame length to at least five times the largest fundamental period of the analyzed signal. For subharmonic classification a threshold of SHR  ≥  0.01 is recommended.  相似文献   

4.
《Journal of voice》2020,34(4):503-526
Electroglottography (EGG) is a low-cost, noninvasive technology for measuring changes of relative vocal fold contact area during laryngeal voice production. EGG was introduced about 60 years ago and has gone through a “golden era” of increased scientific attention in the late 1980s and early 90s. During that period, four eminent review papers were written. Here, an update to these reviews is given, recapitulating some earlier landmark contributions and documenting noteworthy developments during the past 25 years.After presenting an algorithmic bibliographic analysis, some methodological aspects pertaining to measurement technology, qualitative and quantitative analysis, and respective interpretation are discussed. In particular, the interpretation of landmarks in the (first derivative of the) EGG waveform is critically examined. It is argued that because of inferior-superior and anterior-posterior phase differences of vocal fold vibration, vocal fold (de)contacting does not occur instantaneously, but over an interval of time. For this reason, instants of vocal fold closing and opening cannot be resolved exactly from the EGG signal. Consequently, any quantitative analysis parameter relying on the determination of (de)contacting events (such as the EGG contact quotient) should be interpreted with care.Finally, recent developments are reviewed for the various fields of application of EGG, including basic voice science and voice production physiology, speech signal processing and classification, clinical practice including swallowing, phonetics, hearing sciences, psychology, singing, trumpet playing, and mammalian and avian bioacoustics. Overall, EGG has over the past six decades developed into a mature technology with a wide range of applications. However, due to current limitations, the full potential of the methodology has as yet not been fully exploited. Future development may occur on three levels: (a) rigorous validation of existent measurement approaches; (b) introduction and rigorous validation of novel quantitative and interpretative approaches; and (c) advancement of the measurement technology itself.  相似文献   

5.
Aerobic instructors frequently experience vocal fatigue and are at risk for the development of vocal fold pathology. Six female aerobic instructors, three with self-reported voice problems and three without, served as subjects. Measures of vocal function (perturbation and EGG) were obtained before and after a 30-minute exercise session. Results showed that the group with self-reported voice problems had greater amounts of jitter, lower harmonic-to-noise ratios, and less periodicity in sustained vowels overall, but no significant differences in measures of perturbation and EGG were found before and immediately after instruction. Measures of vocal parameters showed that subjects with self-reported voice problems projected with relatively greater vocal intensity and phonated for a greater percentage of time across beginning, middle, and ending periods of aerobic instruction than subjects with no reported voice problems.  相似文献   

6.
7.
Several studies revealed a high percentage of voice problems in future teachers. The influence of vocal constitution on the vocal endurance is, however, still unclear. The goal of this study was to evaluate whether the increase of voice fundamental frequency (F0) during teaching is caused by (1) autonomic regulation patterns under stress, (2) anxiety as an emotional factor, or (3) limitations in voice constitution. Thirty-three subjects with either normal voice constitution (n = 15, group 1) or constitutional hypofunction (n = 18, group 2) assessed by voice range profile measurements were enrolled in this study. Furthermore, they underwent a standardized baseline test to register selected autonomic test parameters and were classified into autonomic outlet types (AOT) as proposed by Johannes et al. Later the subjects were examined during 1 hour of teaching (field study). The parameters tested included heart rate, pulse transition time, finger temperature, and voice fundamental frequency. To measure situational anxiety and general anxiety proneness, a state-trait anxiety inventory was taken. Eleven subjects per group were identified as autonomic stable (AOT 1), two per group as responding cardiovascularly (AOT 2), and two of group 1 and four of group 2, respectively, as having higher heart rate and higher blood pressure responses to stress (AOT 4). One subject had to be excluded because of missing data. However, statistical analyses showed no differences between AOT groups regarding the voice constitution groups. Increased fundamental frequencies of speaking voice after 30 and 45 minutes of teaching were found in group 2 (constitutional hypofunction). No effect of state or trait anxiety on voice endurance could be detected. Thus, the increase of fundamental frequency of voice has to be regarded as a consequence of vocal fatigue. A constitutionally weak voice seems to be a risk factor for developing a professional voice disorder.  相似文献   

8.
Inspiratory phonation (IP) is the production of voice as air is taken into the lungs. Although IP is promoted as a laryngeal assessment and voice treatment technique, it has been described quantitatively in very few speakers. This study quantified changes in laryngeal adduction, fundamental frequency, and intensity during IP relative to expiratory phonation (EP). We hypothesized that IP would increase laryngeal abduction and fundamental frequency. The experiment was a within-subjects, repeated measures design with each subject serving as her own control. Participants were 10 females (ages 19-50 years) who underwent simultaneous transoral videostrobolaryngoscopy and acoustic voice recording. We found that membranous vocal fold contact decreased significantly during IP relative to EP, while the trends for change of ventricular fold squeeze during IP varied across individuals. Vocal fundamental frequency increased significantly during IP relative to EP, but intensity did not vary consistently across conditions. Without teaching or coaching, changes that occurred during IP did not carry over to EP produced immediately following IP within the same respiratory cycle.  相似文献   

9.
HearFones (HF) have been designed to enhance auditory feedback during phonation. This study investigated the effects of HF (1) on sound perceivable by the subject, (2) on voice quality in reading and singing, and (3) on voice production in speech and singing at the same pitch and sound level.

Test 1: Text reading was recorded with two identical microphones in the ears of a subject. One ear was covered with HF, and the other was free. Four subjects attended this test. Tests 2 and 3: A reading sample was recorded from 13 subjects and a song from 12 subjects without and with HF on. Test 4: Six females repeated [pa:p:a] in speaking and singing modes without and with HF on same pitch and sound level.

Long-term average spectra were made (Tests 1–3), and formant frequencies, fundamental frequency, and sound level were measured (Tests 2 and 3). Subglottic pressure was estimated from oral pressure in [p], and simultaneously electroglottography (EGG) was registered during voicing on [a:] (Test 4). Voice quality in speech and singing was evaluated by three professional voice trainers (Tests 2–4).

HF seemed to enhance sound perceivable at the whole range studied (0–8 kHz), with the greatest enhancement (up to ca 25 dB) being at 1–3 kHz and at 4–7 kHz. The subjects tended to decrease loudness with HF (when sound level was not being monitored). In more than half of the cases, voice quality was evaluated “less strained” and “better controlled” with HF. When pitch and loudness were constant, no clear differences were heard but closed quotient of the EGG signal was higher and the signal more skewed, suggesting a better glottal closure and/or diminished activity of the thyroarytenoid muscle.  相似文献   


10.
Control of fundamental frequency by the human voice may be accomplished using a variety of physiological mechanisms. Most of these mechanisms are myoelastic and include the length, mass, and longitudinal tension of the vocal folds. In this paper, a review of some of the data concerning the mechanisms for the creation of tension in the vocal folds is presented. The physiological mechanisms for the development of passive tension are reviewed and the potential contribution of passive tension in the control of fundamental frequency is discussed. Some of the mechanisms for the development of active muscle tension are described and the biomechanical characteristics of muscle are considered in the effort to explain the physiological control of fundamental frequency in the human voice  相似文献   

11.
A method for analyzing and displaying electroglottographic (EGG) signals (and their first derivative, DEGG) is introduced: the electroglottographic wavegram ("wavegram" hereafter). To construct a wavegram, the time-varying fundamental frequency is measured and consecutive individual glottal cycles are identified. Each cycle is locally normalized in duration and amplitude, the signal values are encoded by color intensity and the cycles are concatenated to display the entire voice sample in a single image, similar as in sound spectrography. The wavegram provides an intuitive means for quickly assessing vocal fold contact phenomena and their variation over time. Variations in vocal fold contact appear here as a sequence of events rather than single phenomena, taking place over a certain period of time, and changing with pitch, loudness and register. Multiple DEGG peaks are revealed in wavegrams to behave systematically, indicating subtle changes of vocal fold oscillatory regime. As such, EGG wavegrams promise to reveal more information on vocal fold contacting and de-contacting events than previous methods.  相似文献   

12.
A method of measuring the rate of change of fundamental frequency has been developed in an effort to find acoustic voice parameters that could be useful in psychiatric research. A minicomputer program was used to extract seven parameters from the fundamental frequency contour of tape-recorded speech samples: (1) the average rate of change of the fundamental frequency and (2) its standard deviation, (3) the absolute rate of fundamental frequency change, (4) the total reading time, (5) the percent pause time of the total reading time, (6) the mean, and (7) the standard deviation of the fundamental frequency distribution. The method is demonstrated on (a) a material consisting of synthetic speech and (b) voice recordings of depressed patients who were examined during depression and after improvement.  相似文献   

13.
14.
The purpose of this study was to take a critical look at a voice therapy technique known as the yawn-sigh. The voiced sigh as an approach in voice therapy has had increased use in recent years, particularly with problems of vocal hyperfunction. In this study, the physiology of the yawn-sigh was studied with video nasoendoscopy in eight normal subjects; their taped voices were also studied acoustically for possible fundamental frequency and formant changes in producing selected vowels under normal and sigh conditions. Although each subject was given a model by the examiner of a yawn-sigh, one of the eight subjects could not produce a true yawn-sigh. Endoscopic findings for seven of the eight subjects performing the yawn-sigh demonstrated retracted elevation of the tongue, a lower positioning of the larynx, and a widened pharynx. Acoustic analyses for the seven subjects producing the sigh found a marked lowering of the second and third formants. Implications for using the yawn-sigh in voice therapy are given, such as using a modified “silent” yawn-sigh, as an easy method for producing greater vocal tract relaxation.  相似文献   

15.
Noninvasive measures of vocal fold activity are useful for describingnormal and disordered voice production. Measures of open and speed quotient from glottal airflow and electroglottographic (EGG) waveforms have been used to describe timing events associated with vocal fold vibration. To date, there has been little consistency in the measurement criteria used to calculate quotient values. In this study, criteria of 20% and 50% were applied to the AC amplitude of glottal airflow and inverted EGG waveforms for measurement of open quotient. Criteria of 20%, 50%, and 80%, and a midslope criterion that segmented the waveform between 20% and 80% of the waveform amplitude, were used for the calculation of speed quotient. Subjects produced waveforms at sound pressure levels (SPL) of 70, 75, 80 and 85 dB. Results indicated that approximations of open quotient obtained from the glottal airflow waveform significantly decreased using both the 20% and 50% criteria as SPL increased from 80 to 85 dB. No significant changes were found in open quotient from the EGG waveform as a function of SPL. Results of speed quotient measures from the glottal airflow and EGG waveforms showed a generally increasing trend as SPL increased, although the differences were not statistically significant. The data suggest that the signal type, measurement criterion and SPL must be considered in interpreting quotient measures.  相似文献   

16.
17.
"Throaty" voice quality has been regarded by voice pedagogues as undesired and even harmful. This study attempts to identify acoustic and physiological correlates of this quality. One male and one female subject read a text habitually and with a throaty voice quality. Oral pressure during p-occlusion was measured as an estimate of subglottal pressure. Long-term average spectrum analysis described the average spectrum characteristics. Sixteen syllables, perceptually evaluated with regard to throaty quality by five experts, were selected for analysis. Formant frequencies and voice source characteristics were measured by means of inverse filtering, and the vocal tract shape of the throaty and normal versions of the vowels [a,u,i,ae] of the male subject were recorded by magnetic resonance imaging. From this material, area functions were derived and their resonance frequencies were determined. The throaty versions of these four vowels all showed a pharynx that was narrower than in the habitually produced versions. To test the relevance of formant frequencies to perceived throaty quality, experts rated degree of throatiness in synthetic vowel samples, in which the measured formant frequency values of the subject were used. The main acoustic correlates of throatiness seemed to be an increase of F1, a decrease of F4, and in front vowels a decrease of F2, which presumably results from a narrowing of the pharynx. In the male subject, voice source parameters suggested a more hyperfunctional voice in throaty samples.  相似文献   

18.
The voice conversion (VC) technique recently has emerged as a new branch of speech synthesis dealing with speaker identity. In this work, a linear prediction (LP) analysis is carried out on speech signals to obtain acoustical parameters related to speaker identity - the speech fundamental frequency, or pitch, voicing decision, signal energy, and vocal tract parameters. Once these parameters are established for two different speakers designated as source and target speakers, statistical mapping functions can then be applied to modify the established parameters. The mapping functions are derived from these parameters in such a way that the source parameters resemble those of the target. Finally, the modified parameters are used to produce the new speech signal. To illustrate the feasibility of the proposed approach, a simple to use voice conversion software has been developed. This VC technique has shown satisfactory results. The synthesized speech signal virtually matching that of the target speaker.  相似文献   

19.
The purpose of this study was to examine the phonatory characteristics of pig, sheep, and cow excised larynges and to find out which of these animal species is the best model for human phonation. Excised pig, sheep, and cow larynges were prepared and mounted over a tapered tube on the excised bench that supplied pressurized, heated, and humidified air in a manner similar to that for excised canine models. Each excised larynx was subjected to a series of pressure-flow experiments with adduction as major control parameter. The subglottal pressure, electroglottograph (EGG), mean flow rate, audio signal, and sound pressure level were recorded during each experiment. EGG signal was used to extract the fundamental frequency. It was found that pressure-frequency relations were nonlinear for these species with large rate of frequency changes for the pig. The average oscillation frequencies for these species were 220+/-57 Hz for the pig, 102+/-33 Hz for the sheep, and 73+/-10 Hz for the cow. The average phonation threshold pressure for the pig was 7.4+/-2.0 cm H(2)O, 6.9+/-2.9 cm H(2)O for the sheep, and 4.4+/-2.3 cm H(2)O for the cow.  相似文献   

20.
Time normalization in voice analysis.   总被引:2,自引:0,他引:2  
The harmonics-to-noise ratio (HNR) has been widely accepted for quantifying the irregular or noise component of voice. HNR, however, is usually inflated by cycle-to-cycle variations of fundamental frequency period because zero padding is used for time normalization of the wavelet. In this study, a new method was developed for analyzing waveform perturbations of voice. In this method, noise components of voice were calculated from the discrepancies between wavelets after they had been optimally aligned in time. The optimal time normalization of wavelets was accomplished using procedures of dynamic time warping (DTW). This method was evaluated using both synthetic and natural voices, and significant reductions in noise were obtained. The harmonics-to-noise ratio obtained using DTW for time normalization was also shown to be independent of fundamental frequency perturbations.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号