期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

1.

The Performance of the Time-Frequency Analysis Software (TF32) in the Acoustic Analysis of the Synthesized Pathological Voice

Yaser S. Natour Ahmad F. Saleem 《Journal of voice》2009,23(4):414-424

The purpose of this study was to examine the algorithm-measuring capabilities used in the Time-Frequency Analysis Software Program for 32-bit Windows (TF32) for measuring fundamental frequency (F0), its dependent measures, and signal-to-noise ratio (SNR). The stability, accuracy, and linearity of its algorithm to systematic changes in aspiration noise and/or spectral slope (to mimic the perceptual characteristics of breathiness, roughness, and hoarseness) were evaluated using its analysis output to five female and five male synthesized voices. TF32 was used to calculate F0, Jitter%, Shimmer%, and SNR for each of the synthesized signals. The findings indicate that although TF32 produced stable results for male synthesized samples, they were not accurate when measuring F0, Jitter%, and Shimmer% with the addition of noise and variations in open quotient independently and in combination. In contrast, TF32 was neither stable nor accurate in making the same measurements for female synthesized samples. However, TF32 was stable and accurate in measuring SNR for male and most of female voices. These results point to an inappropriate F0 extraction algorithm in TF32 and stress the need for further research to remediate the algorithm or to identify a superior one. 相似文献

2.

A Comparative Study of Acoustic Voice Measurements by Means of Dr. Speech and Computerized Speech Lab

Ilse Smits Piet Ceuppens Marc S. De Bodt 《Journal of voice》2005,19(2):38-196

In this study, the calculations and results of acoustic voice analysis as calculated by two different analysis systems (Doctor Speech (DRS), Tiger Electronics, Neu-Anspach, Germany, and Computerized Speech Lab (CSL), Kay Elemetrics Corporation, Lincoln Park, NJ) are compared. A group of 120 normal voices was selected for analysis of the objective parameters: fundamental frequency (F(0)), variation of F(0) (F(0)SD), jitter, shimmer, and harmonics-to-noise ratio (HNR). The subject group was a random selection of normal voices of adults. The aim of this comparison was to find determined differences and similarities in data measurements between both systems to make data transfer possible. A significant correlation was found for F(0), HNR, and shimmer relative. The correlation for jitter (relative and absolute) and F(0)SD was weak. DRS and CSL are not comparable in absolute figures, but their judgment against normative data is identical. Further research is necessary to explore the affect on pathological voices or child voices. 相似文献

3.

Effects of reverberation on perceptual segregation of competing voices

Culling JF Hodder KI Toh CY 《The Journal of the Acoustical Society of America》2003,114(5):2871-2876

Two experiments investigated the effect of reverberation on listeners' ability to perceptually segregate two competing voices. Culling et al. [Speech Commun. 14, 71-96 (1994)] found that for competing synthetic vowels, masked identification thresholds were increased by reverberation only when combined with modulation of fundamental frequency (F0). The present investigation extended this finding to running speech. Speech reception thresholds (SRTs) were measured for a male voice against a single interfering female voice within a virtual room with controlled reverberation. The two voices were either (1) co-located in virtual space at 0 degrees azimuth or (2) separately located at +/-60 degrees azimuth. In experiment 1, target and interfering voices were either normally intonated or resynthesized with a fixed F0. In anechoic conditions, SRTs were lower for normally intonated and for spatially separated sources, while, in reverberant conditions, the SRTs were all the same. In experiment 2, additional conditions employed inverted F0 contours. Inverted F0 contours yielded higher SRTs in all conditions, regardless of reverberation. The results suggest that reverberation can seriously impair listeners' ability to exploit differences in F0 and spatial location between competing voices. The levels of reverberation employed had no effect on speech intelligibility in quiet. 相似文献

4.

Frequency and amplitude perturbation analysis of electroglottograph during sustained phonation 总被引：3，自引：0，他引：3

T Haji S Horiguchi T Baer W J Gould 《The Journal of the Acoustical Society of America》1986,80(1):58-62

Electroglottography (EGG) was used to monitor vocal fold vibration patterns in normal subjects and patients with various laryngeal disorders. In order to evaluate the regularity of vocal fold vibration, frequency and amplitude perturbation of EGG waves during sustained phonation were measured with a laboratory computer. The data were compared to the degree of hoarseness evaluated by auditory perception and by sound spectrographic analysis. Frequency and amplitude perturbation measures showed some overlap between normal and pathological groups. However, there was a close relation between perturbation analysis of EGG waves and degree of hoarseness (Spearman's rank correlation coefficient rs = 0.73, p less than 0.0005). Amplitude perturbation was found to be a more sensitive measure of the irregularity of vocal fold vibration than frequency perturbation. 相似文献

5.

The correlogram: a visual display of periodicity

Granqvist S Hammarberg B 《The Journal of the Acoustical Society of America》2003,114(5):2934-2945

Fundamental frequency (F0) extraction is often used in voice quality analysis. In pathological voices with a high degree of instability in F0, it is common for F0 extraction algorithms to fail. In such cases, the faulty F0 values might spoil the possibilities for further data analysis. This paper presents the correlogram, a new method of displaying periodicity. The correlogram is based on the waveform-matching techniques often used in F0 extraction programs, but with no mechanism to select an actual F0 value. Instead, several candidates for F0 are shown as dark bands. The result is presented as a 3D plot with time on the x axis, correlation delay inverted to frequency on the y axis, and correlation on the z axis. The z axis is represented in a gray scale as in a spectrogram. Delays corresponding to integer multiples of the period time will receive high correlation, thus resulting in candidates at F0, F0/2, F0/3, etc. While the correlogram adds little to F0 analysis of normal voices, it is useful for analysis of pathological voices since it illustrates the full complexity of the periodicity in the voice signal. Also, in combination with manual tracing, the correlogram can be used for semimanual F0 extraction. If so, F0 extraction can be performed on many voices that cause problems for conventional F0 extractors. To demonstrate the properties of the method it is applied to synthetic and natural voices, among them six pathological voices, which are characterized by roughness, vocal fry, gratings/scrape, hypofunctional breathiness and voice breaks, or combinations of these. 相似文献

6.

Relationship between subjective voice complaints and acoustic parameters in female teachers'' voices

Leena Rantala Erkki Vilkman 《Journal of voice》1999,13(4):484-495

The aim of the study was to identify the acoustic correlates of female teachers' subjective voice complaints by recording their voices in their working environment. The subjects made recordings during lessons (N = 10) and breaks (N = 11). The subjects were divided into 2 groups: those with few voice complaints (FC group) and those with many voice complaints (MC group). The speech sample made in the breaks was maximally sustained /a/, from which fundamental frequency (F0), jitter, and shimmer were analyzed. The classroom samples were analyzed for F0, sound pressure level (SPL), and F0 time (the active vibration time of the vocal folds). Additionally, an index for assessing voice loading is presented. The results revealed a tendency of the MC group to have higher F0 and lower SPL and perturbation values than the FC group. The index values correlated moderately with the subjective vocal complaints. 相似文献

7.

The Effectiveness of Oral Resonance Therapy on the Perception of Femininity of Voice in Male-to-Female Transsexuals

Lisa Carew Georgia Dacakis Jennifer Oates 《Journal of voice》2007,21(5):591-603

Ten male-to-female transsexuals participated in five sessions of oral resonance voice therapy targeting lip spreading and forward tongue carriage. Acoustic analysis of recordings made pre- and posttherapy found that participant formant frequency values (F1, F2, and F3, from the vowels /a/, /i/, and /mho/), as well as fundamental frequency (F0), underwent a general increase posttherapy. F3 values, in particular, increased significantly posttreatment. Trends in listener ratings of these recordings showed that the majority of participants were perceived to sound more feminine following treatment. Participants' self-ratings of their voices pre- and posttreatment also indicated that participants perceived their voices as sounding more feminine and that they were more satisfied with their voices following treatment. The present study supports the findings of previous studies that have demonstrated that resonance characteristics in male-to-female transsexuals can be changed to more closely approximate those of females through oral resonance therapy. This intervention study also demonstrates that a spontaneous increase in F0 is achieved during the course of therapy. Further, this study provides preliminary evidence to suggest that oral resonance therapy may be effective in increasing femininity of voice in male-to-female transsexual clients. 相似文献

8.

Vocal stability in functional dysphonic versus healthy voices at different times of voice loading

C. Jilek J. Marienhagen T. Hacki 《Journal of voice》2004,18(4):443-453

Functional (nonorganic) dysphonia is often characterized by vocal instability. The purpose of the prospective study was to examine whether there is a difference in vocal instability of functional dysphonic voices compared with healthy ones, this means whether electroglottographic perturbation values differ (1) between healthy and dysphonic voices and (2) between two subgroups of the dysphponic voices (hpertonic and hypotonic dysphonic voices). Twenty-three patients with hypertonic functional dysphonia, 9 with hypotonic functional dysphonia and 31 healthy nonsmokers, were each examined electroglottographically before (Ex 1), immediately after (Ex 2), and 1 hour after (Ex 3) voice loading. Perturbations of frequency, amplitude, quasi-open-quotient, and contact-index were calculated from the EGG signal. At all three times of examination, hypertonic dysphonic voices showed higher perturbations than healthy voices, and they had higher perturbations than hypotonic dysphonic voices before and 1 hour after voice loading. Hypotonic dysphonic voices showed higher perturbations than healthy voices only 1 hour after voice loading. Voice loading induced different reactions in dysphonic voices: Some voices showed increased perturbations, and others exhibited normal or even decreased perturbation immediately after voice loading. Examination of electroglottographic-derived perturbations immediately after voice loading seems not to be useful. Differentiation of hypertonic and hypotonic dysphonic voices was possible with an estimated sensitivity of 88.9% and a specificity of 87.0% by using the sum of the amplitude-perturbation and the quasi-open-quotient-perturbation measured before voice loading. 相似文献

9.

Processing unattended speech

Rivenez M Darwin CJ Guillaume A 《The Journal of the Acoustical Society of America》2006,119(6):4027-4040

Three experiments examine the effect of a difference in fundamental frequency (F0) range between two simultaneous voices on the processing of unattended speech. Previous experiments have only found evidence for the processing of nominally unattended speech when it has consisted of isolated words which could have attracted the listener's attention. A paradigm recently used by Dupoux et al. [J. Exp. Psychol.: Human Percept. Perform. 29(1), 172-184 (2003)] was modified so that participants had to detect a target word belonging to a specific category presented in a rapid list of words in the attended ear. In the unattended ear, concatenated sentences were presented, some containing a repetition prime presented just before a target word. Primes speeded category detection by 25 ms when the two messages were in a difference F0 range. This priming effect was unaffected by whether the target was led to the left or the right ear, but disappeared when there was no F0 range difference between the messages. Finally, it was replicated when participants were compelled to focus on the attended message in order to perform a second task. The results demonstrate that repetition priming can be produced by words in unattended continuous speech provided that there is a difference in F0 range between the voices. 相似文献

10.

The effects of vowels on voice perturbation measures 总被引：1，自引：0，他引：1

Mehmet Akif Kili? Fatih O?üt Gürsel Dursun Erdo?an Okur Ilhami Yildirim Ra?it Midilli 《Journal of voice》2004,18(3):318-324

This study examines voice perturbation parameters of the sustained [a] in English and of the eight vowels in Turkish to discover whether any difference exists between these languages, and whether a correlation exists between voice perturbation parameters and articulatory and acoustic properties of the Turkish vowels. Eight Turkish vowels uttered by 26 healthy nonsmoker volunteer males who are native Turkish speakers were compared with a voice database that includes samples of normal and disordered voices belonging to American English speakers. Fundamental frequencies, the first and second formants, and perturbation parameters, such as jitter percent, pitch perturbation quotient, shimmer percent, and amplitude perturbation quotient of the sustained vowels, were measured. Also, the first and second formants of the sustained [a] in English were measured, and other parameters have been obtained from the database. When the voice perturbation parameters in Turkish and English were compared, statistically significant differences were not found. However, when Turkish vowels compared with each other, statistically significant differences were found among perturbation values. Categorical comparisons of the Turkish vowels like high-low, rounded-unrounded, and front-back revealed significant differences in perturbation values. In correlation analysis, a weak linear inverse relation between jitter percent and the first formant (r=-0.260, p<0.05) was found. 相似文献