期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

1.

Acoustic Analyses of Sustained and Running Voices From Patients With Laryngeal Pathologies

Yu Zhang Jack J. Jiang 《Journal of voice》2008,22(1):1-9

In this paper, we investigated the acoustic characteristics of sustained and running vowels from normal subjects and patients with laryngeal pathologies. Perturbation methods (including jitter and shimmer), signal-to-noise ratio (SNR), and nonlinear dynamic methods (such as correlation dimension and second-order entropy) were used to analyze sustained and running vowels. We found that the sustained vowels and running voices from normal subjects and patients with laryngeal pathologies had low-dimensional dynamic characteristics. For sustained vowels, the analyses of jitter, shimmer, correlation dimension, and second-order entropy revealed significant differences between normal and pathological voices. For running voices, jitter and shimmer did not statistically discriminate between normal and pathological voices, but a significant difference was found for SNR, correlation dimension, and second-order entropy. The results suggest that nonlinear dynamic analysis and traditional SNR analysis may be valuable for the analysis of sustained and running vowels; perturbation analysis may be applicable for the analysis of sustained vowels but should be applied with caution for running voice analysis. 相似文献

2.

Vocal Parameters of Aerobic Instructors with and without Voice Problems

Virginia Wolfe PhD Joanne Long Heather Conner Youngblood Henry Williford Michelle Scharff Olson 《Journal of voice》2002,16(1):52-60

Aerobic instructors frequently experience vocal fatigue and are at risk for the development of vocal fold pathology. Six female aerobic instructors, three with self-reported voice problems and three without, served as subjects. Measures of vocal function (perturbation and EGG) were obtained before and after a 30-minute exercise session. Results showed that the group with self-reported voice problems had greater amounts of jitter, lower harmonic-to-noise ratios, and less periodicity in sustained vowels overall, but no significant differences in measures of perturbation and EGG were found before and immediately after instruction. Measures of vocal parameters showed that subjects with self-reported voice problems projected with relatively greater vocal intensity and phonated for a greater percentage of time across beginning, middle, and ending periods of aerobic instruction than subjects with no reported voice problems. 相似文献

3.

Acoustic Characteristics of Vowels by Normal Malaysian Malay Young Adults

Hua Nong Ting See Yan Chia Badrulzaman Abdul Hamid Siti Zamratol-Mai Sarah Mukari 《Journal of voice》2011,25(6):e305

The acoustic characteristics of sustained vowel have been widely investigated across various languages and ethnic groups. These acoustic measures, including fundamental frequency (F₀), jitter (Jitt), relative average perturbation (RAP), five-point period perturbation quotient (PPQ5), shimmer (Shim), and 11-point amplitude perturbation quotient (APQ11) are not well established for Malaysian Malay young adults. This article studies the acoustic measures of Malaysian Malay adults using acoustical analysis. The study analyzed six sustained Malay vowels of 60 normal native Malaysian Malay adults with a mean of 21.19 years. The F₀ values of Malaysian Malay males and females were reported as 134.85 ± 18.54 and 238.27 ± 24.06 Hz, respectively. Malaysian Malay females had significantly higher F₀ than that of males for all the vowels. However, no significant differences were observed between the genders for the perturbation measures in all the vowels, except RAP in /e/. No significant F₀ differences between the vowels were observed. Significant differences between the vowels were reported for all perturbation measures in Malaysian Malay males. As for Malaysian Malay females, significant differences between the vowels were reported for Shim and APQ11. Multiethnic comparisons indicate that F₀ varies between Malaysian Malay and other ethnic groups. However, the perturbation measures cannot be directly compared, where the measures vary significantly across different speech analysis softwares. 相似文献

4.

Multimodal Standardization of Voice Among Four Multicultural Populations Formant Structures

Mary V. Andrianopoulos Keith Darrow Jie Chen 《Journal of voice》2001,15(1):61-77

A stratified random sample of 20 males and 20 females matched for physiologic factors and cultural-linguistic markers was examined to determine differences in formant frequencies during prolongation of three vowels: [a], [i], and [u]. The ethnic and gender breakdown included four sets of 5 male and 5 female subjects comprised of Caucasian and African American speakers of Standard American English, native Hindi Indian speakers, and native Mandarin Chinese speakers. Acoustic measures were analyzed using the Computerized Speech Lab (4300B) from which formant histories were extracted from a 200-ms sample of each vowel token to obtain first formant (F1), second formant (F2), and third formant (F3) frequencies. Significant group differences for the main effect of culture and race were found. For the main effect gender, sexual dimorphism in vowel formants was evidenced for all cultures and races across all three vowels. The acoustic differences found are attributed to cultural-linguistic factors. 相似文献

5.

Measures of vocal function during changes in vocal effort level 总被引：4，自引：0，他引：4

Daniel Zaoming Huang Fred D. Minifie Hidelzi Kasuya Sarah Xiao Lin 《Journal of voice》1995,9(4):429-438

The purpose of this article is to present the results of a controlled study of the day-to-day variabilities of three acoustic parameters (jitter, shimmer, and normalized noise energy), and two electroglottographic parameters (contact quotient and contact quotient perturbation) for vowels produced at three vocal efforts (low, normal, high). Data were obtained with use of a sophisticated bilinear interpolation pitch detection method. A repeated measures design required subjects to produce the vowels // and /a/ five times a day over 3 days at each vocal effort level. The jitter, shimmer, and normalized noise energy values from acoustic measures and contact quotient and contact quotient perturbation values varied significantly among the three vocal effort levels. The clinical implication of this finding is that vocal effort must be controlled in order to obtain consistent clinical measures. Furthermore, day-to-day variability must be taken into account if representative measures are to be obtained for clinical use. 相似文献

6.

Effects of singing training on the speaking voice of voice majors

Ana P. Mendes W. S. Brown Jr. Howard B. Rothman Christine Sapienza 《Journal of voice》2004,18(1):83-89

This longitudinal study gathered data with regard to the question: Does singing training have an effect on the speaking voice? Fourteen voice majors (12 females and two males; age range 17 to 20 years) were recorded once a semester for four consecutive semesters, while sustaining vowels and reading the "Rainbow Passage." Acoustic measures included speaking fundamental frequency (SFF) and sound pressure level (SLP). Perturbation measures included jitter, shimmer, and harmonic-to-noise ratio. Temporal measures included sentence, consonant, and diphthong durations. Results revealed that, as the number of semesters increased, the SFF increased while jitter and shimmer slightly decreased. Repeated measure analysis, however, indicated that none of the acoustic, temporal, or perturbation differences were statistically significant. These results confirm earlier cross-sectional studies that compared singers with nonsingers, in that singing training mostly affects the singing voice and rarely the speaking voice. 相似文献

7.

Voice F0 responses to pitch-shifted voice feedback during English speech

Chen SH Liu H Xu Y Larson CR 《The Journal of the Acoustical Society of America》2007,121(2):1157-1163

Previous studies have demonstrated that motor control of segmental features of speech rely to some extent on sensory feedback. Control of voice fundamental frequency (F0) has been shown to be modulated by perturbations in voice pitch feedback during various phonatory tasks and in Mandarin speech. The present study was designed to determine if voice Fo is modulated in a task-dependent manner during production of suprasegmental features of English speech. English speakers received pitch-modulated voice feedback (+/-50, 100, and 200 cents, 200 ms duration) during a sustained vowel task and a speech task. Response magnitudes during speech (mean 31.5 cents) were larger than during the vowels (mean 21.6 cents), response magnitudes increased as a function of stimulus magnitude during speech but not vowels, and responses to downward pitch-shift stimuli were larger than those to upward stimuli. Response latencies were shorter in speech (mean 122 ms) compared to vowels (mean 154 ms). These findings support previous research suggesting the audio vocal system is involved in the control of suprasegmental features of English speech by correcting for errors between voice pitch feedback and the desired F0. 相似文献

8.

Intra- and intersubject variability in acoustic measures of normal voice

R.E. Stone Jr. Cheryl L. Rainey 《Journal of voice》1991,5(3)

Twenty-four normal adult women read part of the Rainbow Passage and sustained vowels three trials each. Utterances were assessed for selected parameters measured by Visi-Pitch (average and SD of fundamental frequency (F₀), average and SD of dBA, perturbation, and percent voiced/unvoiced/pause). Assessment of each parameter included measures of central tendency, dispersion, and distribution characteristics (skewness and kurtosis) of the data and of the ranges of values that would include 95% of the scores (95% fiduciary limits). Generally, differences for the group between the three trials were not significant. Intersubject variability for only a few parameters was less than 20% of the parameter's mean. For vowels, variability of jitter was 30–48% of the mean. Eight subjects provided performances 2 months later to obtain an estimate of intrasubject variability over time. There were desirable intrasubject correlations between performances for mean F₀, jitter in reading and on vowels /i/ and /a/, and percent of voicing. Inter- and intrasubject variability seems restricted and the data appear to resemble a normally distributed function for mean F₀ on reading, jitter on /i/, and percent of voicing. Thus, these parameters may have statistical merit for use in vocal testing. 相似文献

9.

Intraspeaker variability in fundamental frequency stability: an age-related phenomenon? 总被引：1，自引：0，他引：1

S E Linville 《The Journal of the Acoustical Society of America》1988,83(2):741-745

The purpose of this investigation was to gather information on the extent to which intraspeaker variability on measures of jitter (%) and fundamental frequency standard deviation (F0 s.d.) is age related in women. Fifteen repeat productions of the vowels /i/, /a/, and /u/ from 22 young women (18-22 years) were analyzed for F0 s.d. and jitter. Findings for these young speakers were compared with those for elderly speakers tested previously (Linville and Korabic, 1987). Results indicate that the aging process brings about increases in the variability individual women demonstrate on measures of F0 stability when producing sustained vowels as steadily as possible. Further, young speakers differed markedly from elderly speakers in the pattern of frequency instability variations observed across the three vowels tested. 相似文献

10.

Voice of Postradiotherapy Nasopharyngeal Carcinoma Patients: Evidence of Vocal Tract Effect

Emily Lin Tzer-Zen Hwang Jeremy Hornibrook Tika Ormond 《Journal of voice》2008,22(3):351-364

This study was aimed at identifying acoustic and physiological measures useful for monitoring voice changes in postnasopharyngeal patients with nonlaryngeal malignancies, and providing evidences of vocal tract effect on voice through comparisons between individuals with and without intact vocal tract. Simultaneous acoustic-electroglottographic signals recorded during phonation of vowels /i/ and /a/ sustained at habitual, high, and low pitch levels were compared among 10 postradiotherapy patients with nasopharyngeal carcinoma (NPC), 10 voice patients (VPs) with intact vocal tract, and 10 healthy individuals with normal voice (NORM). Results from a series of discriminant analyses revealed that the NPC group generally exhibited lower signal-to-noise (SNR) and open quotient (OQ) and higher Formant 1 frequency (F(1)) and speed quotient (SQ) than the NORM group. Unlike both VP and NORM groups, the NPC group failed to show a pitch effect on all voice measures, including OQ, SQ, percent jitter, percent shimmer, and SNR, suggesting an effect of radiotherapy and/or vocal tract on laryngeal behaviors. For the vowel /i/, on the other hand, only the NPC and NORM groups showed a pattern of pitch-dependent F(1) raising, a reflection of increased pharyngeal narrowing. These findings suggested that the pitch effect on laryngeal behaviors differed not only between individuals with intact vocal tract and those without but also between those with structural and dynamic changes of vocal tract. 相似文献

11.

Comparisons of jitter, shimmer, and signal-to-noise ratio from directly digitized versus taped voice samples

Marylou Pausewang Gelfer Dawn M. Fendel 《Journal of voice》1995,9(4):378-382

The purpose of this study was to compare jitter, shimmer, and signal-to-noise ratio (SNR) measures obtained from tape-recorded samples with the same measures made on directly digitized voice samples, with use of the CSpeech acoustic analysis program. Subjects included 30 young women who phonated the vowel /a/ at a comfortable pitch and loudness level. Voice samples were simultaneously recorded and digitized, and the resulting perturbation measures for the two conditions were compared. Results indicated that there were small but statistically significant differences between percent jitter, percent shimmer, and SNR calculated from taped samples compared with the same measures calculated from directly digitized samples. It was concluded that direct digitization for clinical measures of vocal perturbation was most desirable, but that taped samples could be used, if necessary, with some caution. 相似文献

12.

Compensatory responses to loudness-shifted voice feedback during production of Mandarin speech

Liu H Zhang Q Xu Y Larson CR 《The Journal of the Acoustical Society of America》2007,122(4):2405-2412

Previous studies have demonstrated that perturbations in voice pitch or loudness feedback lead to compensatory changes in voice F(0) or amplitude during production of sustained vowels. Responses to pitch-shifted auditory feedback have also been observed during English and Mandarin speech. The present study investigated whether Mandarin speakers would respond to amplitude-shifted feedback during meaningful speech production. Native speakers of Mandarin produced two-syllable utterances with focus on the first syllable, the second syllable, or none of the syllables, as prompted by corresponding questions. Their acoustic speech signal was fed back to them with loudness shifted by +/-3 dB for 200 ms durations. The responses to the feedback perturbations had mean latencies of approximately 142 ms and magnitudes of approximately 0.86 dB. Response magnitudes were greater and latencies were longer when emphasis was placed on the first syllable than when there was no emphasis. Since amplitude is not known for being highly effective in encoding linguistic contrasts, the fact that subjects reacted to amplitude perturbation just as fast as they reacted to F(0) perturbations in previous studies provides clear evidence that a highly automatic feedback mechanism is active in controlling both F(0) and amplitude of speech production. 相似文献

13.

The relationship of vocal tract shape to three voice qualities

Story BH Titze IR Hoffman EA 《The Journal of the Acoustical Society of America》2001,109(4):1651-1667

Three-dimensional vocal tract shapes and consequent area functions representing the vowels [i, ae, a, u] have been obtained from one male and one female speaker using magnetic resonance imaging (MRI). The two speakers were trained vocal performers and both were adept at manipulation of vocal tract shape to alter voice quality. Each vowel was performed three times, each with one of the three voice qualities: normal, yawny, and twangy. The purpose of the study was to determine some ways in which the vocal tract shape can be manipulated to alter voice quality while retaining a desired phonetic quality. To summarize any overall tract shaping tendencies mean area functions were subsequently computed across the four vowels produced within each specific voice quality. Relative to normal speech, both the vowel area functions and mean area functions showed, in general, that the oral cavity is widened and tract length increased for the yawny productions. The twangy vowels were characterized by shortened tract length, widened lip opening, and a slightly constricted oral cavity. The resulting acoustic characteristics of these articulatory alterations consisted of the first two formants (F1 and F2) being close together for all yawny vowels and far apart for all the twangy vowels. 相似文献

14.

Perceptual sensitivity to first harmonic amplitude in the voice source

Kreiman J Gerratt BR 《The Journal of the Acoustical Society of America》2010,128(4):2085-2089

Little is known about the perceptual importance of changes in the shape of the source spectrum, although many measures have been proposed and correlations with different vocal qualities (breathiness, roughness, nasality, strain...) have frequently been reported. This study investigated just-noticeable differences in the relative amplitudes of the first two harmonics (H1-H2) for speakers of Mandarin and English. Listeners heard pairs of vowels that differed only in the amplitude of the first harmonic and judged whether or not the voice tokens were identical in voice quality. Across voices and listeners, just-noticeable-differences averaged 3.18 dB. This value is small relative to the range of values across voices, indicating that H1-H2 is a perceptually valid acoustic measure of vocal quality. For both groups of listeners, differences in the amplitude of the first harmonic were easier to detect when the source spectral slope was steeply falling so that F0 dominated the spectrum. Mandarin speakers were significantly more sensitive (by about 1 dB) to differences in first harmonic amplitudes than were English speakers. Two explanations for these results are possible: Mandarin speakers may have learned to hear changes in harmonic amplitudes due to changes in voice quality that are correlated with the tones of Mandarin; or Mandarin speakers' experience with tonal contrasts may increase their sensitivity to small differences in the amplitude of F0 (which is also the first harmonic). 相似文献

15.

Photoglottographic measures in parkinson''s disease 总被引：1，自引：0，他引：1

Emily Lin Jack Jiang Stephen Hone David G. Hanson 《Journal of voice》1999,13(1):25-35

This study examines the usefulness of photoglottographic measures in reflecting the phonatory effect of Parkinson's disease. In the first experiment, data obtained by photoglottography were compared between 15 male patients with Parkinson's disease and 15 normal male speakers of similar age. Six photoglottographic parameters, mean open quotient (OQ), mean speed quotient (SQ), perturbation of open quotient (POQ), perturbation of speed quotient (PSQ), frequency perturbation ratio (FPR), and amplitude perturbation ratio (APR), in sustained vowel phonation were investigated. Increased SQ (t = -2.731, df = 28, P = 0.011) and POQ (t = -2.584, df = 28, P = 0.015) were significantly associated with data from patients in comparison to normal speakers. The FPR, APR, and OQ were not significantly different between normal subjects and patients. A follow-up experiment, including 12 female and 19 male patients with Parkinson's disease, was designed to evaluate the sensitivity of SQ and POQ in detecting vocal dysfunction. The sensitivity of SQ was found to be relatively high (93.5%), while that of POQ was low (45.2%). Methodological issues regarding the effects of gender, age, stage of the disease, and treatment on photoglottographic measures in Parkinson's disease were discussed. 相似文献

16.

Effects of head extension and tongue protrusion on voice perturbation measures

Emily Lin Jack Jiang Stephen D. Noon David G. Hanson 《Journal of voice》2000,14(1):8

Head extension with protruded tongue is the position for video-laryngoscopy and simultaneous glottographic recordings including photoglottographic signals. This study investigated the effect of head extension and tongue protrusion on the measures of fundamental frequency, frequency perturbation (jitter), and amplitude perturbation (shimmer). Acoustic signals recorded during sustained vowels were obtained from 49 women and 66 men with no speech or voice disorders in different head-tongue positions. Head extension was associated with increased fundamental frequency and decreased shimmer. In men, head extension did not appear to affect jitter. When the tongue was protruded, head extension tended to lower jitter. For both genders, tongue protrusion was associated with decreased fundamental frequency with head extension. In the men, tongue protrusion tended to increase shimmer when the head was in the neutral position. In the women, tongue protrusion was associated with increased jitter and increased shimmer and was most evident in the head-neutral position. These findings supported a physical linkage hypothesis of the relationship between vocal tract configuration and vocal fold vibration, suggesting that head-tongue position must be taken into account when comparing voice measures. 相似文献

17.

Can intrinsic vowel Fo be explained by source/tract coupling?

W G Ewan 《The Journal of the Acoustical Society of America》1979,66(2):358-362

There is extensive evidence that in the same phonetic environment the voice fundamental frequency (Fo) of vowels varies directly with vowel "height." This Fo difference between vowels could be caused by acoustic interaction between the first vowel formant and the vibrating vocal folds. Since higher vowels have lower first formants than low vowels the acoustic interaction should be greatest for high vowels whose first formant frequencies are closer in frequency to Fo. Ten speakers were used to see if acoustic interaction could cause the Fo differences. The consonant [m] was recorded in the utterances [umu] and [ama]. Although the formant structure of [m] in [umu] and [ama] should not differ significantly, the Fo of each [m] allophone was significantly different. However, the Fo of each [m] allophone did not differ significantly from the Fo of the following vowel. These results did not support acoustic interaction. However, it is quite reasonable to conclude that the Fo variation of [m] was caused by coarticulatory anticipation of the tongue and jaw for the following vowel. Another experiment is offered in order to help explain the physical causes of intrinsic vowel Fo. In this experiment Fo lowering was found at the beginning of vowels following Arabic pharyngeal approximants. This finding indicates that the Fo of pharyngeal constricting vowels, e.g., [ae] and [a], might be lowered as a result of similar articulary movements, viz. tongue compression and active pharyngeal constriction. 相似文献

18.

Decomposition of Vocal Cycle Length Perturbations into Vocal Jitter and Vocal Microtremor, and Comparison of Their Size in Normophonic Speakers

J. Schoentgen 《Journal of voice》2003,17(2):114-125

A statistical method that enables raw vocal cycle length perturbations to be decomposed into perturbations ascribed to vocal jitter and vocal tremor is presented, together with a comparison of the size of jitter and tremor. The method is based on a time series model that splits the vocal cycle length perturbations into uncorrelated cycle-to-cycle perturbations ascribed to vocal jitter and supra-cycle perturbations ascribed to vocal tremor. The corpus was composed of 114 vocal cycle length time series for sustained vowels [a], [i], and [u] produced by 22 male and 16 female normophonic speakers. The results were the following. First, 100 out of 114 time series were decomposed successfully by means of the time series model. Second, vocal perturbations ascribed to tremor were significantly larger than perturbations ascribed to jitter. Third, the correlation between vocal jitter and vocal tremor was moderate, but statistically significant. Fourth, small but statistically significant differences were observed among the three vowel timbres in the relative jitter and the arithmetic difference of jitter and tremor. Fifth, the differences between male and female speakers were not statistically significant in the relative raw perturbations, the relative jitter, or the modulation level owing to tremor. 相似文献

19.

共振峰编辑法区别鼻化元音中口、鼻音共振峰的实证探究*

下载免费PDF全文

赵擎华杨俊杰《应用声学》2021,40(6):937-945

为解决司法话者识别中利用鼻化元音构建元音声学空间图时如何准确判别鼻化元音的口、鼻音共振峰的问题。本文通过计算机语音工作站对语音样本的共振峰进行编辑操作，利用生成的语音样本构建不同的对照组分别进行听辨。结果表明，口音、鼻音共振峰分别被衰减后的语音变化特点呈现一定规律，使用此方法可以准确区分鼻化元音的口、鼻共振峰的阶次。本文建立的“共振峰编辑”与“听觉感知”相结合的判别方法，可以为司法话者识别及语音感知、识别等相关领域通过构建元音声学空间图进行声学特征研究的模型提供口音、鼻音共振峰的判别依据。相似文献

20.

Comparison of Singer''s Formant, Speaker''s Ring, and LTA Spectrum Among Classical Singers and Untrained Normal Speakers

Viviane M. Oliveira Barrichelo Reinhardt J. Heuer Carole M. Dean Robert T. Sataloff 《Journal of voice》2001,15(3):344-350

Many studies have described and analyzed the singer's formant. A similar phenomenon produced by trained speakers led some authors to examine the speaker's ring. If we consider these phenomena as resonance effects associated with vocal tract adjustments and training, can we hypothesize that trained singers can carry over their singing formant ability into speech, also obtaining a speaker's ring? Can we find similar differences for energy distribution in continuous speech? Forty classically trained singers and forty untrained normal speakers performed an all-voiced reading task and produced a sample of a sustained spoken vowel /a/. The singers were also requested to perform a sustained sung vowel /a/ at a comfortable pitch. The reading was analyzed by the long-term average spectrum (LTAS) method. The sustained vowels were analyzed through power spectrum analysis. The data suggest that singers show more energy concentration in the singer's formant/speaker's ring region in both sung and spoken vowels. The singers' spoken vowel energy in the speaker's ring area was found to be significantly larger than that of the untrained speakers. The LTAS showed similar findings suggesting that those differences also occur in continuous speech. This finding supports the value of further research on the effect of singing training on the resonance of the speaking voice. 相似文献