期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

1.

Incorporation of phonetic constraints in acoustic-to-articulatory inversion

Potard B Laprie Y Ouni S 《The Journal of the Acoustical Society of America》2008,123(4):2310-2323

This study investigates the use of constraints upon articulatory parameters in the context of acoustic-to-articulatory inversion. These speaker independent constraints, referred to as phonetic constraints, were derived from standard phonetic knowledge for French vowels and express authorized domains for one or several articulatory parameters. They were experimented on in an existing inversion framework that utilizes Maeda's articulatory model and a hypercubic articulatory-acoustic table. Phonetic constraints give rise to a phonetic score rendering the phonetic consistency of vocal tract shapes recovered by inversion. Inversion has been applied to vowels articulated by a speaker whose corresponding x-ray images are also available. Constraints were evaluated by measuring the distance between vocal tract shapes recovered through inversion to real vocal tract shapes obtained from x-ray images, by investigating the spreading of inverse solutions in terms of place of articulation and constriction degree, and finally by studying the articulatory variability. Results show that these constraints capture interdependencies and synergies between speech articulators and favor vocal tract shapes close to those realized by the human speaker. In addition, this study also provides how acoustic-to-articulatory inversion can be used to explore acoustical and compensatory articulatory properties of an articulatory model. 相似文献

2.

Voice and Lifestyle Behaviors of Speech-Language Pathology Students: Impact of History Gathering Method on Self-Reported Data

Jeff Searl Troy Dargin 《Journal of voice》2021,35(1):158.e9-158.e20

ObjectivesThis study described voice use and lifestyle information from student speech-language pathologists (SLP) and assessed the impact of history gathering method on the acquired data.MethodsOne hundred sixty-two SLP students completed a detailed history form and estimated voice and life style parameters at study intake and subsequently tracked the same parameters daily for three consecutive weeks. Nonparametric statistical comparisons were applied to assess differences in estimates at intake versus the 3-week log.ResultsVoice problems diagnosed by a physician or SLP were reported by 11% of the students. A similar percentage reported frequent loud talking and heavy occupational voice demands beyond clinical training use. Furthermore, high stress was reported by 49%, frequent anxiety by 53%, and depression by 17%. Comparing data from study intake relative to the 3-week log, SLP students statistically significantly overestimated speaking time, and underestimated singing, second hand smoke exposure time, and hours of sleep. Additionally, they overestimated water intake and daily stress, and underestimated caffeine and alcohol intake, at the study onset versus the log. The experience of vocal fatigue was common within the 3-week log, but how a student identified at study intake on this parameter (experiencing it frequently or not) did not differentiate how many days of vocal fatigue were reported in 3 weeks.ConclusionsSLP students engage in some voice use and lifestyle behaviors that place them at risk for voice problems. The method of soliciting information about the voice and lifestyle of SLP students impacted the information obtained. Optimal methods of gathering accurate and reliable clinical history and voice us data are needed. 相似文献

3.

Articulatory tradeoffs reduce acoustic variability during American English /r/ production.

F H Guenther C Y Espy-Wilson S E Boyce M L Matthies M Zandipour J S Perkell 《The Journal of the Acoustical Society of America》1999,105(5):2854-2865

The American English phoneme /r/ has long been associated with large amounts of articulatory variability during production. This paper investigates the hypothesis that the articulatory variations used by a speaker to produce /r/ in different contexts exhibit systematic tradeoffs, or articulatory trading relations, that act to maintain a relatively stable acoustic signal despite the large variations in vocal tract shape. Acoustic and articulatory recordings were collected from seven speakers producing /r/ in five phonetic contexts. For every speaker, the different articulator configurations used to produce /r/ in the different phonetic contexts showed systematic tradeoffs, as evidenced by significant correlations between the positions of transducers mounted on the tongue. Analysis of acoustic and articulatory variabilities revealed that these tradeoffs act to reduce acoustic variability, thus allowing relatively large contextual variations in vocal tract shape for /r/ without seriously degrading the primary acoustic cue. Furthermore, some subjects appeared to use completely different articulatory gestures to produce /r/ in different phonetic contexts. When viewed in light of current models of speech movement control, these results appear to favor models that utilize an acoustic or auditory target for each phoneme over models that utilize a vocal tract shape target for each phoneme. 相似文献

4.

On the perception of similarity among talkers

Remez RE Fellowes JM Nagel DS 《The Journal of the Acoustical Society of America》2007,122(6):3688-3696

A listener who recognizes a talker notices characteristic attributes of the talker's speech despite the novelty of each utterance. Accounts of talker perception have often presumed that consistent aspects of an individual's speech, termed indexical properties, are ascribable to a talker's unique anatomy or consistent vocal posture distinct from acoustic correlates of phonetic contrasts. Accordingly, the perception of a talker is acknowledged to occur independently of the perception of a linguistic message. Alternatively, some studies suggest that attention to attributes of a talker includes indexical linguistic attributes conveyed in the articulation of consonants and vowels. This investigation sought direct evidence of attention to phonetic attributes of speech in perceiving talkers. Natural samples and sinewave replicas derived from them were used in three experiments assessing the perceptual properties of natural and sine-wave sentences; of temporally veridical and reversed natural and sine-wave sentences; and of an acoustic correlate of vocal tract scale to judgments of sine-wave talker similarity. The results revealed that the subjective similarity of individual talkers is preserved in the absence of natural vocal quality; and that local phonetic segmental attributes as well as global characteristics of speech can be exploited when listeners notice characteristics of talkers. 相似文献

5.

Large scale data acquisition of simultaneous MRI and speech

《Applied Acoustics》2014

We describe an arrangement for simultaneous recording of speech and vocal tract geometry in patients undergoing surgery involving this area. Experimental design is considered from an articulatory phonetic point of view. The speech signals are recorded with an acoustic-electrical arrangement. The vocal tract is simultaneously imaged with MRI. A MATLAB-based system controls the timing of speech recording and MR image acquisition. The speech signals are cleaned from acoustic MRI noise by an adaptive signal processing algorithm. Finally, a vowel data set from pilot experiments is qualitatively compared both with validation data from the anechoic chamber and with Helmholtz resonances of the vocal tract volume, obtained using FEM. 相似文献

6.

The relationship of vocal tract shape to three voice qualities

Story BH Titze IR Hoffman EA 《The Journal of the Acoustical Society of America》2001,109(4):1651-1667

Three-dimensional vocal tract shapes and consequent area functions representing the vowels [i, ae, a, u] have been obtained from one male and one female speaker using magnetic resonance imaging (MRI). The two speakers were trained vocal performers and both were adept at manipulation of vocal tract shape to alter voice quality. Each vowel was performed three times, each with one of the three voice qualities: normal, yawny, and twangy. The purpose of the study was to determine some ways in which the vocal tract shape can be manipulated to alter voice quality while retaining a desired phonetic quality. To summarize any overall tract shaping tendencies mean area functions were subsequently computed across the four vowels produced within each specific voice quality. Relative to normal speech, both the vowel area functions and mean area functions showed, in general, that the oral cavity is widened and tract length increased for the yawny productions. The twangy vowels were characterized by shortened tract length, widened lip opening, and a slightly constricted oral cavity. The resulting acoustic characteristics of these articulatory alterations consisted of the first two formants (F1 and F2) being close together for all yawny vowels and far apart for all the twangy vowels. 相似文献

7.

Effect of Fasting on Voice in Women

Abdul-Latif Hamdan Abla Sibai Charbel Rameh 《Journal of voice》2007,21(4):495-501

OBJECTIVE/HYPOTHESIS: To study the effect of fasting on voice in women: abstinence from food and water intake between 14 and 18 hours. STUDY DESIGN: A prospective study on female subjects. MATERIAL AND METHOD: A total of 28 female subjects were included in this study. Their age ranged between 21 and 45 years. Subjects with vocal symptoms or vocal fold lesions were excluded. The subjects were tested when they were not fasting and while fasting after the first week of intermittent fasting during Ramadan. Each subject was first asked about her vocal symptoms and the ease of phonation or phonatory effort. Then each underwent acoustic analysis and laryngeal video-endostroboscopy. RESULTS: Vocal fatigue was the most common reported complaint (53.6%) followed by deepening of the voice (21.4%) and harshness (10.2%). Self-reported phonatory effort was significantly affected by fasting (P value < 0.001). Out of the 28 subjects, 23 had an increase in their phonatory effort. Vocal acoustic parameters did not change markedly except for the maximum phonation time, which decreased significantly. Laryngeal video-endostroboscopy did not reveal any significant changes during fasting. All stroboscopic parameters were the same except for a decrease in the amplitude of the mucosal waves in one subject and the presence of a posterior chink in three subjects. CONCLUSION: Fasting affects voice. There is an increase in the phonatory effort, and vocal fatigue is the most common symptom. 相似文献

8.

Rhesus macaques spontaneously perceive formants in conspecific vocalizations

Fitch WT Fritz JB 《The Journal of the Acoustical Society of America》2006,120(4):2132-2141

We provide a direct demonstration that nonhuman primates spontaneously perceive changes in formant frequencies in their own species-typical vocalizations, without training or reinforcement. Formants are vocal tract resonances leading to distinctive spectral prominences in the vocal signal, and provide the acoustic determinant of many key phonetic distinctions in human languages. We developed algorithms for manipulating formants in rhesus macaque calls. Using the resulting computer-manipulated calls in a habituation/dishabituation paradigm, with blind video scoring, we show that rhesus macaques spontaneously respond to a change in formant frequencies within the normal macaque vocal range. Lack of dishabituation to a "synthetic replica" signal demonstrates that dishabituation was not due to an artificial quality of synthetic calls, but to the formant shift itself. These results indicate that formant perception, a significant component of human voice and speech perception, is a perceptual ability shared with other primates. 相似文献

9.

Auditory contrast and speaker quality variation in vowel perception

R A Fox 《The Journal of the Acoustical Society of America》1985,77(4):1552-1559

Selective adaption and anchoring effects in speech perception have generated several different hypotheses regarding the nature of contextual contrast, including auditory/phonetic feature detector fatigue, response bias, and auditory contrast. In the present study three different seven-step [hId]-[h epsilon d] continua were constructed to represent a low F0 (long vocal tract source), a high F0 (long vocal tract source), and a high F0 (short vocal tract source), respectively. Subjects identified the tokens from each of the stimulus continua under two conditions: an equiprobable control and an anchoring condition which included an endpoint stimulus from one of the three continua occurring at least three times more often than any other single stimulus. Differential contrast effects were found depending on whether the anchor differed from the test stimuli in terms of F0, absolute formant frequencies, or both. Results were inconsistent with both the feature detector fatigue and response bias hypothesis. Rather, the obtained data suggest that vowel contrast occurs on the basis of normalized formant values, thus supporting a version of the auditory-contrast theory. 相似文献

10.

Estimation of vocal dysperiodicities in disordered connected speech by means of distant-sample bidirectional linear predictive analysis

Bettens F Grenez F Schoentgen J 《The Journal of the Acoustical Society of America》2005,117(1):328-337

The article presents an analysis of vocal dysperiodicities in connected speech produced by dysphonic speakers. The processing is based on a comparison of the present speech fragment with future and past fragments. The size of the dysperiodicity estimate is zero for periodic speech signals. A feeble increase of the vocal dysperiodicity is guaranteed to produce a feeble increase of the estimate. No spurious noise boosting occurs owing to cycle insertion and omission errors, or phonetic segment boundary artifacts. Additional objectives of the study have been investigating whether deviations from periodicity are larger or more commonplace in connected speech than in sustained vowels, and whether sentences that comprise frequent voice onsets and offsets are noisier than sentences that comprise few. The corpora contain sustained vowels as well as grammatically- and phonetically matched sentences. An acoustic marker that correlates with the perceived degree of hoarseness summarizes the size of the dysperiodicities. The marker values for sustained vowels have been highly correlated with those for connected speech, and the marker values for sentences that comprise few voiced/unvoiced transients have been highly correlated with the marker values for sentences that comprise many. 相似文献

11.

The mutual roles of temporal glimpsing and vocal characteristics in cocktail-party listening

Vestergaard MD Fyson NR Patterson RD 《The Journal of the Acoustical Society of America》2011,130(1):429-439

At a cocktail party, listeners must attend selectively to a target speaker and segregate their speech from distracting speech sounds uttered by other speakers. To solve this task, listeners can draw on a variety of vocal, spatial, and temporal cues. Recently, Vestergaard et al. [J. Acoust. Soc. Am. 125, 1114-1124 (2009)] developed a concurrent-syllable task to control temporal glimpsing within segments of concurrent speech, and this allowed them to measure the interaction of glottal pulse rate and vocal tract length and reveal how the auditory system integrates information from independent acoustic modalities to enhance recognition. The current paper shows how the interaction of these acoustic cues evolves as the temporal overlap of syllables is varied. Temporal glimpses as short as 25 ms are observed to improve syllable recognition substantially when the target and distracter have similar vocal characteristics, but not when they are dissimilar. The effect of temporal glimpsing on recognition performance is strongly affected by the form of the syllable (consonant-vowel versus vowel-consonant), but it is independent of other phonetic features such as place and manner of articulation. 相似文献

12.

Investigating the Effects of Caffeine on Phonation

Elizabeth Erickson-Levendoski Mahalakshmi Sivasankar 《Journal of voice》2011,25(5):e215

Objective

A core component of vocal hygiene programs is the avoidance of agents that may dry the vocal folds. Clinicians commonly recommend that individuals reduce caffeine intake because of its presumed dehydrating effects on the voice. However, there is little evidence that ingestion of caffeine is detrimental to voice production. The first objective of this study was to evaluate whether caffeine adversely affects voice production. The second objective was to evaluate if caffeine exacerbates the adverse phonatory effects of vocal loading.

Study Design

Prospective, double-blinded, sham-controlled study.

Methods

Sixteen healthy adults participated in two sessions where they consumed caffeine (caffeine concentration = 480 mg) or sham (caffeine concentration = 24 mg) beverages. Voice measures (phonation threshold pressure and perceived phonatory effort) were collected. Subjects then completed a vocal loading challenge and voice measures were obtained again.

Results

There were no significant differences in voice measures between the caffeine and sham conditions. Ingestion of caffeine did not adversely affect voice production (P > 0.05) or exacerbate the detrimental phonatory effects of vocal loading (P > 0.05).

Conclusions

Our findings contribute to emerging knowledge on the effects of caffeine on voice production. Recommendations to completely eliminate caffeine from the diet, as a component of a vocal hygiene program, should be evaluated on an individual basis. 相似文献

13.

Investigation of Anti-Hyaluronidase Treatment on Vocal Fold Wound Healing

Bernard Rousseau Ichiro Tateya XinHong Lim Alejandro Munoz-del-Rio Diane M. Bless 《Journal of voice》2006,20(3):443-451

SUMMARY: Phytochemical constituents of medicinal plants demonstrate inhibition of tissue and bacterial hyaluronidase. Echinacoside is a caffeoyl conjugate of Echinacea with known anti-hyaluronidase properties. The purpose of this study was to investigate the wound healing effects of Echinacea on vocal fold wound healing and functional voice outcomes. Pig animal model. Methods: Vocal fold injury was induced in 18 pigs by unilateral vocal fold stripping. The uninjured vocal fold served as control. Three groups of six pigs randomly received a topical application of 300, 600, or 1,200 mg of standardized Echinacea on the injured side. Animals were euthanized after 3, 10, and 15 days of wound healing. Phonation threshold pressure and vocal economy measurements were obtained from excised larynges. Treatment outcomes were examined by comparing the animals receiving treatment with a set of 19 untreated and 5 historical controls. Treatment effects on wound healing were evaluated by histologic staining for hyaluronan and collagen. Treated larynges revealed improved vocal economy and phonation threshold pressure compared with untreated larynges. Histologically, treated vocal folds revealed stable hyaluronan content and no significant accumulation of collagen compared with control. Findings provide a favorable outcome of anti-hyaluronidase treatment on acute vocal fold wound healing and functional measures of voice. 相似文献

14.

Analysis and Evaluation of a Voice-Training Program in Future Professional Voice Users

Bernadette Timmermans Marc S. De Bodt Floris L. Wuyts Paul H. Van de Heyning 《Journal of voice》2005,19(2):202-210

The goal of this study is to analyze and evaluate the effectiveness of a voice-training program. Twenty-three professional voice users received voice training for 2 years and vocal hygiene education for 1 year. The voice-training program consisted of lectures, technical workshops, and vocal coaching. The European Laryngological Society (ELS) protocol, including the Dysphonia Severity Index (DSI) and the Voice Handicap Index (VHI), was applied before and after, respectively, 9 and 18 months of voice training. A questionnaire on daily habits was presented at study onset and after 18 months. The DSI improvement is more significant after 9 months (P=0.005) than it is after 18 months (P=0.2). On the other hand, the perceptual evaluation remained unchanged after 9 months, whereas it improved significantly after 18 months. The results of the daily habit questionnaire are disturbing: the prevalence of smoking, vocal abuse, stress, and late meals were not influenced by the lectures and remained high. This study emphasizes the need for a well-organized voice-training program that is most effective after 9 months. Regarding the low effectiveness of the vocal hygiene program, the concept needs revision. 相似文献

15.

Changes in sustained production tasks among women with bilateral vocal nodules before and after voice therapy

Kathleen Treole Michael D. Trudeau 《Journal of voice》1997,11(4):462-469

The purpose of the current study was to determine how maximum phonationduration (MPD) of five notes (C4, D4, E4, F4, 134) sustained on /o/, sustained vowels (/i/, //, /a/, /u/), and s/z ratio (sustained /s/ and /z/ changed during voice therapy for vocal nodules. Voices of adult females before treatment and after resolution of vocal nodules were analyzed via the PM Pitch Analyzer. Treatment included tension reduction, abuse identification and elimination, laryngeal strengthening, and home exercises. Results indicate there was no significant difference in maximum phonation duration or S/Z ratio before and after treatment. Results revealed that females with vocal nodules demonstrate measurements before therapy similar to measures considered to be normal in persons without vocal nodules. Application of findings to clinical practice is discussed. 相似文献

16.

The effect of topical anesthesia on vocal fold motion

AD Rubin A Shah CA Moyer MM Johns 《Journal of voice》2009,23(1):128-131

The objective of this study was to determine if topical anesthesia to the larynx and pharynx affects vocal fold motion during dynamic voice evaluation with transnasal flexible endoscopy. Transnasal dynamic laryngeal examinations of 10 patients with no voice complaints were evaluated by five blinded fellowship-trained laryngologists. Each patient was examined before and after application of topical anesthetic. Reviewers rated briskness of right and left vocal fold movement and longitudinal tension on a visual analogue scale. Statistical comparisons were made between individual subject scores before and after anesthetic application. Inter-rater reliability was also assessed. No statistical difference was observed between subject scores before and after anesthetic application. Average intraclass correlation coefficients were 0.643 and 0.591 for pre- and postanesthesia scores, respectively. Application of topical anesthesia to the larynx and pharynx does not affect vocal fold motion. 相似文献

17.

Chaotic vibrations of a vocal fold model with a unilateral polyp

Zhang Y Jiang JJ 《The Journal of the Acoustical Society of America》2004,115(3):1266-1269

A nonlinear model was proposed to study chaotic vibrations of vocal folds with a unilateral vocal polyp. The model study found that the vocal polyp affected glottal closure and caused aperiodic vocal fold vibrations. Using nonlinear dynamic methods, aperiodic vibrations of the vocal fold model with a polyp were attributed to low-dimensional chaos. Bifurcation diagrams showed that vocal polyp size, stiffness, and damping had important effects on vocal fold vibrations. An increase in polyp size tended to induce subharmonic patterns and chaos. This study provides a theoretical basis to model aperiodic vibrations of vocal folds with a laryngeal mass. 相似文献

18.

Interarticulator cohesion within coronal consonant production

Mooshammer C Hoole P Geumann A 《The Journal of the Acoustical Society of America》2006,120(2):1028-1039

If more than one articulator is involved in the execution of a phonetic task, then the individual articulators have to be temporally coordinated with each other in a lawful manner. The present study aims at analyzing tongue-jaw cohesion in the temporal domain for the German coronal consonants [s, f, t, d, n, l], i.e., consonants produced with the same set of articulators--the tongue blade and the jaw--but differing in manner of articulation. The stability of obtained interaction patterns is evaluated by varying the degree of vocal effort: comfortable and loud. Tongue and jaw movements of five speakers of German were recorded by means of electromagnetic midsagittal articulography (EMMA) during [aCa] sequences. The results indicate that (1) tongue-jaw coordination varies with manner of articulation, i.e., a later onset and offset of the jaw target for the stops compared to the fricatives, the nasal and the lateral; (2) the obtained patterns are stable across vocal effort conditions; (3) the sibilants are produced with smaller standard deviations for latencies and target positions; and (4) adjustments to the lower jaw positions during the surrounding vowels in loud speech occur during the closing and opening movement intervals and not the consonantal target phases. 相似文献

19.

Vocal tract changes caused by phonation into a tube: a case study using computer tomography and finite-element modeling

Vampola T Laukkanen AM Horácek J Svec JG 《The Journal of the Acoustical Society of America》2011,129(1):310-315

Phonation into a glass tube is a voice training and therapy method that leads to beneficial effects in voice production. It has not been known, however, what changes occur in the vocal tract during and after the phonation into a tube. This pilot study examined the vocal tract shape in a female subject before, during, and after phonation into a tube using computer tomography (CT). Three-dimensional finite-element models (FEMs) of the vocal tract were derived from the CT images and used to study changes in vocal tract input impedance. When phonating on vowel [a:] the data showed tightened velopharyngeal closure and enlarged cross-sectional areas of the oropharyngeal and oral cavities during and after the tube-phonation. FEM calculations revealed an increased input inertance of the vocal tract and an increased acoustic energy radiated out of the vocal tract after the tube-phonation. The results indicate that the phonation into a tube causes changes in the vocal tract which remain also when the tube is removed. These effects may help improving voice production in patients and voice professionals. 相似文献

20.

Perceived phonatory effort and phonation threshold pressure across a prolonged voice loading task: a study of vocal fatigue

Ann Chang Michael P. Karnell 《Journal of voice》2004,18(4):454-466

Although the problem of vocal fatigue is not uncommon in people with voice disorders, research on objective quantifiable indicators of vocal fatigue is limited. It has been suggested that a speaker's perception of increased phonatory effort associated with periods of prolonged voice use is related to increased lung pressure required to initiate and sustain phonation. The purpose of this study was to examine the relationship among perceived phonatory effort (PPE), which was used as a subjective index of vocal fatigue, and phonation threshold pressure (PTP), a quantifiable measure defined as the minimal lung pressure required to initiate and sustain vocal fold oscillation. PTP and PPE were recorded before, during, and after five adult male and five adult female speakers engaged in a prolonged oral reading task designed to induce vocal fatigue. The results supported a direct, moderately strong relationship between PTP and PPE, particularly when PTP was measured during speech produced at comfortable and low-speaking pitch levels. No gender effects were found. PTP returned to baseline levels within 1 hour after the fatiguing task. PPE returned to baseline within 1 day. The data support the use of PTP as an objective index of vocal fatigue. 相似文献