首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
This study was designed to determine if differences exist in parsrecta and pars oblique muscle activity during speech and singing. Hooked wire electrodes were implanted in the muscle bundles under direct vision during thyroid surgery in two men and three women. It was found that the pars recta and pars oblique do not function in a similar manner across fundamental frequencies (ƒ0's), tasks, or subjects. Large inter- and intrasubject variability wase evident in the contribution of the cricothyroid bundles to fundamental frequency (ƒ0,) control. It is speculated that the effect of pars recta and pars oblique contraction may be a function of individual anatomic variations.  相似文献   

2.
A voice range profile (VRP) was obtained from each of eight professional actors and compared with two speech range profiles (SRPs). One speech profile was obtained during the dramatic reading of a scene in the laboratory and the other during a performance on stage in a professional theater. The objective was to determine the pitch and loudness ranges used by the actors in speech relative to the VRP. The principal question of interest was whether the actors stayed within the center of the VRP, or whether they tended to drift toward the boundaries of intensity and frequency. A second question was whether the performance within the laboratory accurately reflects that of a stage performance. The results suggest that some subjects tend to exceed the center of the VRP during the stage performance. It is hypothesized that these actors may stress their vocal mechanism during performance and are more likely candidates for vocal injury.  相似文献   

3.
Electrolarynxes have been used as one of the rehabilitation methods for laryngectomees. Earlier electrolarynxes could not alter frequency and intensity simultaneously during conversation. Recently, we developed an electrolarynx named “Evada” (prototype so far) using a force sensing resistor (FSR) sensor that can control both frequency and intensity simultaneously during conversation. Employing three types of electrolarynxes (Evada, Servox-inton, Nu-vois), this study was undertaken to examine the functional characteristics of Evada for the normal control group and for laryngectomess. Five laryngectomees and five normal adults were asked to express three sentences (declarative sentence, “You stay here.”, interrogative sentence, “You stay here?”, and imperative sentence, “You! Stay here.”) using three types of electrolarynxes. Frequency and intensity changes between the first and last vowels in the three sentences were calculated and analyzed statistically by paired t test. The frequency changes in the interrogative and imperative sentences were more prominent in Evada than in Servox-inton and Nu-vois. The intensity changes in the interrogative and imperative sentences were also more prominent in Evada than in Servox-inton and Nu-vois. Evada controls frequency and/or intensity by having the subject press the control button(s). Therefore, Evada appears to be better at producing intonation and contrastive stress than Nu-vois and Servox-inton.  相似文献   

4.
The performance of the human pitch control system was characterized by measurement of the speed of pitch shift and pitch shift response speed (inverse of reaction time) at various initial pitch and loudness levels. Data from three nonsinger adult male subjects and one professional singer suggest a strong inverse correlation (r greater than 0.78) between initial pitch and rate of pitch rise. This study showed no significant relation between initial loudness and rate of pitch rise. Also, vocal response speed showed no significant relation with either initial pitch or loudness. However, it is suggested that pitch shift response speed might be related to the second formant frequency of the target vowel. A composite index of pitch control performance capacity was defined as the product of response speed and vocal fold contractile velocity. From experimental data, the composite index was able to reflect a distinct 74% superior performance by the professional singer (relative to the average maximum performance capacity of nonsingers). It is suggested that the product-based composite index of performance capacity can serve as a sensitive means for vocal proficiency determination.  相似文献   

5.
SUMMARY: This study identified that physiologically the superior pharyngeal constrictor muscle at the level of the base of the tongue contributes to retrusive movement of the tongue with constriction of the mid-pharyngeal cavity and possesses unique properties in terms of motor speech control along with the genioglossus muscle. From a kinematic study involving trans-nasal fiberscopy and lateral X-ray fluorography, retrusive movement of the tongue was highly correlated with constrictive movement of the mid-pharyngeal cavity. An electromyographic study revealed that the superior pharyngeal constrictor muscle at the level of the base of the tongue contributes to retrusive movement of the tongue and that the genioglossus muscle contributes to protrusive movement. We also noted that this relationship between the activities of these two muscles were in response to postural changes during vowel productions without changes in the acoustic features. These findings suggest that these two muscles act not only antagonistically to produce retrusive and protrusive movement of the tongue, but also they complement each other to conserve the shape of the vocal tract for speech production. The functional relationship between these two muscles could contribute the consecutive movement of human speech production under various conditions and might be useful when applying rehabilitation approaches for the patients with neurological speech and swallowing disorders.  相似文献   

6.
Although advances in techniques for image acquisition and analysis have facilitated the direct measurement of three-dimensional vocal tract air space shapes associated with specific speech phonemes, little information is available with regard to changes in three-dimensional (3-D) vocal tract shape as a function of vocal register, pitch, and loudness. In this study, 3-D images of the vocal tract during falsetto and chest register phonations at various pitch and loudness conditions were obtained using electron beam computed tomography (EBCT). Detailed measurements and differences in vocal tract configuration and formant characteristics derived from the eight measured vocal tract shapes are reported.  相似文献   

7.
The forces and torques governing effective two-dimensional (2D) translation and rotation of the laryngeal cartilages (cricoid, thyroid, and arytenoids) are quantified on the basis of more complex three-dimensional movement. The motions between these cartilages define the elongation and adduction (collectively referred to as posturing) of the vocal folds. Activations of the five intrinsic laryngeal muscles, the cricothyroid, thyroarytenoid, lateral cricoarytenoid, posterior cricoarytenoid, and interarytenoid are programmed as inputs, in isolation and in combination, to produce the dynamics of 2D posturing. Parameters for the muscles are maximum active stress, passive stress, activation time, contraction time, and maximum shortening velocity. The model accepts measured electromyographic signals as inputs. A repeated adductory-abductory gesture in the form /hi-hi-hi-hi-hi/ is modeled with electromyographic inputs. Movement and acoustic outputs are compared between simulation and measurement.  相似文献   

8.
Measurements on the inverse filtered airflow waveform and of estimated average transglottal pressure and glottal airflow were made from syllable sequences in low, normal, and high pitch for 25 male and 20 female speakers. Correlation analyses indicated that several of the airflow measurements were more directly related to voice intensity than to fundamental frequency (F0). Results suggested that pressure may have different influences in low and high pitch in this speech task. It is suggested that unexpected results of increased pressure in low pitch were related to maintaining voice quality, that is, avoiding vocal fry. In high pitch, the increased pressure may serve to maintain vocal fold vibration. The findings suggested different underlying laryngeal mechanisms and vocal adjustments for increasing and decreasing F0 from normal pitch.  相似文献   

9.
Previous studies have demonstrated that motor control of segmental features of speech rely to some extent on sensory feedback. Control of voice fundamental frequency (F0) has been shown to be modulated by perturbations in voice pitch feedback during various phonatory tasks and in Mandarin speech. The present study was designed to determine if voice Fo is modulated in a task-dependent manner during production of suprasegmental features of English speech. English speakers received pitch-modulated voice feedback (+/-50, 100, and 200 cents, 200 ms duration) during a sustained vowel task and a speech task. Response magnitudes during speech (mean 31.5 cents) were larger than during the vowels (mean 21.6 cents), response magnitudes increased as a function of stimulus magnitude during speech but not vowels, and responses to downward pitch-shift stimuli were larger than those to upward stimuli. Response latencies were shorter in speech (mean 122 ms) compared to vowels (mean 154 ms). These findings support previous research suggesting the audio vocal system is involved in the control of suprasegmental features of English speech by correcting for errors between voice pitch feedback and the desired F0.  相似文献   

10.
《Journal of voice》2019,33(6):851-859
PurposeThe pitch-shift reflex (PSR) is the adaptation of the fundamental frequency during phonation and speech and describes the auditory feedback control. Speakers without voice and speech disorders mostly show a compensation of the pitch change in the auditory feedback and adapt their fundamental frequency to the opposite direction. Dysphonic patients often display problems with the auditory perception and control of their voice during therapy. Our study focuses on the auditory and kinesthetic control mechanisms of patients with muscle tension dysphonia (MTD) and speakers without voice and speech problems. Main purpose of the study is the analysis of the functionality of the control mechanisms within phonation and speech between patients with MTD and normal speakers.MethodSixty-one healthy subjects (17 male, 44 female) and 22 patients with MTD (7 male, 15 female) participated following two paradigms including a sustained phonation (vowel /a/) and speech ([‘mama]). Within both paradigms the fundamental frequency of the auditory feedback was increased synthetically. For the analysis of the PSR the electroencephalogram, electroglottography, the voice signal, and the high-speed endoscopy data were recorded simultaneously. The PSR in the electroencephalogram was detected via the N100 and the mismatch negativity. Statistical tests were applied for the detection of the PSR in the physiological response within the electroglottography, voice, and high-speed endoscopy signals. The results were compared between both groups.ResultsNo differences were found between the controls and patients with MTD regarding latency and magnitude of the perception of the pitch shift in both paradigms, but for the magnitude of the behavioral response. Differences also could be found for both groups between the “no pitch” and “pitch” condition of the two paradigms regarding vocal fold dynamics and voice quality. Patients with MTD showed more vibrational irregularities during the PSR than the controls, especially regarding the symmetry of vocal fold dynamics.ConclusionPatients with MTD seem to have a disturbed interaction between the auditory and kinesthetic feedback inducing the execution of an overriding behavioral response.  相似文献   

11.
Understanding speech in background noise, talker identification, and vocal emotion recognition are challenging for cochlear implant (CI) users due to poor spectral resolution and limited pitch cues with the CI. Recent studies have shown that bimodal CI users, that is, those CI users who wear a hearing aid (HA) in their non-implanted ear, receive benefit for understanding speech both in quiet and in noise. This study compared the efficacy of talker-identification training in two groups of young normal-hearing adults, listening to either acoustic simulations of unilateral CI or bimodal (CI+HA) hearing. Training resulted in improved identification of talkers for both groups with better overall performance for simulated bimodal hearing. Generalization of learning to sentence and emotion recognition also was assessed in both subject groups. Sentence recognition in quiet and in noise improved for both groups, no matter if the talkers had been heard during training or not. Generalization to improvements in emotion recognition for two unfamiliar talkers also was noted for both groups with the simulated bimodal-hearing group showing better overall emotion-recognition performance. Improvements in sentence recognition were retained a month after training in both groups. These results have potential implications for aural rehabilitation of conventional and bimodal CI users.  相似文献   

12.
Active and passive characteristics of the canine cricothyroid muscle were investigated through a series of experiments conducted in vitro and compared with their counterparts in the thyroarytenoid muscle. Samples from separate portions of canine cricothyroid muscle, namely, the pars recta and pars obliqua, were dissected from dog larynges excised a few minutes before death and kept in Krebs-Ringer solution at a temperature of 37°C ± 1° C and a pH of 7.4 ± 0.05. Active tetanic stress was obtained in isometric and isotonic conditions by applying field stimulation to the muscle samples through a pair of parallel-plate platinum electrodes and using a train of square pulses of 0.1-ms duration and 85-V amplitude. Force and elongation of the samples were obtained electronically with a dual-servo system (ergometer). The results indicate that the dynamic response of the canine cricothyroid muscle is almost twice as slow as that of the thyroarytenoid muscle. The average 50% tetanic contraction times for pars recta and pars obliqua were 84 ms and 109 ms, respectively, in comparison to 50 ms for thyroarytenoid. The examination of force-velocity response of this muscle indicates a maximum shortening velocity of 2 to 3 times its length per second, which is about half of the thyroarytenoid shortening speed. The passive properties of the pars recta and pars obliqua portions are similar to those of thyroarytenoid muscle.  相似文献   

13.
Both in normal speech voice and in some types of pathological voice, adjacent vocal cycles may alternate in amplitude or period, or both. When this occurs, the determination of voice fundamental frequency (defined as number of vocal cycles per second) becomes difficult. The present study attempts to address this issue by investigating how human listeners perceive the pitch of alternate cycles. As stimuli, vowels /a/ and /i/ were synthesized with fundamental frequencies at 140 Hz and 220 Hz, and the effect of alternate cycles was simulated with both amplitude- and frequency-modulation of the glottal volume velocity waveform. Subjects were asked to judge the pitch of the modulated vowels in reference to vowels without modulation. The results showed that (a) perceived pitch became lower as the amount of modulation increased, and the effect seems to be more dramatic than would be predicted by existing hypotheses, (b) perceived pitch differed across vowels, fundamental frequencies, and modulation types, that is, amplitude versus frequency modulation, and (c) the prediction of perceived pitch was best made in the frequency domain in terms of subharmonic-to-harmonic ratio. These findings provide useful information on how we should assess the pitch of alternate cycles. They may also be helpful in developing more robust pitch determination algorithms.  相似文献   

14.
A method for the analysis of vocal tract parameters is developed, aimed to perform quantitative analysis of rigidity from speech signals of Parkinsonian patients. The cross-sectional area function of the vocal tract is calculated using pitch synchronous autoregressive moving average (ARMA) analysis. The changes in Parkinsonian subjects of the cross-sectional area during the utterance of sustained sounds are attributed to both Parkinsonian tremor and rigidity. In order to isolate the effects of the rigidity on the vocal tract from those of the tremor, an adaptive tremor cancellation (ATC) algorithm is developed, based on the correlation of tremor signals extracted from different locations of the speech production system.  相似文献   

15.
This study examined speech breathing patterns during reading bywomen with bilateral vocal fold nodules judged as mildly dysphonic and by women without vocal nodules. Although it might be predictable that the speech breathing patterns of individuals with laryngeal dysfunction will differ from those without laryngeal dysfunction, there is a lack of empirical data to support such assumptions.The results of the current study indicated that glottal airflow was greaterduring reading for the women with vocal nodules and that a larger volume of air was expended both per syllable and per breath group during reading. The rate of speech did not significantly differ between the two groups of women. There was no significant difference for the average duration of the breath groups and no significant difference for the number of syllables spoken per breath group. Additionally, both groups of women demonstrated a similar pattern of inspiratory pause location during the reading. The results suggest that speech breathing patterns associated with dysphonia be examined independently to distinguish specifically the nature of the interaction between the laryngeal dysfunction and the speech breathing pattern. Certainly, more information on how the severity of a voice disorder influences speech breathing is necessary.  相似文献   

16.
Recent simulations of continuous interleaved sampling (CIS) cochlear implant speech processors have used acoustic stimulation that provides only weak cues to pitch, periodicity, and aperiodicity, although these are regarded as important perceptual factors of speech. Four-channel vocoders simulating CIS processors have been constructed, in which the salience of speech-derived periodicity and pitch information was manipulated. The highest salience of pitch and periodicity was provided by an explicit encoding, using a pulse carrier following fundamental frequency for voiced speech, and a noise carrier during voiceless speech. Other processors included noise-excited vocoders with envelope cutoff frequencies of 32 and 400 Hz. The use of a pulse carrier following fundamental frequency gave substantially higher performance in identification of frequency glides than did vocoders using envelope-modulated noise carriers. The perception of consonant voicing information was improved by processors that preserved periodicity, and connected discourse tracking rates were slightly faster with noise carriers modulated by envelopes with a cutoff frequency of 400 Hz compared to 32 Hz. However, consonant and vowel identification, sentence intelligibility, and connected discourse tracking rates were generally similar through all of the processors. For these speech tasks, pitch and periodicity beyond the weak information available from 400 Hz envelope-modulated noise did not contribute substantially to performance.  相似文献   

17.
Over the last few decades, researchers have been investigating the mechanisms involved in speech production. Image analysis can be a valuable aid in the understanding of the morphology of the vocal tract. The application of magnetic resonance imaging to study these mechanisms has been proven to be reliable and safe. We have applied deformable models in magnetic resonance images to conduct an automatic study of the vocal tract; mainly, to evaluate the shape of the vocal tract in the articulation of some European Portuguese sounds, and then to successfully automatically segment the vocal tract's shape in new images. Thus, a point distribution model has been built from a set of magnetic resonance images acquired during artificially sustained articulations of 21 sounds, which successfully extracts the main characteristics of the movements of the vocal tract. The combination of that statistical shape model with the gray levels of its points is subsequently used to build active shape models and active appearance models. Those models have then been used to segment the modeled vocal tract into new images in a successful and automatic manner. The computational models have thus been revealed to be useful for the specific area of speech simulation and rehabilitation, namely to simulate and recognize the compensatory movements of the articulators during speech production.  相似文献   

18.
Acoustic analysis of the speaking voice after thyroidectomy   总被引:1,自引:0,他引:1  
Voices of 47 female patients were analyzed before and after thyroidectomy, with preservation of the recurrent and superior laryngeal nerves and normal vocal fold motility during the observation period. A mean decrease of the speaking fundamental frequency (SFF) of 12 Hz was found on day 4; in 8 patients the postoperative vocal pitch was more than 2 semitones lower. The distance between the highest and lowest F0 during speaking was diminished (speech was more monotone) and the vocal jitter was elevated. In the frequency spectrum, there was a diminished prominence of the harmonics. The other spectral parameters (as the slope of the spectrum and the H1/H2 ratio) were unchanged. All changes had disappeared the fifteenth day, except for a lower SFF (>2 semitones) in 2 cases. It is concluded that after normal dissection of the laryngeal nerves, and in the absence of vocal fold paresis, other reasons for voice changes immediately after thyroidectomy remain: alterations in the neck muscles, in the laryngeal mucosa, and in the patient's general condition. Although the effects seem limited and of short duration, knowledge of them is helpful when informing the patient before thyroid surgery.  相似文献   

19.
Relying on a corpus of thirty narrative discourses,the roles of pitch and duration of prosodic words in sentence accent were studied in discourse context.At first,the pitch was normalized.Then according to the pitch range,the sentence and prosodic word were classified into three ranks of strengthened,normal and weakened respectively.In the same time the sentence accent was classified into two levels of primary and secondary by perceptual evaluation. The results showed that the relative pitch range of prosodic words in opposition to sentence contributed dominantly to sentence accent.Furthermore,the roles of pitch and duration in sentence accent were affected interactively by the rank of sentence and prosodic words.In normal prosodic words,primary sentence accents were realized by the mutual performance of pitch and duration while secondary sentence accents mainly depended on the variation of pitch. In strengthened prosodic words,the role of duration in sentence accent was more significant when the pitch range of the sentence was more compressed.Finally,it was found that the correlation between pitch and duration was influenced primarily by the strength of prosodic words,and in weakened,normal and strengthened prosodic words,the correlations between pitch and duration were positive,null,and negative respectively.  相似文献   

20.
When listening to natural speech, listeners are fairly adept at using cues such as pitch, vocal tract length, prosody, and level differences to extract a target speech signal from an interfering speech masker. However, little is known about the cues that listeners might use to segregate synthetic speech signals that retain the intelligibility characteristics of speech but lack many of the features that listeners normally use to segregate competing talkers. In this experiment, intelligibility was measured in a diotic listening task that required the segregation of two simultaneously presented synthetic sentences. Three types of synthetic signals were created: (1) sine-wave speech (SWS); (2) modulated noise-band speech (MNB); and (3) modulated sine-band speech (MSB). The listeners performed worse for all three types of synthetic signals than they did with natural speech signals, particularly at low signal-to-noise ratio (SNR) values. Of the three synthetic signals, the results indicate that SWS signals preserve more of the voice characteristics used for speech segregation than MNB and MSB signals. These findings have implications for cochlear implant users, who rely on signals very similar to MNB speech and thus are likely to have difficulty understanding speech in cocktail-party listening environments.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号