首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
This article reviews recent experimental and clinical literature on the central neural mechanisms involved in vocalization. Various parts of the cerebral cortex, limbic system, basal ganglia, and extrapyramidal system have been shown in human and animal studies to be important in vocalization, but the exact function of these areas with regard to vocal control is unclear. The limbic system and diencephalon project to the midbrain periaqueductal gray (PAG), which may be important for coordination of various muscle groups involved in vocalization. The PAG neurons project to the reticular formation, nucleus retroambiguus, and nucleus ambiguus. Neurons in the nucleus retroambiguus seem to be involved in control of neurons related to the respiratory or laryngeal systems. Different types of motoneurons of the laryngeal muscles in the nucleus ambiguus are related to various functions such as vocalization, swallowing, and respiration.  相似文献   

2.
The present study describes the laryngeal and respiratory muscle activity associated with vocalizations in macaque monkeys. During the bark vocalization, a short, aperiodic call, the cricothyroid, thyroarytenoid, rectus abdominis, and intercostals were active while the posterior cricoarytenoid and diaphragm were quiet. During the coo vocalization, a longer, clear call, the cricothyroid, thyroarytenoid, intercostals, rectus abdominis, and diaphragm were active. In one monkey, the posterior cricoarytenoid was also active during the call, while in another monkey it was not. Laryngeal muscle activity was correlated with the amplitude and duration of the coo call. Results suggest that the amplitude and duration differences between calls are determined primarily by laryngeal modification of the airflow, and that the differences in posterior cricoarytenoid activity may be due to differences in voice intensity.  相似文献   

3.
Electrical stimulation of the midbrain was used to elicit a variety of vocalizations from six anesthetized dogs. This study was conducted to investigate the ranges of and relationships between fundamental frequency of the vocalizations (F0) and tracheal pressure (Pt) produced during the vocalizations. The vocalizations were described according to type (growl, howl, and whine); F0 and Pt, as well as patterns of laryngeal muscle activity, were examined for each vocalization type. Natural-sounding growl and howl vocalizations were elicited from five dogs; three dogs also produced whines. With few exceptions, F0 was categorically different for the three vocalization types (low for growls, average for howls, very high for whines). Pt values overlapped for the three vocalization types, although, on average, howls were produced with greater Pt than growls. Patterns and degrees of laryngeal muscle activity varied across and within vocalization types, but general findings were consistent with the presumed function of most of the muscles. Laryngeal muscle activity may help explain some of the variability in the acoustic and aerodynamic data.  相似文献   

4.
The cricothyroid muscle in voicing control   总被引:1,自引:0,他引:1  
Initiation and maintenance of vibrations of the vocal folds require suitable conditions of adduction, longitudinal tension, and transglottal airflow. Thus manipulation of adduction/abduction, stiffening/slackening, or degree of transglottal flow may, in principle, be used to determine the voicing status of a speech segment. This study explores the control of voicing and voicelessness in speech with particular reference to the role of changes in the longitudinal tension of the vocal folds, as indicated by cricothyroid (CT) muscle activity. Electromyographic recordings were made from the CT muscle in two speakers of American English and one speaker of Dutch. The linguistic material consisted of reiterant speech made up of CV syllables where the consonants were voiced and voiceless stops, fricatives, and affricates. Comparison of CT activity associated with the voiced and voiceless consonants indicated a higher level for the voiceless consonants than for their voiced cognates. Measurements of the fundamental frequency (F0) at the beginning of a vowel following the consonant show the common pattern of higher F0 after voiceless consonants. For one subject, there was no difference in cricothyroid activity for voiced and voiceless affricates; in this case, the consonant-induced variations in the F0 of the following vowel were also less robust. Consideration of timing relationships between the EMG curves for voiced and voiceless consonants suggests that the differences most likely reflect control of vocal-fold tension for maintenance or suppression of phonatory vibrations. The same mechanism also seems to contribute to the well-known difference in F0 at the beginning of vowels following voiced and voiceless consonants.  相似文献   

5.
In normal speech, coordinated activities of intrinsic laryngeal muscles suspend a glottal sound at utterance of voiceless consonants, automatically realizing a voicing control. In electrolaryngeal speech, however, the lack of voicing control is one of the causes of unclear voice, voiceless consonants tending to be misheard as the corresponding voiced consonants. In the present work, we developed an intra-oral vibrator with an intra-oral pressure sensor that detected utterance of voiceless phonemes during the intra-oral electrolaryngeal speech, and demonstrated that an intra-oral pressure-based voicing control could improve the intelligibility of the speech. The test voices were obtained from one electrolaryngeal speaker and one normal speaker. We first investigated on the speech analysis software how a voice onset time (VOT) and first formant (F1) transition of the test consonant-vowel syllables contributed to voiceless/voiced contrasts, and developed an adequate voicing control strategy. We then compared the intelligibility of consonant-vowel syllables among the intra-oral electrolaryngeal speech with and without online voicing control. The increase of intra-oral pressure, typically with a peak ranging from 10 to 50 gf/cm2, could reliably identify utterance of voiceless consonants. The speech analysis and intelligibility test then demonstrated that a short VOT caused the misidentification of the voiced consonants due to a clear F1 transition. Finally, taking these results together, the online voicing control, which suspended the prosthetic tone while the intra-oral pressure exceeded 2.5 gf/cm2 and during the 35 milliseconds that followed, proved efficient to improve the voiceless/voiced contrast.  相似文献   

6.
Mammalian vocal production mechanisms are still poorly understood despite their significance for theories of human speech evolution. Particularly, it is still unclear to what degree mammals are capable of actively controlling vocal-tract filtering, a defining feature of human speech production. To address this issue, a detailed acoustic analysis on the alarm vocalization of free-ranging Diana monkeys was conducted. These vocalizations are especially interesting because they convey semantic information about two of the monkeys' natural predators, the leopard and the crowned eagle. Here, vocal tract and sound source parameter in Diana monkey alarm vocalizations are described. It is found that a vocalization-initial formant downward transition distinguishes most reliably between eagle and leopard alarm vocalization. This finding is discussed as an indication of articulation and alternatively as the result of a strong nasalization effect. It is suggested that the formant modulation is the result of active vocal filtering used by the monkeys to encode semantic information, an ability previously thought to be restricted to human speech.  相似文献   

7.
Respiratory abdominal movements associated with vocalization were recorded in awake squirrel monkeys. Several call types, such as peeping, trilling, cackling, and err-chucks, were accompanied by large vocalization-correlated respiratory movements (VCRM) that started before vocalization. During purring, in contrast, only small VCRM were recorded that started later after vocal onset. VCRM during trill calls, a vocalization type with repetitive frequency modulation, showed a modulation in the rhythm of the frequency changes. A correlation with amplitude modulation was also present, but more variable. As high frequencies need a higher lung pressure for production than low frequencies, the modulation of VCRM seems to serve to optimize the lung pressure in relation to the vocalization frequency. The modulation, furthermore, may act as a mechanism to produce different trill variants. During err-chucks and staccato peeps, which show a large amplitude modulation, a nonmodulated VCRM occurred. This indicates the existence of a laryngeal amplitude-controlling mechanism that is independent from respiration.  相似文献   

8.
Supraglottic activity was rated from flexible endoscopic video recordings of subjects with normal laryngeal structure and function as they sustained vowels and repeated syllables and sentences. Judges rated these recordings for false vocal fold (FVF) adduction and anterior-to-posterior (A-P) compression at the initiation of the speech task, throughout the whole speech task (static supraglottic activity), and as brief individual adductions within a speech task (dynamic supraglottic activity). Significant differences in A-P (p < 0.0003) and FVF (p < 0.0000001) compression were found between tasks. Dynamic FVF activity was associated with glottal stops. Static A-P and FVF activities were present in males significantly more (p < 0.0001) than females. FVF activity associated with speech initiation was found in females significantly more (p = 0.0256) than males. Supraglottic activity plays a role in normal speech production, and should not necessarily be considered suggestive of a voice use pattern with excessive muscle tension.  相似文献   

9.
The objective of this study was to investigate the underlying laryngeal mechanisms during the specific human 4-kHz vocalization. The laryngeal configuration during this vocalization was measured using high-resolution computerized tomographic scan and videostrobolaryngoscopy. The color Doppler imaging (CDI) of medical ultrasound was used to detect the vibrations of glottal and supraglottal mucosa. During the 4-kHz vocalization, the ventricular folds were adducted in the shape of a bimodal chink and the vocal folds were shaped as a "V" with an opening at the posterior glottis. In the coronal view, the laryngeal ventricles had collapsed and a divergent shaped conduit was observed at the posterior portion of the larynx. The surface mucosa vibration detected by CDI was noted over the bilateral ventricular folds and aryepiglottic folds. The vibration displacement was estimated to be on the order of 0.1mm. This vibration amplitude was too small to be detected in videostrobolaryngoscopy. The laryngeal configuration and CDI data suggested a diffuser jet with periodic vorticity bursts in the larynx producing 4 kHz voice.  相似文献   

10.
噪声环境中的汉语浊语音检测   总被引:1,自引:0,他引:1  
为了在低信噪比和复杂噪声环境下检测汉语浊语音,根据浊语音谐波结构特性,提出了一种鲁棒的浊语音检测方法。通过改进的谱跟踪算法,得到能表征浊语音谐波特性的一簇谱线;从谱线簇中提取谐波特征作为汉语浊语音检测的依据。在不同信噪比和不同噪声环境下的浊语音检测对比实验中全面优于传统方法,在0 dB信噪比时正识率高于传统方法约30%。实验结果表明,该方法在低信噪比和非平稳复杂噪声环境下都具有较好的浊语音检测效果。   相似文献   

11.
The voiced bilabial fricative /β:/ has been used as a vocal exercise. The present study investigated the effects of the exercise on voice production and voice source. This study compared vowel phonation on the syllable /a:p/ with the production of the exercise and vowel phonation before and immediately after the exercise. The methods were (a) dual-channel electroglottography, from which the vertical laryngeal position was derived, (b) electromyography using surface electrodes, and (c) inverse filtering of the acoustic signal to obtain an estimate of the voice source. In the production of /β:/ as compared with vowel phonation in most of the cases, the vertical laryngeal position seemed to be higher, the muscular activity of the larynx lower, and the slope of the voice source spectrum steeper. In vowel phonation after the exercise, the muscular activity seemed to be lower in most cases, although the voice source remained unchanged. This seems to indicate improved vocal economy.  相似文献   

12.
Auditory feedback influences human speech production, as demonstrated by studies using rapid pitch and loudness changes. Feedback has also been investigated using the gradual manipulation of formants in adaptation studies with whispered speech. In the work reported here, the first formant of steady-state isolated vowels was unexpectedly altered within trials for voiced speech. This was achieved using a real-time formant tracking and filtering system developed for this purpose. The first formant of vowel /epsilon/ was manipulated 100% toward either /ae/ or /I/, and participants responded by altering their production with average Fl compensation as large as 16.3% and 10.6% of the applied formant shift, respectively. Compensation was estimated to begin <460 ms after stimulus onset. The rapid formant compensations found here suggest that auditory feedback control is similar for both F0 and formants.  相似文献   

13.
For stimuli modeling stop consonants varying in the acoustic correlates of voice onset time (VOT), human listeners are more likely to perceive stimuli with lower f0's as voiced consonants--a pattern of perception that follows regularities in English speech production. The present study examines the basis of this observation. One hypothesis is that lower f0's enhance perception of voiced stops by virtue of perceptual interactions that arise from the operating characteristics of the auditory system. A second hypothesis is that this perceptual pattern develops as a result of experience with f0-voicing covariation. In a test of these hypotheses, Japanese quail learned to respond to stimuli drawn from a series varying in VOT through training with one of three patterns of f0-voicing covariation. Voicing and f0 varied in the natural pattern (shorter VOT, lower f0), in an inverse pattern (shorter VOT, higher f0), or in a random pattern (no f0-voicing covariation). Birds trained with stimuli that had no f0-voicing covariation exhibited no effect of f0 on response to novel stimuli varying in VOT. For the other groups, birds' responses followed the experienced pattern of covariation. These results suggest f0 does not exert an obligatory influence on categorization of consonants as [VOICE] and emphasize the learnability of covariation among acoustic characteristics of speech.  相似文献   

14.
Irregularities in voiced speech are often observed as a consequence of vocal fold lesions, paralyses, and other pathological conditions. Many of these instabilities are related to the intrinsic nonlinearities in the vibrations of the vocal folds. In this paper, bifurcations in voice signals are analyzed using narrow-band spectrograms. We study sustained phonation of patients with laryngeal paralysis and data from an excised larynx experiment. These spectrograms are compared with computer simulations of an asymmetric 2-mass model of the vocal folds. (c) 1995 American Institute of Physics.  相似文献   

15.
The well-known speech production model is considered, where the speech signal is modeled as the output of an all-pole filter driven either by some white noise sequence (unvoiced speech) or by the sum of a periodic excitation and a noise sequence (voiced speech). Approximate maximum-likelihood (ML) estimation algorithms for the unvoiced case are well known. The ML estimator of the parameters is obtained for the voiced speech model. These parameters consist of the parameters of the periodic excitation (pitch parameters) and the parameters of the filter [linear prediction coefficient (LPC) parameters]. The results of the application of the algorithm on simulated and on real speech data are presented.  相似文献   

16.
Although both perceived vocal effort and intensity are known to influence the perceived distance of speech, little is known about the processes listeners use to integrate these two parameters into a single estimate of talker distance. In this series of experiments, listeners judged the distances of prerecorded speech samples presented over headphones in a large open field. In the first experiment, virtual synthesis techniques were used to simulate speech signals produced by a live talker at distances ranging from 0.25 to 64 m. In the second experiment, listeners judged the apparent distances of speech stimuli produced over a 60-dB range of different vocal effort levels (production levels) and presented over a 34-dB range of different intensities (presentation levels). In the third experiment, the listeners judged the distances of time-reversed speech samples. The results indicate that production level and presentation level influence distance perception differently for each of three distinct categories of speech. When the stimulus was high-level voiced speech (produced above 66 dB SPL 1 m from the talker's mouth), the distance judgments doubled with each 8-dB increase in production level and each 12-dB decrease in presentation level. When the stimulus was low-level voiced speech (produced at or below 66 dB SPL at 1 m), the distance judgments doubled with each 15-dB increase in production level but were relatively insensitive to changes in presentation level at all but the highest intensity levels tested. When the stimulus was whispered speech, the distance judgments were unaffected by changes in production level and only decreased with increasing presentation level when the intensity of the stimulus exceeded 66 dB SPL. The distance judgments obtained in these experiments were consistent across a range of different talkers, listeners, and utterances, suggesting that voice-based distance cueing could provide a robust way to control the apparent distances of speech sounds in virtual audio displays.  相似文献   

17.
The present study attempted to investigate the acoustic characteristics of Mandarin laryngeal and esophageal speech. Eight normal laryngeal and seven esophageal speakers participated in the acoustic experiments. Results from acoustic analyses of syllables /ma/and /ba/ indicated that, F0, intensity, and signal-to-noise ratio of laryngeal speech were significantly higher than those of esophageal speech. However, opposite results were found for vowel duration, jitter, and shimmer. Mean F0, intensity, and word per minute in reading were greater but number of pauses was smaller in laryngeal speech than those in esophageal speech. Similar patterns of F0 contours and vowel duration as a function of tone were found between laryngeal and esophageal speakers. Long-time spectra analysis indicated that higher first and second formant frequencies were associated with esophageal speech than that with normal laryngeal speech.  相似文献   

18.
A technique has been developed to obtain a quantitative measure of correlation between electromyographic (EMG) activity of various laryngeal muscles, subglottal air pressure, and the fundamental frequency of vibration of the vocal folds (Fo). Data were collected and analyzed on one subject, a native speaker of American English. The results show that an analysis of this type can provide a useful measure of correlation between the physiological and acoustical events in speech and, furthermore, can yield detailed insights into the organization and nature of the speech production process. In particular, based on these results, a model is suggested of Fo control involving laryngeal state functions that seems to agree with present knowledge of laryngeal control and experimental evidence.  相似文献   

19.
Electromyographic recordings of intrinsic laryngeal muscles were made during respiration, phonation, speech, and swallow in three subjects in two conditions: with and without intravenous administration of 5 mg of diazepam. The mean activity in microvolts of the thyroarytenoid and cricothyroid muscles was measured during respiration and the percentage increase over resting levels during inspiration, expiration, swallow, phonation, and speech. All subjects demonstrated significant (p ≤ 0.001) reductions in mean activity during respiration with diazepam. Significant (p ≤ 0.001) diazepam, subject, and subject by diazepam interactions were found in the percentage increase in muscle activation on each task. In one subject muscle activity consistently decreased, while in the other two subjects it consistently increased with diazepam. Although all had significant muscle relaxant effects, individuals differed in their diazepam responses during muscle activation. These differences may relate to a subject's age.  相似文献   

20.
Monaural speech segregation has proven to be extremely challenging. While efforts in computational auditory scene analysis have led to considerable progress in voiced speech segregation, little attention has been given to unvoiced speech, which lacks harmonic structure and has weaker energy, hence more susceptible to interference. This study proposes a new approach to the problem of segregating unvoiced speech from nonspeech interference. The study first addresses the question of how much speech is unvoiced. The segregation process occurs in two stages: Segmentation and grouping. In segmentation, the proposed model decomposes an input mixture into contiguous time-frequency segments by a multiscale analysis of event onsets and offsets. Grouping of unvoiced segments is based on Bayesian classification of acoustic-phonetic features. The proposed model for unvoiced speech segregation joins an existing model for voiced speech segregation to produce an overall system that can deal with both voiced and unvoiced speech. Systematic evaluation shows that the proposed system extracts a majority of unvoiced speech without including much interference, and it performs substantially better than spectral subtraction.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号