首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
2.
This paper investigates the mechanisms controlling the phonemic quantity contrast and speech rate in nonsense p(1)Np(2)a words read by five Slovak speakers in normal and fast speech rate. N represents a syllable nucleus, which in Slovak corresponds to long and short vowels and liquid consonants. The movements of the lips and the tongue were recorded with an electromagnetometry system. Together with the acoustic durations of p(1), N, and p(2), gestural characteristics of three core movements were extracted: p(1) lip opening, tongue movement for (N)ucleus, and p(2) lip closing. The results show that, although consonantal and vocalic nuclei are predictably different on many kinematic measures, their common phonological behavior as syllabic nuclei may be linked to a stable temporal coordination of the consonantal gestures flanking the nucleus. The functional contrast between phonemic duration and speech rate was reflected in the bias in the control mechanisms they employed: the strategies robustly used for signaling phonemic duration, such as the degree of coproduction of the two lip movements, showed a minimal effect of speech rate, while measures greatly affected by speech rate, such as p(2) acoustic duration, or the degree of p(1)-N gestural coproduction, tended to be minimally influenced by phonemic quantity.  相似文献   

3.
In this study, the effect of articulation rate and speaking style on the perceived speech rate is investigated. The articulation rate is measured both in terms of the intended phones, i.e., phones present in the assumed canonical form, and as the number of actual, realized phones per second. The combination of these measures reflects the deletion of phones, which is related to speaking style. The effect of the two rate measures on the perceived speech rate is compared in two listening experiments on the basis of a set of intonation phrases with carefully balanced intended and realized phone rates, selected from a German database of spontaneous speech. Because the balance between input-oriented (effort) and output-oriented (communicative) constraints may be different at fast versus slow speech rates, the effect of articulation rate is compared both for fast and for slow phrases from the database. The effect of the listeners' own speaking habits is also investigated to evaluate if listeners' perception is based on a projection of their own behavior as a speaker. It is shown that listener judgments reflect both the intended and realized phone rates, and that their judgments are independent of the constraint balance and their own speaking habits.  相似文献   

4.
Medial movements of the lateral pharyngeal wall at the level of the velopharyngeal port were examined by using a computerized ultrasound system. Subjects produced CVNVC sequences involving all combinations of the vowels /a/ and /u/ and the nasal consonants /n/ and /m/. The effects of both vowels on the CVN and NVC gestures (opening and closing of the velopharyngeal port, respectively) were assessed in terms of movement amplitude, duration, and movement onset time. The amplitude of both opening and closing gestures of the lateral pharyngeal wall was less in the context of the vowel /u/ than the vowel /a/. In addition, the onset of the opening gesture towards the nasal consonant was related to the identity of both the initial and the final vowels. The characteristics of the functional coupling of the velum and lateral pharyngeal wall in speech are discussed.  相似文献   

5.
Study on the acoustical characteristic is important to speech and speaker recognition in Chinese whispered speech. In this paper, the characteristics of whispered speech are introduced and the acoustical characteristics in Chinese whispered speech are discussed. There is no fundamental frequency in the whispered speech, so other characteristics such as the duration and frequency of formant are extracted and analyzed. From experiments with six simple Chinese whispered vowels, it is proved that the duration and the frequency of formant can be used as the main acoustical characteristics in the Chinese whispered recognition.  相似文献   

6.
Deleted segments of speech can be restored perceptually if they are replaced by a louder noise. An earlier study of this "phonemic restoration effect" found that, when recorded discourse was interrupted periodically by noise, the durational limit for illusory continuity corresponded to the average word duration. The present study employed a different passage of discourse recorded by a different speaker. Durational limits for apparent continuity of discourse interrupted by noise were measured at the normal (original) playback speed, as well as at rates that were 15% greater and 15% less. At the normal playback rate, once again the limit of continuity approximated the average word duration--but of especial interest was the finding that changes in playback rate produced proportional changes in continuity limits. These results, together with other evidence, suggest that phonemic restorations represent a special linguistic application of a general auditory mechanism (auditory induction) producing appropriate syntheses of obliterated sounds, and that for discourse the limits of illusory continuity correspond to a fixed amount of verbal information, and not a fixed temporal value.  相似文献   

7.
A model is presented which predicts the movements of flesh points on the tongue, lips, and jaw during speech production, from time-aligned phonetic strings. Starting from a database of x-ray articulator trajectories, means and variances of articulator positions and curvatures at the midpoints of phonemes are extracted from the data set. During prediction, the amount of articulatory effort required in a particular phonetic context is estimated from the relative local curvature of the articulator trajectory concerned. Correlations between position and curvature are used to directly predict variations from mean articulator positions due to coarticulatory effects. Use of the explicit coarticulation model yields a significant increase in articulatory modeling accuracy with respect to x-ray traces, as compared with the use of mean articulator positions alone.  相似文献   

8.
9.
This paper describes two electromagnetic midsagittal articulometer (EMMA) systems that were developed for transducing articulatory movements during speech production. Alternating magnetic fields are generated by transmitter coils that are mounted in an assembly that fits on the head of a speaker. The fields induce alternating voltages in a number of small transducer coils that are attached to articulators in the midline plane, inside and outside the vocal tract. The transducers are connected by fine lead wires to receiver electronics whose output voltages are processed to yield measures of transducer locations as a function of time. Measurement error can arise with this method, because as the articulators move and change shape, the transducers can undergo a varying amount of rotational misalignment with respect to the transmitter axes; both systems are designed to correct for transducer misalignment. For this purpose, one system uses two transmitters and biaxial transducers; the other uses three transmitters and single-axis transducers. The systems have been compared with one another in terms of their performance, human subjects compatibility, and ease of use. Both systems can produce useful midsagittal-plane data on articular movement, and each one has a specific set of advantages and limitations. (Two commercially available systems are also described briefly for comparison purposes). If appropriate experimental controls are used, the three-transmitter system is preferable for practical reasons.  相似文献   

10.
Acoustic measurements were conducted to determine the degree to which vowel duration, closure duration, and their ratio distinguish voicing of word-final stop consonants across variations in sentential and phonetic environments. Subjects read CVC test words containing three different vowels and ending in stops of three different places of articulation. The test words were produced either in nonphrase-final or phrase-final position and in several local phonetic environments within each of these sentence positions. Our measurements revealed that vowel duration most consistently distinguished voicing categories for the test words. Closure duration failed to consistently distinguish voicing categories across the contextual variables manipulated, as did the ratio of closure and vowel duration. Our results suggest that vowel duration is the most reliable correlate of voicing for word-final stops in connected speech.  相似文献   

11.
The effect of accentuation and word duration on the naturalness of speech.   总被引:1,自引:0,他引:1  
In this study the effect of appropriate word duration and correct (pitch) accentuation on the naturalness of speech was investigated. In the stimulus material, the information value of the target word determined the correctness of accentuation ([new, +accent] and [old, -accent] were defined as correct). Appropriate word duration was defined as either "in agreement with accentuation" ([long, +accent] and [short, -accent]) or "in agreement with information value" ([long, new] and [short, old]). Listeners were asked to give naturalness judgments along a scale from 1 (very unnatural) to 10 (very natural) on fragments consisting of two sentences. Duration and accentuation of the target word, which always occurred in the second sentence, were manipulated separately and in combinations. Judgments show that accentuation that is not in agreement with information value causes a significant decrease of naturalness. When accentuation is in agreement with information value but duration is inappropriate for both factors, the perceived naturalness decreases significantly. However, listeners were unable to give consistent naturalness judgments on the manipulated word durations in fragments with incorrect accent distributions. Based on these results and the findings of an earlier production study [W. Eefting, J. Acoust. Soc. Am. 89, 412-424 (1991)], which showed that duration is not involved in the realization of pitch accent, the following is suggested. Speakers adapt both accentuation and word duration in order to indicate that a word contains relevant information. Presence of an accent distinguishes the word from its (less relevant) environment. A longer duration provides the listener with the extra time that is needed in order to process the word's content adequately.  相似文献   

12.
肖东  莫福源  陈庚  马力 《应用声学》2016,35(1):77-83
过渡段对语音清晰度、可懂度和人耳听觉感知都起到不可忽视的作用。参数语音编码中,包含有过渡段的语音帧能否得到恰当处理,是决定其合成语音是否清晰可懂的关键。本文以混合激励线性预测编码为参考,将其中的语音帧划分为静音、清音、浊音、过渡四大类后分别处理,在以往低码率语音编码(1 kbps)工作基础上,比较了八种过渡帧划分方法对合成语音PESQ MOS的影响。经分析后发现:不同的过渡帧对PESQ MOS的贡献也不同。由清、静音向浊音变化的过渡帧的贡献最大;介于浊辅音与元音之间的过渡帧的贡献也不应被忽略。  相似文献   

13.
14.
The purpose of this experiment was to study the effects of changes in speaking rate on both the attainment of acoustic vowel targets and the relative time and speed of movements toward these presumed targets. Four speakers produced a number of different CVC and CVCVC utterances at slow and fast speaking rates. Spectrographic measurements showed that the midpoint format frequencies of the different vowels did not vary as a function of rate. However, for fast speech the onset frequencies of second formant transitions were closer to their target frequencies while CV transition rates remained essentially unchanged, indicating that movement toward the vowel simply began earlier for fast speech. Changes in both speaking rate and lexical stress had different effects. For stressed vowels, an increase in speaking rate was accompanied primarily by a decrease in duration. However, destressed vowels, even if they were of the same duration as quickly produced stressed vowels, were reduced in overall amplitude, fundamental frequency, and to some extent, vowel color. These results suggest that speaking rate and lexical stress are controlled by two different mechanisms.  相似文献   

15.
16.
A pulsed master-oscillator power fiber amplifier (MOPFA) system based on Yb3+-doped large mode area (LMA) double-clad optical fiber was developed. The system generated pulses of changeable duration ranging from about 8.5 to 250.0 ns at the repetition rate of up to 500 kHz. The laser system emitted up to 22 W of average output power at the wavelength of 1064 nm.  相似文献   

17.
廖逢钗  李鹏  徐波 《声学学报》2009,34(3):281-288
在延时相加波束形成和维纳滤波技术的基础上,提出了一种基于能量损失率估计的传声器阵列后滤波语音增强算法。该算法通过检测线性不等间距传声器阵列中各嵌套子阵在波束形成前后的能量变化来估计维纳滤波器的权系数,实现了语音增强的目标。在仿真数据集上的实验评估表明,相比原始语音,该算法增强后的语音在信噪比、对数谱距离和感知质量等指标上平均分别改善了17.1 dB,1.001和0.935,具有很好的应用前景。  相似文献   

18.
19.
The purpose of this investigation was to study the effects of consonant environment on vowel duration for normally hearing males, hearing-impaired males with intelligible speech, and hearing-impaired males with semi-intelligible speech. The results indicated that the normally hearing and intelligible hearing-impaired speakers exhibited similar trends with respect to consonant influence on vowel duration; i.e., vowels were longer in duration, in a voiced environment as compared with a voiceless, and in a fricative environment as compared with a plosive. The semi-intelligible hearing-impaired speakers, however, failed to demonstrate a consonant effect on vowel duration, and produced the vowels with significantly longer durations when compared with the other two groups of speakers. These data provide information regarding temporal conditions which may contribute to the decreased intelligibility of hearing-impaired persons.  相似文献   

20.
The purpose of the present study was to investigate kinematic characteristics of the speech of children and adults under three speaking conditions. The effects of requiring subjects to produce speech stimuli were studied as they spoke: in a normal manner; at a faster than normal rate; and while holding a bite block between their molars to restrict mandibular movement. Using a strain gauge monitoring system, superior-inferior lip and jaw movement data were collected from 24 subjects--six in each of three groups of normally developing children and an adult control group. For the normal condition, it was found that net peak velocity (i.e., the sum of the peak velocities of the individual articulators) was quite comparable among the three groups of children and the adults. Net peak velocity increased significantly for all four groups of subjects when they spoke at a fast rate, but it did not increase significantly in the bite block condition. For most measures, there were typically no differences in peak velocity across the various speaking conditions when comparing the three groups of children to one another. In general, articulatory displacement data showed patterns quite similar to those of the peak velocity data. In addition to the displacement and peak velocity data, pilot data are discussed concerning temporal properties of articulatory phases and also concerning maximum, nonspeech articulatory gestures.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号