首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 109 毫秒
1.
In intonation research, prominence-lending pitch movements have either been described on a linear or on a logarithmic frequency scale. An experiment has been carried out to check whether pitch movements in speech intonation are perceived on one of these two scales or on a psychoacoustic scale representing the frequency selectivity of the auditory system. This last scale is intermediary between the other two scales. Subjects matched the excursion size of prominence-lending pitch movements in utterances resynthesized in different pitch registers. Their task was to adjust the excursion size in a comparison stimulus in such a way that it lent equal prominence to the corresponding syllable in a fixed test stimulus. The comparison stimulus and the test stimulus had pitches running parallel on either the logarithmic frequency scale, the psychoacoustic scale, or the linear frequency scale. In one-half of the experimental sessions, the test stimulus was presented in the low register, while the comparison stimulus was presented in the high register, and, conversely, for the other half of the sessions. The result is that, in all cases, stimuli are matched in such a way that the average excursion sizes in different registers are equal on the psychoacoustic scale.  相似文献   

2.
3.
In this study, the effect of articulation rate and speaking style on the perceived speech rate is investigated. The articulation rate is measured both in terms of the intended phones, i.e., phones present in the assumed canonical form, and as the number of actual, realized phones per second. The combination of these measures reflects the deletion of phones, which is related to speaking style. The effect of the two rate measures on the perceived speech rate is compared in two listening experiments on the basis of a set of intonation phrases with carefully balanced intended and realized phone rates, selected from a German database of spontaneous speech. Because the balance between input-oriented (effort) and output-oriented (communicative) constraints may be different at fast versus slow speech rates, the effect of articulation rate is compared both for fast and for slow phrases from the database. The effect of the listeners' own speaking habits is also investigated to evaluate if listeners' perception is based on a projection of their own behavior as a speaker. It is shown that listener judgments reflect both the intended and realized phone rates, and that their judgments are independent of the constraint balance and their own speaking habits.  相似文献   

4.
Infant-directed speech (IDS) is believed to facilitate language learning. However, the benefit may be either due to clearer acoustic correlates to linguistic structures, or simply increased attention from infants induced by IDS exaggerated prosody. This study investigated the pure effect of IDS pitch on lexical tone learning, with attentional/affective factors removed by using artificial neural networks. Following training with the pitch of Mandarin tones in IDS versus adult-directed speech, the networks yielded equal tonal categorization for both registers. IDS pitch produced no additional linguistic support. IDS pitch appears to strictly play the non-linguistic role of attention/affect, which may indirectly benefit learning.  相似文献   

5.
6.
汉语语句通常存在音高下倾现象,然而关于语句内部韵律词的具体音高表现目前的研究尚较欠缺。本研究使用的对话语料选自973电话语料库,包括69段对话,涉及79位说话人;朗读话语语料为广播电台两位主持人的新闻播音,长度为221个语句,对语句内部韵律词的高音点、低音点及音域进行了分析,结果显示对话与朗读话语多数语句的音高呈前高后低的走势,不过口语对话较长语句前半段的音高下降趋势不太明显。与朗读话语相比,口语对话韵律词的音域通常比较小。对话语句最后一个韵律词的音域相对较大,而朗读话语内部韵律词的音域大多没有差异。本研究的结果,将有助于语音合成中语句内部韵律词音阶及音域的构拟。   相似文献   

7.
The ability of five profoundly hearing-impaired subjects to "track" connected speech and to make judgments about the intonation and stress in spoken sentences was evaluated under a variety of auditory-visual conditions. These included speechreading alone, speechreading plus speech (low-pass filtered at 4 kHz), and speechreading plus a tone whose frequency, intensity, and temporal characteristics were matched to the speaker's fundamental frequency (F0). In addition, several frequency transfer functions were applied to the normal F0 range resulting in new ranges that were both transposed and expanded with respect to the original F0 range. Three of the five subjects were able to use several of the tonal representations of F0 nearly as well as speech to improve their speechreading rates and to make appropriate judgments concerning sentence intonation and stress. The remaining two subjects greatly improved their identification performance for intonation and stress patterns when expanded F0 signals were presented alone (i.e., without speechreading), but had difficulty integrating visual and auditory information at the connected discourse level, despite intensive training in the connected discourse tracking procedure lasting from 27.8-33.8 h.  相似文献   

8.
There is a tendency across languages to use a rising pitch contour to convey question intonation and a falling pitch contour to convey a statement. In a lexical tone language such as Mandarin Chinese, rising and falling pitch contours are also used to differentiate lexical meaning. How, then, does the multiplexing of the F(0) channel affect the perception of question and statement intonation in a lexical tone language? This study investigated the effects of lexical tones and focus on the perception of intonation in Mandarin Chinese. The results show that lexical tones and focus impact the perception of sentence intonation. Question intonation was easier for native speakers to identify on a sentence with a final falling tone and more difficult to identify on a sentence with a final rising tone, suggesting that tone identification intervenes in the mapping of F(0) contours to intonational categories and that tone and intonation interact at the phonological level. In contrast, there is no evidence that the interaction between focus and intonation goes beyond the psychoacoustic level. The results provide insights that will be useful for further research on tone and intonation interactions in both acoustic modeling studies and neurobiological studies.  相似文献   

9.
重音是重要的语调特征,重音合成技术可以提高语音的自然度和表现力。针对重音的局部凸显性,该文提出了声学特征凸显度的表示方法,分析了不同韵律位置(韵律词首、中、尾,韵律短语首、中、尾等)重音音节的声学特征凸显度,发现在韵律单元末(韵律词末音节和韵律短语末韵律词)的重音其基频最大值凸显度要低于非韵律单元末重音,提出了基于声学特征凸显度的非线性的重音声学参数生成算法,解决了传统重音声学参数线性修改算法的修改幅度不足或过大的问题。采用该算法建立了基于隐Markov模型的支持重音合成的语音合成系统。实验表明,该系统可以有效合成带有重音的语音,提高了合成语音的自然度和表现力。   相似文献   

10.
Standard continuous interleaved sampling processing, and a modified processing strategy designed to enhance temporal cues to voice pitch, were compared on tests of intonation perception, and vowel perception, both in implant users and in acoustic simulations. In standard processing, 400 Hz low-pass envelopes modulated either pulse trains (implant users) or noise carriers (simulations). In the modified strategy, slow-rate envelope modulations, which convey dynamic spectral variation crucial for speech understanding, were extracted by low-pass filtering (32 Hz). In addition, during voiced speech, higher-rate temporal modulation in each channel was provided by 100% amplitude-modulation by a sawtooth-like wave form whose periodicity followed the fundamental frequency (F0) of the input. Channel levels were determined by the product of the lower- and higher-rate modulation components. Both in acoustic simulations and in implant users, the ability to use intonation information to identify sentences as question or statement was significantly better with modified processing. However, while there was no difference in vowel recognition in the acoustic simulation, implant users performed worse with modified processing both in vowel recognition and in formant frequency discrimination. It appears that, while enhancing pitch perception, modified processing harmed the transmission of spectral information.  相似文献   

11.
12.
In the experiments reported here, perceived speaker identity was controlled by manipulating the fundamental frequency (F0) range of carrier phrases in which speech tokens were embedded. In the first experiment, words from two "hood"-"hud" continua were synthesized with different F0. The words were then embedded in synthetic carrier phrases with intonation contours which reduced perceived speaker identity differences for test items with different F0. The results indicated that when perceived speaker identity differences were reduced, the effect of F0 on vowel identification was also reduced. Experiment 2 indicated that when items presented in carrier phrases are matched for speaker identity and F0 with items in isolation, there is no effect for presentation in a carrier phrase. Experiment 3 involved the presentation of vowels from the "hood"-"hud" continuum in two different intonational contexts which were judged to have been produced by different speakers, even though the F0 of the test word was identical in the two contexts. There was a shift in identification as a result of the intonational context which was interpreted as evidence for the role of perceived identity in vowel normalization. Overall, the experiments suggest that perceived speaker identity is a better predictor of vowel normalization effects than is intrinsic F0. This indicates that the role of F0 in vowel normalization is mediated through perceived speaker identity.  相似文献   

13.
An illusion is explored in which a spoken phrase is perceptually transformed to sound like song rather than speech, simply by repeating it several times over. In experiment I, subjects listened to ten presentations of the phrase and judged how it sounded on a five-point scale with endpoints marked "exactly like speech" and "exactly like singing." The initial and final presentations of the phrase were identical. When the intervening presentations were also identical, judgments moved solidly from speech to song. However, this did not occur when the intervening phrases were transposed slightly or when the syllables were presented in jumbled orderings. In experiment II, the phrase was presented either once or ten times, and subjects repeated it back as they finally heard it. Following one presentation, the subjects repeated the phrase back as speech; however, following ten presentations they repeated it back as song. The pitch values of the subjects' renditions following ten presentations were closer to those of the original spoken phrase than were the pitch values following a single presentation. Furthermore, the renditions following ten presentations were even closer to a hypothesized representation in terms of a simple tonal melody than they were to the original spoken phrase.  相似文献   

14.
A voice range profile (VRP) was obtained from each of eight professional actors and compared with two speech range profiles (SRPs). One speech profile was obtained during the dramatic reading of a scene in the laboratory and the other during a performance on stage in a professional theater. The objective was to determine the pitch and loudness ranges used by the actors in speech relative to the VRP. The principal question of interest was whether the actors stayed within the center of the VRP, or whether they tended to drift toward the boundaries of intensity and frequency. A second question was whether the performance within the laboratory accurately reflects that of a stage performance. The results suggest that some subjects tend to exceed the center of the VRP during the stage performance. It is hypothesized that these actors may stress their vocal mechanism during performance and are more likely candidates for vocal injury.  相似文献   

15.
Cochlear implants are largely unable to encode voice pitch information, which hampers the perception of some prosodic cues, such as intonation. This study investigated whether children with a cochlear implant in one ear were better able to detect differences in intonation when a hearing aid was added in the other ear ("bimodal fitting"). Fourteen children with normal hearing and 19 children with bimodal fitting participated in two experiments. The first experiment assessed the just noticeable difference in F0, by presenting listeners with a naturally produced bisyllabic utterance with an artificially manipulated pitch accent. The second experiment assessed the ability to distinguish between questions and affirmations in Dutch words, again by using artificial manipulation of F0. For the implanted group, performance significantly improved in each experiment when the hearing aid was added. However, even with a hearing aid, the implanted group required exaggerated F0 excursions to perceive a pitch accent and to identify a question. These exaggerated excursions are close to the maximum excursions typically used by Dutch speakers. Nevertheless, the results of this study showed that compared to the implant only condition, bimodal fitting improved the perception of intonation.  相似文献   

16.
The corruption of intonation contours has detrimental effects on sentence-based speech recognition in normal-hearing listeners Binns and Culling [(2007). J. Acoust. Soc. Am. 122, 1765-1776]. This paper examines whether this finding also applies to cochlear implant (CI) recipients. The subjects' F0-discrimination and speech perception in the presence of noise were measured, using sentences with regular and inverted F0-contours. The results revealed that speech recognition for regular contours was significantly better than for inverted contours. This difference was related to the subjects' F0-discrimination providing further evidence that the perception of intonation patterns is important for the CI-mediated speech recognition in noise.  相似文献   

17.
Three experiments were conducted to study the effect of segmental and suprasegmental corrections on the intelligibility and judged quality of deaf speech. By means of digital signal processing techniques, including LPC analysis, transformations of separate speech sounds, temporal structure, and intonation were carried out on 30 Dutch sentences spoken by ten deaf children. The transformed sentences were tested for intelligibility and acceptability by presenting them to inexperienced listeners. In experiment 1, LPC based reflection coefficients describing segmental characteristics of deaf speakers were replaced by those of hearing speakers. A complete segmental correction caused a dramatic increase in intelligibility from 24% to 72%, which, for a major part, was due to correction of vowels. Experiment 2 revealed that correction of temporal structure and intonation caused only a small improvement from 24% to about 34%. Combination of segmental and suprasegmental corrections yielded almost perfectly understandable sentences, due to a more than additive effect of the two corrections. Quality judgments, collected in experiment 3, were in close agreement with the intelligibility measures. The results show that, in order for these speakers to become more intelligible, improving their articulation is more important than improving their production of temporal structure and intonation.  相似文献   

18.
Congenital amusia is a lifelong disorder of music processing that has been ascribed to impaired pitch perception and memory. The present study tested a large group of amusics (n=17) and provided evidence that their pitch deficit affects pitch processing in speech to a lesser extent: Fine-grained pitch discrimination was better in spoken syllables than in acoustically matched tones. Unlike amusics, control participants performed fine-grained pitch discrimination better for musical material than for verbal material. These findings suggest that pitch extraction can be influenced by the nature of the material (music vs speech), and that amusics' pitch deficit is not restricted to musical material, but extends to segmented speech events.  相似文献   

19.
This paper addresses a classical but important problem: The coupling of lexical tones and sentence intonation in tonal languages, such as Chinese, focusing particularly on voice fundamental frequency (F1) contours of speech. It is important because it forms the basis of speech synthesis technology and prosody analysis. We provide a solution to the problem with a constrained tone transformation technique based on structural modeling of the F1 contours. This consists of transforming target values in pairs from norms to variants. These targets are intended to sparsely specify the prosodic contributions to the F1 contours, while the alignment of target pairs between norms and variants is based on underlying lexical tone structures. When the norms take the citation forms of lexical tones, the technique makes it possible to separate sentence intonation from observed F0 contours. When the norms take normative F0 contours, it is possible to measure intonation variations from the norms to the variants, both having identical lexical tone structures. This paper explains the underlying scientific and linguistic principles and presents an algorithm that was implemented on computers. The method's capability of separating and combining tone and intonation is evaluated through analysis and re-synthesis of several hundred observed F0 contours.  相似文献   

20.
The significance of auditory and kinesthetic feedback to pitch control in singing was described in a previous report of this project for students at the beginning of their professional solo singer education.(1) As it seems reasonable to assume that pitch control can be improved by training, the same students were reinvestigated after 3 years of professional singing education. As in the previous study, the singers sang an ascending and descending triad pattern with and without masking noise in legato and staccato and in a slow and a fast tempo. Fundamental frequency and interval sizes between adjacent tones were determined and compared with their equivalents in the equally tempered tuning. The average deviations from these values were used as estimates of intonation accuracy. Intonation accuracy was reduced by masking noise, by staccato as opposed to legato singing, and by fast as opposed to slow performance. The contribution of the auditory feedback to pitch control was not significantly improved after education, whereas the kinesthetic feedback circuit was improved in slow legato and slow staccato tasks. The results support the assumption that the kinesthetic feedback contributes substantially to intonation accuracy.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号