首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
Relying on a corpus of thirty narrative discourses,the roles of pitch and duration of prosodic words in sentence accent were studied in discourse context.At first,the pitch was normalized.Then according to the pitch range,the sentence and prosodic word were classified into three ranks of strengthened,normal and weakened respectively.In the same time the sentence accent was classified into two levels of primary and secondary by perceptual evaluation. The results showed that the relative pitch range of prosodic words in opposition to sentence contributed dominantly to sentence accent.Furthermore,the roles of pitch and duration in sentence accent were affected interactively by the rank of sentence and prosodic words.In normal prosodic words,primary sentence accents were realized by the mutual performance of pitch and duration while secondary sentence accents mainly depended on the variation of pitch. In strengthened prosodic words,the role of duration in sentence accent was more significant when the pitch range of the sentence was more compressed.Finally,it was found that the correlation between pitch and duration was influenced primarily by the strength of prosodic words,and in weakened,normal and strengthened prosodic words,the correlations between pitch and duration were positive,null,and negative respectively.  相似文献   

2.
Pitch accent in spoken-word recognition in Japanese   总被引:2,自引:0,他引:2  
Three experiments addressed the question of whether pitch-accent information may be exploited in the process of recognizing spoken words in Tokyo Japanese. In a two-choice classification task, listeners judged from which of two words, differing in accentual structure, isolated syllables had been extracted (e.g., ka from baka HL or gaka LH); most judgments were correct, and listeners' decisions were correlated with the fundamental frequency characteristics of the syllables. In a gating experiment, listeners heard initial fragments of words and guessed what the words were; their guesses overwhelmingly had the same initial accent structure as the gated word even when only the beginning CV of the stimulus (e.g., na- from nagasa HLL or nagashi LHH) was presented. In addition, listeners were more confident in guesses with the same initial accent structure as the stimulus than in guesses with different accent. In a lexical decision experiment, responses to spoken words (e.g., ame HL) were speeded by previous presentation of the same word (e.g., ame HL) but not by previous presentation of a word differing only in accent (e.g., ame LH). Together these findings provide strong evidence that accentual information constrains the activation and selection of candidates for spoken-word recognition.  相似文献   

3.
I.IntroductionResearchesonChinesesynthesisdisclosethatonlywhenboththesegmentalandsupraseg-melltalfeaturesofthesyntheticspeecharesimilartothoseofthellaturalone,thesyntheticspeechwillsoundintelligibleandnatural[1].Amongekistingsynthetictechniques,theapproachbasedonacousticparametersca-nadustboththesegmentalandsuprasegmentalfeaturesofsyntheticunitsfiekiblyandcanbeconsideredasthemostreasonablesynthetictechniqueintheory.However,theparameterbasedsynthesizerisoverAfependentonthedevelopmentsofparamet…  相似文献   

4.
倪崇嘉  刘文举  徐波 《声学学报》2012,37(5):553-560
虽然汉语和英语的重音自动标注被广泛的研究,但是关于汉语和英语的重音自动标注之间对比的研究还鲜有报道。基于汉语韵律标注库ASCCD和英语韵律标注库Boston University Radio News Corpus,对汉语和英语的重音自动标注的异同进行对比,考察不同的特征在不同语言的语料库上的泛化性能。通过基于集成分类回归树的重音自动标注实验、特征分析及基于互信息的重音自动标注的声学对比,得到如下结论:在相同的条件下,汉语重音自动标注的正确率比英语重音自动标注的正确率要低;在重音自动标注中,词典语法相关特征比声学相关的特征更重要;不同的声学信息源在重音自动标注中所起的作用不同,时长相关的特征对汉语和英语重音自动标注都很重要;英语中大部分特征提供的互信息要比汉语相应的特征提供的互信息要高。   相似文献   

5.
This study evaluated the reliability of pitch judgments as a basic step toward increasing interrater and intrarater reliability of multidimensional perceptual judgments of the speaking voice. Forty-five undergraduate university students studying speech/language pathology made piano-to-piano tone pitch matches and vowel-to-piano pitch matches using a computer software program. The mean percentage correct of piano-to-piano tone matches was 91.3% and of vowel-to-piano matches was 75.6%. Subjects who scored 100% correct were significantly faster at the pitch matching task. Further research of perceptual judgments of pitch and its contribution to multidimensional rating tasks is warranted.  相似文献   

6.
Three groups of nine 5-11-month-old infants provided evidence of discrimination of speechlike stimuli differing only in vowel duration. Ease of discrimination was directly related to the magnitude of the ratio of the longer to shorter vowel. Group one infants discriminated three vowel duration contrasts (with ratios of 0.33, 0.67, and 1.0) embedded in a synthetic [mad] syllable; group two discriminated these same duration contrasts within the bisyllable [ samad ], and group three in the trisyllable [ masamad ]. In all cases, the contrasting durations were carried by the last vowel of the synthetic word. These same three infant groups failed to provide evidence of discrimination of a final position released stop consonant contrast ([mat] versus [mad]) cued by voice excitation during closure of the [d] and not the [t]. These results suggest that vowel duration may be a primary cue for infants' perception of the voicing of final position stop consonants.  相似文献   

7.
Cochlear implants are largely unable to encode voice pitch information, which hampers the perception of some prosodic cues, such as intonation. This study investigated whether children with a cochlear implant in one ear were better able to detect differences in intonation when a hearing aid was added in the other ear ("bimodal fitting"). Fourteen children with normal hearing and 19 children with bimodal fitting participated in two experiments. The first experiment assessed the just noticeable difference in F0, by presenting listeners with a naturally produced bisyllabic utterance with an artificially manipulated pitch accent. The second experiment assessed the ability to distinguish between questions and affirmations in Dutch words, again by using artificial manipulation of F0. For the implanted group, performance significantly improved in each experiment when the hearing aid was added. However, even with a hearing aid, the implanted group required exaggerated F0 excursions to perceive a pitch accent and to identify a question. These exaggerated excursions are close to the maximum excursions typically used by Dutch speakers. Nevertheless, the results of this study showed that compared to the implant only condition, bimodal fitting improved the perception of intonation.  相似文献   

8.
维吾尔语焦点的韵律实现及感知   总被引:1,自引:0,他引:1       下载免费PDF全文
通过严格控制的语音实验,研究了维吾尔语陈述句中焦点对音高和时长的调节作用。实验设计了两个目标句,请发音人根据上下文自然地强调句中相应的词,随后还考察了焦点的感知问题。结果表明:(1)以句末焦点为基线,维吾尔语焦点的韵律编码方式类似于北京话和英语中的"三区段"调节模式,表现为焦点词音高升高、音域扩大和焦点后音高骤降(音域变窄),而焦点前音高变化不大;(2)焦点词和焦点前的词时长都有延长,而焦点后的词没有明显变化;(3)对焦点感知的正确率平均可达90%左右,表明焦点的韵律编码方式是有效的感知线索;(4)感知实验及语调分析还显示,维吾尔语"中性焦点"语调特征与英语和汉语不同,它接近句首焦点而不是句末焦点。另外,论文特别讨论了"焦点后音高骤降"在中国语言中的分布及来源问题。   相似文献   

9.
Vowel perception strategies were assessed for two "average" and one "star" single-channel 3M/House and three "average" and one "star" Nucleus 22-channel cochlear implant patients and six normal-hearing control subjects. All subjects were tested by computer with real and synthetic speech versions of [symbol: see text], presented randomly. Duration, fundamental frequency, and first, second, and third formant frequency cues to the vowels were the vowels were systematically manipulated. Results showed high accuracy for the normal-hearing subjects in all conditions but that of the first formant alone. "Average" single-channel patients classified only real speech [hVd] syllables differently from synthetic steady state syllables. The "star" single-channel patient identified the vowels at much better than chance levels, with a results pattern suggesting effective use of first formant and duration information. Both "star" and "average" Nucleus users showed similar response patterns, performing better than chance in most conditions, and identifying the vowels using duration and some frequency information from all three formants.  相似文献   

10.
The present study explores the use of extrinsic context in perceptual normalization for the purpose of identifying lexical tones in Cantonese. In each of four experiments, listeners were presented with a target word embedded in a semantically neutral sentential context. The target word was produced with a mid level tone and it was never modified throughout the study, but on any given trial the fundamental frequency of part or all of the context sentence was raised or lowered to varying degrees. The effect of perceptual normalization of tone was quantified as the proportion of non-mid level responses given in F0-shifted contexts. Results showed that listeners' tonal judgments (i) were proportional to the degree of frequency shift, (ii) were not affected by non-pitch-related differences in talker, (iii) and were affected by the frequency of both the preceding and following context, although (iv) following context affected tonal decisions more strongly than did preceding context. These findings suggest that perceptual normalization of lexical tone may involve a "moving window" or "running average" type of mechanism, that selectively weights more recent pitch information over older information, but does not depend on the perception of a single voice.  相似文献   

11.
Pitch judgments for dichotic chords composed of two pure tones often show a bias in favor of the chord-component going to one ear. This "ear-advantage for pitch" (EAP) varies between subjects, but is very stable within subject and is thought to reflect differences in spectral "sensitivity" of the two auditory pathways. The present study explored this hypothesis by examining pitch judgments of complexes composed of tones dichotically paired with frequency-varying signals. The direction and strength of EAP was first established using pure tones, then rising tone glides were introduced into the channel going to thenondominant ear. Since a glide possesses less energy at a given frequency than a pure tone of equal duration, an increase in EAP was expected. An increase in EAP was consistently observed for only one subject; three subjects showed small, variable effects; and six subjects displayed a decrease in EAP. The results suggested that factors other than relative spectral sensitivity affect observed EAP.  相似文献   

12.
We derive model-independent, "naturalness" upper bounds on the magnetic moments munu of Dirac neutrinos generated by physics above the scale of electroweak symmetry breaking. In the absence of fine-tuning of effective operator coefficients, we find that current information on neutrino mass implies that[EQUATION: SEE TEXT] bohr magnetons. This bound is several orders of magnitude stronger than those obtained from analyses of solar and reactor neutrino data and astrophysical observations.  相似文献   

13.
In a series of sodium aluminoborate glasses, we have applied triple-quantum magic-angle spinning (3QMAS) 17O NMR to obtain high-resolution information about the connections among various network structural units, to explore the mixing of aluminum and boron species. Oxygen-17 3QMAS spectra reveal changes in connectivities between AlO4 ([4]Al), AlO5 and AlO6 ([5,6]Al), BO3 ([3]B) and BO4 ([4]B) units, by quantifying populations of bridging oxygens such as Al-O-Al, Al-O-B and B-O-B and of non-bridging oxygens. Several linkages such as [4]Al-O-[4]Al and three-coordinated oxygen associated with [5,6]Al in Al-O-Al, [4]Al-O-[4]B, [4]Al-O-[3]B and [5,6]Al-O-[3]B in Al-O-B as well as [4]B-O-[3]B and [3]B-O-[3]B in B-O-B can be distinguished for the first time. The fractions of these linkages were calculated from models of random mixing and of mixing with maximum avoidance of tetrahedral-tetrahedral linkages. The results suggest that the structure of all of glasses in this study is well approximated by the latter model. However, the energetic "penalty" for formation of [4]Al-O-[4]B may be somewhat less than for [4]Al-O-[4]Al and [4]B-O-[4]B. In general, the new results presented here are similar to those obtained on glasses in this system by 27Al{11B} REDOR NMR (J. Phys. Chem. B 104 (2000) 6541), but provide considerably more detail on network connectivity and ordering schemes.  相似文献   

14.
Acoustic cues related to the voice source, including harmonic structure and spectral tilt, were examined for relevance to prosodic boundary detection. The measurements considered here comprise five categories: duration, pitch, harmonic structure, spectral tilt, and amplitude. Distributions of the measurements and statistical analysis show that the measurements may be used to differentiate between prosodic categories. Detection experiments on the Boston University Radio Speech Corpus show equal error detection rates around 70% for accent and boundary detection, using only the acoustic measurements described, without any lexical or syntactic information. Further investigation of the detection results shows that duration and amplitude measurements, and, to a lesser degree, pitch measurements, are useful for detecting accents, while all voice source measurements except pitch measurements are useful for boundary detection.  相似文献   

15.
Listeners detected interaural differences of time in trains of high-frequency clicks. The manipulated variables were the number of clicks in the train and the period between clicks. Thresholds were compared to an optimal integrator, where the binaural information accrued from each click in the stimulus train is equivalent. In agreement with data reported in the past, integration is optimal only when the period between clicks exceeds approximately 10 ms and when the duration of the entire stimulus train is less than about 250 ms. The first constraint represents a limitation due to a form of "binaural adaptation" and the second is due to a limited "integration period."  相似文献   

16.
In the past 10 years a Chinese text-to-speech system including aphonetic library,static tone model and basic synthesis rules had been estab-lished in IAAS.The Chinese synthesis of unrestricted vocabulary had beenachieved,but further steps must be taken to improve the naturalness ofsynthesized Chinese.The effect of segmental and suprasegmental features ofsynthetic speech upon naturalness have been studied by use of subjective as-sessment method.The results show that the rhythm in time domain andcoarticulation occupy a basic position for improving the naturalness of synthet-ic speech.And the fundamental frequency curve decided by tone model onlysuit to synthesize short sentence of Chinese.If the synthesis of larger linguisticunit than simple sentence is considered,the fundamental frequency curveshould be carefully manipulated.This paper presents the experimental methodand results,and discusses the way how to improve the naturalness of syntheticChinese.  相似文献   

17.
Can native listeners rapidly adapt to suprasegmental mispronunciations in foreign-accented speech? To address this question, an exposure-test paradigm was used to test whether Dutch listeners can improve their understanding of non-canonical lexical stress in Hungarian-accented Dutch. During exposure, one group of listeners heard a Dutch story with only initially stressed words, whereas another group also heard 28 words with canonical second-syllable stress (e.g., EEKhorn, "squirrel" was replaced by koNIJN "rabbit"; capitals indicate stress). The 28 words, however, were non-canonically marked by the Hungarian speaker with high pitch and amplitude on the initial syllable, both of which are stress cues in Dutch. After exposure, listeners' eye movements were tracked to Dutch target-competitor pairs with segmental overlap but different stress patterns, while they listened to new words from the same Hungarian speaker (e.g., HERsens, herSTEL, "brain," "recovery"). Listeners who had previously heard non-canonically produced words distinguished target-competitor pairs better than listeners who had only been exposed to Hungarian accent with canonical forms of lexical stress. Even a short exposure thus allows listeners to tune into speaker-specific realizations of words' suprasegmental make-up, and use this information for word recognition.  相似文献   

18.
The present study had two main purposes. One was to examine if listeners perceive gradually increasing durations of a voiceless fricative categorically ("fluent" versus "stuttered") or continuously (gradient perception from fluent to stuttered). The second purpose was to investigate whether there are gender differences in how listeners perceive various duration of sounds as "prolongations." Forty-four listeners were instructed to rate the duration of the // in the word "shape" produced by a normally fluent speaker. The target word was embedded in the middle of an experimental phrase and the initial // sound was digitally manipulated to create a range of fluent to stuttered sounds. This was accomplished by creating 20 ms stepwise increments for sounds ranging from 120 to 500 ms in duration. Listeners were instructed to give a rating of 1 for a fluent word and a rating of 100 for a stuttered word. The results showed listeners perceived the range of sounds continuously. Also, there was a significant gender difference in that males rated fluent sounds higher than females but female listeners rated stuttered sounds higher than males. The implications of these results are discussed.  相似文献   

19.
Stops in Swiss German contrast only in quantity in all word positions; aspiration and voicing play no role. As in most languages with consonant quantity contrast, geminate stops are produced with significantly longer closure duration (CD) than singletons in an intersonorant context. This holds word medially as well as phrase medially, e.g., [oni tto:s] "without roar" versus [oni to:s] "without can." Since the stops are voiceless, no CD cue distinguishes geminates from singletons phrase initially. Nevertheless, do speakers utilize articulatory means to maintain the contrast? By using electropalatography, the articulatory and acoustic properties of word-initial alveolar stops were investigated in phrase-initial and phrase-medial contexts. The results are threefold. First, as expected, CD and contact duration of the articulators mirror each other within a phrase: Geminates are longer than singletons. Second, phrase initially, the contact data unequivocally establish a quantity distinction. This means that-even without acoustic CD cues for perception-geminates are articulated with substantially longer oral closure than singletons. Third, stops are longer in phrase-initial than phrase-medial position, indicating articulatory strengthening. Nevertheless, the difference between geminates and singletons phrase initially is proportionately less than in phrase-medial position.  相似文献   

20.
How are listeners able to identify whether the pitch of a brief isolated sample of an unknown voice is high or low in the overall pitch range of that speaker? Does the speaker's voice quality convey crucial information about pitch level? Results and statistical models of two experiments that provide answers to these questions are presented. First, listeners rated the pitch levels of vowels taken over the full pitch ranges of male and female speakers. The absolute f0 of the samples was by far the most important determinant of listeners' ratings, but with some effect of the sex of the speaker. Acoustic measures of voice quality had only a very small effect on these ratings. This result suggests that listeners have expectations about f0s for average speakers of each sex, and judge voice samples against such expectations. Second, listeners judged speaker sex for the same speech samples. Again, absolute f0 was the most important determinant of listeners' judgments, but now voice quality measures also played a role. Thus it seems that pitch level judgments depend on voice quality mostly indirectly, through its information about sex. Absolute f0 is the most important information for deciding both pitch level and speaker sex.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号