首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 46 毫秒
1.
Intonation perception of English speech was examined for English- and Chinese-native listeners. F0 contour was manipulated from falling to rising patterns for the final words of three sentences. Listener's task was to identify and discriminate the intonation of each sentence (question versus statement). English and Chinese listeners had significant differences in the identification functions such as the categorical boundary and the slope. In the discrimination functions, Chinese listeners showed greater peakedness than English peers. The cross-linguistic differences in intonation perception were similar to the previous findings in perception of lexical tones, likely due to listeners' language background differences.  相似文献   

2.
There is a tendency across languages to use a rising pitch contour to convey question intonation and a falling pitch contour to convey a statement. In a lexical tone language such as Mandarin Chinese, rising and falling pitch contours are also used to differentiate lexical meaning. How, then, does the multiplexing of the F(0) channel affect the perception of question and statement intonation in a lexical tone language? This study investigated the effects of lexical tones and focus on the perception of intonation in Mandarin Chinese. The results show that lexical tones and focus impact the perception of sentence intonation. Question intonation was easier for native speakers to identify on a sentence with a final falling tone and more difficult to identify on a sentence with a final rising tone, suggesting that tone identification intervenes in the mapping of F(0) contours to intonational categories and that tone and intonation interact at the phonological level. In contrast, there is no evidence that the interaction between focus and intonation goes beyond the psychoacoustic level. The results provide insights that will be useful for further research on tone and intonation interactions in both acoustic modeling studies and neurobiological studies.  相似文献   

3.
Intonation stylization is studied using "chironomy," i.e., the analogy between hand gestures and prosodic movements. An intonation mimicking paradigm is used. The task of the ten subjects is to copy the intonation pattern of sentences with the help of a stylus on a graphic tablet, using a system for real-time manual intonation modification. Gestural imitation is compared to vocal imitation of the same sentences (seven for a male speaker, seven for a female speaker). Distance measures between gestural copies, vocal imitations, and original sentences are computed for performance assessment. Perceptual testing is also used for assessing the quality of gestural copies. The perceptual difference between natural and stylized contours is measured using a mean opinion score paradigm for 15 subjects. The results indicate that intonation contours can be stylized with accuracy by chironomic imitation. The results of vocal imitation and chironomic imitation are comparable, but subjects show better imitation results in vocal imitation. The best stylized contours using chironomy seems perceptually indistinguishable or almost indistinguishable from natural contours, particularly for female speech. This indicates that chironomic stylization is effective, and that hand movements can be analogous to intonation movements.  相似文献   

4.
5.
Four experiments were performed to evaluate a new wearable vibrotactile speech perception aid that extracts fundamental frequency (F0) and displays the extracted F0 as a single-channel temporal or an eight-channel spatio-temporal stimulus. Specifically, we investigated the perception of intonation (i.e., question versus statement) and emphatic stress (i.e., stress on the first, second, or third word) under Visual-Alone (VA), Visual-Tactile (VT), and Tactile-Alone (TA) conditions and compared performance using the temporal and spatio-temporal vibrotactile display. Subjects were adults with normal hearing in experiments I-III and adults with severe to profound hearing impairments in experiment IV. Both versions of the vibrotactile speech perception aid successfully conveyed intonation. Vibrotactile stress information was successfully conveyed, but vibrotactile stress information did not enhance performance in VT conditions beyond performance in VA conditions. In experiment III, which involved only intonation identification, a reliable advantage for the spatio-temporal display was obtained. Differences between subject groups were obtained for intonation identification, with more accurate VT performance by those with normal hearing. Possible effects of long-term hearing status are discussed.  相似文献   

6.
7.
Two experiments were conducted to explore the effectiveness of a single vibrotactile stimulator to convey intonation (question versus statement) and contrastive stress (on one of the first three words of four 4- or 5-word sentences). In experiment I, artificially deafened normal-hearing subjects judged stress and intonation in counterbalanced visual-alone and visual-tactile conditions. Six voice fundamental frequency-to-tactile transformations were tested. Two sentence types were voiced throughout, and two contained unvoiced consonants. Benefits to speechreading were significant, but small. No differences among transformations were observed. In experiment II, only the tactile stimuli were presented. Significant differences emerged among the transformations, with larger differences for intonation than for stress judgments. Surprisingly, tactile-alone intonation identification was more accurate than visual-tactile for several transformations.  相似文献   

8.
In the experiments reported here, perceived speaker identity was controlled by manipulating the fundamental frequency (F0) range of carrier phrases in which speech tokens were embedded. In the first experiment, words from two "hood"-"hud" continua were synthesized with different F0. The words were then embedded in synthetic carrier phrases with intonation contours which reduced perceived speaker identity differences for test items with different F0. The results indicated that when perceived speaker identity differences were reduced, the effect of F0 on vowel identification was also reduced. Experiment 2 indicated that when items presented in carrier phrases are matched for speaker identity and F0 with items in isolation, there is no effect for presentation in a carrier phrase. Experiment 3 involved the presentation of vowels from the "hood"-"hud" continuum in two different intonational contexts which were judged to have been produced by different speakers, even though the F0 of the test word was identical in the two contexts. There was a shift in identification as a result of the intonational context which was interpreted as evidence for the role of perceived identity in vowel normalization. Overall, the experiments suggest that perceived speaker identity is a better predictor of vowel normalization effects than is intrinsic F0. This indicates that the role of F0 in vowel normalization is mediated through perceived speaker identity.  相似文献   

9.
Categorical perception was investigated in a series of experiments on the perception of melodic musical intervals (sequential frequency ratios). When procedures equivalent to those typically used in speech-perception experiments were employed, i.e., determination of identification and discrimination functions for stimuli separated by equal physical increments), musical intervals were perceived categorically by trained musicians. When a variable-step-size (adaptive) discrimination procedure was used, evidence of categorical perception (in the form of smaller interval-width DL's for ratios at identification category boundaries than for ratios within categories), although present initially, largely disappeared after subjects had reached asymptotic performance. However, equal-step-size discrimination functions obtained after observers had reached asymptotic performance in the adaptive paradigm were not substantially different from those initially obtained. The results of other experiments imply that this dependence of categorical perception on procedure may be related to differences in stimulus uncertainty between the procedures. An experiment on the perception of melodic intervals by musically untrained observers showed no evidence for the existence of "natural" categories for musical intervals.  相似文献   

10.
This paper addresses a classical but important problem: The coupling of lexical tones and sentence intonation in tonal languages, such as Chinese, focusing particularly on voice fundamental frequency (F1) contours of speech. It is important because it forms the basis of speech synthesis technology and prosody analysis. We provide a solution to the problem with a constrained tone transformation technique based on structural modeling of the F1 contours. This consists of transforming target values in pairs from norms to variants. These targets are intended to sparsely specify the prosodic contributions to the F1 contours, while the alignment of target pairs between norms and variants is based on underlying lexical tone structures. When the norms take the citation forms of lexical tones, the technique makes it possible to separate sentence intonation from observed F0 contours. When the norms take normative F0 contours, it is possible to measure intonation variations from the norms to the variants, both having identical lexical tone structures. This paper explains the underlying scientific and linguistic principles and presents an algorithm that was implemented on computers. The method's capability of separating and combining tone and intonation is evaluated through analysis and re-synthesis of several hundred observed F0 contours.  相似文献   

11.
An acoustic analysis of a German read-speech corpus showed that utterance-final /t/ aspirations differ systematically depending on the accompanying nuclear accent contour. Two contours were included: Terminal-falling early and late F0 peaks in terms of the Kiel Intonation Model. They correspond to H+L*L-% and L*+HL-% within the autosegmental metrical (AM) model. Aspirations in early-peak contexts were characterized by (a) "short", (b) "high-intensity" noise with (c) "low" frequency values for the spectral energy maximum above the lower spectral energy boundary. The opposite holds for aspirations accompanying late-peak productions. Starting from the acoustic analysis, a perception experiment was performed using a variant of the semantic differential paradigm. The stimuli were varied in the duration and intensity pattern as well as the spectral energy pattern of the final /t/ aspiration. Results revealed that the different noise patterns found in connection with early and late peak productions were able to change the attitudinal meaning of the stimuli toward the meaning profile of the respective F0 peak category. This suggests that final aspirations can be part of the coding of meanings, so far solely associated with intonation contours. Hence, the traditionally separated segmental and suprasegmental coding levels seem to be more intertwined than previously thought.  相似文献   

12.
The ability of five profoundly hearing-impaired subjects to "track" connected speech and to make judgments about the intonation and stress in spoken sentences was evaluated under a variety of auditory-visual conditions. These included speechreading alone, speechreading plus speech (low-pass filtered at 4 kHz), and speechreading plus a tone whose frequency, intensity, and temporal characteristics were matched to the speaker's fundamental frequency (F0). In addition, several frequency transfer functions were applied to the normal F0 range resulting in new ranges that were both transposed and expanded with respect to the original F0 range. Three of the five subjects were able to use several of the tonal representations of F0 nearly as well as speech to improve their speechreading rates and to make appropriate judgments concerning sentence intonation and stress. The remaining two subjects greatly improved their identification performance for intonation and stress patterns when expanded F0 signals were presented alone (i.e., without speechreading), but had difficulty integrating visual and auditory information at the connected discourse level, despite intensive training in the connected discourse tracking procedure lasting from 27.8-33.8 h.  相似文献   

13.
The corruption of intonation contours has detrimental effects on sentence-based speech recognition in normal-hearing listeners Binns and Culling [(2007). J. Acoust. Soc. Am. 122, 1765-1776]. This paper examines whether this finding also applies to cochlear implant (CI) recipients. The subjects' F0-discrimination and speech perception in the presence of noise were measured, using sentences with regular and inverted F0-contours. The results revealed that speech recognition for regular contours was significantly better than for inverted contours. This difference was related to the subjects' F0-discrimination providing further evidence that the perception of intonation patterns is important for the CI-mediated speech recognition in noise.  相似文献   

14.
Previous studies have reported that rise time of sawtooth waveforms may be discriminated in either a categorical-like manner under some experimental conditions or according to Weber's law under other conditions. In the present experiments, rise time discrimination was examined with two experimental procedures: the traditional labeling and ABX tasks used in speech perception studies and an adaptive tracking procedure used in psychophysical studies. Rise time varied from 0 to 80 ms in 10-ms intervals for sawtooth signals of 1-s duration. Discrimination functions for subjects who simply discriminated the signals on any basis whatsoever as well as functions for subjects who practiced labeling the endpoint stimuli as " pluck " and "bow" before ABX discrimination were not categorical in the ABX task. In the adaptive tracking procedure, the Weber fraction obtained from the jnds of rise time was found to be a constant above 20-ms rise time. The results from the two discrimination paradigms were then compared by predicting a jnd for rise time from the ABX discrimination data by reference to the underlying psychometric function. Using this method of analysis, discrimination results from previous studies were shown to be quite similar to the discrimination results observed in this study. Taken together the results demonstrate clearly that rise time discrimination of sawtooth signals follows predictions derived from Weber's law.  相似文献   

15.
Spectral-ripple discrimination has been used widely for psychoacoustical studies in normal-hearing, hearing-impaired, and cochlear implant listeners. The present study investigated the perceptual mechanism for spectral-ripple discrimination in cochlear implant listeners. The main goal of this study was to determine whether cochlear implant listeners use a local intensity cue or global spectral shape for spectral-ripple discrimination. The effect of electrode separation on spectral-ripple discrimination was also evaluated. Results showed that it is highly unlikely that cochlear implant listeners depend on a local intensity cue for spectral-ripple discrimination. A phenomenological model of spectral-ripple discrimination, as an "ideal observer," showed that a perceptual mechanism based on discrimination of a single intensity difference cannot account for performance of cochlear implant listeners. Spectral modulation depth and electrode separation were found to significantly affect spectral-ripple discrimination. The evidence supports the hypothesis that spectral-ripple discrimination involves integrating information from multiple channels.  相似文献   

16.
Thresholds for formant frequency discrimination have been established using optimal listening conditions. In normal conversation, the ability to discriminate formant frequency is probably substantially degraded. The purpose of the present study was to change the listening procedures in several substantial ways from optimal towards more ordinary listening conditions, including a higher level of stimulus uncertainty, increased levels of phonetic context, and with the addition of a sentence identification task. Four vowels synthesized from a female talker were presented in isolation, or in the phonetic context of /bVd/ syllables, three-word phrases, or nine-word sentences. In the first experiment, formant resolution was estimated under medium stimulus uncertainty for three levels of phonetic context. Some undesirable training effects were obtained and led to the design of a new protocol for the second experiment to reduce this problem and to manipulate both length of phonetic context and level of difficulty in the simultaneous sentence identification task. Similar results were obtained in both experiments. The effect of phonetic context on formant discrimination is reduced as context lengthens such that no difference was found between vowels embedded in the phrase or sentence contexts. The addition of a challenging sentence identification task to the discrimination task did not degrade performance further and a stable pattern for formant discrimination in sentences emerged. This norm for the resolution of vowel formants under these more ordinary listening conditions was shown to be nearly a constant at 0.28 barks. Analysis of vowel spaces from 16 American English talkers determined that the closest vowels, on average, were 0.56 barks apart, that is, a factor of 2 larger than the norm obtained in these vowel formant discrimination tasks.  相似文献   

17.
The questions of the identification of complex biological systems (complexity) as special self-organizing systems or systems of the third type first defined by W. Weaver in 1948 continue to be of interest. No reports on the evaluation of entropy for systems of the third type were found among the publications currently available to the authors. The present study addresses the parameters of muscle biopotentials recorded using surface interference electromyography and presents the results of calculation of the Shannon entropy, autocorrelation functions, and statistical distribution functions for electromyograms of subjects in different physiological states (rest and tension of muscles). The results do not allow for statistically reliable discrimination between the functional states of muscles. However, the data obtained by calculating electromyogram quasiatttractor parameters and matrices of paired comparisons of electromyogram samples (calculation of the number k of “coinciding” pairs among the electromyogram samples) provide an integral characteristic that allows the identification of substantial differences between the state of rest and the different states of functional activity. Modifications and implementation of new methods in combination with the novel methods of the theory of chaos and self-organization are obviously essential. The stochastic approach paradigm is not applicable to systems of the third type due to continuous and chaotic changes of the parameters of the state vector x(t) of an organism or the contrasting constancy of these parameters (in the case of entropy).  相似文献   

18.
In tonal languages, there are potential conflicts between the FO-based changes due to the coexistence of intonation and lexical tones. In the present study, the interaction of tone and intonation in Cantonese was examined using acoustic and perceptual analyses. The acoustic patterns of tones at the initial, medial, and final positions of questions and statements were measured. Results showed that intonation affects both the FO level and contour, while the duration of the six tones varied as a function of positions within intonation contexts. All six tones at the final position of questions showed rising FO contour, regardless of their canonical form. Listeners were overall more accurate in the identification of tones presented within the original carrier than of the same tones in isolation. However, a large proportion of tones 33, 21, 23, and 22 at the final position of questions were misperceived as tone 25 both within the original carrier and as isolated words. These results suggest that although the intonation context provided cues for correct tone identification, the intonation-induced changes in FO contour cannot always be perceptually compensated for, resulting in some erroneous perception of the identity of Cantonese tone.  相似文献   

19.
In tone languages there are potential conflicts in the perception of lexical tone and intonation, as both depend mainly on the differences in fundamental frequency (F0) patterns. The present study investigated the acoustic cues associated with the perception of sentences as questions or statements in Cantonese, as a function of the lexical tone in sentence final position. Cantonese listeners performed intonation identification tasks involving complete sentences, isolated final syllables, and sentences without the final syllable (carriers). Sensitivity (d' scores) were similar for complete sentences and final syllables but were significantly lower for carriers. Sensitivity was also affected by tone identity. These findings show that the perception of questions and statements relies primarily on the F0 characteristics of the final syllables (local F0 cues). A measure of response bias (c) provided evidence for a general bias toward the perception of statements. Logistic regression analyses showed that utterances were accurately classified as questions or statements by using average F0 and F0 interval. Average F0 of carriers (global F0 cue) was also found to be a reliable secondary cue. These findings suggest that the use of F0 cues for the perception of intonation question in tonal languages is likely to be language-specific.  相似文献   

20.
Psychometric functions for pulsed pure-tone frequency discrimination were obtained from hearing-impaired listeners at frequencies with normal hearing and at frequencies with mild or moderate hearing losses. The general form of psychometric functions at hearing-impaired frequencies was found to be the same as at normal-hearing frequencies, i.e.,d' was linear with the frequency difference between tones, in Hz. For all but one psychometric function, the addition of an intercept term to the fitting equation did not account for significantly more variance than did the slope term alone. Therefore, it was concluded that psychometric functions for frequency discrimination can be adequately described with only one parameter: the slope of the psychometric function. Deficits in discrimination at hearing-loss frequencies were manifested by more gradual slopes of psychometric functions. Procedures for normalizing psychometric functions are presented, which facilitate comparisons of normal and impaired frequency discrimination data across studies and frequencies. Comparisons of dlf's (difference limen for frequency) obtained with adaptive and fixed procedures show a bias toward larger dlf's with adaptive procedures, but only at higher frequencies. A discussion of equal-interval and equal-ratio adaptive stepping rules indicates that an equal-ratio rule may be preferable.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号