首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
Summary Both bottom-up sensory information and top-down influences contribute to the perception processes. We studied the perceptual alternations of a multistable ambiguous pattern. We observed that it is possible to interfere on the process of the perception alternance by means of subliminal visual stimuli, which either contrast or second the previous perception. We investigated also the effect of the top-down volitional factor on the perceptual alternation. By using a combination of such top-down factor and bottom-up stimulation, we ascertained that a non-linear type of interaction occurs between the two above factors. Paper presented at the International Workshop ?Fluctuations in Physics and Biology: Stochastic Resonance, Signal Processing and Related Phenomena?, Elba, 5–10 June 1994.  相似文献   

2.
The present study explores the use of extrinsic context in perceptual normalization for the purpose of identifying lexical tones in Cantonese. In each of four experiments, listeners were presented with a target word embedded in a semantically neutral sentential context. The target word was produced with a mid level tone and it was never modified throughout the study, but on any given trial the fundamental frequency of part or all of the context sentence was raised or lowered to varying degrees. The effect of perceptual normalization of tone was quantified as the proportion of non-mid level responses given in F0-shifted contexts. Results showed that listeners' tonal judgments (i) were proportional to the degree of frequency shift, (ii) were not affected by non-pitch-related differences in talker, (iii) and were affected by the frequency of both the preceding and following context, although (iv) following context affected tonal decisions more strongly than did preceding context. These findings suggest that perceptual normalization of lexical tone may involve a "moving window" or "running average" type of mechanism, that selectively weights more recent pitch information over older information, but does not depend on the perception of a single voice.  相似文献   

3.
In VCV nonsense forms (such as /epsilondepsilon/, while both the CV transition and the VC transition are perceptible in isolation, the CV transition dominates identification of the stop consonant. Thus, the question arises, what role, if any, do VC transitions play in word perception? Stimuli were two-syllable English words in which the medial consonant was either a stop or a fricative (e.g., "feeding" and "gravy"). Each word was constructed in three ways: (1) the VC transition was incompatible with the CV in either place, manner of articulation, or both; (2) the VC transition was eliminated and the steady-state portion of first vowel was substituted in its place; and (3) the original word. All versions of a particular word were identical with respect to duration, pitch contour, and amplitude envelope. While an intelligibility test revealed no differences among the three conditions, data from a paired comparison preference task and an unspeeded lexical decision task indicated that incompatible VC transitions hindered word perception, but lack of VC transitions did not. However, there were clear differences among the three conditions in the speeded lexical decision task for word stimuli, but not for nonword stimuli that were constructed in an analogous fashion. We discuss the use of lexical tasks for speech quality assessment and possible processes by which listeners recognize spoken words.  相似文献   

4.
It is hypothesized that in sine-wave replicas of natural speech, lexical tone recognition would be severely impaired due to the loss of F0 information, but the linguistic information at the sentence level could be retrieved even with limited tone information. Forty-one native Mandarin-Chinese-speaking listeners participated in the experiments. Results showed that sine-wave tone-recognition performance was on average only 32.7% correct. However, sine-wave sentence-recognition performance was very accurate, approximately 92% correct on average. Therefore the functional load of lexical tones on sentence recognition is limited, and the high-level recognition of sine-wave sentences is likely attributed to the perceptual organization that is influenced by top-down processes.  相似文献   

5.
Much research has explored how spoken word recognition is influenced by the architecture and dynamics of the mental lexicon (e.g., Luce and Pisoni, 1998; McClelland and Elman, 1986). A more recent question is whether the processes underlying word recognition are unique to the auditory domain, or whether visually perceived (lipread) speech may also be sensitive to the structure of the mental lexicon (Auer, 2002; Mattys, Bernstein, and Auer, 2002). The current research was designed to test the hypothesis that both aurally and visually perceived spoken words are isolated in the mental lexicon as a function of their modality-specific perceptual similarity to other words. Lexical competition (the extent to which perceptually similar words influence recognition of a stimulus word) was quantified using metrics that are well-established in the literature, as well as a statistical method for calculating perceptual confusability based on the phi-square statistic. Both auditory and visual spoken word recognition were influenced by modality-specific lexical competition as well as stimulus word frequency. These findings extend the scope of activation-competition models of spoken word recognition and reinforce the hypothesis (Auer, 2002; Mattys et al., 2002) that perceptual and cognitive properties underlying spoken word recognition are not specific to the auditory domain. In addition, the results support the use of the phi-square statistic as a better predictor of lexical competition than metrics currently used in models of spoken word recognition.  相似文献   

6.
Previous research on foreign accent perception has largely focused on speaker-dependent factors such as age of learning and length of residence. Factors that are independent of a speaker's language learning history have also been shown to affect perception of second language speech. The present study examined the effects of two such factors--listening context and lexical frequency--on the perception of foreign-accented speech. Listeners rated foreign accent in two listening contexts: auditory-only, where listeners only heard the target stimuli, and auditory + orthography, where listeners were presented with both an auditory signal and an orthographic display of the target word. Results revealed that higher frequency words were consistently rated as less accented than lower frequency words. The effect of the listening context emerged in two interactions: the auditory + orthography context reduced the effects of lexical frequency, but increased the perceived differences between native and non-native speakers. Acoustic measurements revealed some production differences for words of different levels of lexical frequency, though these differences could not account for all of the observed interactions from the perceptual experiment. These results suggest that factors independent of the speakers' actual speech articulations can influence the perception of degree of foreign accent.  相似文献   

7.
The theory of relational acoustic invariance [Pickett, E. R., et al. (1999). Phonetica 56, 135-157] was tested with the Japanese stop quantity distinction in disyllables spoken at various rates. The questions were whether the perceptual boundary between the two phonemic categories of single and geminate stops is invariant across rates, and whether there is a close correspondence between the perception and production boundaries. The durational ratio of stop closure to word (where the "word" was defined as disyllables) was previously found to be an invariant parameter that classified the two categories in production, but the present study found that this ratio varied with different speaking rates in perception. However, regression and discriminant analyses of perception and production data showed that treating stop closure as a function of word duration with an intercept term represented the perception and production boundaries very well. This result indicated that the durational ratio of adjusted stop closure (i.e., closure with an added constant) to the word was invariant and distinguished the two phonemic categories clearly. Taken together, the results support the relational acoustic invariance theory, and help refine the theory with regard to exactly what form 'invariance' can take.  相似文献   

8.
In a variety of experiments and paradigms, researchers have attempted to determine whether or not speech perception is specialized by comparing perception of speech syllables to perception of nonspeech analogs. While nonspeech analogs appear optimal as comparisons to speech because they are acoustically similar without being recognized as speechlike, it is argued that the comparison they offer is confounded and uninterpretable. Two experiments are designed to show that, in auditory perception generally where acoustic signals are causal consequences of mechanical events, perceptual experiences are of the mechanical events themselves, not of the acoustic signal. This has two consequences. One is that there is a confounding in comparisons of speech with sine wave analogs that, whereas the one perceived as speech also has a definite causal source, the other, perceived as nonspeech, has an indeterminate or ambiguous source. A second is that response patterns in classification tasks such as those used in the literature comparing speech to nonspeech will reflect properties of the perceived sound-producing event; they will not provide a clear window on auditory system processes used to recover event properties. Experiment 3 is designed to show that perception of many acoustic-signal-producing events can appear to be special by the logic of speech-sine wave comparisons--even events that cannot plausibly be supposed to involve a specialization.  相似文献   

9.
Studies with adults have demonstrated that acoustic cues cohere in speech perception such that two stimuli cannot be discriminated if separate cues bias responses equally, but oppositely, in each. This study examined whether this kind of coherence exists for children's perception of speech signals, a test that first required that a contrast be found for which adults and children show similar cue weightings. Accordingly, experiment 1 demonstrated that adults, 7-, and 5-year-olds weight F2-onset frequency and gap duration similarly in "spa" versus "sa" decisions. In experiment 2, listeners of these same ages made "same" or "not-the-same" judgments for pairs of stimuli in an AX paradigm when only one cue differed, when the two cues were set within a stimulus to bias the phonetic percept towards the same category (relative to the other stimulus in the pair), and when the two cues were set within a stimulus to bias the phonetic percept towards different categories. Unexpectedly, adults' results contradicted earlier studies: They were able to discriminate stimuli when the two cues conflicted in how they biased phonetic percepts. Results for 7-year-olds replicated those of adults, but were not as strong. Only the results of 5-year-olds revealed the kind of perceptual coherence reported by earlier studies for adults. Thus, it is concluded that perceptual coherence for speech signals is present from an early age, and in fact listeners learn to overcome it under certain conditions.  相似文献   

10.
根据混响环境下的汉语单音节清晰度实验,采用多维尺度和聚类分析的方法得到了混响作用下声母、韵母的知觉空间结构和层次逻辑关系。发现混响环境下声母的主要知觉特征是舌的发音部位(摩擦部位)和送气一不送气,其中舌的发音部位是声母最重要的知觉特征;韵母的主要知觉特征是起始部分元音的舌位。声母的清一浊特征和韵母的韵尾在混响环境下对语音知觉几乎不起作用。实验结果也揭示出语音的知觉特征与物理传递条件的相关性。   相似文献   

11.
Studies of the effects of lexical neighbors upon the recognition of spoken words have generally assumed that the most salient competitors differ by a single phoneme. The present study employs a procedure that induces the listeners to perceive and call out the salient competitors. By presenting a recording of a monosyllable repeated over and over, perceptual adaptation is produced, and perception of the stimulus is replaced by perception of a competitor. Reports from groups of subjects were obtained for monosyllables that vary in their frequency-weighted neighborhood density. The findings are compared with predictions based upon the neighborhood activation model.  相似文献   

12.
The present study investigated the extent to which native English listeners' perception of Japanese length contrasts can be modified with perceptual training, and how their performance is affected by factors that influence segment duration, which is a primary correlate of Japanese length contrasts. Listeners were trained in a minimal-pair identification paradigm with feedback, using isolated words contrasting in vowel length, produced at a normal speaking rate. Experiment 1 tested listeners using stimuli varying in speaking rate, presentation context (in isolation versus embedded in carrier sentences), and type of length contrast. Experiment 2 examined whether performance varied by the position of the contrast within the word, and by whether the test talkers were professionally trained or not. Results did not show that trained listeners improved overall performance to a greater extent than untrained control participants. Training improved perception of trained contrast types, generalized to nonprofessional talkers' productions, and improved performance in difficult within-word positions. However, training did not enable listeners to cope with speaking rate variation, and did not generalize to untrained contrast types. These results suggest that perceptual training improves non-native listeners' perception of Japanese length contrasts only to a limited extent.  相似文献   

13.
14.
The present article aims at exploring the invariant parameters involved in the perceptual normalization of French vowels. A set of 490 stimuli, including the ten French vowels /i y u e ? o E oe (inverted c) a/ produced by an articulatory model, simulating seven growth stages and seven fundamental frequency values, has been submitted as a perceptual identification test to 43 subjects. The results confirm the important effect of the tonality distance between F1 and f0 in perceived height. It does not seem, however, that height perception involves a binary organization determined by the 3-3.5-Bark critical distance. Regarding place of articulation, the tonotopic distance between F1 and F2 appears to be the best predictor of the perceived front-back dimension. Nevertheless, the role of the difference between F2 and F3 remains important. Roundedness is also examined and correlated to the effective second formant, involving spectral integration of higher formants within the 3.5-Bark critical distance. The results shed light on the issue of perceptual invariance, and can be interpreted as perceptual constraints imposed on speech production.  相似文献   

15.
The effect of rate of stimulation on spectral shape perception was measured in six users of the Nucleus CI24 cochlear implant. Three spectral shapes were created by using three profiles of current across seven electrode positions. Each current profile was replicated in three stimuli that interleaved stimulus pulses across the seven electrodes with cycle rates (rate per electrode) of 450, 900, and 1800 Hz. The stimulus space resulting from a multidimensional scaling experiment showed a clear dimension related to the rate of stimulation that was orthogonal to the dimension related to the spectral shapes. A second experiment was performed with the same subjects to investigate whether the perceptual dimension related to rate in Experiment 1 could be attributed to different perceptual flatness of the profiles at different rates. In Experiment 2, the rate of stimulation was fixed at 900 Hz and three profiles were created for each spectral shape that differed in flatness. This experiment did not, however, result in an independent perceptual dimension related to the flatness of the profile. In conclusion, rate of stimulation provided an independent perceptual dimension in the multiple-electrode stimuli, in spite of the rates being not discriminable or barely discriminable in single-electrode stimulation.  相似文献   

16.
Context is important for recovering language information from talker-induced variability in acoustic signals. In tone perception, previous studies reported similar effects of speech and nonspeech contexts in Mandarin, supporting a general perceptual mechanism underlying tone normalization. However, no supportive evidence was obtained in Cantonese, also a tone language. Moreover, no study has compared speech and nonspeech contexts in the multi-talker condition, which is essential for exploring the normalization mechanism of inter-talker variability in speaking F0. The other question is whether a talker's full F0 range and mean F0 equally facilitate normalization. To answer these questions, this study examines the effects of four context conditions (speech/nonspeech?×?F0 contour/mean F0) in the multi-talker condition in Cantonese. Results show that raising and lowering the F0 of speech contexts change the perception of identical stimuli from mid level tone to low and high level tone, whereas nonspeech contexts only mildly increase the identification preference. It supports the speech-specific mechanism of tone normalization. Moreover, speech context with flattened F0 trajectory, which neutralizes cues of a talker's full F0 range, fails to facilitate normalization in some conditions, implying that a talker's mean F0 is less efficient for minimizing talker-induced lexical ambiguity in tone perception.  相似文献   

17.

Background

How does the brain repair obliterated speech and cope with acoustically ambivalent situations? A widely discussed possibility is to use top-down information for solving the ambiguity problem. In the case of speech, this may lead to a match of bottom-up sensory input with lexical expectations resulting in resonant states which are reflected in the induced gamma-band activity (GBA).

Methods

In the present EEG study, we compared the subject's pre-attentive GBA responses to obliterated speech segments presented after a series of correct words. The words were a minimal pair in German and differed with respect to the degree of specificity of segmental phonological information.

Results

The induced GBA was larger when the expected lexical information was phonologically fully specified compared to the underspecified condition. Thus, the degree of specificity of phonological information in the mental lexicon correlates with the intensity of the matching process of bottom-up sensory input with lexical information.

Conclusions

These results together with those of a behavioural control experiment support the notion of multi-level mechanisms involved in the repair of deficient speech. The delineated alignment of pre-existing knowledge with sensory input is in accordance with recent ideas about the role of internal forward models in speech perception.
  相似文献   

18.
The mechanism and temporal characteristics of gloss perception are not entirely clear. In addition, the formulation for predicting gloss perception from photometric values has not been established. In the present study, we conducted an experiment to measure several temporal characteristics of gloss perception in order to clarify the mechanism. All stimuli were rendered as computer graphics with Phong and Lambert models to provide gloss perception to human observers. We measured perceptual glossiness with a magnitude estimation method and perceptual diffuse/specular reflectance of test stimuli with a matching method under several stimulus conditions, such as reflectance coefficients and stimulus duration. The results showed that the perceptual specular component and perceptual glossiness increase with decreasing stimulus duration. Finally, we proposed a formulation to predict perceptual glossiness as a function of stimulus duration.  相似文献   

19.
Traditionally, timbre has been defined as that perceptual attribute that differentiates two sounds when pitch and loudness are equal, and thus is a measure of dissimilarity. By such a definition, each voice possesses a set of timbres, and the ability to identify any voice or voice category across different pitch-loudness-vowel combinations must be due to an ability to "link" these timbres by abstracting the "timbre transformation," the manner in which timbre subtly changes across pitch and loudness for a specific voice or voice category. Using stimuli produced across the singing range by singers from different voice categories, this study sought to examine how timbre and pitch interact in the perception of dissimilarity in male singing voices. This study also investigated whether or not listener experience affects the perception of timbre as a function of pitch. The resulting multidimensional scaling (MDS) representations showed that for all stimuli and listeners, dimension 1 correlated with pitch, while dimension 2 correlated with spectral centroid and separated vocal stimuli into the categories baritone and tenor. Dimension 3 appeared highly idiosyncratic depending on the nature of the stimuli and on the experience of the listener. Inexperienced listeners appeared to rely more heavily on pitch in making dissimilarity judgments than did experienced listeners. The resulting MDS representations of dissimilarity across pitch provide a glimpse of the timbre transformation of voice categories across pitch.  相似文献   

20.
We provide a direct demonstration that nonhuman primates spontaneously perceive changes in formant frequencies in their own species-typical vocalizations, without training or reinforcement. Formants are vocal tract resonances leading to distinctive spectral prominences in the vocal signal, and provide the acoustic determinant of many key phonetic distinctions in human languages. We developed algorithms for manipulating formants in rhesus macaque calls. Using the resulting computer-manipulated calls in a habituation/dishabituation paradigm, with blind video scoring, we show that rhesus macaques spontaneously respond to a change in formant frequencies within the normal macaque vocal range. Lack of dishabituation to a "synthetic replica" signal demonstrates that dishabituation was not due to an artificial quality of synthetic calls, but to the formant shift itself. These results indicate that formant perception, a significant component of human voice and speech perception, is a perceptual ability shared with other primates.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号