首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 0 毫秒
1.
Navigation in virtual environments relies on an accurate spatial rendering. A virtual object is localized according to its position in the environment, which is usually defined by the following three coordinates: azimuth, elevation and distance. Even though several studies investigated the perception of auditory and visual cues in azimuth and elevation, little has been made on the distance dimension. This study aims at investigating the way humans estimate visual and auditory egocentric distances of virtual objects. Subjects were asked to estimate the egocentric distance of 2–20 m distant objects in three contexts: auditory perception alone, visual one alone, combination of both perceptions (with coherent and incoherent visual and auditory cues). Even though egocentric distance was under-estimated in all contexts, the results showed a higher influence of visual information than auditory information on the perceived distance. Specifically, the bimodal incoherent condition gave perceived distances equivalent to those in the visual-only condition only when the visual target was closer to the subject than the auditory target.  相似文献   

2.
Event-related fMRI of auditory and visual oddball tasks   总被引:10,自引:0,他引:10  
Functional magnetic resonance imaging (fMRI) was used to investigate the spatial distribution of cortical activation in frontal and parietal lobes during auditory and visual oddball tasks in 10 healthy subjects. The purpose of the study was to compare activation within auditory and visual modalities and identify common patterns of activation across these modalities. Each subject was scanned eight times, four times each for the auditory and visual conditions. The tasks consisted of a series of trials presented every 1500 ms of which 4-6% were target trials. Subjects kept a silent count of the number of targets detected during each scan. The data were analyzed by correlating the fMRI signal response of each pixel to a reference hemodynamic response function that modeled expected responses to each target stimulus. The auditory and visual targets produced target-related activation in frontal and parietal cortices with high spatial overlap particularly in the middle frontal gyrus and in the anterior cingulate. Similar convergence zones were detected in parietal cortex. Temporal differences were detected in the onset of the activation in frontal and parietal areas with an earlier onset in parietal areas than in the middle frontal areas. Based on consistent findings with previous event-related oddball tasks, the high degree of spatial overlap in frontal and parietal areas appears to be due to modality independent or amodal processes related to procedural aspects of the tasks that may involve memory updating and non-specific response organization.  相似文献   

3.
Subjects presented with coherent auditory and visual streams generally fuse them into a single percept. This results in enhanced intelligibility in noise, or in visual modification of the auditory percept in the McGurk effect. It is classically considered that processing is done independently in the auditory and visual systems before interaction occurs at a certain representational stage, resulting in an integrated percept. However, some behavioral and neurophysiological data suggest the existence of a two-stage process. A first stage would involve binding together the appropriate pieces of audio and video information before fusion per se in a second stage. Then it should be possible to design experiments leading to unbinding. It is shown here that if a given McGurk stimulus is preceded by an incoherent audiovisual context, the amount of McGurk effect is largely reduced. Various kinds of incoherent contexts (acoustic syllables dubbed on video sentences or phonetic or temporal modifications of the acoustic content of a regular sequence of audiovisual syllables) can significantly reduce the McGurk effect even when they are short (less than 4?s). The data are interpreted in the framework of a two-stage "binding and fusion" model for audiovisual speech perception.  相似文献   

4.
The existence of auditory cues such as intonation, rhythm, and pausing that facilitate end-of-utterance detection is by now well established. It has been argued repeatedly that speakers may also employ visual cues to indicate that they are at the end of their utterance. This raises at least two questions, which are addressed in the current paper. First, which modalities do speakers use for signalling finality and nonfinality, and second, how sensitive are observers to these signals. Our goal is to investigate the relative contribution of three different conditions to end-of-utterance detection: the two unimodal ones, vision only and audio only, and their bimodal combination. Speaker utterances were collected via a novel semicontrolled production experiment, in which participants provided lists of words in an interview setting. The data thus collected were used in two perception experiments, which systematically compared responses to unimodal (audio only and vision only) and bimodal (audio-visual) stimuli. Experiment I is a reaction time experiment, which revealed that humans are significantly quicker in end-of-utterance detection when confronted with bimodal or audio-only stimuli, than for vision-only stimuli. No significant differences in reaction times were found between the bimodal and audio-only condition, and therefore a second experiment was conducted. Experiment II is a classification experiment, and showed that participants perform significantly better in the bimodal condition than in the two unimodal ones. Both the first and the second experiment revealed interesting differences between speakers in the various conditions, which indicates that some speakers are more expressive in the visual and others in the auditory modality.  相似文献   

5.
In face-to-face speech communication, the listener extracts and integrates information from the acoustic and optic speech signals. Integration occurs within the auditory modality (i.e., across the acoustic frequency spectrum) and across sensory modalities (i.e., across the acoustic and optic signals). The difficulties experienced by some hearing-impaired listeners in understanding speech could be attributed to losses in the extraction of speech information, the integration of speech cues, or both. The present study evaluated the ability of normal-hearing and hearing-impaired listeners to integrate speech information within and across sensory modalities in order to determine the degree to which integration efficiency may be a factor in the performance of hearing-impaired listeners. Auditory-visual nonsense syllables consisting of eighteen medial consonants surrounded by the vowel [a] were processed into four nonoverlapping acoustic filter bands between 300 and 6000 Hz. A variety of one, two, three, and four filter-band combinations were presented for identification in auditory-only and auditory-visual conditions: A visual-only condition was also included. Integration efficiency was evaluated using a model of optimal integration. Results showed that normal-hearing and hearing-impaired listeners integrated information across the auditory and visual sensory modalities with a high degree of efficiency, independent of differences in auditory capabilities. However, across-frequency integration for auditory-only input was less efficient for hearing-impaired listeners. These individuals exhibited particular difficulty extracting information from the highest frequency band (4762-6000 Hz) when speech information was presented concurrently in the next lower-frequency band (1890-2381 Hz). Results suggest that integration of speech information within the auditory modality, but not across auditory and visual modalities, affects speech understanding in hearing-impaired listeners.  相似文献   

6.
A single pool of untrained subjects was tested for interactions across two bimodal perception conditions: audio-tactile, in which subjects heard and felt speech, and visual-tactile, in which subjects saw and felt speech. Identifications of English obstruent consonants were compared in bimodal and no-tactile baseline conditions. Results indicate that tactile information enhances speech perception by about 10 percent, regardless of which other mode (auditory or visual) is active. However, within-subject analysis indicates that individual subjects who benefit more from tactile information in one cross-modal condition tend to benefit less from tactile information in the other.  相似文献   

7.
Delayed auditory feedback (DAF) regarding speech can cause dysfluency. The purpose of this study was to explore whether providing visual feedback in addition to DAF would ameliorate speech disruption. Speakers repeated sentences and heard their auditory feedback delayed with and without simultaneous visual feedback. DAF led to increased sentence durations and an increased number of speech disruptions. Although visual feedback did not reduce DAF effects on duration, a promising but nonsignificant trend was observed for fewer speech disruptions when visual feedback was provided. This trend was significant in speakers who were overall less affected by DAF. The results suggest the possibility that speakers strategically use alternative sources of feedback.  相似文献   

8.
This paper addresses the JND(Just Noticeable Difference)change of auditory perception with synchronous visual stimuli.Through psychoacoustics experimentS,loudness JND,subjective duration JND and pitch JND of pure tone were measured in auditory-only mode and visual_auditory mode with different visual stimuli which have different attributes such as color,illumination,quality and moving state.Statistical analyses of the experimental data indicare that,comparing with JND in auditory-only mode,the amount of JND with visual stimuli is often larger.The JND'S average increment of subjective duration,pitch and loudness are 45.1%,14.8%and 12.3%,respectively.The conclusion is that the ability of JNDbased auditory perception often decreases with visual stimuli.The incremental amount of JND is afiected bv the attributes of visual stimuli.If the visual stimuli make subjects feel more comfortable,the JND of auditory perception will change smaller.  相似文献   

9.
赵志军  谢凌云 《声学学报》2013,38(5):624-631
视听交互的重要性日益突出,但视觉刺激对听觉感知的影响尚缺乏全面深入的研究。以视觉刺激下人耳对声音的主观听感差别阈限变化为研究对象,在主观听觉实验中施加颜色、质量、亮度、运动状态四个不同属性视觉刺激,同时测量纯音信号的响度、主观音长和音高的听感差别阈限。通过与无视觉刺激下相应差别阈限的比较,分析不同视觉条件对响度感知、主观音长感知、音高感知能力的影响。实验数据显示,施加视觉刺激后主观听觉感知的差别阈限值增大,主观音长、音高和响度的差别阈限值平均分别提高了45.1%,14.8%和12.3%。进一步分析的结果表明,施加视觉刺激后基本的听觉感知能力呈下降趋势。同一视觉属性的不同水平视觉条件对听觉感知的影响程度不同,主观听感的变化呈现出一定的规律性,即视觉刺激越舒适,听感的差别阈限变化越小。  相似文献   

10.
11.
Opera performance conveys both visual and auditory information to an audience, and so opera theaters should be evaluated in both domains. This study investigates the effect of static visual and auditory cues on seat preference in an opera theater. Acoustical parameters were measured and visibility was analyzed for nine seats. Subjective assessments for visual-only, auditory-only, and auditory-visual preferences for these seat positions were made through paired-comparison tests. In the cases of visual-only and auditory-only subjective evaluations, preference judgment tests on a rating scale were also employed. Visual stimuli were based on still photographs, and auditory stimuli were based on binaural impulse responses convolved with a solo tenor recording. For the visual-only experiment, preference is predicted well by measurements taken related to the angle of seats from the theater midline at the center of the stage, the size of the photographed stage view, the visual obstruction, and the distance from the stage. Sound pressure level was the dominant predictor of auditory preference in the auditory-only experiment. In the cross-modal experiments, both auditory and visual preferences were shown to contribute to overall impression, but auditory cues were more influential than the static visual cues. The results show that both a positive visual-only or a positive auditory-only evaluations positively contribute to the assessments of seat quality.  相似文献   

12.

Background  

Recent findings of a tight coupling between visual and auditory association cortices during multisensory perception in monkeys and humans raise the question whether consistent paired presentation of simple visual and auditory stimuli prompts conditioned responses in unimodal auditory regions or multimodal association cortex once visual stimuli are presented in isolation in a post-conditioning run. To address this issue fifteen healthy participants partook in a "silent" sparse temporal event-related fMRI study. In the first (visual control) habituation phase they were presented with briefly red flashing visual stimuli. In the second (auditory control) habituation phase they heard brief telephone ringing. In the third (conditioning) phase we coincidently presented the visual stimulus (CS) paired with the auditory stimulus (UCS). In the fourth phase participants either viewed flashes paired with the auditory stimulus (maintenance, CS-) or viewed the visual stimulus in isolation (extinction, CS+) according to a 5:10 partial reinforcement schedule. The participants had no other task than attending to the stimuli and indicating the end of each trial by pressing a button.  相似文献   

13.
Much research has explored how spoken word recognition is influenced by the architecture and dynamics of the mental lexicon (e.g., Luce and Pisoni, 1998; McClelland and Elman, 1986). A more recent question is whether the processes underlying word recognition are unique to the auditory domain, or whether visually perceived (lipread) speech may also be sensitive to the structure of the mental lexicon (Auer, 2002; Mattys, Bernstein, and Auer, 2002). The current research was designed to test the hypothesis that both aurally and visually perceived spoken words are isolated in the mental lexicon as a function of their modality-specific perceptual similarity to other words. Lexical competition (the extent to which perceptually similar words influence recognition of a stimulus word) was quantified using metrics that are well-established in the literature, as well as a statistical method for calculating perceptual confusability based on the phi-square statistic. Both auditory and visual spoken word recognition were influenced by modality-specific lexical competition as well as stimulus word frequency. These findings extend the scope of activation-competition models of spoken word recognition and reinforce the hypothesis (Auer, 2002; Mattys et al., 2002) that perceptual and cognitive properties underlying spoken word recognition are not specific to the auditory domain. In addition, the results support the use of the phi-square statistic as a better predictor of lexical competition than metrics currently used in models of spoken word recognition.  相似文献   

14.
潘杨  孟子厚 《声学学报》2013,38(2):215-223
在视听交互环境中,当视觉和听觉掩蔽刺激同时感知时,对视听觉的相互作用及掩蔽变化问题进行了实验研究。将视听交互场景按注意状态分类,实验测量了不同场景中视觉和听觉掩蔽阈值的变化,讨论了不同视觉和听觉掩蔽刺激的相互影响。实验结果发现对比单任务实验的阈值,不同视听交互环境中视觉和听觉的掩蔽阈值会产生相应变化。由实验结果可以推论视听交互环境中注意分配影响视觉和听觉掩蔽效应,并且视听掩蔽刺激的相互作用会进一步影响视觉和听觉唤起的注意。  相似文献   

15.

Background  

To investigate the long-latency activities common to all sensory modalities, electroencephalographic responses to auditory (1000 Hz pure tone), tactile (electrical stimulation to the index finger), visual (simple figure of a star), and noxious (intra-epidermal electrical stimulation to the dorsum of the hand) stimuli were recorded from 27 scalp electrodes in 14 healthy volunteers.  相似文献   

16.
Despite many studies investigating auditory spatial impressions in rooms, few have addressed the impact of simultaneous visual cues on localization and the perception of spaciousness. The current research presents an immersive audiovisual environment in which participants were instructed to make auditory width judgments in dynamic bi-modal settings. The results of these psychophysical tests suggest the importance of congruent audio visual presentation to the ecological interpretation of an auditory scene. Supporting data were accumulated in five rooms of ascending volumes and varying reverberation times. Participants were given an audiovisual matching test in which they were instructed to pan the auditory width of a performing ensemble to a varying set of audio and visual cues in rooms. Results show that both auditory and visual factors affect the collected responses and that the two sensory modalities coincide in distinct interactions. The greatest differences between the panned audio stimuli given a fixed visual width were found in the physical space with the largest volume and the greatest source distance. These results suggest, in this specific instance, a predominance of auditory cues in the spatial analysis of the bi-modal scene.  相似文献   

17.
Vowel perception studies were conducted on a group of four adolescent children with congenital profound sensorineural hearing impairments in the three conditions of audition alone, vision alone, and audition plus vision. Data were analyzed using the ALSCAL multidimensional scaling procedure to identify the underlying dimensions and individual differences in dimension emphasis. The three dimensions obtained from the analysis of data for the audition alone condition were interpreted as the parameters of first and second formant frequencies, and vowel length. The one dimension for the vision alone condition was interpreted as the parameter of the width of the internal lip opening. The three dimensions for the audition plus vision condition were interpreted as the parameters of first formant frequency, vowel length, and the width of the internal lip opening. Subject variations in parameter preferences were observed for the audition alone and audition plus vision conditions but not for the vision alone condition.  相似文献   

18.
Derived-band auditory brainstem responses (ABRs) were obtained in 43 normal-hearing and 80 cochlear hearing-impaired individuals using clicks and high-pass noise masking. The response times across the cochlea [the latency difference between wave V's of the 5.7- and 1.4-kHz center frequency (CF) derived bands] were calculated for five levels of click stimulation ranging from 53 to 93 dB p.-p.e. SPL (23 to 63 dB nHL) in 10-dB steps. Cochlear response times appeared to shorten significantly with hearing loss, especially when the average pure tone (1 to 8 kHz) hearing loss exceeded 30 dB. Examination of derived-band latencies indicates that this shortening is due to a dramatic decrease of wave V latency in the lower CF derived band. Estimates of cochlear filter times in terms of the number of periods to maximum response (Nmax) were calculated from derived-band latencies corrected for gender-dependent cochlear transport and neural conduction times. Nmax decreased as a function of hearing loss, especially for the low CF derived bands. The functions were similar for both males and females. These results are consistent with broader cochlear tuning due to peripheral hearing loss. Estimating filter response times from ABR latencies enhances objective noninvasive diagnosis and allows delineation of the differential effects of pathology on the underlying cochlear mechanisms involved in cochlear transport and filter build-up times.  相似文献   

19.
Functional magnetic resonance imaging (fMRI) has rapidly become the most widely used imaging method for studying brain functions in humans. This is a result of its extreme flexibility of use and of the astonishingly detailed spatial and temporal information it provides. Nevertheless, until very recently, the study of the auditory system has progressed at a considerably slower pace compared to other functional systems. Several factors have limited fMRI research in the auditory field, including some intrinsic features of auditory functional anatomy and some peculiar interactions between fMRI technique and audition. A well known difficulty arises from the high intensity acoustic noise produced by gradient switching in echo-planar imaging (EPI), as well as in other fMRI sequences more similar to conventional MR sequences. The acoustic noise interacts in an unpredictable way with the experimental stimuli both from a perceptual point of view and in the evoked hemodynamics. To overcome this problem, different approaches have been proposed recently that generally require careful tailoring of the experimental design and the fMRI methodology to the specific requirements posed by the auditory research. The novel methodological approaches can make the fMRI exploration of auditory processing much easier and more reliable, and thus may permit filling the gap with other fields of neuroscience research. As a result, some fundamental neural underpinnings of audition are being clarified, and the way sound stimuli are integrated in the auditory gestalt are beginning to be understood.  相似文献   

20.
Dynamic range and asymmetry of the auditory filter   总被引:2,自引:0,他引:2  
This experiment was designed to measure the shape and asymmetry of the auditory filter over a wider dynamic range than has been measured previously. Thresholds were measured for 2-kHz sinusoidal signals in the presence of two 800-Hz-wide noise bands, one above and one below the signal frequency. The spectrum level of the noise was 45 dB (re: 20 muPa), and the noise bands were placed both symmetrically and asymmetrically about the signal frequency. The deviation of the signal frequency from the nearer edge of each noise band varied from 0 to 0.8 times the signal frequency. Each ear of six subjects was tested, and the subjects' ages ranged from 22 to 74 years. The auditory filters derived from the data were somewhat asymmetric, with steeper slopes on the high-frequency side; the degree of asymmetry varied across subjects. The asymmetry could be characterized as a uniform stretching of the (linear) frequency scale on one side of the filter. The dynamic range of the auditory filter exceeded 60 dB in the younger listeners, but the dynamic range and sharpness of the filter tended to decrease with increasing age.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号