首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 407 毫秒
1.
Coding and computation with neural spike trains   总被引:1,自引:0,他引:1  
We study a simple model for the statistics of neural spike trains as they encode a continuously varying signal. The model is motivated with reference to several recent experiments on sensory neurons, and we show how analogies between the relevant probabilistic issues in neural coding and statistical mechanics can be exploited. Results are given for the information capacity of the code, for the optimal structure of code-reading algorithms, and for the time delays which arise in optimal processing of the coded signal. In addition, we show how simple analog computations can be expressed directly in terms of transformations of the spike train. The rules for reading the code and for optimal analog computation depend on the context for behavioral decision making-the relative weights assigned to different types of errors, the relative importance of different signals. We find that there is a conflict between minimizing this context dependence of the code and maximizing its information capacity; a compromise can be achieved by appropriate preprocessing (filtering) of the encoded signal. Experiments on auditory and visual neurons qualitatively confirm the predicted filtering. Similarly, the structure of the optimal multiplier neuron is shown to depend upon the intensity and spectral content of incoming signals, and these predictions compare favorably with experiments on a movement-sensitive cell in the fly visual system.  相似文献   

2.
Echolocating bats transmit ultrasonic vocalizations and use information contained in the reflected sounds to analyze the auditory scene. Auditory scene analysis, a phenomenon that applies broadly to all hearing vertebrates, involves the grouping and segregation of sounds to perceptually organize information about auditory objects. The perceptual organization of sound is influenced by the spectral and temporal characteristics of acoustic signals. In the case of the echolocating bat, its active control over the timing, duration, intensity, and bandwidth of sonar transmissions directly impacts its perception of the auditory objects that comprise the scene. Here, data are presented from perceptual experiments, laboratory insect capture studies, and field recordings of sonar behavior of different bat species, to illustrate principles of importance to auditory scene analysis by echolocation in bats. In the perceptual experiments, FM bats (Eptesicus fuscus) learned to discriminate between systematic and random delay sequences in echo playback sets. The results of these experiments demonstrate that the FM bat can assemble information about echo delay changes over time, a requirement for the analysis of a dynamic auditory scene. Laboratory insect capture experiments examined the vocal production patterns of flying E. fuscus taking tethered insects in a large room. In each trial, the bats consistently produced echolocation signal groups with a relatively stable repetition rate (within 5%). Similar temporal patterning of sonar vocalizations was also observed in the field recordings from E. fuscus, thus suggesting the importance of temporal control of vocal production for perceptually guided behavior. It is hypothesized that a stable sonar signal production rate facilitates the perceptual organization of echoes arriving from objects at different directions and distances as the bat flies through a dynamic auditory scene. Field recordings of E. fuscus, Noctilio albiventris, N. leporinus, Pippistrellus pippistrellus, and Cormura brevirostris revealed that spectral adjustments in sonar signals may also be important to permit tracking of echoes in a complex auditory scene.  相似文献   

3.
In the auditory system, there should be elements that convert temporal parameters into spatial ones. To simulate such conversion, various neural networks are used. In this study, we modeled this conversion, carried out by one complex neuron on the basis of learning without a teacher. We postulate that conversion of the time code into a spatial code is observed at the input of the model. We admit that every aciculum of a complex neuron responds as a coincidence detector, and after each coincidence at any synapse, the neuron generates a spike. Every spike at the output of a neuron changes the weight of all acicula according to the Hebb principle. Training of the model is done without a teacher simply owing to model’s multiple perception of a certain type of signals. In the given case, such signals are the actual activity of the cochlear nucleus of frog, which arises as a response to an amplitude-modulated tone. After the action of such signals, the model behaved as a detector of the modulation frequency used during training. Such a situation existed up to modulation frequencies near 40 Hz. At higher modulation frequencies, the model even extracted signals with a doubled modulation period.  相似文献   

4.
张宏  丁炯  童勤业  程千流 《物理学报》2015,64(18):188701-188701
神经信息系统实质上是定量系统, 应引起足够重视. 关于神经系统的定量研究方面的报道比较少见. 这一问题将会影响进一步的研究, 如双耳声音定向. 双耳定向是定量测量, 用定性分析的方法无法满足要求. 已有的生理实验发现声音输入信号强度与听觉神经的输出频率存在单调递增关系, 所以本文中声音强度的变化被简化成神经脉冲频率的变化. 本文基于圆映射和符号动力学原理, 建立了神经回路定量模型, 模型中对同侧输入回路采用兴奋性耦合, 对侧输入回路采用抑制性耦合, 并考虑神经元间突触连接的量子释放特征, 采用化学耦合模型实现连接, 用耦合系数表示神经元间的耦合程度. 采用Hodgkin-Huxley模型仿真研究听觉神经回路的输入/输出脉冲序列关系. 在已经仿真过的参数范围, 模型在输入信号变化与输出脉冲频率变化间存在单调递增/递减的关系. 对于单输入单输出的神经元, 采用符号动力学方法进行符号化; 对于多输入单输出的神经元, 采用分析各输出脉冲的产生时间, 判断其变化位置, 从神经脉冲序列中得到对应的两耳声音幅值差变化, 以此定位声源. 随着输出脉冲数的增加, 符号序列的长度增加, 符号序列对输入信号变化敏感, 能够得到较高的测量精度. 仿真结果表明这个模型是定量的, 神经脉冲序列能够区分信号的大小.  相似文献   

5.
Spike train regularity of the noisy neural auditory system model under the influence of two sinusoidal signals with different frequencies is investigated. For the increasing ratio m/n of the input signal frequencies (m, n are natural numbers) the linear growth of the regularity is found at the fixed difference (m - n). It is shown that the spike train regularity in the model is high for harmonious chords of input tones and low for dissonant ones.  相似文献   

6.
Vocal recognition is common among songbirds, and provides an excellent model system to study the perceptual and neurobiological mechanisms for processing natural vocal communication signals. Male European starlings, a species of songbird, learn to recognize the songs of multiple conspecific males by attending to stereotyped acoustic patterns, and these learned patterns elicit selective neuronal responses in auditory forebrain neurons. The present study investigates the perceptual grouping of spectrotemporal acoustic patterns in starling song at multiple temporal scales. The results show that permutations in sequencing of submotif acoustic features have significant effects on song recognition, and that these effects are specific to songs that comprise learned motifs. The observations suggest that (1) motifs form auditory objects embedded in a hierarchy of acoustic patterns, (2) that object-based song perception emerges without explicit reinforcement, and (3) that multiple temporal scales within the acoustic pattern hierarchy convey information about the individual identity of the singer. The authors discuss the results in the context of auditory object formation and talker recognition.  相似文献   

7.
Amplitude modulation is an important parameter defining vertebrate acoustic communication signals. Nesting male plainfin midshipman fish, Porichthys notatus, emit simple, long duration hums in which modulation is strikingly absent. Envelope modulation is, however, introduced when the hums of adjacent males overlap to produce acoustic beats. Hums attract gravid females and can be mimicked with continuous tones at the fundamental frequency. While individual hums have flat envelopes, other midshipman signals are amplitude modulated. This study used one-choice playback tests with gravid females to examine the role of envelope modulation in hum recognition. Various pulse train and two-tone beat stimuli resembling natural communication signals were presented individually, and the responses compared to those for continuous pure tones. The effectiveness of pulse trains was graded and depended upon both pulse duration and the ratio of pulse to gap length. Midshipman were sensitive to beat modulations from 0.5 to 10 Hz, with fewer fish approaching the beat than the pure tone. Reducing the degree of modulation increased the effectiveness of beat stimuli. Hence, the lack of modulation in the midshipman's advertisement call corresponds to the importance of envelope modulation for the categorization of communication signals even in this relatively simple system.  相似文献   

8.
Vocal quality factors: analysis, synthesis, and perception.   总被引:4,自引:0,他引:4  
The purpose of this study was to examine several factors of vocal quality that might be affected by changes in vocal fold vibratory patterns. Four voice types were examined: modal, vocal fry, falsetto, and breathy. Three categories of analysis techniques were developed to extract source-related features from speech and electroglottographic (EGG) signals. Four factors were found to be important for characterizing the glottal excitations for the four voice types: the glottal pulse width, the glottal pulse skewness, the abruptness of glottal closure, and the turbulent noise component. The significance of these factors for voice synthesis was studied and a new voice source model that accounted for certain physiological aspects of vocal fold motion was developed and tested using speech synthesis. Perceptual listening tests were conducted to evaluate the auditory effects of the source model parameters upon synthesized speech. The effects of the spectral slope of the source excitation, the shape of the glottal excitation pulse, and the characteristics of the turbulent noise source were considered. Applications for these research results include synthesis of natural sounding speech, synthesis and modeling of vocal disorders, and the development of speaker independent (or adaptive) speech recognition systems.  相似文献   

9.
Because of their dynamic properties, most sounds can best be characterized in the combined frequency-time (FT) domain. Powerful frequency-time characterizations are the Wigner distribution function (WDF) and the Rihacek energy density function (RDF). In the present paper several new concepts are introduced such as using the WDF to characterize the tuning of auditory neurons under wideband noise stimulation and a new method to quantify phase lock of auditory neurons to a wideband noise. No appreciable differences were found between the WDF and RDF in narrow-band signal representations. However, the differences between the WDF and RDF increase as the bandwidth of the signal increases. When signals are buried in uncorrelated background noise, the average FT function of these signals may be obtained through averaging the FT functions for each signal plus noise segment. The WDF takes at least a factor 2 more in time to compute than the RDF. The FT functions can be used to characterize (linear) filters by averaging FT functions of input-noise segments that precede threshold crossings of the filter's output signal. Both the WDF and the RDF were used to characterize auditory neurons from the midbrain in anurans; the WDF always had a smaller bandwidth than the RDF. By comparing the spectrum of the reverse correlation function and the average spectrum of the noise segments preceding the spikes, a quantification of the amount of phase lock of the auditory neuron to the noise is obtained.  相似文献   

10.
We investigated the influences of different types of temporal correlations in the input signal on the functions and coding properties of neurons in the primary visual cortex (V1). We found that the temporal transfer functions of V1 neurons exhibit higher gain, and the spike responses exhibit higher coding efficiency and information transmission rates, for the 1/f (natural long-term correlation) signals than for 1/f(0) (no correlation) and 1/f(2) (stronger long-term correlation) signals. These results suggest that the intermediate long-term correlation ubiquitous to natural signals may play an important role in shaping and optimizing the machinery of neurons in their adaptation to the natural environment.  相似文献   

11.
At a cocktail party, listeners must attend selectively to a target speaker and segregate their speech from distracting speech sounds uttered by other speakers. To solve this task, listeners can draw on a variety of vocal, spatial, and temporal cues. Recently, Vestergaard et al. [J. Acoust. Soc. Am. 125, 1114-1124 (2009)] developed a concurrent-syllable task to control temporal glimpsing within segments of concurrent speech, and this allowed them to measure the interaction of glottal pulse rate and vocal tract length and reveal how the auditory system integrates information from independent acoustic modalities to enhance recognition. The current paper shows how the interaction of these acoustic cues evolves as the temporal overlap of syllables is varied. Temporal glimpses as short as 25 ms are observed to improve syllable recognition substantially when the target and distracter have similar vocal characteristics, but not when they are dissimilar. The effect of temporal glimpsing on recognition performance is strongly affected by the form of the syllable (consonant-vowel versus vowel-consonant), but it is independent of other phonetic features such as place and manner of articulation.  相似文献   

12.
This study describes the effects on the spike count, spike timing, and entrainment of cat auditory cortex neurons of parametric variations in the repetition rate and amplitude of a brief, characteristic frequency tone pulse. Data were obtained from single neurons in barbiturate-anesthetized cats to which signals were presented monaurally to the ear contralateral to the recording electrode. All neurons showed low-pass sensitivity to tone repetition rate. In cells with a monotonic rate response, the effect of an increasing stimulus level was to elevate the response rate and to extend performance to higher repetition rates. In nonmonotonic cells, cutoff frequencies (for repetition rate) varied with overall spike count. Latent periods increased with increases in repetition rate. This effect developed over the first few stimulus trials at any given repetition rate. Spike entrainment to the tone pulses varied with both repetition rate and signal level. Increases in signal level improved entrainment for responses to stimuli presented at low repetition rates, but entrainment at high repetition rates always saturated at significantly imperfect levels.  相似文献   

13.
In contrast to humans and songbirds, there is limited evidence of vocal learning in nonhuman primates. While previous studies suggested that primate vocalizations exhibit developmental changes, detailed analyses of the extent and time course of such changes across a species' vocal repertoire remain limited. In a highly vocal primate, the common marmoset (Callithrix jacchus), we studied developmental changes in the acoustic structure of species-specific communication sounds produced in a social setting. We performed detailed acoustic analyses of the spectral and temporal characteristics of marmoset vocalizations during development, comparing differences between genders and twin pairs, as well as with vocalizations from adult marmosets residing in the same colony. Our analyses revealed significant changes in spectral and temporal features as well as variability of particular call types over time. Infant and juvenile vocalizations changed progressively toward the vocalizations produced by adult marmosets. Call types observed early in development that were unique to infants disappeared gradually with age, while vocal exchanges with conspecifics emerged. Our observations clearly indicate that marmoset vocalizations undergo both qualitative and quantitative postnatal changes, establishing the basis for further studies to delineate contributions from maturation of the vocal apparatus and behavioral experience.  相似文献   

14.
Forward masking, as measured behaviorally, is defined as an increase in a signal's detection threshold resulting from a preceding masker. Previously, forward masking in the auditory nerve has been measured as a reduction in the neural response to a signal when preceded by a masker. However, detection threshold depends on both the magnitude of the response to the signal and the variance of the response. Thus changes in detectability cannot be inferred from response reduction alone. Relkin and Pelli (1987) have described a two-interval forced-choice procedure that may be used to measure the threshold for the detection of a probe signal in recordings of spike counts in single auditory neurons. These methods have been used to study the forward masking of characteristic frequency probe tones by characteristic frequency maskers as masker intensity was varied. Although the masker does reduce the detectability of the probe tone, it was found that the threshold shifts are much less than those observed behaviorally, particularly for intense maskers. In part, the small threshold shifts can be attributed to the reduction in response variance following the masker, which is the result of the adaptation of spontaneous activity. These results imply that behavioral forward masking must result from suboptimal processing of spike counts from auditory neurons at a location central to the auditory nerve.  相似文献   

15.
Speech intelligibility is known to be relatively unaffected by certain deformations of the acoustic spectrum. These include translations, stretching or contracting dilations, and shearing of the spectrum (represented along the logarithmic frequency axis). It is argued here that such robustness reflects a synergy between vocal production and auditory perception. Thus, on the one hand, it is shown that these spectral distortions are produced by common and unavoidable variations among different speakers pertaining to the length, cross-sectional profile, and losses of their vocal tracts. On the other hand, it is argued that these spectral changes leave the auditory cortical representation of the spectrum largely unchanged except for translations along one of its representational axes. These assertions are supported by analyses of production and perception models. On the production side, a simplified sinusoidal model of the vocal tract is developed which analytically relates a few "articulatory" parameters, such as the extent and location of the vocal tract constriction, to the spectral peaks of the acoustic spectra synthesized from it. The model is evaluated by comparing the identification of synthesized sustained vowels to labeled natural vowels extracted from the TIMIT corpus. On the perception side a "multiscale" model of sound processing is utilized to elucidate the effects of the deformations on the representation of the acoustic spectrum in the primary auditory cortex. Finally, the implications of these results for the perception of generally identifiable classes of sound sources beyond the specific case of speech and the vocal tract are discussed.  相似文献   

16.
The fundamental frequency of vocal fold oscillation (F(0)) is controlled by laryngeal mechanics and aerodynamic properties. F(0) change per unit change of transglottal pressure (dF/dP) using a shutter valve has been studied and found to have nonlinear, V-shaped relationship with F(0). On the other hand, the vocal tract is also known to affect vocal fold oscillation. This study examined the effect of artificially lengthened vocal tract length on dF/dP. dF/dP was measured in six men using two mouthpieces of different lengths. Results: The dF/dP graph for the longer vocal tract was shifted leftward relative to the shorter one. Conclusion: Using the one-mass model, the nadir of the "V" on the dF/dP graph was strongly influenced by the resonance around the first formant frequency. However, a more precise model is needed to account for the effects of viscosity and turbulence.  相似文献   

17.
A blind method for suppressing late reverberation from speech and audio signals is presented. The proposed technique operates both on the spectral and on the sub-band domains employing a single input channel. At first, a preliminary rough clean signal estimation is required and for this, any standard technique may be applied; however here the estimate is obtained through spectral subtraction. Then, an auditory masking model is employed in sub-bands to extract the reverberation masking index (RMI) which identifies signal regions with perceived alterations due to late reverberation. Utilizing a selective signal processing technique only these regions are suppressed through sub-band temporal envelope filtering based on analytical expressions. Objective and subjective measures indicate that the proposed method achieves significant late reverberation suppression for both speech and music signals over a wide range of reverberation time (RT) scenarios.  相似文献   

18.
Despite much research, the relationship between vocal acoustic signals and perceived voice quality is not well understood. The present study used an auditory model proposed by Moore et al10 to study how changes in the acoustic spectrum may relate to changes in perceptual ratings of breathiness. Perceptual ratings of breathiness were obtained using a multidimensional scaling (MDS) design. The stimulus distances on the dominant MDS dimension were correlated with several commonly used acoustic measures for voice quality. These distances were also compared with measures obtained from the output of the auditory model. Results show that the partial loudness of the harmonic energy obtained with the aspiration noise acting as a masker was the most important predictor of perceptual ratings of breathiness. Results also demonstrate that measures obtained from the auditory spectrum were better predictors of perceptual ratings of breathiness than were commonly used acoustic spectral measures.  相似文献   

19.
The effects of variations in vocal effort corresponding to common conversation situations on spectral properties of vowels were investigated. A database in which three degrees of vocal effort were suggested to the speakers by varying the distance to their interlocutor in three steps (close--0.4 m, normal--1.5 m, and far--6 m) was recorded. The speech materials consisted of isolated French vowels, uttered by ten naive speakers in a quiet furnished room. Manual measurements of fundamental frequency F0, frequencies, and amplitudes of the first three formants (F1, F2, F3, A1, A2, and A3), and on total amplitude were carried out. The speech materials were perceptually validated in three respects: identity of the vowel, gender of the speaker, and vocal effort. Results indicated that the speech materials were appropriate for the study. Acoustic analysis showed that F0 and F1 were highly correlated with vocal effort and varied at rates close to 5 Hz/dB for F0 and 3.5 Hz/dB for F1. Statistically F2 and F3 did not vary significantly with vocal effort. Formant amplitudes A1, A2, and A3 increased significantly; The amplitudes in the high-frequency range increased more than those in the lower part of the spectrum, revealing a change in spectral tilt. On the average, when the overall amplitude is increased by 10 dB, A1, A2, and A3 are increased by 11, 12.4, and 13 dB, respectively. Using "auditory" dimensions, such as the F1-F0 difference, and a "spectral center of gravity" between adjacent formants for representing vowel features did not reveal a better constancy of these parameters with respect to the variations of vocal effort and speaker. Thus a global view is evoked, in which all of the aspects of the signal should be processed simultaneously.  相似文献   

20.
To study the mechanisms that govern the coding of temporal features of complex sound signals, responses of single neurons located in the dorsal nucleus of the medulla oblongata (the cochlear nucleus) of a curarized grass frog (Rana temporaria) to pure tone bursts and amplitude modulated tone bursts with a modulation frequency of 20 Hz and modulation depths of 10 and 80% were recorded. The carrier frequency was equal to the characteristic frequency of a neuron, the average signal level was 20–30 dB above the threshold, and the signal duration was equal to ten full modulation periods. Of the 133 neurons studied, 129 neurons responded to 80% modulated tone bursts by discharges that were phase-locked with the envelope waveform. At this modulation depth, the best phase locking was observed for neurons with the phasic type of response to tone bursts. For tonic neurons with low characteristic frequencies, along with the reproduction of the modulation, phase locking with the carrier frequency of the signal was observed. At 10% amplitude modulation, phasic neurons usually responded to only the onset of a tone burst. Almost all tonic units showed a tendency to reproduce the envelope, although the efficiency of the reproduction was low, and for half of these neurons, it was below the reliability limit. Some neurons exhibited a more efficient reproduction of the weak modulation. For almost half of the neurons, a reliable improvement was observed in the phase locking of the response during the tone burst presentation (from the first to the tenth modulation period). The cooperative histogram of a set of neurons responding to 10% modulated tone bursts within narrow ranges of frequencies and intensities retains the information on the dynamics of the envelope variation. The data are compared with the results obtained from the study of the responses to similar signals in the acoustic midbrain center of the same object and also with the psychophysical effect of a differential sensitivity increase in the process of adaptation.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号