首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 484 毫秒
1.
The neural processes underlying concurrent sound segregation were examined by using event-related brain potentials. Participants were presented with complex sounds comprised of multiple harmonics, one of which could be mistuned so that it was no longer an integer multiple of the fundamental. In separate blocks of trials, short-, middle-, and long-duration sounds were presented and participants indicated whether they heard one sound (i.e., buzz) or two sounds (i.e., buzz plus another sound with a pure-tone quality). The auditory stimuli were also presented while participants watched a silent movie in order to evaluate the extent to which the mistuned harmonic could be automatically detected. The perception of the mistuned harmonic as a separate sound was associated with a biphasic negative-positive potential that peaked at about 150 and 350 ms after sound onset, respectively. Long duration sounds also elicited a sustained potential that was greater in amplitude when the mistuned harmonic was perceptually segregated from the complex sound. The early negative wave, referred to as the object-related negativity (ORN), was present during both active and passive listening, whereas the positive wave and the mistuning-related changes in sustained potentials were present only when participants attended to the stimuli. These results are consistent with a two-stage model of auditory scene analysis in which the acoustic wave is automatically decomposed into perceptual groups that can be identified by higher executive functions. The ORN and the positive waves were little affected by sound duration, indicating that concurrent sound segregation depends on transient neural responses elicited by the discrepancy between the mistuned harmonic and the harmonic frequency expected based on the fundamental frequency of the incoming stimulus.  相似文献   

2.
The ability to separate simultaneous auditory objects is crucial to infant auditory development. Music in particular relies on the ability to separate musical notes, chords, and melodic lines. Little research addresses how infants process simultaneous sounds. The present study used a conditioned head-turn procedure to examine whether 6-month-old infants are able to discriminate a complex tone (240 Hz, 500 ms, six harmonics in random phase with a 6 dB roll-off per octave) from a version with the third harmonic mistuned. Adults perceive such stimuli as containing two auditory objects, one with the pitch of the mistuned harmonic and the other with pitch corresponding to the fundamental of the complex tone. Adult thresholds were between 1% and 2% mistuning. Infants performed above chance levels for 8%, 6%, and 4% mistunings, with no significant difference between conditions. However, performance was not significantly different from chance for 2% mistuning and significantly worse for 2% compared to all larger mistunings. These results indicate that 6-month-old infants are sensitive to violations of harmonic structure and suggest that they are able to separate two simultaneously sounding objects.  相似文献   

3.
The cortical mechanisms of perceptual segregation of concurrent sound sources were examined, based on binaural detection of interaural timing differences. Auditory event-related potentials were measured from 11 healthy subjects. Binaural stimuli were created by introducing a dichotic delay of 500-ms duration to a narrow frequency region within a broadband noise, and resulted in a perception of a centrally located noise and a right-lateralized pitch (dichotic pitch). In separate listening conditions, subjects actively discriminated and responded to randomly interleaved binaural and control stimuli, or ignored random stimuli while watching silent cartoons. In a third listening condition subjects ignored stimuli presented in homogenous blocks. For all listening conditions, the dichotic pitch stimulus elicited an object-related negativity (ORN) at a latency of about 150-250 ms after stimulus onset. When subjects were required to actively respond to stimuli, the ORN was followed by a P400 wave with a latency of about 320-420 ms. These results support and extend a two-stage model of auditory scene analysis in which acoustic streams are automatically parsed into component sound sources based on source-relevant cues, followed by a controlled process involving identification and generation of a behavioral response.  相似文献   

4.
Three experiments investigated how the onset asynchrony and ear of presentation of a single mistuned frequency component influence its contribution to the pitch of an otherwise harmonic complex tone. Subjects matched the pitch of the target complex by adjusting the pitch of a second similar but strictly periodic complex tone. When the mistuned component (the 4th harmonic of a 155 Hz fundamental) started 160 ms or more before the remaining harmonics but stopped simultaneously with them, it made a reduced contribution to the pitch of the complex. It made no contribution if it started more than 300 ms before. Pitch shifts and their reduction with onset time were larger for short (90 ms) sounds than for long (410 ms). Pitch shifts were slightly larger when the mistuned component was presented to the same ear as the remaining 11 in-tune harmonics than to the opposite ear. Adding a "captor" complex tone with a fundamental of 200 Hz and a missing 3rd harmonic to the contralateral ear did not augment the effect of onset time, even though the captor was synchronous with the mistuned harmonic, the mistuned component was equal in frequency to the missing 3rd harmonic of the captor complex tone and it was played to the same ear as the captor. The results show that a difference in onset time can prevent a resolved frequency component from contributing to the pitch of a complex tone even though it is present throughout that complex tone.  相似文献   

5.
Simultaneous tones that are harmonically related tend to be grouped perceptually to form a unitary auditory image. A partial that is mistuned stands out from the other tones, and harmonic complexes with different fundamental frequencies can readily be perceived as separate auditory objects. These phenomena are evidence for the strong role of harmonicity in perceptual grouping and segregation of sounds. This study measured the discriminability of harmonicity directly. In a two interval, two alternative forced-choice (2I2AFC) paradigm, the listener chose which of two sounds, signal or foil, was composed of tones that more closely matched an exact harmonic relationship. In one experiment, the signal was varied from perfectly harmonic to highly inharmonic by adding frequency perturbation to each component. The foil always had 100% perturbation. Group mean performance decreased from greater than 90% correct for 0% signal perturbation to near chance for 80% signal perturbation. In the second experiment, adding a masker presented simultaneously with the signals and foils disrupted harmonicity. Both monaural and dichotic conditions were tested. Signal level was varied relative to masker level to obtain psychometric functions from which slopes and midpoints were estimated. Dichotic presentation of these audible stimuli improved performance by 3-10 dB, due primarily to a release from "informational masking" by the perceptual segregation of the signal from the masker.  相似文献   

6.
When a partial of a periodic complex is mistuned, its change in pitch is greater than expected. Two experiments examined whether these partial-pitch shifts are related to the computation of global pitch. In experiment 1, stimuli were either harmonic or frequency-shifted (25% of F0) complexes. One partial was mistuned by +/- 4% and played with leading and lagging portions of 500 ms each, relative to the other components (1 s), in both monaural and dichotic contexts. Subjects indicated whether the mistuned partial was higher or lower in pitch when concurrent with the other components. Responses were positively correlated with the direction of mistuning in all conditions. In experiment 2, stimuli from each condition were compared with synchronous equivalents. Subjects matched a pure tone to the pitch of the mistuned partial (component 4). The results showed that partial-pitch shifts are not reduced in size by asynchrony. Similar asynchronies are known to produce a near-exclusion of a mistuned partial from the global-pitch computation. This mismatch indicates that global and partial pitch are derived from different processes. The similarity of the partial-pitch shifts observed for harmonic and frequency-shifted stimuli suggests that they arise from a grouping mechanism that is sensitive to spectral regularity.  相似文献   

7.
The auditory system continuously parses the acoustic environment into auditory objects, usually representing separate sound sources. Sound sources typically show characteristic emission patterns. These regular temporal sound patterns are possible cues for distinguishing sound sources. The present study was designed to test whether regular patterns are used as cues for source distinction and to specify the role that detecting these regularities may play in the process of auditory stream segregation. Participants were presented with tone sequences, and they were asked to continuously indicate whether they perceived the tones in terms of a single coherent sequence of sounds (integrated) or as two concurrent sound streams (segregated). Unknown to the participant, in some stimulus conditions, regular patterns were present in one or both putative streams. In all stimulus conditions, participants' perception switched back and forth between the two sound organizations. Importantly, regular patterns occurring in either one or both streams prolonged the mean duration of two-stream percepts, whereas the duration of one-stream percepts was unaffected. These results suggest that temporal regularities are utilized in auditory scene analysis. It appears that the role of this cue lies in stabilizing streams once they have been formed on the basis of simpler acoustic cues.  相似文献   

8.
Hearing a mistuned harmonic in an otherwise periodic complex tone   总被引:1,自引:0,他引:1  
The ability of a listener to detect a mistuned harmonic in an otherwise periodic tone is representative of the capacity to segregate auditory entities on the basis of steady-state signal cues. By use of a task in which listeners matched the pitch of a mistuned harmonic, this ability has been studied, in order to find dependences on mistuned harmonic number, fundamental frequency, signal level, and signal duration. The results considerably augment the data previously obtained from discrimination experiments and from experiments in which listeners counted apparent sources. Although previous work has emphasized the role of spectral resolution in the segregation process, the present work suggests that neural synchrony is an important consideration; our data show that listeners lose the ability to segregate mistuned harmonics at high frequencies where synchronous neural firing vanishes. The functional form of this loss is insensitive to the spacing of the harmonics. The matching experiment also permits the measurement of the pitches of mistuned harmonics. The data exhibit shifts of a form that argues against models of pitch shifts that are based entirely upon partial masking.  相似文献   

9.
This paper describes a neurocognitive model of pitch segregation in which it is proposed that recognition mechanisms initiate early in auditory processing pathways so that long-term memory templates may be employed to segregate and integrate auditory features. In this model neural representations of pitch height are primed by the location and pattern of excitation across auditory filter channels in relation to long-term memory templates for common stimuli. Since waveform driven pitch mechanisms may produce information at multiple frequencies for tonal stimuli, pitch priming was assumed to include competitive inhibition that would allow only one pitch estimation at any time. Consequently concurrent pitch information must be relayed to short-term memory via a parallel mechanism that employs pitch information contained in the long-term memory template of the chord. Pure tones, harmonic complexes and two pitch chords of harmonic complexes were correctly classified by the correlation of templates comprising auditory nerve excitation and off-frequency inhibition with the excitation patterns of stimuli. The model then replicated behavioral data for pitch matching of concurrent vowels. Comparison of model outputs to the behavioral data suggests that inability to recognize a stimulus was associated with poor pitch segregation due to the use of inappropriate pitch priming strategies.  相似文献   

10.
Recovery of auditory brainstem responses (ABR) in a bottlenose dolphin was studied in conditions of double-pip stimulation when two stimuli in a pair differed in frequency and intensity. When the conditioning and test stimuli were of equal frequencies, the test response was markedly suppressed at short interstimulus intervals; complete recovery appeared at intervals from about 2 ms (when two stimuli were of equal intensity) to 10-20 ms (when the conditioning stimulus exceeded the test by up to 40 dB). When the two stimuli were of different frequencies, the suppression diminished and was almost absent at a half-octave difference even if the conditioning stimulus exceeded the test one by 40 dB. Frequency-dependence curves (ABR amplitude dependence on frequency difference between the two stimuli) had equivalent rectangular bandwidth from +/-0.2 oct at test stimuli of 20 dB above threshold to +/-0.5 oct at test stimuli of 50 dB above threshold.  相似文献   

11.
Echolocating bats transmit ultrasonic vocalizations and use information contained in the reflected sounds to analyze the auditory scene. Auditory scene analysis, a phenomenon that applies broadly to all hearing vertebrates, involves the grouping and segregation of sounds to perceptually organize information about auditory objects. The perceptual organization of sound is influenced by the spectral and temporal characteristics of acoustic signals. In the case of the echolocating bat, its active control over the timing, duration, intensity, and bandwidth of sonar transmissions directly impacts its perception of the auditory objects that comprise the scene. Here, data are presented from perceptual experiments, laboratory insect capture studies, and field recordings of sonar behavior of different bat species, to illustrate principles of importance to auditory scene analysis by echolocation in bats. In the perceptual experiments, FM bats (Eptesicus fuscus) learned to discriminate between systematic and random delay sequences in echo playback sets. The results of these experiments demonstrate that the FM bat can assemble information about echo delay changes over time, a requirement for the analysis of a dynamic auditory scene. Laboratory insect capture experiments examined the vocal production patterns of flying E. fuscus taking tethered insects in a large room. In each trial, the bats consistently produced echolocation signal groups with a relatively stable repetition rate (within 5%). Similar temporal patterning of sonar vocalizations was also observed in the field recordings from E. fuscus, thus suggesting the importance of temporal control of vocal production for perceptually guided behavior. It is hypothesized that a stable sonar signal production rate facilitates the perceptual organization of echoes arriving from objects at different directions and distances as the bat flies through a dynamic auditory scene. Field recordings of E. fuscus, Noctilio albiventris, N. leporinus, Pippistrellus pippistrellus, and Cormura brevirostris revealed that spectral adjustments in sonar signals may also be important to permit tracking of echoes in a complex auditory scene.  相似文献   

12.
Vocal vibrato and tremor are characterized by oscillations in voice fundamental frequency (F0). These oscillations may be sustained by a control loop within the auditory system. One component of the control loop is the pitch-shift reflex (PSR). The PSR is a closed loop negative feedback reflex that is triggered in response to discrepancies between intended and perceived pitch with a latency of approximately 100 ms. Consecutive compensatory reflexive responses lead to oscillations in pitch every approximately 200 ms, resulting in approximately 5-Hz modulation of F0. Pitch-shift reflexes were elicited experimentally in six subjects while they sustained /u/ vowels at a comfortable pitch and loudness. Auditory feedback was sinusoidally modulated at discrete integer frequencies (1 to 10 Hz) with +/- 25 cents amplitude. Modulated auditory feedback induced oscillations in voice F0 output of all subjects at rates consistent with vocal vibrato and tremor. Transfer functions revealed peak gains at 4 to 7 Hz in all subjects, with an average peak gain at 5 Hz. These gains occurred in the modulation frequency region where the voice output and auditory feedback signals were in phase. A control loop in the auditory system may sustain vocal vibrato and tremorlike oscillations in voice F0.  相似文献   

13.
The cerebral magnetic field of the auditory steady-state response (SSR) to sinusoidal amplitude-modulated (SAM) tones was recorded in healthy humans. The waveforms of underlying cortical source activity were calculated at multiples of the modulation frequency using the method of source space projection, which improved the signal-to-noise ratio (SNR) by a factor of 2 to 4. Since the complex amplitudes of the cortical source activity were independent of the sensor position in relation to the subject's head, a comparison of the results across experimental sessions was possible. The effect of modulation frequency on the amplitude and phase of the SSR was investigated at 30 different values between 10 and 98 Hz. At modulation frequencies between 10 and 20 Hz the SNR of harmonics near 40 Hz were predominant over the fundamental SSR. Above 30 Hz the SSR showed an almost sinusoidal waveform with an amplitude maximum at 40 Hz. The amplitude decreased with increasing modulation frequency but was significantly different from the magnetoencephalographic (MEG) background activity up to 98 Hz. Phase response at the fundamental and first harmonic decreased monotonically with increasing modulation frequency. The group delay (apparent latency) showed peaks of 72 ms at 20 Hz, 48 ms at 40 Hz, and 26 ms at 80 Hz. The effects of stimulus intensity, modulation depth, and carrier frequency on amplitude and phase of the SSR were also investigated. The SSR amplitude decreased linearly when stimulus intensity or the modulation depth were decreased in logarithmic steps. SSR amplitude decreased by a factor of 3 when carrier frequency increased from 250 to 4000 Hz. From the phase characteristics, time delays were found in the range of 0 to 6 ms for stimulus intensity, modulation depth, and carrier frequency, which were maximal at low frequencies, low intensities, or maximal modulation depth.  相似文献   

14.
Frequency modulation coherence was investigated as a possible cue for the perceptual segregation of concurrent sound sources. Synthesized chords of 2-s duration and comprising six permutations of three sung vowels (/a/, /i/, /o/) at three fundamental frequencies (130.8, 174.6, and 233.1 Hz) were constructed. In one condition, no vowels were modulated, and, in a second, all three were modulated coherently such that the ratio relations among all frequency components were maintained. In a third group of conditions, one vowel was modulated, while the other two remained steady. In a fourth group, one vowel was modulated independently of the two other vowels, which were modulated coherently with one another. Subjects were asked to judge the perceived prominence of each of the three vowels in each chord. Judged prominence increased significantly when the target vowel was modulated compared to when it was not, with the greatest increase being found for higher fundamental frequencies. The increase in prominence with modulation was unaffected by whether the target was modulated coherently or not with nontarget vowels. The modulation and pitch position of nontarget vowels had no effect on target vowel prominence. These results are discussed in terms of possible concurrent auditory grouping principles.  相似文献   

15.
The segregation of concurrent vocal signals is an auditory processing task faced by all vocal species. To segregate concurrent signals, the auditory system must encode the spectral and temporal features of the fused waveforms such that at least one signal can be individually detected. In the plainfin midshipman fish (Porichthys notatus), the overlapping mate calls of neighboring males produce acoustic beats with amplitude and phase modulations at the difference frequencies (dF) between spectral components. Prior studies in midshipman have shown that midbrain neurons provide a combinatorial code of the temporal and spectral characteristics of beats via synchronization of spike bursts to dF and changes in spike rate and interspike intervals with changes in spectral composition. In the present study we examine the effects of changes in signal parameters of beats (overall intensity level and depth of modulation) on the spike train outputs of midbrain neurons. The observed changes in spike train parameters further support the hypothesis that midbrain neurons provide a combinatorial code of the spectral and temporal features of concurrent vocal signals.  相似文献   

16.
A four hydrophone linear array was used to localize calling black drum and estimate source levels and signal propagation. A total of 1025 source level estimates averaged 165 dB(RMS) relative (re:) 1 μPa (standard deviation (SD)=1.0). The authors suggest that the diverticulated morphology of the black drum swimbladder increase the bladder's surface area, thus contributing to sound amplitude. Call energy was greatest in the fundamental frequency (94 Hz) followed by the second (188 Hz) and third harmonics (282 Hz). A square root model best described propagation of the entire call, and separately the fundamental frequency and second harmonic. A logarithmic model best described propagation of the third harmonic which was the only component to satisfy the cut-off frequency equation. Peak auditory sensitivity was 300 Hz at a 94 dB re: 1 μPa threshold based on auditory evoked potential measurements of a single black drum. Based on mean RMS source level, signal propagation, background levels, and hearing sensitivity, the communication range of black drum was estimated at 33-108 m and was limited by background levels not auditory sensitivity. This estimate assumed the source and receiver were at approximately 0.5 m above the bottom. Consecutive calls of an individual fish localized over 59 min demonstrated a mean calling period of 3.6 s (SD=0.48), mean swimming speed of 0.5 body lengths/s, and a total distance swam of 1035 m.  相似文献   

17.
Listeners' auditory discrimination of vowel sounds depends in part on the order in which stimuli are presented. Such presentation order effects have been argued to be language independent, and to result from psychophysical (not speech- or language-specific) factors such as the decay of memory traces over time or increased weighting of later-occurring stimuli. In the present study, native Cantonese speakers' discrimination of a linguistic tone continuum is shown to exhibit order of presentation effects similar to those shown for vowels in previous studies. When presented with two successive syllables differing in fundamental frequency by approximately 4 Hz, listeners were significantly more sensitive to this difference when the first syllable was higher in frequency than the second. However, American English-speaking listeners with no experience listening to Cantonese showed no such contrast effect when tested in the same manner using the same stimuli. Neither English nor Cantonese listeners showed any order of presentation effects in the discrimination of a nonspeech continuum in which tokens had the same fundamental frequencies as the Cantonese speech tokens but had a qualitatively non-speech-like timbre. These results suggest that tone presentation order effects, unlike vowel effects, may be language specific, possibly resulting from the need to compensate for utterance-related pitch declination when evaluating fundamental frequency for tone identification.  相似文献   

18.
Thresholds were measured for the detection of inharmonicity in complex tones. Subjects were required to distinguish a complex tone whose partials were all at exact harmonic frequencies from a similar complex tone with one of the partials slightly mistuned. The mistuning which allowed 71% correct identification in a two-alternative forced-choice task was estimated for each partial in turn. In experiment I the fundamental frequency was either 100, 200, or 400 Hz, and the complex tones contained the first 12 harmonics at equal levels of 60 dB SPL per component. The stimulus duration was 410 ms. For each fundamental the thresholds were roughly constant when expressed in Hz, having a mean value of about 4 Hz (range 2.4-7.3 Hz). In experiment II the fundamental frequency was fixed at 200 Hz, and thresholds for inharmonicity were measured for stimulus durations of 50, 110, 410, and 1610 ms. For harmonics above the fifth the thresholds increased from less than 1 Hz to about 40 Hz as duration was decreased from 1610-50 ms. For the lower harmonics (up to the fourth) threshold changed much less with duration, and for the three shorter durations thresholds for each duration were roughly a constant proportion of the harmonic frequency. The results suggest that inharmonicity is detected in different ways for high and low harmonics. For low harmonics the inharmonic partial appears to "stand out" from the complex tone as a whole. For high harmonics the mistuning is detected as a kind of "beat" or "roughness," presumably reflecting a sensitivity to the changing relative phase of the mistuned harmonic relative to the other harmonics.(ABSTRACT TRUNCATED AT 250 WORDS)  相似文献   

19.
Previous studies have demonstrated that perturbations in voice pitch or loudness feedback lead to compensatory changes in voice F(0) or amplitude during production of sustained vowels. Responses to pitch-shifted auditory feedback have also been observed during English and Mandarin speech. The present study investigated whether Mandarin speakers would respond to amplitude-shifted feedback during meaningful speech production. Native speakers of Mandarin produced two-syllable utterances with focus on the first syllable, the second syllable, or none of the syllables, as prompted by corresponding questions. Their acoustic speech signal was fed back to them with loudness shifted by +/-3 dB for 200 ms durations. The responses to the feedback perturbations had mean latencies of approximately 142 ms and magnitudes of approximately 0.86 dB. Response magnitudes were greater and latencies were longer when emphasis was placed on the first syllable than when there was no emphasis. Since amplitude is not known for being highly effective in encoding linguistic contrasts, the fact that subjects reacted to amplitude perturbation just as fast as they reacted to F(0) perturbations in previous studies provides clear evidence that a highly automatic feedback mechanism is active in controlling both F(0) and amplitude of speech production.  相似文献   

20.
When a low harmonic in a harmonic complex tone is mistuned from its harmonic value by a sufficient amount it is heard as a separate tone, standing out from the complex as a whole. This experiment estimated the degree of mistuning required for this phenomenon to occur, for complex tones with 10 or 12 equal-amplitude components (60 dB SPL per component). On each trial the subject was presented with a complex tone which either had all its partials at harmonic frequencies or had one partial mistuned from its harmonic frequency. The subject had to indicate whether he heard a single complex tone with one pitch or a complex tone plus a pure tone which did not "belong" to the complex. An adaptive procedure was used to track the degree of mistuning required to achieve a d' value of 1. Threshold was determined for each ot the first six harmonics of each complex tone. In one set of conditions stimulus duration was held constant at 410 ms, and the fundamental frequency was either 100, 200, or 400 Hz. For most conditions the thresholds fell between 1% and 3% of the harmonic frequency, depending on the subject. However, thresholds tended to be greater for the first two harmonics of the 100-Hz fundamental and, for some subjects, thresholds increased for the fifth and sixth harmonics. In a second set of conditions fundamental frequency was held constant at 200 Hz, and the duration was either 50, 110, 410, or 1610 ms. Thresholds increased by a factor of 3-5 as duration was decreased from 1610 ms to 50 ms. The results are discussed in terms of a hypothetical harmonic sieve and mechanisms for the formation of perceptual streams.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号