首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
The present study investigated the hypothesis that the cues for modulation rate discrimination for unresolved spectral components differ as a function of the spectral region occupied by the stimuli. Specifically, it was hypothesized that when components occupy relatively low spectral regions, phase locking both to the fine structure and to the envelope are useful cues. However, as the spectral region occupied by the components increases, phase locking to the fine structure becomes less robust, whereas phase locking to the envelope remains as a potentially strong cue. Observers were asked to detect a decrease in modulation rate for carrier frequencies between 1500 and 6000 Hz. Both amplitude-modulated (AM) and quasifrequency-modulated (QFM) tones were used in order to produce stimuli having strong and weak envelope cues, respectively. Although there were marked individual differences, the results showed an interaction between modulation type and spectral region, with AM and QFM performance being relatively similar at low spectral region, but with QFM showing a steeper reduction in performance as the spectral region of the carrier frequency increased. Overall, the data are consistent with an interpretation that pitch perception for unresolved components depends upon both fine structure and envelope cues, and that the relative importance of these cues depends upon the spectral region occupied by the stimuli.  相似文献   

2.
Three experiments were designed to provide psychophysical evidence for the existence of envelope information in the temporal fine structure (TFS) of stimuli that were originally amplitude modulated (AM). The original stimuli typically consisted of the sum of a sinusoidally AM tone and two unmodulated tones so that the envelope and TFS could be determined a priori. Experiment 1 showed that normal-hearing listeners not only perceive AM when presented with the Hilbert fine structure alone but AM detection thresholds are lower than those observed when presenting the original stimuli. Based on our analysis, envelope recovery resulted from the failure of the decomposition process to remove the spectral components related to the original envelope from the TFS and the introduction of spectral components related to the original envelope, suggesting that frequency- to amplitude-modulation conversion is not necessary to recover envelope information from TFS. Experiment 2 suggested that these spectral components interact in such a way that envelope fluctuations are minimized in the broadband TFS. Experiment 3 demonstrated that the modulation depth at the original carrier frequency is only slightly reduced compared to the depth of the original modulator. It also indicated that envelope recovery is not specific to the Hilbert decomposition.  相似文献   

3.
The relation between the auditory brain stem potential called the frequency-following response (FFR) and the low pitch of complex tones was investigated. Eleven complex stimuli were synthesized such that frequency content varied but waveform envelope periodicity was constant. This was accomplished by repeatedly shifting the components of a harmonic complex tone upward in frequency by delta f of 20 Hz, producing a series of six-component inharmonic complex tones with constant intercomponent spacing of 200 Hz. Pitch-shift functions were derived from pitch matches for these stimuli to a comparison pure tone for each of four normal hearing adults with extensive musical training. The FFRs were recorded for the complex stimuli that were judged most divergent in pitch by each subject and for pure-tone signals that were judged equal in pitch to these complex stimuli. Spectral analyses suggested that the spectral content of the FFRs elicited by the complex stimuli did not vary consistently with component frequency or the first effect of pitch shift. Furthermore, complex and pure-tone signals judged equal in pitch did not elicit FFRs of similar spectral content.  相似文献   

4.
The intelligibility of sentences processed to remove temporal envelope information, as far as possible, was assessed. Sentences were filtered into N analysis channels, and each channel signal was divided by its Hilbert envelope to remove envelope information but leave temporal fine structure (TFS) intact. Channel signals were combined to give TFS speech. The effect of adding low-level low-noise noise (LNN) to each channel signal before processing was assessed. The addition of LNN reduced the amplification of low-level signal portions that contained large excursions in instantaneous frequency, and improved the intelligibility of simple TFS speech sentences, but not more complex sentences. It also reduced the time needed to reach a stable level of performance. The recovery of envelope cues by peripheral auditory filtering was investigated by measuring the intelligibility of 'recovered-envelope speech', formed by filtering TFS speech with an array of simulated auditory filters, and using the envelopes at the output of these filters to modulate sinusoids with frequencies equal to the filter center frequencies (i.e., tone vocoding). The intelligibility of TFS speech and recovered-envelope speech fell as N increased, although TFS speech was still highly intelligible for values of N for which the intelligibility of recovered-envelope speech was low.  相似文献   

5.
Frequency difference limens for pure tones (DLFs) and for complex tones (DLCs) were measured for four groups of subjects: young normal hearing, young hearing impaired, elderly with near-normal hearing, and elderly hearing impaired. The auditory filters of the subjects had been measured in earlier experiments using the notched-noise method, for center frequencies (fc) of 100, 200, 400, and 800 Hz. The DLFs for both impaired groups were higher than for the young normal group at all fc's (50-4000 Hz). The DLFs at a given fc were generally only weakly correlated with the sharpness of the auditory filter at that fc, and some subjects with broad filters had near-normal DLFs at low frequencies. Some subjects in the elderly normal group had very large DLFs at low frequencies in spite of near-normal auditory filters. These results suggest a partial dissociation of frequency selectivity and frequency discrimination of pure tones. The DLCs for the two impaired groups were higher than those for the young normal group at all fundamental frequencies (fo) tested (50, 100, 200, and 400 Hz); the DLCs for the elderly normal group were intermediate. At fo = 50 Hz, DLCs for a complex tone containing only low harmonics (1-5) were markedly higher than for complex tones containing higher harmonics, for all subject groups, suggesting that pitch was conveyed largely by the higher, unresolved harmonics. For the elderly impaired group, and some subjects in the elderly normal group, DLCs were larger for a complex tone with lower harmonics (1-12) than for tones without lower harmonics (4-12 and 6-12) for fo's up to 200 Hz. Some elderly normal subjects had markedly larger-than-normal DLCs in spite of near-normal auditory filters. The DLCs tended to be larger for complexes with components added in alternating sine/cosine phase than for complexes with components added in cosine phase. Phase effects were significant for all groups, but were small for the young normal group. The results are not consistent with place-based models of the pitch perception of complex tones; rather, they suggest that pitch is at least partly determined by temporal mechanisms.  相似文献   

6.
A spectral discrimination task was used to estimate the frequency range over which information about the temporal envelope is consolidated. The standard consisted of n equal intensity, random phase sinusoids, symmetrically placed around a signal component. The signal was an intensity increment of the central sinusoid, which on average was 1000 Hz. Pitch cues were degraded by randomly selecting the center frequency of the complex and single channel energy cues were degraded with a roving-level procedure. Stimulus bandwidth was controlled by varying the number of tones and the frequency separation between tones. For a fixed frequency separation, thresholds increased as n increased until a certain bandwidth was reached, beyond which thresholds decreased. This discontinuity in threshold functions suggests that different auditory processes predominate at different bandwidths, presumably an envelope analysis at bandwidths less than the breakpoint and across channel level comparisons for wider stimulus bandwidths. Estimates of the "transition bandwidth" for 46 listeners ranged from 100 to 1250 Hz. The results are consistent with a peripheral filtering system having multiple filterbanks.  相似文献   

7.
When all of the components in a harmonic complex tone are shifted in frequency by delta f, the pitch of the complex shifts roughly in proportion to delta f. For tones with a small number of components, the shift is usually somewhat larger than predicted from pitch theories, which has been attributed to the influence of combination tones [Smoorenburg, J. Acoust. Soc. Am. 48, 924-941 (1970)]. Experiment 1 assessed whether combination tones influence the pitch of complex tones with more than five harmonics, by using noise to mask the combination tones. The matching stimulus was a harmonic complex. Test complexes were bandpass filtered with passbands centered on harmonic numbers 5 (resolved), 11 (intermediate), or 16 (unresolved) and fundamental frequencies (FOs) were 100, 200, or 400 Hz. For the intermediate and unresolved conditions, the matching stimuli were filtered with the same passband to minimize differences in the excitation patterns of the test and matching stimuli. For the resolved condition, the matching stimulus had a passband centered above that of the test stimulus, to avoid common partials. For resolved and intermediate conditions, pitch shifts were observed that could generally be predicted from the frequencies of the partials. The shifts were unaffected by addition of noise to mask combination tones. For the unresolved condition, no pitch shift was observed, which suggests that pitch is not based on temporal fine structure for stimuli containing only high unresolved harmonics. Experiment 2 used three-component complexes resembling those of Schouten [J. Acoust. Soc. Am. 34, 1418-1424 (1962)]. Nominal harmonic numbers were 3, 4, 5 (resolved), 8, 9, 10 (intermediate), or 13, 14, 15 (unresolved) and F0s were 50, 100, 200, or 400 Hz. Clear shifts in the matches were found for all conditions, including unresolved. For the latter, subjects may have matched the "center of gravity" of the excitation patterns of the test and matching stimuli.  相似文献   

8.
This study measured the role of spectral details and temporal envelope (E) and fine structure (TFS) cues in reconstructing sentences from speech fragments. Four sets of sentences were processed using a 32-band vocoder. Twenty one bands were either processed or removed, leading to sentences differing in their amount of spectral details, E and TFS information. These sentences remained perfectly intelligible, but intelligibility significantly fell after the introduction of periodic silent gaps of 120-ms. While the role of E was unclear, the results unambiguously showed that TFS cues and spectral details influence the ability to reconstruct interrupted sentences.  相似文献   

9.
Harmonic complex tones comprising components in different spectral regions may differ considerably in timbre. While the pitch of "residue" tones of this type has been studied extensively, their timbral properties have received little attention. Discrimination of F0 for such tones is typically poorer than for complex tones with "corresponding" harmonics [A. Faulkner, J. Acoust. Soc. Am. 78, 1993-2004 (1985)]. The F0 DLs may be higher because timbre differences impair pitch discrimination. The present experiment explores effects of changes in spectral locus and F0 of harmonic complex tones on both pitch and timbre. Six normally hearing listeners indicated if the second tone of a two-tone sequence was: (1) same, (2) higher in pitch, (3) lower in pitch, (4) same in pitch but different in "something else," (5) higher in pitch and different in "something else," or (6) lower in pitch and different in "something else" than the first. ("Something else" is assumed to represent timbre.) The tones varied in spectral loci of four equal-amplitude harmonics m, m + 1, m + 2, and m + 3 (m = 1,2,3,4,5,6) and ranged in F0 from 200 to 200 +/- 2n Hz (n = 0,1,2,4,8,16,32). Results show that changes in F0 primarily affect pitch, and changes in spectral locus primarily affect timbre. However, a change in spectral locus can also influence pitch. The direction of locus change was reported as the direction of pitch change, despite no change in F0 or changes in F0 in the opposite direction for delta F0 < or = 0-2%. This implies that listeners may be attending to the "spectral pitch" of components, or to changes in a timbral attribute like "sharpness," which are construed as changes in overall pitch in the absence of strong F0 cues. For delta F0 > or = 2%, the direction of reported pitch change accord with the direction of F0 change, but the locus change continued to be reported as a timbre change. Rather than spectral-pitch matching of corresponding components, a context-dependent spectral evaluation process is thus implied in discernment of changes in pitch and timbre. Relative magnitudes of change in derived features of the spectrum such as harmonic number and F0, and absolute features such as spectral frequencies are compared. What is called "spectral pitch," contributes to the overall pitch, but also appears to be an important dimension of the multidimensional percept, timbre.  相似文献   

10.
The precedence effect (PE) describes the ability to localize a direct, leading sound correctly when its delayed copy (lag) is present, though not separately audible. The relative contribution of binaural cues in the temporal fine structure (TFS) of lead-lag signals was compared to that of interaural level differences (ILDs) and interaural time differences (ITDs) carried in the envelope. In a localization dominance paradigm participants indicated the spatial location of lead-lag stimuli processed with a binaural noise-band vocoder whose noise carriers introduced random TFS. The PE appeared for noise bursts of 10 ms duration, indicating dominance of envelope information. However, for three test words the PE often failed even at short lead-lag delays, producing two images, one toward the lead and one toward the lag. When interaural correlation in the carrier was increased, the images appeared more centered, but often remained split. Although previous studies suggest dominance of TFS cues, no image is lateralized in accord with the ITD in the TFS. An interpretation in the context of auditory scene analysis is proposed: By replacing the TFS with that of noise the auditory system loses the ability to fuse lead and lag into one object, and thus to show the PE.  相似文献   

11.
平利川  原猛  冯海泓 《声学学报》2012,37(3):324-329
系统地分析与探讨频域分辨率及时域包络周期性对不同音色及频率覆盖范围的音乐音高分辨的影响。选择钢琴、小提琴、小号及单簧管四种乐器的乐音和特定的复合音作为测试音源。利用噪声调制的声码器模型调控音乐信号的频域分辨率和时域包络周期性。十位正常听力者参与了该项音高分辨测试。实验结果表明,随着频域分辨率的提高,受试者对音高分辨的准确率呈上升趋势,16个频带已可获得较好的音高分辨效果;当时域包络周期性信息增加时,未见其对音高分辨产生一致性积极影响。   相似文献   

12.
The contribution of temporal fine structure (TFS) cues to consonant identification was assessed in normal-hearing listeners with two speech-processing schemes designed to remove temporal envelope (E) cues. Stimuli were processed vowel-consonant-vowel speech tokens. Derived from the analytic signal, carrier signals were extracted from the output of a bank of analysis filters. The "PM" and "FM" processing schemes estimated a phase- and frequency-modulation function, respectively, of each carrier signal and applied them to a sinusoidal carrier at the analysis-filter center frequency. In the FM scheme, processed signals were further restricted to the analysis-filter bandwidth. A third scheme retaining only E cues from each band was used for comparison. Stimuli processed with the PM and FM schemes were found to be highly intelligible (50-80% correct identification) over a variety of experimental conditions designed to affect the putative reconstruction of E cues subsequent to peripheral auditory filtering. Analysis of confusions between consonants showed that the contribution of TFS cues was greater for place than manner of articulation, whereas the converse was observed for E cues. Taken together, these results indicate that TFS cues convey important phonetic information that is not solely a consequence of E reconstruction.  相似文献   

13.
Thresholds (F0DLs) were measured for discrimination of the fundamental frequency (F0) of a group of harmonics (group B) embedded in harmonics with a fixed F0. Miyazono and Moore [(2009). Acoust. Sci. & Tech. 30, 383386] found a large training effect for tones with high harmonics in group B, when the harmonics were added in cosine phase. It is shown here that this effect was due to use of a cue related to pitch pulse asynchrony (PPA). When PPA cues were disrupted by introducing a temporal offset between the envelope peaks of the harmonics in group B and the remaining harmonics, F0DLs increased markedly. Perceptual learning was examined using a training stimulus with cosine-phase harmonics, F0 = 50 Hz, and high harmonics in group B, under conditions where PPA was not useful. Learning occurred, and it transferred to other cosine-phase tones, but not to random-phase tones. A similar experiment with F0 = 100 Hz showed a learning effect which transferred to a cosine-phase tone with mainly high unresolved harmonics, but not to cosine-phase tones with low harmonics, and not to random-phase tones. The learning found here appears to be specific to tones for which F0 discrimination is based on distinct peaks in the temporal envelope.  相似文献   

14.
Envelope-induced pitch shifts were measured for exponentially decaying complex tones consisting of two sinusoidal components with frequencies f1 = nf0 + 50 Hz and f2 = (n + 1) f0 + 50 Hz, where n equals 3, 4, or 5 and exponential decay rates were 0, 0.5, 1, and 2 dB/ms. Four subjects adjusted a sinusoidal comparison tone to match the virtual pitch of the (missing) fundamental and the pitches of the lower and upper partials f1 and f2. Pitch shifts for f1 are generally less, and pitch shifts for f2 always greater, than envelope-induced shifts observed in isolated sinusoidal tones of comparable frequency and envelope decay rate. Pitch-shift functions for virtual pitch are similar in magnitude and shape to average pitch-shift functions of the partials, which supports the idea that virtual pitch depends on spectral pitch.  相似文献   

15.
The experiment compared the pitches of complex tones consisting of unresolved harmonics. The fundamental frequency (F0) of the tones was 250 Hz and the harmonics were bandpass filtered between 5500 and 7500 Hz. Two 20-ms complex-tone bursts were presented, separated by a brief gap. The gap was an integer number of periods of the waveform: 0, 4, or 8 ms. The envelope phase of the second tone burst was shifted, such that the interpulse interval (IPI) across the gap was reduced or increased by 0.25 or 0.75 periods (1 or 3 ms). A "no shift" control was also included, where the IPI was held at an integer number of periods. Pitch matches were obtained by varying the F0 of a comparison tone with the same temporal parameters as the standard but without the shift. Relative to the no-shift control, the variations in IPI produced substantial pitch shifts when there was no gap between the bursts, but little effect was seen for gaps of 4 or 8 ms. However, for some conditions with the same IPI in the shifted interval, an increase in the IPI of the comparison interval from 4 to 8 ms (gap increased from 0 to 4 ms) changed the pitch match. The presence of a pitch shift suggests that the pitch mechanism is integrating information across the two tone bursts. It is argued that the results are consistent with a pitch mechanism employing a long integration time for continuous stimuli that is reset in response to temporal discontinuities. For a 250-Hz F0, an 8-ms IPI may be sufficient for resetting. Pitch models based on a spectral analysis of the simulated neural spike train, on an autocorrelation of the spike train, and on the mean rate of pitch pulses, all failed to account for the observed pitch matches.  相似文献   

16.
Performance on 19 auditory discrimination and identification tasks was measured for 340 listeners with normal hearing. Test stimuli included single tones, sequences of tones, amplitude-modulated and rippled noise, temporal gaps, speech, and environmental sounds. Principal components analysis and structural equation modeling of the data support the existence of a general auditory ability and four specific auditory abilities. The specific abilities are (1) loudness and duration (overall energy) discrimination; (2) sensitivity to temporal envelope variation; (3) identification of highly familiar sounds (speech and nonspeech); and (4) discrimination of unfamiliar simple and complex spectral and temporal patterns. Examination of Scholastic Aptitude Test (SAT) scores for a large subset of the population revealed little or no association between general or specific auditory abilities and general intellectual ability. The findings provide a basis for research to further specify the nature of the auditory abilities. Of particular interest are results suggestive of a familiar sound recognition (FSR) ability, apparently specialized for sound recognition on the basis of limited or distorted information. This FSR ability is independent of normal variation in both spectral-temporal acuity and of general intellectual ability.  相似文献   

17.
Virtual pitch in a computational physiological model   总被引:2,自引:0,他引:2  
A computational model of nervous activity in the auditory nerve, cochlear nucleus, and inferior colliculus is presented and evaluated in terms of its ability to simulate psychophysically-measured pitch perception. The model has a similar architecture to previous autocorrelation models except that the mathematical operations of autocorrelation are replaced by the combined action of thousands of physiologically plausible neuronal components. The evaluation employs pitch stimuli including complex tones with a missing fundamental frequency, tones with alternating phase, inharmonic tones with equally spaced frequencies and iterated rippled noise. Particular attention is paid to differences in response to resolved and unresolved component harmonics. The results indicate that the model is able to simulate qualitatively the related pitch-perceptions. This physiological model is similar in many respects to autocorrelation models of pitch and the success of the evaluations suggests that autocorrelation models may, after all, be physiologically plausible.  相似文献   

18.
The ability of normally hearing and hearing-impaired subjects to use temporal fine structure information in complex tones was measured. Subjects were required to discriminate a harmonic complex tone from a tone in which all components were shifted upwards by the same amount in Hz, in a three-alternative, forced-choice task. The tones either contained five equal-amplitude components (non-shaped stimuli) or contained many components, but were passed through a fixed bandpass filter to reduce excitation pattern changes (shaped stimuli). Components were centered at nominal harmonic numbers (N) 7, 11, and 18. For the shaped stimuli, hearing-impaired subjects performed much more poorly than normally hearing subjects, with most of the former scoring no better than chance when N=11 or 18, suggesting that they could not access the temporal fine structure information. Performance for the hearing-impaired subjects was significantly improved for the non-shaped stimuli, presumably because they could benefit from spectral cues. It is proposed that normal-hearing subjects can use temporal fine structure information provided the spacing between fine structure peaks is not too small relative to the envelope period, but subjects with moderate cochlear hearing loss make little use of temporal fine structure information for unresolved components.  相似文献   

19.
Within an auditory channel, the speech waveform contains both temporal envelope (E(O)) and temporal fine structure (TFS) information. Vocoder processing extracts a modified version of the temporal envelope (E') within each channel and uses it to modulate a channel carrier. The resulting signal, E'(Carr), has reduced information content compared to the original "E(O)?+ TFS" signal. The dynamic range over which listeners make additional use of E(O)?+ TFS over E'(Carr) cues was investigated in a competing-speech task. The target-and-background mixture was processed using a 30-channel vocoder. In each channel, E(O)?+ TFS replaced E'(Carr) at either the peaks or the valleys of the signal. The replacement decision was based on comparing the short-term channel level to a parametrically varied "switching threshold," expressed relative to the long-term channel level. Intelligibility was measured as a function of switching threshold, carrier type, target-to-background ratio, and replacement method. Scores showed a dependence on all four parameters. Derived intensity-importance functions (IIFs) showed that E(O)?+ TFS information from 8-13 dB below to 10 dB above the channel long-term level was important. When E(O)?+ TFS information was added at the peaks, IIFs peaked around -2 dB, but when E(O)?+ TFS information was added at the valleys, the peaks lay around +1 dB.  相似文献   

20.
The perception of pitch for pure tones with frequencies falling inside low- or high-frequency dead regions (DRs) was examined. Subjects adjusted a variable-frequency tone to match the pitch of a fixed tone. Matches within one ear were often erratic for tones falling in a DR, indicating unclear pitch percepts. Matches across ears of subjects with asymmetric hearing loss, and octave matches within ears, indicated that tones falling within a DR were perceived with an unclear pitch and/or a pitch different from "normal" whenever the tones fell more than 0.5 octave within a low- or high-frequency DR. One unilaterally impaired subject, with only a small surviving region between 3 and 4 kHz, matched a fixed 0.5-kHz tone in his impaired ear with, on average, a 3.75-kHz tone in his better ear. When asked to match the 0.5-kHz tone with an amplitude-modulated tone, he adjusted the carrier and modulation frequencies to about 3.8 and 0.5 kHz, respectively, suggesting that some temporal information was still available. Overall, the results indicate that the pitch of low-frequency tones is not conveyed solely by a temporal code. Possibly, there needs to be a correspondence between place and temporal information for a normal pitch to be perceived.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号