期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

1.

Spectral integration of speech bands in normal-hearing and hearing-impaired listeners

Hall JW Buss E Grose JH 《The Journal of the Acoustical Society of America》2008,124(2):1105-1115

This investigation examined whether listeners with mild-moderate sensorineural hearing impairment have a deficit in the ability to integrate synchronous spectral information in the perception of speech. In stage 1, the bandwidth of filtered speech centered either on 500 or 2500 Hz was varied adaptively to determine the width required for approximately 15%-25% correct recognition. In stage 2, these criterion bandwidths were presented simultaneously and percent correct performance was determined in fixed block trials. Experiment 1 tested normal-hearing listeners in quiet and in masking noise. The main findings were (1) there was no correlation between the criterion bandwidths at 500 and 2500 Hz; (2) listeners achieved a high percent correct in stage 2 (approximately 80%); and (3) performance in quiet and noise was similar. Experiment 2 tested listeners with mild-moderate sensorineural hearing impairment. The main findings were (1) the impaired listeners showed high variability in stage 1, with some listeners requiring narrower and others requiring wider bandwidths than normal, and (2) hearing-impaired listeners achieved percent correct performance in stage 2 that was comparable to normal. The results indicate that listeners with mild-moderate sensorineural hearing loss do not have an essential deficit in the ability to integrate across-frequency speech information. 相似文献

2.

Acoustic and linguistic factors in the perception of bandpass-filtered speech

Stickney GS Assmann PF 《The Journal of the Acoustical Society of America》2001,109(3):1157-1165

Speech can remain intelligible for listeners with normal hearing when processed by narrow bandpass filters that transmit only a small fraction of the audible spectrum. Two experiments investigated the basis for the high intelligibility of narrowband speech. Experiment 1 confirmed reports that everyday English sentences can be recognized accurately (82%-98% words correct) when filtered at center frequencies of 1500, 2100, and 3000 Hz. However, narrowband low predictability (LP) sentences were less accurately recognized than high predictability (HP) sentences (20% lower scores), and excised narrowband words were even less intelligible than LP sentences (a further 23% drop). While experiment 1 revealed similar levels of performance for narrowband and broadband sentences at conversational speech levels, experiment 2 showed that speech reception thresholds were substantially (>30 dB) poorer for narrowband sentences. One explanation for this increased disparity between narrowband and broadband speech at threshold (compared to conversational speech levels) is that spectral components in the sloping transition bands of the filters provide important cues for the recognition of narrowband speech, but these components become inaudible as the signal level is reduced. Experiment 2 also showed that performance was degraded by the introduction of a speech masker (a single competing talker). The elevation in threshold was similar for narrowband and broadband speech (11 dB, on average), but because the narrowband sentences required considerably higher sound levels to reach their thresholds in quiet compared to broadband sentences, their target-to-masker ratios were very different (+23 dB for narrowband sentences and -12 dB for broadband sentences). As in experiment 1, performance was better for HP than LP sentences. The LP-HP difference was larger for narrowband than broadband sentences, suggesting that context provides greater benefits when speech is distorted by narrow bandpass filtering. 相似文献

3.

Recognition of spectrally degraded and frequency-shifted vowels in acoustic and electric hearing 总被引：5，自引：0，他引：5

Fu QJ Shannon RV 《The Journal of the Acoustical Society of America》1999,105(3):1889-1900

The present study measured the recognition of spectrally degraded and frequency-shifted vowels in both acoustic and electric hearing. Vowel stimuli were passed through 4, 8, or 16 bandpass filters and the temporal envelopes from each filter band were extracted by half-wave rectification and low-pass filtering. The temporal envelopes were used to modulate noise bands which were shifted in frequency relative to the corresponding analysis filters. This manipulation not only degraded the spectral information by discarding within-band spectral detail, but also shifted the tonotopic representation of spectral envelope information. Results from five normal-hearing subjects showed that vowel recognition was sensitive to both spectral resolution and frequency shifting. The effect of a frequency shift did not interact with spectral resolution, suggesting that spectral resolution and spectral shifting are orthogonal in terms of intelligibility. High vowel recognition scores were observed for as few as four bands. Regardless of the number of bands, no significant performance drop was observed for tonotopic shifts equivalent to 3 mm along the basilar membrane, that is, for frequency shifts of 40%-60%. Similar results were obtained from five cochlear implant listeners, when electrode locations were fixed and the spectral location of the analysis filters was shifted. Changes in recognition performance in electrical and acoustic hearing were similar in terms of the relative location of electrodes rather than the absolute location of electrodes, indicating that cochlear implant users may at least partly accommodate to the new patterns of speech sounds after long-time exposure to their normal speech processor. 相似文献

4.

Exploring the temporal mechanism involved in the pitch of unresolved harmonics

Kaernbach C Bering C 《The Journal of the Acoustical Society of America》2001,110(2):1039-1048

This paper continues a line of research initiated by Kaernbach and Demany [J. Acoust. Soc. Am. 104, 2298-2306 (1998)], who employed filtered click sequences to explore the temporal mechanism involved in the pitch of unresolved harmonics. In a first experiment, the just noticeable difference (jnd) for the fundamental frequency (F0) of high-pass filtered and low-pass masked click trains was measured, with F0 (100 to 250 Hz) and the cut frequency (0.5 to 6 kHz) being varied orthogonally. The data confirm the result of Houtsma and Smurzynski [J. Acoust. Soc. Am. 87, 304-310 (1990)] that a pitch mechanism working on the temporal structure of the signal is responsible for analyzing frequencies higher than ten times the fundamental. Using high-pass filtered click trains, however, the jnd for the temporal analysis is at 1.2% as compared to 2%-3% found in studies using band-pass filtered stimuli. Two further experiments provide evidence that the pitch of this stimulus can convey musical information. A fourth experiment replicates the finding of Kaernbach and Demany on first- and second-order regularities with a cut frequency of 2 kHz and extends the paradigm to binaural aperiodic click sequences. The result suggests that listeners can detect first-order temporal regularities in monaural click streams as well as in binaurally fused click streams. 相似文献

5.

A new technique for measuring spectral shape discrimination.

G Kidd C R Mason 《The Journal of the Acoustical Society of America》1992,91(5):2855-2864

A new technique is described for studying the ability of listeners to discriminate between sounds on the basis of spectral shape, a process called "auditory profile analysis." The advantage of the technique is that it reduces the range of the random rove in level necessary to provide a specified limit on the performance which listeners could achieve by "level detection;" that is, by employing a detection strategy based solely on comparisons of stimulus level. Thresholds were measured for the just-discriminable "ripple" (a pattern of alternating intensity increments and decrements) in an equal-amplitude, multitone reference spectrum for a group of normal-hearing listeners. Broadband, high-pass and low-pass filtered conditions were tested. The results indicated that the thresholds obtained using the new technique were well below the lowest level achievable by level detection (referred to as the "level-detection limit") in all conditions using a 20-dB random within-trial rove in overall level. The lowest threshold occurred for the broadband stimulus while the highest threshold was observed for the most extreme high-pass filtered condition. The new technique appears to be well-suited for study of profile analysis in hearing-impaired listeners where stimulus bandwidth and rove range are limited. 相似文献

6.

Extracting spectral envelopes: formant frequency matching between sounds on different and modulated fundamental frequencies

Dissard P Darwin CJ 《The Journal of the Acoustical Society of America》2000,107(2):960-969

The four experiments reported here measure listeners' accuracy and consistency in adjusting a formant frequency of one- or two-formant complex sounds to match the timbre of a target sound. By presenting the target and the adjustable sound on different fundamental frequencies, listeners are prevented from performing the task by comparing the absolute or relative levels of resolved spectral components. Experiment 1 uses two-formant vowellike sounds. When the two sounds have the same F0, the variability of matches (within-subject standard deviation) for either the first or the second formant is around 1%-3%, which is comparable to existing data on formant frequency discrimination thresholds. With a difference in F0, variability increases to around 8% for first-formant matches, but to only about 4% for second-formant matches. Experiment 2 uses sounds with a single formant at 1100 or 1200 Hz with both sounds on either low or high fundamental frequencies. The increase in variability produced by a difference in F0 is greater for high F0's (where the harmonics close to the formant peak are resolved) than it is for low F0's (where they are unresolved). Listeners also showed systematic errors in their mean matches to sounds with different high F0's. The direction of the systematic errors was towards the most intense harmonic. Experiments 3 and 4 showed that introduction of a vibratolike frequency modulation (FM) on F0 reduces the variability of matches, but does not reduce the systematic error. The experiments demonstrate, for the specific frequencies and FM used, that there is a perceptual cost to interpolating a spectral envelope across resolved harmonics. 相似文献

7.

Speechreading supplemented with formant-frequency information from voiced speech 总被引：1，自引：0，他引：1

M Breeuwer R Plomp 《The Journal of the Acoustical Society of America》1985,77(1):314-317

The benefit of supplementing speechreading with information about the frequencies of the first and second formants from the voiced sections of the speech signal was studied by presenting short sentences to 18 normal-hearing listeners under the following three conditions: (a) speechreading combined with listening to the formant-frequency information, (b) speechreading only, and (c) formant-frequency information only. The formant frequencies were presented either as pure tones or as a complex speechlike signal, obtained by filtering a periodic pulse sequence of 250 Hz by a cascade of four second-order bandpass filters (with constant bandwidth); the center frequencies of two of these filters followed the frequencies of the first and second formants, whereas the frequencies of the others remained constant. The percentage of correctly identified syllables increased from 22.8 in the case of speechreading only to 82.0 in the case of speechreading while listening to the complex speechlike signal. Listening to the formant information only scored 33.2% correct. However, comparison with the best-scoring condition of our previous study [Breeuwer and Plomp, J. Acoust. Soc. Am. 76, 686-691 (1984)] indicates that information about the sound-pressure levels in two one-octave filter bands with center frequencies of 500 and 3160 Hz is a more effective supplement to speechreading than the formant-frequency information. 相似文献

8.

Effect of stimulation rate on phoneme recognition by nucleus-22 cochlear implant listeners 总被引：3，自引：0，他引：3

Fu QJ Shannon RV 《The Journal of the Acoustical Society of America》2000,107(1):589-597

This study investigated the effect of pulsatile stimulation rate on medial vowel and consonant recognition in cochlear implant listeners. Experiment 1 measured phoneme recognition as a function of stimulation rate in six Nucleus-22 cochlear implant listeners using an experimental four-channel continuous interleaved sampler (CIS) speech processing strategy. Results showed that all stimulation rates from 150 to 500 pulses/s/electrode produced equally good performance, while stimulation rates lower than 150 pulses/s/electrode produced significantly poorer performance. Experiment 2 measured phoneme recognition by implant listeners and normal-hearing listeners as a function of the low-pass cutoff frequency for envelope information. Results from both acoustic and electric hearing showed no significant difference in performance for all cutoff frequencies higher than 20 Hz. Both vowel and consonant scores dropped significantly when the cutoff frequency was reduced from 20 Hz to 2 Hz. The results of these two experiments suggest that temporal envelope information can be conveyed by relatively low stimulation rates. The pattern of results for both electrical and acoustic hearing is consistent with a simple model of temporal integration with an equivalent rectangular duration (ERD) of the temporal integrator of about 7 ms. 相似文献

9.

Temporal weights in the level discrimination of time-varying sounds

Pedersen B Ellermeier W 《The Journal of the Acoustical Society of America》2008,123(2):963-972

To determine how listeners weight different portions of the signal when integrating level information, they were presented with 1-s noise samples the levels of which randomly changed every 100 ms by repeatedly, and independently, drawing from a normal distribution. A given stimulus could be derived from one of two such distributions, a decibel apart, and listeners had to classify each sound as belonging to the "soft" or "loud" group. Subsequently, logistic regression analyses were used to determine to what extent each of the ten temporal segments contributed to the overall judgment. In Experiment 1, a nonoptimal weighting strategy was found that emphasized the beginning, and, to a lesser extent, the ending of the sounds. When listeners received trial-by-trial feedback, however, they approached equal weighting of all stimulus components. In Experiment 2, a spectral change was introduced in the middle of the stimulus sequence, changing from low-pass to high-pass noise, and vice versa. The temporal location of the stimulus change was strongly weighted, much as a new onset. These findings are not accounted for by current models of loudness or intensity discrimination, but are consistent with the idea that temporal weighting in loudness judgments is driven by salient events. 相似文献

10.

Interaural spectral asymmetry and sensitivity to interaural time differences

Brown CA Yost WA 《The Journal of the Acoustical Society of America》2011,130(5):EL358-EL364

Listeners' ability to discriminate interaural time difference (ITD) changes in low-frequency noise was determined as a function of differences in the noise spectra delivered to each ear. An ITD was applied to Gaussian noise, which was bandpass filtered using identical high-pass, but different low-pass cutoff frequencies across ears. Thus, one frequency region was dichotic, and a higher-frequency region monotic. ITD thresholds increased as bandwidth to one ear (i.e., monotic bandwidth) increased, despite the fact that the region of interaural spectral overlap remained constant. Results suggest that listeners can process ITD differences when the spectra at two ears are moderately different. 相似文献

11.

Limits to the role of a common fundamental frequency in the fusion of two sounds with different spatial cues

Darwin CJ Hukin RW 《The Journal of the Acoustical Society of America》2004,116(1):502-506

Two experiments establish constraints on the ability of a common fundamental frequency (F0) to perceptually fuse low-pass filtered and complementary high-pass filtered speech presented to different ears. In experiment 1 the filter cut-off is set at 1 kHz. When the filters are sharp, giving little overlap in frequency between the two sounds, listeners report hearing two sounds even when the sounds at the two ears are on the same F0. Shallower filters give more fusion. In experiment 2, the filters' cut-off frequency is varied together with their slope. Fusion becomes more frequent when the signals at the two ears share low-frequency components. This constraint mirrors the natural filtering by head-shadow of sound sources presented to one side. The mechanisms underlying perceptual fusion may thus be similar to those underlying auditory localization. 相似文献

12.

Relative contributions of spectral and temporal cues for phoneme recognition 总被引：4，自引：0，他引：4

Xu L Thompson CS Pfingst BE 《The Journal of the Acoustical Society of America》2005,117(5):3255-3267

Cochlear implants provide users with limited spectral and temporal information. In this study, the amount of spectral and temporal information was systematically varied through simulations of cochlear implant processors using a noise-excited vocoder. Spectral information was controlled by varying the number of channels between 1 and 16, and temporal information was controlled by varying the lowpass cutoff frequencies of the envelope extractors from 1 to 512 Hz. Consonants and vowels processed using those conditions were presented to seven normal-hearing native-English-speaking listeners for identification. The results demonstrated that both spectral and temporal cues were important for consonant and vowel recognition with the spectral cues having a greater effect than the temporal cues for the ranges of numbers of channels and lowpass cutoff frequencies tested. The lowpass cutoff for asymptotic performance in consonant and vowel recognition was 16 and 4 Hz, respectively. The number of channels at which performance plateaued for consonants and vowels was 8 and 12, respectively. Within the above-mentioned ranges of lowpass cutoff frequency and number of channels, the temporal and spectral cues showed a tradeoff for phoneme recognition. Information transfer analyses showed different relative contributions of spectral and temporal cues in the perception of various phonetic/acoustic features. 相似文献

13.

Temporal pitch mechanisms in acoustic and electric hearing

Carlyon RP van Wieringen A Long CJ Deeks JM Wouters J 《The Journal of the Acoustical Society of America》2002,112(2):621-633

Two experiments investigated pitch perception for stimuli where the place of excitation was held constant. Experiment 1 used pulse trains in which the interpulse interval alternated between 4 and 6 ms. In experiment 1a these "4-6" pulse trains were bandpass filtered between 3900 and 5300 Hz and presented acoustically against a noise background to normal listeners. The rate of an isochronous pulse train (in which all the interpulse intervals were equal) was adjusted so that its pitch matched that of the "4-6" stimulus. The pitch matches were distributed unimodally, had a mean of 5.7 ms, and never corresponded to either 4 or to 10 ms (the period of the stimulus). In experiment 1b the pulse trains were presented both acoustically to normal listeners and electrically to users of the LAURA cochlear implant, via a single channel of their device. A forced-choice procedure was used to measure psychometric functions, in which subjects judged whether the 4-6 stimulus was higher or lower in pitch than isochronous pulse trains having periods of 3, 4, 5, 6, or 7 ms. For both groups of listeners, the point of subjective equality corresponded to a period of 5.6 to 5.7 ms. Experiment 1c confirmed that these psychometric functions were monotonic over the range 4-12 ms. In experiment 2, normal listeners adjusted the rate of an isochronous filtered pulse train to match the pitch of mixtures of pulse trains having rates of F1 and F2 Hz, passed through the same bandpass filter (3900-5400 Hz). The ratio F2/F1 was 1.29 and F1 was either 70, 92, 109, or 124 Hz. Matches were always close to F2 Hz. It is concluded that the results of both experiments are inconsistent with models of pitch perception which rely on higher-order intervals. Together with those of other published data on purely temporal pitch perception, the data are consistent with a model in which only first-order interpulse intervals contribute to pitch, and in which, over the range 0-12 ms, longer intervals receive higher weights than short intervals. 相似文献

14.

Speech recognition with altered spectral distribution of envelope cues. 总被引：8，自引：0，他引：8

R V Shannon F G Zeng J Wygonski 《The Journal of the Acoustical Society of America》1998,104(4):2467-2476

Recognition of consonants, vowels, and sentences was measured in conditions of reduced spectral resolution and distorted spectral distribution of temporal envelope cues. Speech materials were processed through four bandpass filters (analysis bands), half-wave rectified, and low-pass filtered to extract the temporal envelope from each band. The envelope from each speech band modulated a band-limited noise (carrier bands). Analysis and carrier bands were manipulated independently to alter the spectral distribution of envelope cues. Experiment I demonstrated that the location of the cutoff frequencies defining the bands was not a critical parameter for speech recognition, as long as the analysis and carrier bands were matched in frequency extent. Experiment II demonstrated a dramatic decrease in performance when the analysis and carrier bands did not match in frequency extent, which resulted in a warping of the spectral distribution of envelope cues. Experiment III demonstrated a large decrease in performance when the carrier bands were shifted in frequency, mimicking the basal position of electrodes in a cochlear implant. And experiment IV showed a relatively minor effect of the overlap in the noise carrier bands, simulating the overlap in neural populations responding to adjacent electrodes in a cochlear implant. Overall, these results show that, for four bands, the frequency alignment of the analysis bands and carrier bands is critical for good performance, while the exact frequency divisions and overlap in carrier bands are not as critical. 相似文献

15.

Pulse-rate discrimination by cochlear-implant and normal-hearing listeners with and without binaural cues

Carlyon RP Long CJ Deeks JM 《The Journal of the Acoustical Society of America》2008,123(4):2276-2286

Experiment 1 measured rate discrimination of electric pulse trains by bilateral cochlear implant (CI) users, for standard rates of 100, 200, and 300 pps. In the diotic condition the pulses were presented simultaneously to the two ears. Consistent with previous results with unilateral stimulation, performance deteriorated at higher standard rates. In the signal interval of each trial in the dichotic condition, the standard rate was presented to the left ear and the (higher) signal rate was presented to the right ear; the non-signal intervals were the same as in the diotic condition. Performance in the dichotic condition was better for some listeners than in the diotic condition for standard rates of 100 and 200 pps, but not at 300 pps. It is concluded that the deterioration in rate discrimination observed for CI users at high rates cannot be alleviated by the introduction of a binaural cue, and is unlikely to be limited solely by central pitch processes. Experiment 2 performed an analogous experiment in which 300-pps acoustic pulse trains were bandpass filtered (3900-5400 Hz) and presented in a noise background to normal-hearing listeners. Unlike the results of experiment 1, performance was superior in the dichotic than in the diotic condition. 相似文献

16.

The influence of pinnae-based spectral cues on sound localization 总被引：1，自引：0，他引：1

A D Musicant R A Butler 《The Journal of the Acoustical Society of America》1984,75(4):1195-1200

The role of pinnae-based spectral cues was investigated by requiring listeners to locate sound, binaurally, in the horizontal plane with and without partial occlusion of their external ears. The main finding was that the high frequencies were necessary for optimal performance. When the stimulus contained the higher audio frequencies, e.g., broadband and 4.0-kHz high-pass noise, localization accuracy was significantly superior to that recorded for stimuli consisting only of the lower frequencies (4.0- and 1.0-kHz low-pass noise). This finding was attributed to the influence of the spectral cues furnished by the pinnae, for when the stimulus composition included high frequencies, pinnae occlusion resulted in a marked decline in localization accuracy. Numerous front-rear reversals occurred. Moreover, the ability to distinguish among sounds originating within the same quadrant also suffered. Performance proficiency for the low-pass stimuli was not further degraded under conditions of pinnae occlusion. In locating the 4.0-kHz high-pass noise when both, neither, or only one ear was occluded, the data demonstrated unequivocally that the pinna-based cues of the "near" ear contributed powerfully toward localization accuracy. 相似文献

17.

The relationship between frequency selectivity and pitch discrimination: sensorineural hearing loss

Bernstein JG Oxenham AJ 《The Journal of the Acoustical Society of America》2006,120(6):3929-3945

This study tested the relationship between frequency selectivity and the minimum spacing between harmonics necessary for accurate fo discrimination. Fundamental frequency difference limens (fo DLs) were measured for ten listeners with moderate sensorineural hearing loss (SNHL) and three normal-hearing listeners for sine- and random-phase harmonic complexes, bandpass filtered between 1500 and 3500 Hz, with fo's ranging from 75 to 500 Hz (or higher). All listeners showed a transition between small (good) fo DLs at high fo's and large (poor) fo DLs at low fo's, although the fo at which this transition occurred (fo,tr) varied across listeners. Three measures thought to reflect frequency selectivity were significantly correlated to both the fo,tr and the minimum fo DL achieved at high fo's: (1) the maximum fo for which fo DLs were phase dependent, (2) the maximum modulation frequency for which amplitude modulation and quasi-frequency modulation were discriminable, and (3) the equivalent rectangular bandwidth of the auditory filter, estimated using the notched-noise method. These results provide evidence of a relationship between fo discrimination performance and frequency selectivity in listeners with SNHL, supporting "spectral" and "spectro-temporal" theories of pitch perception that rely on sharp tuning in the auditory periphery to accurately extract fo information. 相似文献

18.

Perception of amplitude modulation by hearing-impaired listeners: the audibility of component modulation and detection of phase change in three-component modulators

Sek A Moore BC 《The Journal of the Acoustical Society of America》2006,119(1):507-514

Two experiments were conducted to assess whether hearing-impaired listeners have a reduced ability to process suprathreshold complex patterns of modulation applied to a 4-kHz sinusoidal carrier. Experiment 1 examined the ability to "hear out" the modulation frequency of the central component of a three-component modulator, using the method described by Sek and Moore [J. Acoust. Soc. Am. 113, 2801-2811 (2003)]. Scores were around 70-80% correct when the components in the three-component modulator were widely spaced and when the frequencies of the target and comparison different sufficiently, but decreased when the components in the modulator were closely spaced. Experiment 2 examined the ability to hear a change in the relative phase of the components in a three-component modulator with harmonically spaced components. The frequency of the central component, f, was either 50 or 100 Hz. Scores were about 70% correct when the component spacing was < or = 0.5fc, but decreased markedly for greater spacings. Performance was only slightly impaired by randomizing the overall modulation depth from one stimulus to the next. For both experiments, performance was only slightly worse than for normally hearing listeners, indicating that cochlear hearing loss does not markedly affect the ability to process suprathreshold complex patterns of modulation. 相似文献

19.

Predicting the path of a changing sound: velocity tracking and auditory continuity

Crum PA Hafter ER 《The Journal of the Acoustical Society of America》2008,124(2):1116-1129

Three studies demonstrate listeners' ability to use the rate of a sound's frequency change (velocity) to predict how the spectral path of the sound is likely to evolve, even in the event of an occlusion. Experiments 1 and 2 use a modified probe-signal method to measure attentional filters and demonstrate increased detection to sounds falling along implied paths of constant-linear velocity. Experiment 3 shows listeners perceive a suprathreshold tone as falling along a trajectory of constant velocity when the frequency is near to the region of greatest detection as measured in Experiments 1 and 2. Further, results show greater accuracy and decreased bias in the use of velocity information with increased exposure to a constant-velocity sound. As the duration of occlusion lengthens, results also show a downward shift (relative to a trajectory of constant velocity) in the frequency at which listeners' detection and experience of a continuous trajectory are greatest. A preliminary model of velocity processing is proposed to account for this downward shift. Results show listeners' use of velocity in extrapolating sounds with dynamically changing spectral and temporal properties and provide evidence for its role in perceptual auditory continuity within a noisy acoustic environment. 相似文献

20.

An autocorrelation model with place dependence to account for the effect of harmonic number on fundamental frequency discrimination

Bernstein JG Oxenham AJ 《The Journal of the Acoustical Society of America》2005,117(6):3816-3831

Fundamental frequency (f0) difference limens (DLs) were measured as a function of f0 for sine- and random-phase harmonic complexes, bandpass filtered with 3-dB cutoff frequencies of 2.5 and 3.5 kHz (low region) or 5 and 7 kHz (high region), and presented at an average 15 dB sensation level (approximately 48 dB SPL) per component in a wideband background noise. Fundamental frequencies ranged from 50 to 300 Hz and 100 to 600 Hz in the low and high spectral regions, respectively. In each spectral region, f0 DLs improved dramatically with increasing f0 as approximately the tenth harmonic appeared in the passband. Generally, f0 DLs for complexes with similar harmonic numbers were similar in the two spectral regions. The dependence of f0 discrimination on harmonic number presents a significant challenge to autocorrelation (AC) models of pitch, in which predictions generally depend more on spectral region than harmonic number. A modification involving a "lag window"is proposed and tested, restricting the AC representation to a limited range of lags relative to each channel's characteristic frequency. This modified unitary pitch model was able to account for the dependence of f0 DLs on harmonic number, although this correct behavior was not based on peripheral harmonic resolvability. 相似文献