首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 578 毫秒
1.
Ripple-spectrum stimuli were used to investigate the scale of spectral detail used by listeners in interpreting spectral cues for vertical-plane localization. In three experiments, free-field localization judgments were obtained for 250-ms, 0.6-16-kHz noise bursts with log-ripple spectra that varied in ripple density, peak-to-trough depth, and phase. When ripple density was varied and depth was held constant at 40 dB, listeners' localization error rates increased most (relative to rates for flat-spectrum targets) for densities of 0.5-2 ripples/oct. When depth was varied and density was held constant at 1 ripple/oct, localization accuracy was degraded only for ripple depths > or = 20 dB. When phase was varied and density was held constant at 1 ripple/oct and depth at 40 dB, three of five listeners made errors at consistent locations unrelated to the ripple phase, whereas two listeners made errors at locations systematically modulated by ripple phase. Although the reported upper limit for ripple discrimination is 10 ripples/oct [Supin et al., J. Acoust. Soc. Am. 106, 2800-2804 (1999)], present results indicate that details finer than 2 ripples/oct or coarser than 0.5 ripples/oct do not strongly influence processing of spectral cues for sound localization. The low spectral-frequency limit suggests that broad-scale spectral variation is discounted, even though components at this scale are among those contributing the most to the shapes of directional transfer functions.  相似文献   

2.
Depth resolution of spectral ripples was measured in normal humans using a phase-reversal test. The principle of the test was to find the lowest ripple depth at which an interchange of peak and trough position (the phase reversal) in the rippled spectrum is detectable. Using this test, ripple-depth thresholds were measured as a function of ripple density of octave-band rippled noise at center frequencies from 0.5 to 8 kHz. The ripple-depth threshold in the power domain was around 0.2 at low ripple densities of 4-5 relative units (center-frequency-to-ripple-spacing ratio) or 3-3.5 ripples/oct. The threshold increased with the ripple density increase. It reached the highest possible level of 1.0 at ripple density from 7.5 relative units at 0.5 kHz center frequency to 14.3 relative units at 8 kHz (5.2 to 10.0 ripple/oct, respectively). The interrelation between the ripple depth threshold and ripple density can be satisfactorily described by transfer of the signal by frequency-tuned auditory filters.  相似文献   

3.
In this study, auditory stream segregation based on differences in the rate of envelope fluctuations--in the absence of spectral and temporal fine structure cues--was tested. The temporal sequences to segregate were composed of fully amplitude-modulated (AM) bursts of broadband noises A and B. All sequences were built by the reiteration of a ABA triplet where A modulation rate was fixed at 100 Hz and B modulation rate was variable. The first experiment was devoted to measuring the threshold difference in AM rate leading subjects to perceive the sequence as two streams as opposed to just one. The results of this first experiment revealed that subjects generally perceived the sequences as a single perceptual stream when the difference in AM rate between the A and B noises was smaller than 0.75 oct, and as two streams when the difference was larger than about 1.00 oct. These streaming thresholds were found to be substantially larger than, and not related to, the subjects' modulation-rate discrimination thresholds. The results of a second experiment demonstrated that AM-rate-based streaming was adversely affected by decreases in AM depth, but that segregation remained possible as long as the AM of either the A or B noises was above the subject's AM-detection threshold. The results of a third experiment indicated that AM-rate-based streaming effects were still observed when the modulations applied to the A and B noises were set individually, either at a constant level in dB above AM-detection threshold, or at levels at which they were of the same perceived strength. This finding suggests that AM-rate-based streaming is not necessarily mediated by perceived differences in AM depth. Altogether, the results of this study indicate that sequential sounds can be segregated on the sole basis of differences in the rate of their temporal fluctuations in the absence of other temporal or spectral cues.  相似文献   

4.
Spectral peak resolution was investigated in normal hearing (NH), hearing impaired (HI), and cochlear implant (CI) listeners. The task involved discriminating between two rippled noise stimuli in which the frequency positions of the log-spaced peaks and valleys were interchanged. The ripple spacing was varied adaptively from 0.13 to 11.31 ripples/octave, and the minimum ripple spacing at which a reversal in peak and trough positions could be detected was determined as the spectral peak resolution threshold for each listener. Spectral peak resolution was best, on average, in NH listeners, poorest in CI listeners, and intermediate for HI listeners. There was a significant relationship between spectral peak resolution and both vowel and consonant recognition in quiet across the three listener groups. The results indicate that the degree of spectral peak resolution required for accurate vowel and consonant recognition in quiet backgrounds is around 4 ripples/octave, and that spectral peak resolution poorer than around 1-2 ripples/octave may result in highly degraded speech recognition. These results suggest that efforts to improve spectral peak resolution for HI and CI users may lead to improved speech recognition.  相似文献   

5.
Standard continuous interleaved sampling processing, and a modified processing strategy designed to enhance temporal cues to voice pitch, were compared on tests of intonation perception, and vowel perception, both in implant users and in acoustic simulations. In standard processing, 400 Hz low-pass envelopes modulated either pulse trains (implant users) or noise carriers (simulations). In the modified strategy, slow-rate envelope modulations, which convey dynamic spectral variation crucial for speech understanding, were extracted by low-pass filtering (32 Hz). In addition, during voiced speech, higher-rate temporal modulation in each channel was provided by 100% amplitude-modulation by a sawtooth-like wave form whose periodicity followed the fundamental frequency (F0) of the input. Channel levels were determined by the product of the lower- and higher-rate modulation components. Both in acoustic simulations and in implant users, the ability to use intonation information to identify sentences as question or statement was significantly better with modified processing. However, while there was no difference in vowel recognition in the acoustic simulation, implant users performed worse with modified processing both in vowel recognition and in formant frequency discrimination. It appears that, while enhancing pitch perception, modified processing harmed the transmission of spectral information.  相似文献   

6.
Budgerigars were trained to discriminate complex sounds with two different types of spectral profiles from flat-spectrum, wideband noise. In one case, complex sounds with a sinusoidal ripple in (log) amplitude across (log) frequency bandwidth were generated by combining 201 logarithmically spaced tones covering the frequency region from 500 Hz to 10 kHz. A second type of rippled stimulus was generated by delaying broadband noise and adding it to the original noise in an iterative fashion. In each case, thresholds for modulation depth (i.e., peak-to-valley in dB) were measured at several different ripple frequencies (i.e., cycles/octave for logarithmic profiles) or different repetition pitches (i.e., delay for ripple noises). Budgerigars were similar to humans in detecting ripple at low spatial frequencies, but were considerably more sensitive than humans in detecting ripples in log ripple spectra at high spatial frequencies. Budgerigars were also similar to humans in detecting linear ripple in broadband noise over a wide range of repetition pitches. Taken together, these data show that the avian auditory system is at least as good, if not better, than the human auditory system at detecting spectral ripples in noise despite gross anatomical differences in both the peripheral and central auditory nervous systems.  相似文献   

7.
When distortion product otoacoustic emissions (DPOAEs) are measured with a high-frequency resolution, the DPOAE shows quasi-periodic variations across frequency, called DPOAE fine structure. In this study the DPOAE fine structure is determined for 50 normal-hearing humans using fixed primary levels of L1/L2 = 65/45 dB. An algorithm is developed, which characterizes the fine structure ripples in terms of three parameters: ripple spacing, ripple height, and ripple prevalence. The characteristic patterns of fine structure can be found in the DPOAE of all subjects, though the DPOAE fine structure characteristics are individual and vary from subject to subject. On average the ripple spacing decreases with increasing frequency from 1/8 oct at 1 kHz to 3/32 oct at 5 kHz. The ripple prevalence is two to three ripples per 1/3 oct, and ripple heights of up to 32 dB could be detected. The 50 normal-hearing subjects were divided into two groups, the subjects of group A having slightly better hearing levels than subjects of group B. The subjects of group A have significantly higher DPOAE levels. The overall prevalence of fine structure ripples do not differ between the two groups, but are higher and narrower for subjects of group B than for group A.  相似文献   

8.
Vowel identity correlates well with the shape of the transfer function of the vocal tract, in particular the position of the first two or three formant peaks. However, in voiced speech the transfer function is sampled at multiples of the fundamental frequency (F0), and the short-term spectrum contains peaks at those frequencies, rather than at formants. It is not clear how the auditory system estimates the original spectral envelope from the vowel waveform. Cochlear excitation patterns, for example, resolve harmonics in the low-frequency region and their shape varies strongly with F0. The problem cannot be cured by smoothing: lag-domain components of the spectral envelope are aliased and cause F0-dependent distortion. The problem is severe at high F0's where the spectral envelope is severely undersampled. This paper treats vowel identification as a process of pattern recognition with missing data. Matching is restricted to available data, and missing data are ignored using an F0-dependent weighting function that emphasizes regions near harmonics. The model is presented in two versions: a frequency-domain version based on short-term spectra, or tonotopic excitation patterns, and a time-domain version based on autocorrelation functions. It accounts for the relative F0-independency observed in vowel identification.  相似文献   

9.
The present study measured the recognition of spectrally degraded and frequency-shifted vowels in both acoustic and electric hearing. Vowel stimuli were passed through 4, 8, or 16 bandpass filters and the temporal envelopes from each filter band were extracted by half-wave rectification and low-pass filtering. The temporal envelopes were used to modulate noise bands which were shifted in frequency relative to the corresponding analysis filters. This manipulation not only degraded the spectral information by discarding within-band spectral detail, but also shifted the tonotopic representation of spectral envelope information. Results from five normal-hearing subjects showed that vowel recognition was sensitive to both spectral resolution and frequency shifting. The effect of a frequency shift did not interact with spectral resolution, suggesting that spectral resolution and spectral shifting are orthogonal in terms of intelligibility. High vowel recognition scores were observed for as few as four bands. Regardless of the number of bands, no significant performance drop was observed for tonotopic shifts equivalent to 3 mm along the basilar membrane, that is, for frequency shifts of 40%-60%. Similar results were obtained from five cochlear implant listeners, when electrode locations were fixed and the spectral location of the analysis filters was shifted. Changes in recognition performance in electrical and acoustic hearing were similar in terms of the relative location of electrodes rather than the absolute location of electrodes, indicating that cochlear implant users may at least partly accommodate to the new patterns of speech sounds after long-time exposure to their normal speech processor.  相似文献   

10.
Background noise reduces the depth of the low-frequency envelope modulations known to be important for speech intelligibility. The relative strength of the target and masker envelope modulations can be quantified using a modulation signal-to-noise ratio, (S/N)(mod), measure. Such a measure can be used in noise-suppression algorithms to extract target-relevant modulations from the corrupted (target + masker) envelopes for potential improvement in speech intelligibility. In the present study, envelopes are decomposed in the modulation spectral domain into a number of channels spanning the range of 0-30 Hz. Target-dominant modulations are identified and retained in each channel based on the (S/N)(mod) selection criterion, while modulations which potentially interfere with perception of the target (i.e., those dominated by the masker) are discarded. The impact of modulation-selective processing on the speech-reception threshold for sentences in noise is assessed with normal-hearing listeners. Results indicate that the intelligibility of noise-masked speech can be improved by as much as 13 dB when preserving target-dominant modulations, present up to a modulation frequency of 18 Hz, while discarding masker-dominant modulations from the mixture envelopes.  相似文献   

11.
Sound coming directly from a source is often accompanied by reflections arriving from different directions. However, the "precedence effect" occurs when listeners judge such a source's direction: information in the direct, first-arriving sound tends to govern the direction heard for the overall sound. This paper asks whether the spectral envelope of the direct sound has a similar, dominant influence on the spectral envelope perceived for the whole sound. A continuum between two vowels was produced and then a "two-part" filter distorted each step. The beginning of this filter's unit-sample response simulated a direct sound with no distortion of the spectral envelope. The second part simulated a reflection pattern that distorted the spectral envelope. The reflections' frequency response was designed to give the spectral envelope of one of the continuum's end-points to the other end-point. Listeners' identifications showed that the reflections in two-part filters had a substantial influence because sounds tended to be identified as the positive vowel of the reflection pattern. This effect was not reduced when the interaural delays of the reflections and the direct sound were substantially different. Also, when the reflections were caused to precede the direct sound, the effects were much the same. By contrast, in measurements of lateralization the precedence effect was obtained. Here, the lateral position of the whole sound was largely governed by the interaural delay of the direct sound, and was hardly affected by the interaural delay of the reflections.  相似文献   

12.
The spectral envelope is a major determinant of the perceptual identity of many classes of sound including speech. When sounds are transmitted from the source to the listener, the spectral envelope is invariably and diversely distorted, by factors such as room reverberation. Perceptual compensation for spectral-envelope distortion was investigated here. Carrier sounds were distorted by spectral envelope difference filters whose frequency response is the spectral envelope of one vowel minus the spectral envelope of another. The filter /I/ minus /e/ and its inverse were used. Subjects identified a test sound that followed the carrier. The test sound was drawn from an /Itch/ to /etch/ continuum. Perceptual compensation produces a phoneme boundary difference between /I/ minus /e/ and its inverse. Carriers were the phrase "the next word is" spoken by the same (male) speaker as the test sounds, signal-correlated noise derived from this phrase, the same phrase spoken by a female speaker, male and female versions played backwards, and a repeated end-point vowel. The carrier and test were presented to the same ear, to different ears, and from different apparent directions (by varying interaural time delay). The results show that compensation is unlike peripheral phenomena, such as adaptation, and unlike phonetic perceptual phenomena. The evidence favors a central, auditory mechanism.  相似文献   

13.
Vowels are characterized by peaks in their spectral envelopes: the formants. To gain insight into the perception of speech as well as into the basic abilities of the ear, sensitivity to modulations in the positions of these formants is investigated. Frequency modulation detection thresholds (FMTs) were measured for the center frequency of formantlike harmonic complexes in the absence and in the presence of simultaneous off-frequency formants (maskers). Both the signals and the maskers were harmonic complexes which were band-pass filtered with a triangular spectral envelope, on a log-log scale, into either a LOW (near 500 Hz), a MID (near 1500 Hz), or a HIGH region (near 3000 Hz). They had a duration of 250 ms, and either an 80- or a 240-Hz fundamental. The modulation rate was 5 Hz for the signals and 10 Hz for the maskers. A pink noise background was presented continuously. In a first experiment no maskers were used. The measured FMTs were roughly two times larger than previously reported just-noticeable differences for formant frequency. In a second experiment, no significant differences were found between the FMTs in the absence of maskers and those in the presence of stationary (i.e., nonfrequency modulated) maskers. However, under many conditions the FMTs were increased by the presence of simultaneous modulated maskers. These results indicate that frequency modulation detection interference (FMDI) can exist for formantlike complex tones. The FMDI data could be divided into two groups. For stimuli characterized by a steep (200-dB/oct) slope, it was found that the size of the FMDI depended on which cues were used for detecting the signal and masker modulations. For stimuli with shallow (50-dB/oct) slopes, the FMDI was reduced when the signal and the masker had widely differing fundamentals, implying that the fundamental information is extracted before the interference occurs.  相似文献   

14.
Spectral resolution has been reported to be closely related to vowel and consonant recognition in cochlear implant (CI) listeners. One measure of spectral resolution is spectral modulation threshold (SMT), which is defined as the smallest detectable spectral contrast in the spectral ripple stimulus. SMT may be determined by the activation pattern associated with electrical stimulation. In the present study, broad activation patterns were simulated using a multi-band vocoder to determine if similar impairments in speech understanding scores could be produced in normal-hearing listeners. Tokens were first decomposed into 15 logarithmically spaced bands and then re-synthesized by multiplying the envelope of each band by matched filtered noise. Various amounts of current spread were simulated by adjusting the drop-off of the noise spectrum away from the peak (40-5 dBoctave). The average SMT (0.25 and 0.5 cyclesoctave) increased from 6.3 to 22.5 dB, while average vowel identification scores dropped from 86% to 19% and consonant identification scores dropped from 93% to 59%. In each condition, the impairments in speech understanding were generally similar to those found in CI listeners with similar SMTs, suggesting that variability in spread of neural activation largely accounts for the variability in speech perception of CI listeners.  相似文献   

15.
The just-noticeable-difference in frequency (jndf) for complex signals with triangular spectral envelopes is found to depend on the envelope slope. For shallow slopes (less than 140 dB/oct), jndf increases with decreasing slope. Addition of noise also impairs frequency discrimination within a region of about 20 dB above masked threshold. This is found for both maskers used: a wideband noise and a narrow-band masker which is below the signal in frequency. When wideband noise is used, frequency discrimination of complex signals with shallow slopes deteriorates more rapidly with decreasing signal-to-noise ratio than it does when the signals have steep spectral slopes.  相似文献   

16.
We determined how the perceived naturalness of music and speech (male and female talkers) signals was affected by various forms of linear filtering, some of which were intended to mimic the spectral "distortions" introduced by transducers such as microphones, loudspeakers, and earphones. The filters introduced spectral tilts and ripples of various types, variations in upper and lower cutoff frequency, and combinations of these. All of the differently filtered signals (168 conditions) were intermixed in random order within one block of trials. Levels were adjusted to give approximately equal loudness in all conditions. Listeners were required to judge the perceptual quality (naturalness) of the filtered signals on a scale from 1 to 10. For spectral ripples, perceived quality decreased with increasing ripple density up to 0.2 ripple/ERB(N) and with increasing ripple depth. Spectral tilts also degraded quality, and the effects were similar for positive and negative tilts. Ripples and/or tilts degraded quality more when they extended over a wide frequency range (87-6981 Hz) than when they extended over subranges. Low- and mid-frequency ranges were roughly equally important for music, but the mid-range was most important for speech. For music, the highest quality was obtained for the broadband signal (55-16,854 Hz). Increasing the lower cutoff frequency from 55 Hz resulted in a clear degradation of quality. There was also a distinct degradation as the upper cutoff frequency was decreased from 16,845 Hz. For speech, there was a marked degradation when the lower cutoff frequency was increased from 123 to 208 Hz and when the upper cutoff frequency was decreased from 10,869 Hz. Typical telephone bandwidth (313 to 3547 Hz) gave very poor quality.  相似文献   

17.
Effect of spectral envelope smearing on speech reception. I.   总被引:2,自引:0,他引:2  
The effect of reduced spectral contrast on the speech-reception threshold (SRT) for sentences in noise and on phoneme identification, was investigated with 16 normal-hearing subjects. Signal processing was performed by smoothing the envelope of the squared short-time fast Fourier transform (FFT) by convolving it with a Gaussian-shaped filter, and overlapping additions to reconstruct a continuous signal. Spectral energy in the frequency region from 100 to 8000 Hz was smeared over bandwidths of 1/8, 1/4, 1/3, 1/2, 1, 2, and 4 oct for the SRT experiment. Vowel and consonant identification was studied for smearing bandwidths of 1/8, 1/2, and 2 oct. Results showed the SRT in noise to increase as the spectral energy was smeared over bandwidths exceeding the ear's critical bandwidth. Vowel identification suffered more from this type of processing than consonant identification. Vowels were primarily confused with the back vowels /c,u/, and consonants were confused where place of articulation is concerned.  相似文献   

18.
To determine the minimum difference in amplitude between spectral peaks and troughs sufficient for vowel identification by normal-hearing and hearing-impaired listeners, four vowel-like complex sounds were created by summing the first 30 harmonics of a 100-Hz tone. The amplitudes of all harmonics were equal, except for two consecutive harmonics located at each of three "formant" locations. The amplitudes of these harmonics were equal and ranged from 1-8 dB more than the remaining components. Normal-hearing listeners achieved greater than 75% accuracy when peak-to-trough differences were 1-2 dB. Normal-hearing listeners who were tested in a noise background sufficient to raise their thresholds to the level of a flat, moderate hearing loss needed a 4-dB difference for identification. Listeners with a moderate, flat hearing loss required a 6- to 7-dB difference for identification. The results suggest, for normal-hearing listeners, that the peak-to-trough amplitude difference required for identification of this set of vowels is very near the threshold for detection of a change in the amplitude spectrum of a complex signal. Hearing-impaired listeners may have difficulty using closely spaced formants for vowel identification due to abnormal smoothing of the internal representation of the spectrum by broadened auditory filters.  相似文献   

19.
The acoustic change complex (ACC) is a scalp-recorded negative-positive voltage swing elicited by a change during an otherwise steady-state sound. The ACC was obtained from eight adults in response to changes of amplitude and/or spectral envelope at the temporal center of a three-formant synthetic vowel lasting 800 ms. In the absence of spectral change, the group mean waveforms showed a clear ACC to amplitude increments of 2 dB or more and decrements of 3 dB or more. In the presence of a change of second formant frequency (from perceived /u/ to perceived /i/), amplitude increments increased the magnitude of the ACC but amplitude decrements had little or no effect. The fact that the just detectable amplitude change is close to the psychoacoustic limits of the auditory system augurs well for the clinical application of the ACC. The failure to find a condition under which the spectrally elicited ACC is diminished by a small change of amplitude supports the conclusion that the observed ACC to a change of spectral envelope reflects some aspect of cortical frequency coding. Taken together, these findings support the potential value of the ACC as an objective index of auditory discrimination capacity.  相似文献   

20.
由入射强激光束的强度或相位调制引起的小尺度纹波在经过非线性介质后,将产生强烈的自聚焦效应。在非线性相位延迟(B积分)超过一定值后,将严重影响光束的近场强度分布,甚至造成丝状破坏。数值模拟和实验研究了在一定强度调制下的光束经过钕玻璃介质发生自聚焦成丝效应的阈值条件,以及非线性传输的一般规律。可以通过模拟计算,给出高功率激光系统设计中的D-B判据。  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号