首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
Spectral sharpness and vowel dissimilarity   总被引:1,自引:0,他引:1  
The effect of sharpening or smoothing the spectral envelopes of synthetic vowel-like sounds on the dissimilarities perceived among these sounds was investigated by means of triadic comparisons. When a spectral envelope (dB on a log-frequency scale) is considered the sum of a series of sinusoidal spectral modulations (or ripples) of different densities (the ripple spectrum), spectral sharpening or smoothing can be described as an amplification or attenuation of a part of the original ripple spectrum. For a set of nine sounds comprising different degrees of spectral sharpening of a single vowel, the perceived dissimilarities were found to be dominated by a specific part of the ripple spectrum, i.e., by spectral modulations with a density of about 2 ripples/oct. The possible role of lateral suppression in relation to this dominant region is discussed. For a set of 18 sounds comprising six vowels, each in three different versions (sharpened, normal, or smoothed), the dissimilarities were found to be determined mainly by the global shape of the spectral envelopes, i.e., by spectral modulations up to about 1.5-2 ripples/oct. Details of the spectral envelope (including the region of 2 ripples/oct where lateral suppression is effective) appear to be of minor influence on vowel dissimilarities.  相似文献   

2.
Reports using a variety of psychophysical tasks indicate that pitch perception by hearing-impaired listeners may be abnormal, contributing to difficulties in understanding speech and enjoying music. Pitches of complex sounds may be weaker and more indistinct in the presence of cochlear damage, especially when frequency regions are affected that form the strongest basis for pitch perception in normal-hearing listeners. In this study, the strength of the complex pitch generated by iterated rippled noise was assessed in normal-hearing and hearing-impaired listeners. Pitch strength was measured for broadband noises with spectral ripples generated by iteratively delaying a copy of a given noise and adding it back into the original. Octave-band-pass versions of these noises also were evaluated to assess frequency dominance regions for rippled-noise pitch. Hearing-impaired listeners demonstrated consistently weaker pitches in response to the rippled noises relative to pitch strength in normal-hearing listeners. However, in most cases, the frequency regions of pitch dominance, i.e., strongest pitch, were similar to those observed in normal-hearing listeners. Except where there exists a substantial sensitivity loss, contributions from normal pitch dominance regions associated with the strongest pitches may not be directly related to impaired spectral processing. It is suggested that the reduced strength of rippled-noise pitch in listeners with hearing loss results from impaired frequency resolution and possibly an associated deficit in temporal processing.  相似文献   

3.
Depth resolution of spectral ripples was measured in normal humans using a phase-reversal test. The principle of the test was to find the lowest ripple depth at which an interchange of peak and trough position (the phase reversal) in the rippled spectrum is detectable. Using this test, ripple-depth thresholds were measured as a function of ripple density of octave-band rippled noise at center frequencies from 0.5 to 8 kHz. The ripple-depth threshold in the power domain was around 0.2 at low ripple densities of 4-5 relative units (center-frequency-to-ripple-spacing ratio) or 3-3.5 ripples/oct. The threshold increased with the ripple density increase. It reached the highest possible level of 1.0 at ripple density from 7.5 relative units at 0.5 kHz center frequency to 14.3 relative units at 8 kHz (5.2 to 10.0 ripple/oct, respectively). The interrelation between the ripple depth threshold and ripple density can be satisfactorily described by transfer of the signal by frequency-tuned auditory filters.  相似文献   

4.
Spectral ripple discrimination thresholds were measured in 15 cochlear-implant users with broadband (350-5600 Hz) and octave-band noise stimuli. The results were compared with spatial tuning curve (STC) bandwidths previously obtained from the same subjects. Spatial tuning curve bandwidths did not correlate significantly with broadband spectral ripple discrimination thresholds but did correlate significantly with ripple discrimination thresholds when the rippled noise was confined to an octave-wide passband, centered on the STC's probe electrode frequency allocation. Ripple discrimination thresholds were also measured for octave-band stimuli in four contiguous octaves, with center frequencies from 500 Hz to 4000 Hz. Substantial variations in thresholds with center frequency were found in individuals, but no general trends of increasing or decreasing resolution from apex to base were observed in the pooled data. Neither ripple nor STC measures correlated consistently with speech measures in noise and quiet in the sample of subjects in this study. Overall, the results suggest that spectral ripple discrimination measures provide a reasonable measure of spectral resolution that correlates well with more direct, but more time-consuming, measures of spectral resolution, but that such measures do not always provide a clear and robust predictor of performance in speech perception tasks.  相似文献   

5.
Sound localization allows humans and animals to determine the direction of objects to seek or avoid and indicates the appropriate position to direct visual attention. Interaural time differences (ITDs) and interaural level differences (ILDs) are two primary cues that humans use to localize or lateralize sound sources. There is limited information about behavioral cue sensitivity in animals, especially animals with poor sound localization acuity and small heads, like budgerigars. ITD and ILD thresholds were measured behaviorally in dichotically listening budgerigars equipped with headphones in an identification task. Budgerigars were less sensitive than humans and cats, and more similar to rabbits, barn owls, and monkeys, in their abilities to lateralize dichotic signals. Threshold ITDs were relatively constant for pure tones below 4 kHz, and were immeasurable at higher frequencies. Threshold ILDs were relatively constant over a wide range of frequencies, similar to humans. Thresholds in both experiments were best for broadband noise stimuli. These lateralization results are generally consistent with the free field localization abilities of these birds, and add support to the idea that budgerigars may be able to enhance their cues to directional hearing (e.g., via connected interaural pathways) beyond what would be expected based on head size.  相似文献   

6.

Background  

In the field of auditory neuroscience, much research has focused on the neural processes underlying human sound localization. A recent magnetoencephalography (MEG) study investigated localization-related brain activity by measuring the N1m event-related response originating in the auditory cortex. It was found that the dynamic range of the right-hemispheric N1m response, defined as the mean difference in response magnitude between contralateral and ipsilateral stimulation, reflects cortical activity related to the discrimination of horizontal sound direction. Interestingly, the results also suggested that the presence of realistic spectral information within horizontally located spatial sounds resulted in a larger right-hemispheric N1m dynamic range. Spectral cues being predominant at high frequencies, the present study further investigated the issue by removing frequencies from the spatial stimuli with low-pass filtering. This resulted in a stepwise elimination of direction-specific spectral information. Interaural time and level differences were kept constant. The original, unfiltered stimuli were broadband noise signals presented from five frontal horizontal directions and binaurally recorded for eight human subjects with miniature microphones placed in each subject's ear canals. Stimuli were presented to the subjects during MEG registration and in a behavioral listening experiment.  相似文献   

7.
Ripple-spectrum stimuli were used to investigate the scale of spectral detail used by listeners in interpreting spectral cues for vertical-plane localization. In three experiments, free-field localization judgments were obtained for 250-ms, 0.6-16-kHz noise bursts with log-ripple spectra that varied in ripple density, peak-to-trough depth, and phase. When ripple density was varied and depth was held constant at 40 dB, listeners' localization error rates increased most (relative to rates for flat-spectrum targets) for densities of 0.5-2 ripples/oct. When depth was varied and density was held constant at 1 ripple/oct, localization accuracy was degraded only for ripple depths > or = 20 dB. When phase was varied and density was held constant at 1 ripple/oct and depth at 40 dB, three of five listeners made errors at consistent locations unrelated to the ripple phase, whereas two listeners made errors at locations systematically modulated by ripple phase. Although the reported upper limit for ripple discrimination is 10 ripples/oct [Supin et al., J. Acoust. Soc. Am. 106, 2800-2804 (1999)], present results indicate that details finer than 2 ripples/oct or coarser than 0.5 ripples/oct do not strongly influence processing of spectral cues for sound localization. The low spectral-frequency limit suggests that broad-scale spectral variation is discounted, even though components at this scale are among those contributing the most to the shapes of directional transfer functions.  相似文献   

8.
The influence of pinnae-based spectral cues on sound localization   总被引:1,自引:0,他引:1  
The role of pinnae-based spectral cues was investigated by requiring listeners to locate sound, binaurally, in the horizontal plane with and without partial occlusion of their external ears. The main finding was that the high frequencies were necessary for optimal performance. When the stimulus contained the higher audio frequencies, e.g., broadband and 4.0-kHz high-pass noise, localization accuracy was significantly superior to that recorded for stimuli consisting only of the lower frequencies (4.0- and 1.0-kHz low-pass noise). This finding was attributed to the influence of the spectral cues furnished by the pinnae, for when the stimulus composition included high frequencies, pinnae occlusion resulted in a marked decline in localization accuracy. Numerous front-rear reversals occurred. Moreover, the ability to distinguish among sounds originating within the same quadrant also suffered. Performance proficiency for the low-pass stimuli was not further degraded under conditions of pinnae occlusion. In locating the 4.0-kHz high-pass noise when both, neither, or only one ear was occluded, the data demonstrated unequivocally that the pinna-based cues of the "near" ear contributed powerfully toward localization accuracy.  相似文献   

9.
Detection and discrimination of spectral peaks and notches at 1 and 8 kHz   总被引:1,自引:0,他引:1  
The ability of subjects to detect and discriminate spectral peaks and notches in noise stimuli was determined for center frequencies fc of 1 and 8 kHz. The signals were delivered using an insert earphone designed to produce a flat frequency response at the eardrum for frequencies up to 14 kHz. In experiment I, subjects were required to distinguish a broadband reference noise with a flat spectrum from a noise with either a peak or a notch at fc. The threshold peak height or notch depth was determined as a function of bandwidth of the peak or notch (0.125, 0.25, or 0.5 times fc). Thresholds increased with decreasing bandwidth, particularly for the notches. In experiment II, subjects were required to detect an increase in the height of a spectral peak or a decrease in the depth of a notch as a function of bandwidth. Performance was worse for notches than for peaks, particularly at narrow bandwidths. For both experiments I and II, randomizing (roving) the overall level of the stimuli had little effect at 1 kHz, but tended to impair performance at 8 kHz, particularly for notches. Experiments III-VI measured thresholds for detecting changes in center frequency of sinusoids, bands of noise, and spectral peaks or notches in a broadband background. Thresholds were lowest for the sinusoids and highest for the peaks and notches. The width of the bands, peaks, or notches had only a small effect on thresholds. For the notches at 8 kHz, thresholds for detecting glides in center frequency were lower than thresholds for detecting a difference in center frequency between two steady sounds. Randomizing the overall level of the stimuli made frequency discrimination of the sinusoids worse, but had little or no effect for the noise stimuli. In all six experiments, performance was generally worse at 8 kHz than at 1 kHz. The results are discussed in terms of their implications for the detectability of spectral cues introduced by the pinnae.  相似文献   

10.
Closants, or consonantlike sounds in infant vocalizations, were described acoustically using 16-kHz spectrograms and LPC or FFT analyses based on waveforms sampled at 20 or 40 kHz. The two major closant types studied were fricatives and trills. Compared to similar fricative sounds in adult speech, the fricative sounds of the 3-, 6-, 9-, and 12-month-old infants had primary spectral components at higher frequencies, i.e., to and above 14 kHz. Trill rate varied from 16-180 Hz with a mean of about 100, approximately four times the mean trill rate reported for adult talkers. Acoustic features are described for various places of articulation for fricatives and trills. The discussion of the data emphasizes dimensions of acoustic contrast that appear in infant vocalizations during the first year of life, and implications of the spectral data for auditory and motor self-stimulation by normal-hearing and hearing-impaired infants.  相似文献   

11.
The auditory system calibrates to reliable properties of a listening environment in ways that enhance sensitivity to less predictable (more informative) aspects of sounds. These reliable properties may be spectrally local (e.g., peaks) or global (e.g., gross tilt), but the time course over which the auditory system registers and calibrates to these properties is unknown. Understanding temporal properties of this perceptual calibration is essential for revealing underlying mechanisms that serve to increase sensitivity to changing and informative properties of sounds. Relative influence of the second formant (F(2)) and spectral tilt was measured for identification of /u/ and /i/ following precursor contexts that were harmonic complexes with frequency-modulated resonances. Precursors filtered to match F(2) or tilt of following vowels induced perceptual calibration (diminished influence) to F(2) and tilt, respectively. Calibration to F(2) was greatest for shorter duration precursors (250 ms), which implicates physiologic and/or perceptual mechanisms that are sensitive to onsets. In contrast, calibration to tilt was greatest for precursors with longer durations and higher repetition rates because greater opportunities to sample the spectrum result in more stable estimates of long-term global spectral properties. Possible mechanisms that promote sensitivity to change are discussed.  相似文献   

12.
The spectral properties of a complex stimulus (rippled noise) were varied over time, and listeners were asked to discriminate between this stimulus and a flat-spectrum, stationary noise. The spacing between the spectral peaks of rippled noise was changed sinusoidally as a function of time, or the location of the spectral peaks of rippled noise was moved up and down the spectrum as a sinusoidal function of time. In most conditions, listeners were able to make the discriminations up to rates of temporal modulation of 5-10 cycles per second. Beyond 5-10 cps the rippled noise with the temporally varying peaks was indiscriminable from a flat (nonrippled) noise. The results suggest that for temporal changes in the spectral peaks of rippled noise, listeners cannot monitor the output of a single (or small number of) auditory channel(s) (critical bands), or that the mechanism used to extract the perceptual information from these stimuli is slow. Temporal variations in the spectral properties of rippled noise may relate to temporal changes in the repetition pitch of complex sounds, the temporal properties of the coloration added to sound in a reverberant environment, and the nature of spectral peak changes such as those that occur in speech-formant transitions. The results are relevant to the general issue of the auditory system's ability to extract information from a complex spectral profile.  相似文献   

13.
Prolonged listening to a pulse train with repetition rates around 100 Hz induces a striking aftereffect, whereby subsequently presented sounds are heard with an unusually "metallic" timbre [Rosenblith et al., Science 106, 333-335 (1947)]. The mechanisms responsible for this auditory aftereffect are currently unknown. Whether the aftereffect is related to an alteration of the perception of temporal envelope fluctuations was evaluated. Detection thresholds for sinusoidal amplitude modulation (AM) imposed onto noise-burst carriers were measured for different AM frequencies (50-500 Hz), following the continuous presentation of a periodic pulse train, a temporally jittered pulse train, or an unmodulated noise. AM detection thresholds for AM frequencies of 100 Hz and above were significantly elevated compared to thresholds in quiet, following the presentation of the pulse-train inducers, and both induced a subjective auditory aftereffect. Unmodulated noise, which produced no audible aftereffect, left AM detection thresholds unchanged. Additional experiments revealed that, like the Rosenblith et al. aftereffect, the effect on AM thresholds does not transfer across ears, is not eliminated by protracted training, and can last several tens of seconds. The results suggest that the Rosenblith et al. aftereffect is related to a temporary alteration in the perception of fast temporal envelope fluctuations in sounds.  相似文献   

14.
Noise is an important theoretical constraint on the evolution of signal form and sensory performance. In order to determine environmental constraints on the communication of two freshwater gobies Padogobius martensii and Gobius nigricans, numerous noise spectra were measured from quiet areas and ones adjacent to waterfalls and rapids in two shallow stony streams. Propagation of goby sounds and waterfall noise was also measured. A quiet window around 100 Hz is present in many noise spectra from noisy locations. The window lies between two noise sources, a low-frequency one attributed to turbulence, and a high-frequency one (200-500 Hz) attributed to bubble noise from water breaking the surface. Ambient noise from a waterfall (frequencies below 1 kHz) attenuates as much as 30 dB between 1 and 2 m, after which values are variable without further attenuation (i.e., buried in the noise floor). Similarly, courtship sounds of P. martensii attenuate as much as 30 dB between 5 and 50 cm. Since gobies are known to court in noisy as well as quiet locations in these streams, their acoustic communication system (sounds and auditory system) must be able to cope with short-range propagation dictated by shallow depths and ambient noise in noisy locations.  相似文献   

15.
Spectral peak resolution was investigated in normal hearing (NH), hearing impaired (HI), and cochlear implant (CI) listeners. The task involved discriminating between two rippled noise stimuli in which the frequency positions of the log-spaced peaks and valleys were interchanged. The ripple spacing was varied adaptively from 0.13 to 11.31 ripples/octave, and the minimum ripple spacing at which a reversal in peak and trough positions could be detected was determined as the spectral peak resolution threshold for each listener. Spectral peak resolution was best, on average, in NH listeners, poorest in CI listeners, and intermediate for HI listeners. There was a significant relationship between spectral peak resolution and both vowel and consonant recognition in quiet across the three listener groups. The results indicate that the degree of spectral peak resolution required for accurate vowel and consonant recognition in quiet backgrounds is around 4 ripples/octave, and that spectral peak resolution poorer than around 1-2 ripples/octave may result in highly degraded speech recognition. These results suggest that efforts to improve spectral peak resolution for HI and CI users may lead to improved speech recognition.  相似文献   

16.
The nature of the neural processing underlying the extraction of pitch information from harmonic complex sounds is still unclear. Electrophysiological studies in the auditory nerve and many psychophysical and modeling studies suggest that pitch might be extracted successfully by applying a mechanism like autocorrelation to the temporal discharge patterns of auditory-nerve fibers. The current modeling study investigates the possible role of populations of sustained chopper (Chop-S) units located in the mammalian ventral cochlear nucleus (VCN) in this process. First, it is shown that computer simulations can predict responses to periodic and quasiperiodic sounds of individual Chop-S units recorded in the guinea-pig VCN. Second, it is shown that the fundamental period of a periodic or quasiperiodic sound is represented in the first-order, interspike interval statistics of a population of simulated Chop-S units. This is true across a wide range of characteristic frequencies when the chopping rate is equal to the f0 of the sound. The model was able to simulate the results of psychophysical studies involving the pitch height and pitch strength of iterated ripple noise, the dominance region of pitch, the effect of phase on pitch height and pitch strength, pitch of inharmonic stimuli, and of sinusoidally amplitude modulated noise. Simulation results indicate that changes in the interspike interval statistics of populations of Chop-S units compare well with changes in the pitch perceived by humans. It is proposed that Chop-S units in the ventral cochlear nucleus may play an important role in pitch extraction: They can convert a purely temporal pitch code as observed in the auditory nerve into a temporal place code of pitch in populations of cochlear-nucleus, Chop-S with different characteristic frequencies, and chopping rates. Thus, populations of cochlear-nucleus Chop-S units, together with their target units presumably located in the inferior colliculus, may serve to establish a stable rate-place code of pitch at the level of the auditory cortex.  相似文献   

17.
The effect of spatial separation on the ability of human listeners to resolve a pair of concurrent broadband sounds was examined. Stimuli were presented in a virtual auditory environment using individualized outer ear filter functions. Subjects were presented with two simultaneous noise bursts that were either spatially coincident or separated (horizontally or vertically), and responded as to whether they perceived one or two source locations. Testing was carried out at five reference locations on the audiovisual horizon (0 degrees, 22.5 degrees, 45 degrees, 67.5 degrees, and 90 degrees azimuth). Results from experiment 1 showed that at more lateral locations, a larger horizontal separation was required for the perception of two sounds. The reverse was true for vertical separation. Furthermore, it was observed that subjects were unable to separate stimulus pairs if they delivered the same interaural differences in time (ITD) and level (ILD). These findings suggested that the auditory system exploited differences in one or both of the binaural cues to resolve the sources, and could not use monaural spectral cues effectively for the task. In experiments 2 and 3, separation of concurrent noise sources was examined upon removal of low-frequency content (and ITDs), onset/offset ITDs, both of these in conjunction, and all ITD information. While onset and offset ITDs did not appear to play a major role, differences in ongoing ITDs were robust cues for separation under these conditions, including those in the envelopes of high-frequency channels.  相似文献   

18.
Spectro-temporal modulation transfer functions and speech intelligibility   总被引:6,自引:0,他引:6  
Detection thresholds for spectral and temporal modulations are measured using broadband spectra with sinusoidally rippled profiles that drift up or down the log-frequency axis at constant velocities. Spectro-temporal modulation transfer functions (MTFs) are derived as a function of ripple peak density (omega cycles/octave) and drifting velocity (omega Hz). The MTFs exhibit a low-pass function with respect to both dimensions, with 50% bandwidths of about 16 Hz and 2 cycles/octave. The data replicate (as special cases) previously measured purely temporal MTFs (omega = 0) [Viemeister, J. Acoust. Soc. Am. 66, 1364-1380 (1979)] and purely spectral MTFs (omega = 0) [Green, in Auditory Frequency Selectivity (Plenum, Cambridge, 1986), pp. 351-359]. A computational auditory model is presented that exhibits spectro-temporal MTFs consistent with the salient trends in the data. The model is used to demonstrate the potential relevance of these MTFs to the assessment of speech intelligibility in noise and reverberant conditions.  相似文献   

19.
The fidelity of reproducing free-field sounds using a virtual auditory display was investigated in two experiments. In the first experiment, listeners directly compared stimuli from an actual loudspeaker in the free field with those from small headphones placed in front of the ears. Headphone stimuli were filtered using head-related transfer functions (HRTFs), recorded while listeners were wearing the headphones, in order to reproduce the pressure signatures of the free-field sounds at the eardrum. Discriminability was investigated for six sound-source positions using broadband noise as a stimulus. The results show that the acoustic percepts of real and virtual sounds were identical. In the second experiment, discrimination between virtual sounds generated with measured and interpolated HRTFs was investigated. Interpolation was performed using HRTFs measured for loudspeaker positions with different spatial resolutions. Broadband noise bursts with flat and scrambled spectra were used as stimuli. The results indicate that, for a spatial resolution of about 6 degrees, the interpolation does not introduce audible cues. For resolutions of 20 degrees or more, the interpolation introduces audible cues related to timbre and position. For intermediate resolutions (10 degrees - 15 degrees) the data suggest that only timbre cues were used.  相似文献   

20.
A set of experiments was conducted to examine the loudness of sounds with temporally asymmetric amplitude envelopes. Envelopes were generated with fast-attack/slow-decay characteristics to produce F-S (or "fast-slow") stimuli, while temporally reversed versions of these same envelopes produced corresponding S-F ("slow-fast") stimuli. For sinusoidal (330-6000 Hz) and broadband noise carriers, S-F stimuli were louder than F-S stimuli of equal energy. The magnitude of this effect was sensitive to stimulus order, with the largest differences between F-S and S-F loudness occurring after exposure to a preceding F-S stimulus. These results are not compatible with automatic gain control, power-spectrum models of loudness, or predictions obtained using the auditory image model [Patterson et al., J. Acoust. Soc. Am. 98, 1890-1894 (1995)]. Rather, they are comparable to phenomena of perceptual constancy, and may be related to the parsing of auditory input into direct and reverberant sound.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号