首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
Within an auditory channel, the speech waveform contains both temporal envelope (E(O)) and temporal fine structure (TFS) information. Vocoder processing extracts a modified version of the temporal envelope (E') within each channel and uses it to modulate a channel carrier. The resulting signal, E'(Carr), has reduced information content compared to the original "E(O)?+ TFS" signal. The dynamic range over which listeners make additional use of E(O)?+ TFS over E'(Carr) cues was investigated in a competing-speech task. The target-and-background mixture was processed using a 30-channel vocoder. In each channel, E(O)?+ TFS replaced E'(Carr) at either the peaks or the valleys of the signal. The replacement decision was based on comparing the short-term channel level to a parametrically varied "switching threshold," expressed relative to the long-term channel level. Intelligibility was measured as a function of switching threshold, carrier type, target-to-background ratio, and replacement method. Scores showed a dependence on all four parameters. Derived intensity-importance functions (IIFs) showed that E(O)?+ TFS information from 8-13 dB below to 10 dB above the channel long-term level was important. When E(O)?+ TFS information was added at the peaks, IIFs peaked around -2 dB, but when E(O)?+ TFS information was added at the valleys, the peaks lay around +1 dB.  相似文献   

2.
The precedence effect (PE) describes the ability to localize a direct, leading sound correctly when its delayed copy (lag) is present, though not separately audible. The relative contribution of binaural cues in the temporal fine structure (TFS) of lead-lag signals was compared to that of interaural level differences (ILDs) and interaural time differences (ITDs) carried in the envelope. In a localization dominance paradigm participants indicated the spatial location of lead-lag stimuli processed with a binaural noise-band vocoder whose noise carriers introduced random TFS. The PE appeared for noise bursts of 10 ms duration, indicating dominance of envelope information. However, for three test words the PE often failed even at short lead-lag delays, producing two images, one toward the lead and one toward the lag. When interaural correlation in the carrier was increased, the images appeared more centered, but often remained split. Although previous studies suggest dominance of TFS cues, no image is lateralized in accord with the ITD in the TFS. An interpretation in the context of auditory scene analysis is proposed: By replacing the TFS with that of noise the auditory system loses the ability to fuse lead and lag into one object, and thus to show the PE.  相似文献   

3.
This study measured the role of spectral details and temporal envelope (E) and fine structure (TFS) cues in reconstructing sentences from speech fragments. Four sets of sentences were processed using a 32-band vocoder. Twenty one bands were either processed or removed, leading to sentences differing in their amount of spectral details, E and TFS information. These sentences remained perfectly intelligible, but intelligibility significantly fell after the introduction of periodic silent gaps of 120-ms. While the role of E was unclear, the results unambiguously showed that TFS cues and spectral details influence the ability to reconstruct interrupted sentences.  相似文献   

4.
Previous studies have demonstrated that normal-hearing listeners can understand speech using the recovered "temporal envelopes," i.e., amplitude modulation (AM) cues from frequency modulation (FM). This study evaluated this mechanism in cochlear implant (CI) users for consonant identification. Stimuli containing only FM cues were created using 1, 2, 4, and 8-band FM-vocoders to determine if consonant identification performance would improve as the recovered AM cues become more available. A consistent improvement was observed as the band number decreased from 8 to 1, supporting the hypothesis that (1) the CI sound processor generates recovered AM cues from broadband FM, and (2) CI users can use the recovered AM cues to recognize speech. The correlation between the intact and the recovered AM components at the output of the sound processor was also generally higher when the band number was low, supporting the consonant identification results. Moreover, CI subjects who were better at using recovered AM cues from broadband FM cues showed better identification performance with intact (unprocessed) speech stimuli. This suggests that speech perception performance variability in CI users may be partly caused by differences in their ability to use AM cues recovered from FM speech cues.  相似文献   

5.
The intelligibility of sentences processed to remove temporal envelope information, as far as possible, was assessed. Sentences were filtered into N analysis channels, and each channel signal was divided by its Hilbert envelope to remove envelope information but leave temporal fine structure (TFS) intact. Channel signals were combined to give TFS speech. The effect of adding low-level low-noise noise (LNN) to each channel signal before processing was assessed. The addition of LNN reduced the amplification of low-level signal portions that contained large excursions in instantaneous frequency, and improved the intelligibility of simple TFS speech sentences, but not more complex sentences. It also reduced the time needed to reach a stable level of performance. The recovery of envelope cues by peripheral auditory filtering was investigated by measuring the intelligibility of 'recovered-envelope speech', formed by filtering TFS speech with an array of simulated auditory filters, and using the envelopes at the output of these filters to modulate sinusoids with frequencies equal to the filter center frequencies (i.e., tone vocoding). The intelligibility of TFS speech and recovered-envelope speech fell as N increased, although TFS speech was still highly intelligible for values of N for which the intelligibility of recovered-envelope speech was low.  相似文献   

6.
The intelligibility of speech signals processed to retain either temporal envelope (E) or fine structure (TFS) cues within 16 0.4-oct-wide frequency bands was evaluated when processed stimuli were periodically interrupted at different rates. The interrupted E- and TFS-coded stimuli were highly intelligible in all conditions. However, the different patterns of results obtained for E- and TFS-coded speech suggest that the two types of stimuli do not convey identical speech cues. When an effect of interruption rate was observed, the effect occurred at low interruption rates (<8 Hz) and was stronger for E- than TFS-coded speech, suggesting larger involvement of modulation masking with E-coded speech.  相似文献   

7.
Speech reception thresholds (SRTs) were measured with a competing talker background for signals processed to contain variable amounts of temporal fine structure (TFS) information, using nine normal-hearing and nine hearing-impaired subjects. Signals (speech and background talker) were bandpass filtered into channels. Channel signals for channel numbers above a "cut-off channel" (CO) were vocoded to remove TFS information, while channel signals for channel numbers of CO and below were left unprocessed. Signals from all channels were combined. As a group, hearing-impaired subjects benefited less than normal-hearing subjects from the additional TFS information that was available as CO increased. The amount of benefit varied between hearing-impaired individuals, with some showing no improvement in SRT and one showing an improvement similar to that for normal-hearing subjects. The reduced ability to take advantage of TFS information in speech may partially explain why subjects with cochlear hearing loss get less benefit from listening in a fluctuating background than normal-hearing subjects. TFS information may be important in identifying the temporal "dips" in such a background.  相似文献   

8.
The fused low pitch evoked by complex tones containing only unresolved high-frequency components demonstrates the ability of the human auditory system to extract pitch using a temporal mechanism in the absence of spectral cues. However, the temporal features used by such a mechanism have been a matter of debate. For stimuli with components lying exclusively in high-frequency spectral regions, the slowly varying temporal envelope of sounds is often assumed to be the only information contained in auditory temporal representations, and it has remained controversial to what extent the fast amplitude fluctuations, or temporal fine structure (TFS), of the conveyed signal can be processed. Using a pitch matching paradigm, the present study found that the low pitch of inharmonic transposed tones with unresolved components was consistent with the timing between the most prominent TFS maxima in their waveforms, rather than envelope maxima. Moreover, envelope cues did not take over as the absolute frequency or rank of the lowest component was raised and TFS cues thus became less effective. Instead, the low pitch became less salient. This suggests that complex pitch perception does not rely on envelope coding as such, and that TFS representation might persist at higher frequencies than previously thought.  相似文献   

9.
Frequency modulation detection limens (FMDLs) were measured for five hearing-impaired (HI) subjects for carrier frequencies f(c) = 1000, 4000, and 6000 Hz, using modulation frequencies f(m) = 2 and 10 Hz and levels of 20 dB sensation level and 90 dB SPL. FMDLs were smaller for f(m) = 10 than for f(m) = 2 Hz for the two higher f(c), but not for f(c) = 1000 Hz. FMDLs were also determined with additional random amplitude modulation (AM), to disrupt excitation-pattern cues. The disruptive effect was larger for f(m) = 10 than for f(m) = 2 Hz. The smallest disruption occurred for f(m) = 2 Hz and f(c) = 1000 Hz. AM detection thresholds for normal-hearing and HI subjects were measured for the same f(c) and f(m) values. Performance was better for the HI subjects for both f(m). AM detection was much better for f(m) = 10 than for f(m) = 2 Hz. Additional tests showed that most HI subjects could discriminate temporal fine structure (TFS) at 800 Hz. The results are consistent with the idea that, for f(m) = 2 Hz and f(c) = 1000 Hz, frequency modulation (FM) detection was partly based on the use of TFS information. For higher carrier frequencies and for all carrier frequencies with f(m) = 10 Hz, FM detection was probably based on place cues.  相似文献   

10.
This study investigated the ability to use temporal-envelope (E) cues in a consonant identification task when presented within one or two frequency bands. Syllables were split into five bands spanning the range 70-7300 Hz with each band processed to preserve E cues and degrade temporal fine-structure cues. Identification scores were measured for normal-hearing listeners in quiet for individual processed bands and for pairs of bands. Consistent patterns of results were obtained in both the single- and dual-band conditions: identification scores increased systematically with band center frequency, showing that E cues in the higher bands (1.8-7.3 kHz) convey greater information.  相似文献   

11.
The present study assessed the relative contribution of the "target" and "masker" temporal fine structure (TFS) when identifying consonants. Accordingly, the TFS of the target and that of the masker were manipulated simultaneously or independently. A 30 band vocoder was used to replace the original TFS of the stimuli with tones. Four masker types were used. They included a speech-shaped noise, a speech-shaped noise modulated by a speech envelope, a sentence, or a sentence played backward. When the TFS of the target and that of the masker were disrupted simultaneously, consonant recognition dropped significantly compared to the unprocessed condition for all masker types, except the speech-shaped noise. Disruption of only the target TFS led to a significant drop in performance with all masker types. In contrast, disruption of only the masker TFS had no effect on recognition. Overall, the present data are consistent with previous work showing that TFS information plays a significant role in speech recognition in noise, especially when the noise fluctuates over time. However, the present study indicates that listeners rely primarily on TFS information in the target and that the nature of the masker TFS has a very limited influence on the outcome of the unmasking process.  相似文献   

12.
Tone languages differ from English in that the pitch pattern of a single-syllable word conveys lexical meaning. In the present study, dependence of tonal-speech perception on features of the stimulation was examined using an acoustic simulation of a CIS-type speech-processing strategy for cochlear prostheses. Contributions of spectral features of the speech signals were assessed by varying the number of filter bands, while contributions of temporal envelope features were assessed by varying the low-pass cutoff frequency used for extracting the amplitude envelopes. Ten normal-hearing native Mandarin Chinese speakers were tested. When the low-pass cutoff frequency was fixed at 512 Hz, consonant, vowel, and sentence recognition improved as a function of the number of channels and reached plateau at 4 to 6 channels. Subjective judgments of sound quality continued to improve as the number of channels increased to 12, the highest number tested. Tone recognition, i.e., recognition of the four Mandarin tone patterns, depended on both the number of channels and the low-pass cutoff frequency. The trade-off between the temporal and spectral cues for tone recognition indicates that temporal cues can compensate for diminished spectral cues for tone recognition and vice versa. An additional tone recognition experiment using syllables of equal duration showed a marked decrease in performance, indicating that duration cues contribute to tone recognition. A third experiment showed that recognition of processed FM patterns that mimic Mandarin tone patterns was poor when temporal envelope and duration cues were removed.  相似文献   

13.
Three experiments were designed to provide psychophysical evidence for the existence of envelope information in the temporal fine structure (TFS) of stimuli that were originally amplitude modulated (AM). The original stimuli typically consisted of the sum of a sinusoidally AM tone and two unmodulated tones so that the envelope and TFS could be determined a priori. Experiment 1 showed that normal-hearing listeners not only perceive AM when presented with the Hilbert fine structure alone but AM detection thresholds are lower than those observed when presenting the original stimuli. Based on our analysis, envelope recovery resulted from the failure of the decomposition process to remove the spectral components related to the original envelope from the TFS and the introduction of spectral components related to the original envelope, suggesting that frequency- to amplitude-modulation conversion is not necessary to recover envelope information from TFS. Experiment 2 suggested that these spectral components interact in such a way that envelope fluctuations are minimized in the broadband TFS. Experiment 3 demonstrated that the modulation depth at the original carrier frequency is only slightly reduced compared to the depth of the original modulator. It also indicated that envelope recovery is not specific to the Hilbert decomposition.  相似文献   

14.
It has been proposed that the detection of frequency modulation (FM) of sinusoidal carriers can be mediated by two mechanisms; a place mechanism based on FM-induced amplitude modulation (AM) in the excitation pattern, and a temporal mechanism based on phase locking in the auditory nerve. The temporal mechanism appears to be "sluggish" and does not play a role for FM rates above about 10 Hz. It also does not play a role for high carrier frequencies (above about 5 kHz). This experiment provided a further test of the hypothesis that the effectiveness of the temporal mechanism depends upon the time spent close to frequency extremes during the modulation cycle. Psychometric functions for the detection of AM and FM were measured for two carrier frequencies, 1 and 6 kHz. The modulation waveform was quasitrapezoidal. Within each modulation period, P, a time Tss was spent at each extreme of frequency or amplitude. The transitions between the extremes, with duration Ttrans had the form of a half-cycle of a cosine function. The modulation rate was 2, 5, 10, or 20 Hz, giving values of P of 500, 200, 100, and 50 ms. TSS varied from 0 ms (sinusoidal modulation) up to 160, 80, 40, or 20 ms, for rates of 2, 5, 10, and 20 Hz, respectively. The detectability of AM was not greatly affected by modulation rate or by the value of TSS, except for a slight improvement with increasing TSS for the lowest modulation rates; this was true for both carrier frequencies. For FM of the 6-kHz carrier, the pattern of results was similar to that found for AM, which is consistent with an excitation-pattern model of FM detection. For FM of the 1-kHz carrier, performance improved markedly with increasing TSS, especially for the lower FM rates; there was no change in performance with TSS for the 20-Hz modulation rate. The results are consistent with the idea that detection of FM of a 1-kHz carrier is partly mediated by a sluggish temporal mechanism. That mechanism benefits from greater time spent at frequency extremes of the modulation cycle for rates up to 10 Hz.  相似文献   

15.
Previous studies have assessed the importance of temporal fine structure (TFS) for speech perception in noise by comparing the performance of normal-hearing listeners in two conditions. In one condition, the stimuli have useful information in both their temporal envelopes and their TFS. In the other condition, stimuli are vocoded and contain useful information only in their temporal envelopes. However, these studies have confounded differences in TFS with differences in the temporal envelope. The present study manipulated the analytic signal of stimuli to preserve the temporal envelope between conditions with different TFS. The inclusion of informative TFS improved speech-reception thresholds for sentences presented in steady and modulated noise, demonstrating that there are significant benefits of including informative TFS even when the temporal envelope is controlled. It is likely that the results of previous studies largely reflect the benefits of TFS, rather than uncontrolled effects of changes in the temporal envelope.  相似文献   

16.
A common problem when applying Raman scattering in applied research is spectral interference from laser‐induced fluorescence. Extensive work has been invested in developing spectral and polarization filters as well as modulation schemes to refine spontaneous Raman signals. This current work, however, focuses on utilizing the temporal domain using a picosecond laser system and ICCD cameras with relatively short decay of the camera gate to prevent the fluorescence tail from being captured in Raman experiments. Further, the approach of using an ICCD camera to perform temporal filtering is compared to earlier proposed detection schemes using streak cameras or Kerr gates. The temporal‐filtering scheme is evaluated in a spectroscopic investigation where a background subtraction algorithm is presented. The temporal‐filtering scheme was also evaluated for Raman imaging of a levitated water droplet surrounded by fluorescing toluene vapor. Furthermore, the temporal‐filter detection scheme was simulated in order to provide straight forward evaluation tools to estimate the potential of performing temporal filtering with a laser/camera system considering: laser‐pulse duration, time jitter, camera‐gate characteristics, gate delay times, fluorescence lifetimes, and relative signal strength between the Raman and fluorescence signal. The fluorescence signal was modeled with a closed two‐level system, and the simulated results were compared to results from an investigation of the rising slope of toluene fluorescence. These evaluation tools and experimental investigations may serve as guidelines for planning and performing Raman measurements in situations where traditional filter‐rejection schemes are insufficient. Copyright © 2013 John Wiley & Sons, Ltd.  相似文献   

17.
Speech recognition with altered spectral distribution of envelope cues.   总被引:8,自引:0,他引:8  
Recognition of consonants, vowels, and sentences was measured in conditions of reduced spectral resolution and distorted spectral distribution of temporal envelope cues. Speech materials were processed through four bandpass filters (analysis bands), half-wave rectified, and low-pass filtered to extract the temporal envelope from each band. The envelope from each speech band modulated a band-limited noise (carrier bands). Analysis and carrier bands were manipulated independently to alter the spectral distribution of envelope cues. Experiment I demonstrated that the location of the cutoff frequencies defining the bands was not a critical parameter for speech recognition, as long as the analysis and carrier bands were matched in frequency extent. Experiment II demonstrated a dramatic decrease in performance when the analysis and carrier bands did not match in frequency extent, which resulted in a warping of the spectral distribution of envelope cues. Experiment III demonstrated a large decrease in performance when the carrier bands were shifted in frequency, mimicking the basal position of electrodes in a cochlear implant. And experiment IV showed a relatively minor effect of the overlap in the noise carrier bands, simulating the overlap in neural populations responding to adjacent electrodes in a cochlear implant. Overall, these results show that, for four bands, the frequency alignment of the analysis bands and carrier bands is critical for good performance, while the exact frequency divisions and overlap in carrier bands are not as critical.  相似文献   

18.
The present study examined the relative influence of the off- and on-frequency spectral components of modulated and unmodulated maskers on consonant recognition. Stimuli were divided into 30 contiguous equivalent rectangular bandwidths. The temporal fine structure (TFS) in each "target" band was either left intact or replaced with tones using vocoder processing. Recognition scores for 10, 15 and 20 target bands randomly located in frequency were obtained in quiet and in the presence of all 30 masker bands, only the off-frequency masker bands, or only the on-frequency masker bands. The amount of masking produced by the on-frequency bands was generally comparable to that produced by the broadband masker. However, the difference between these two conditions was often significant, indicating an influence of the off-frequency masker bands, likely through modulation interference or spectral restoration. Although vocoder processing systematically lead to poorer consonant recognition scores, the deficit observed in noise could often be attributed to that observed in quiet. These data indicate that (i) speech recognition is affected by the off-frequency components of the background and (ii) the nature of the target TFS does not systematically affect speech recognition in noise, especially when energetic masking and/or the number of target bands is limited.  相似文献   

19.
Two synthetic vowels /i/ and /ae/ with a fundamental frequency of 100 Hz served as maskers for brief (5 or 15 ms) sinusoidal signals. Threshold was measured as a function of signal frequency, for signals presented immediately following the masker (forward masking, FM) or just before the cessation of the masker (simultaneous masking, SM). Three different overall masker levels were used: 50, 70, and 90 dB SPL. In order to compare the data from simultaneous and forward masking, and to compensate for the nonlinear characteristics of forward masking, each signal threshold was expressed as the level of a flat-spectrum noise which would give the same masking. The internal representation of the formant structure of the vowels, as inferred from the transformed masking patterns, was enhanced in FM and "blurred" in SM in comparison to the physical spectra, suggesting that suppression plays a role in enhancing spectral contrasts. The first two or three formants were usually visible in the masking patterns and the representation of the formant structure was impaired only slightly at high masker levels. For high levels, filtering out the relatively intense low-frequency components enhanced the representation of the higher formants in FM but not in SM, indicating a broadly tuned remote suppression from lower formants towards higher ones. The relative phase of the components in the masker had no effect on thresholds in forward masking, indicating that the detailed temporal structure of the masker waveform is not important.  相似文献   

20.
基于光相位调制的核信号读出方法将探测器信号调制进光纤中,并使用光纤作为模拟信号的传输介质。在该读出方案中,调制驱动模块负责载波信号的产生及放大,是该方案读出电子学系统的重要组成部分。为了产生低相位噪声,幅度大且幅度可调的载波信号,本工作提出了基于锁相环的载波产生电路和基于MMIC射频放大器的载波放大电路的设计方案,该方案结构简单,尺寸小,性能优异。对载波产生电路使用了ADIsimPLL仿真软件进行了环路滤波器的设计和仿真,同时也对载波放大电路使用ADS仿真软件进行了设计和仿真,并在实验室条件下进行了测试。测试结果表明,输出26 dBm载波信号相位噪声好于–110 dBc/Hz@100 kHz,能够用于信号解调。  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号