共查询到20条相似文献,搜索用时 0 毫秒
1.
2.
Previous research has identified a "synchrony window" of several hundred milliseconds over which auditory-visual (AV) asynchronies are not reliably perceived. Individual variability in the size of this AV synchrony window has been linked with variability in AV speech perception measures, but it was not clear whether AV speech perception measures are related to synchrony detection for speech only or for both speech and nonspeech signals. An experiment was conducted to investigate the relationship between measures of AV speech perception and AV synchrony detection for speech and nonspeech signals. Variability in AV synchrony detection for both speech and nonspeech signals was found to be related to variability in measures of auditory-only (A-only) and AV speech perception, suggesting that temporal processing for both speech and nonspeech signals must be taken into account in explaining variability in A-only and multisensory speech perception. 相似文献
3.
Whalen DH Benson RR Richardson M Swainson B Clark VP Lai S Mencl WE Fulbright RK Constable RT Liberman AM 《The Journal of the Acoustical Society of America》2006,119(1):575-581
Primary auditory cortex (PAC), located in Heschl's gyrus (HG), is the earliest cortical level at which sounds are processed. Standard theories of speech perception assume that signal components are given a representation in PAC which are then matched to speech templates in auditory association cortex. An alternative holds that speech activates a specialized system in cortex that does not use the primitives of PAC. Functional magnetic resonance imaging revealed different brain activation patterns in listening to speech and nonspeech sounds across different levels of complexity. Sensitivity to speech was observed in association cortex, as expected. Further, activation in HG increased with increasing levels of complexity with added fundamentals for both nonspeech and speech stimuli, but only for nonspeech when separate sources (release bursts/fricative noises or their nonspeech analogs) were added. These results are consistent with the existence of a specialized speech system which bypasses more typical processes at the earliest cortical level. 相似文献
4.
Holt LL 《The Journal of the Acoustical Society of America》2006,119(6):4016-4026
The extent to which context influences speech categorization can inform theories of pre-lexical speech perception. Across three conditions, listeners categorized speech targets preceded by speech context syllables. These syllables were presented as the sole context or paired with nonspeech tone contexts previously shown to affect speech categorization. Listeners' context-dependent categorization across these conditions provides evidence that speech and nonspeech context stimuli jointly influence speech processing. Specifically, when the spectral characteristics of speech and nonspeech context stimuli are mismatched such that they are expected to produce opposing effects on speech categorization the influence of nonspeech contexts may undermine, or even reverse, the expected effect of adjacent speech context. Likewise, when spectrally matched, the cross-class contexts may collaborate to increase effects of context. Similar effects are observed even when natural speech syllables, matched in source to the speech categorization targets, serve as the speech contexts. Results are well-predicted by spectral characteristics of the context stimuli. 相似文献
5.
R L Diehl M A Walsh K R Kluender 《The Journal of the Acoustical Society of America》1991,89(6):2905-2909
Fowler [J. Acoust. Soc. Am. 88, 1236-1249 (1990)] makes a set of claims on the basis of which she denies the general interpretability of experiments that compare the perception of speech sounds to the perception of acoustically analogous nonspeech sound. She also challenges a specific auditory hypothesis offered by Diehl and Walsh [J. Acoust. Soc. Am. 85, 2154-2164 (1989)] to explain the stimulus-length effect in the perception of stops and glides. It will be argued that her conclusions are unwarranted. 相似文献
6.
Relations among temporal acuity, hearing loss, and the perception of speech distorted by noise and reverberation 总被引:1,自引:0,他引:1
Eight normal listeners and eight listeners with sensorineural hearing losses were compared on a gap-detection task and on a speech perception task. The minimum detectable gap (71% correct) was determined as a function of noise level, and a time constant was computed from these data for each listener. The time constants of the hearing-impaired listeners were significantly longer than those of the normal listeners. The speech consisted of sentences that were mixed with two levels of noise and subjected to two kinds of reverberation (real or simulated). The speech thresholds (minimum signal-to-noise ratio for 50% correct) were significantly higher for the hearing-impaired listeners than for the normal listeners for both kinds of reverberation. The longer reverberation times produced significantly higher thresholds than the shorter times. The time constant was significantly correlated with all the speech threshold measures (r = -0.58 to -0.74) and a measure of hearing threshold loss also correlated significantly with all the speech thresholds (r = 0.53 to 0.95). A principal components analysis yielded two factors that accounted for the intercorrelations. The factor loadings for the time constant were similar to those on the speech thresholds for real reverberation and the loadings for hearing loss were similar to those of the thresholds for simulated reverberation. 相似文献
7.
Individual differences in the processing of speech and nonspeech sounds by normal-hearing listeners.
While a large portion of the variance among listeners in speech recognition is associated with the audibility of components of the speech waveform, it is not possible to predict individual differences in the accuracy of speech processing strictly from the audiogram. This has suggested that some of the variance may be associated with individual differences in spectral or temporal resolving power, or acuity. Psychoacoustic measures of spectral-temporal acuity with nonspeech stimuli have been shown, however, to correlate only weakly (or not at all) with speech processing. In a replication and extension of an earlier study [Watson et al., J. Acoust. Soc. Am. Suppl. 1 71. S73 (1982)] 93 normal-hearing college students were tested on speech perception tasks (nonsense syllables, words, and sentences in a noise background) and on six spectral-temporal discrimination tasks using simple and complex nonspeech sounds. Factor analysis showed that the abilities that explain performance on the nonspeech tasks are quite distinct from those that account for performance on the speech tasks. Performance was significantly correlated among speech tasks and among nonspeech tasks. Either, (a) auditory spectral-temporal acuity for nonspeech sounds is orthogonal to speech processing abilities, or (b) the appropriate tasks or types of nonspeech stimuli that challenge the abilities required for speech recognition have yet to be identified. 相似文献
8.
Durlach NI Mason CR Gallun FJ Shinn-Cunningham B Colburn HS Kidd G 《The Journal of the Acoustical Society of America》2005,118(4):2482-2497
Sensitivity d' and response bias beta were measured as a function of target level for the detection of a 1000-Hz tone in multitone maskers using a one interval, two-alternative forced-choice (1I-2AFC) paradigm. Ten such maskers, each with eight randomly selected components in the region 200-5000 Hz, with 800-1250 Hz excluded to form a protected zone, were presented under two conditions: the fixed condition, in which the same eight-component masker is used throughout an experimental run, and the random condition, in which an eight-component masker is chosen randomly trial-to-trial from the given set of ten such maskers. Differences between the results obtained with these two conditions help characterize the listener's susceptibility to informational masking (IM). The d' results show great intersubject variability, but can be reasonably well fit by simple energy-detector models in which internal noise and filter bandwidth are used as fitting parameters. In contrast, the beta results are not well fit by these models. In addition to presentation of new data and its relation to energy-detector models, this paper provides comments on a variety of issues, problems, and research needs in the IM area. 相似文献
9.
B Espinoza-Varas 《The Journal of the Acoustical Society of America》1983,74(6):1687-1694
This study presents a psychoacoustic analysis of the integration of spectral and temporal cues in the discrimination of simple nonspeech sounds. The experimental task was a same-different discrimination between a standard and a comparison pair of tones. Each pair consists of two 80-ms, 1500-Hz tone bursts separated by a 60-ms interval. The just-discriminable (d' = 2.0) increment in duration delta t, of one of the bursts was measured as a function of increments in the frequency delta f, of the other burst. A trade off between the values of delta t and delta f required to perform at d' = 2.0 was observed, which suggests that listeners integrate the evidence from the two dimensions. Integration occurred with both sub- and supra-threshold values of delta t or delta f, regardless of the order in which the cues were presented. The performance associated to the integration of cues was found to be determined by the discriminability of delta t plus that of delta f, and thus, it is within the psychophysical limits of auditory processing. To a first approximation the results agreed with the prediction of orthogonal vector summation of evidence stemming from signal detection theory. It is proposed that the ability to integrate spectral and temporal cues is in the repertoire of auditory processing capabilities. This integration does not appear to depend on perceiving sounds as members of phonetic classes. 相似文献
10.
Bhattacharya A Vandali A Zeng FG 《The Journal of the Acoustical Society of America》2011,130(5):2951-2960
The present study examined the effect of combined spectral and temporal enhancement on speech recognition by cochlear-implant (CI) users in quiet and in noise. The spectral enhancement was achieved by expanding the short-term Fourier amplitudes in the input signal. Additionally, a variation of the Transient Emphasis Spectral Maxima (TESM) strategy was applied to enhance the short-duration consonant cues that are otherwise suppressed when processed with spectral expansion. Nine CI users were tested on phoneme recognition tasks and ten CI users were tested on sentence recognition tasks both in quiet and in steady, speech-spectrum-shaped noise. Vowel and consonant recognition in noise were significantly improved with spectral expansion combined with TESM. Sentence recognition improved with both spectral expansion and spectral expansion combined with TESM. The amount of improvement varied with individual CI users. Overall the present results suggest that customized processing is needed to optimize performance according to not only individual users but also listening conditions. 相似文献
11.
Recent evidence suggests that spectral change, as measured by cochlea-scaled entropy (CSE), predicts speech intelligibility better than the information carried by vowels or consonants in sentences. Motivated by this finding, the present study investigates whether intelligibility indices implemented to include segments marked with significant spectral change better predict speech intelligibility in noise than measures that include all phonetic segments paying no attention to vowels/consonants or spectral change. The prediction of two intelligibility measures [normalized covariance measure (NCM), coherence-based speech intelligibility index (CSII)] is investigated using three sentence-segmentation methods: relative root-mean-square (RMS) levels, CSE, and traditional phonetic segmentation of obstruents and sonorants. While the CSE method makes no distinction between spectral changes occurring within vowels/consonants, the RMS-level segmentation method places more emphasis on the vowel-consonant boundaries wherein the spectral change is often most prominent, and perhaps most robust, in the presence of noise. Higher correlation with intelligibility scores was obtained when including sentence segments containing a large number of consonant-vowel boundaries than when including segments with highest entropy or segments based on obstruent/sonorant classification. These data suggest that in the context of intelligibility measures the type of spectral change captured by the measure is important. 相似文献
12.
Context is important for recovering language information from talker-induced variability in acoustic signals. In tone perception, previous studies reported similar effects of speech and nonspeech contexts in Mandarin, supporting a general perceptual mechanism underlying tone normalization. However, no supportive evidence was obtained in Cantonese, also a tone language. Moreover, no study has compared speech and nonspeech contexts in the multi-talker condition, which is essential for exploring the normalization mechanism of inter-talker variability in speaking F0. The other question is whether a talker's full F0 range and mean F0 equally facilitate normalization. To answer these questions, this study examines the effects of four context conditions (speech/nonspeech?×?F0 contour/mean F0) in the multi-talker condition in Cantonese. Results show that raising and lowering the F0 of speech contexts change the perception of identical stimuli from mid level tone to low and high level tone, whereas nonspeech contexts only mildly increase the identification preference. It supports the speech-specific mechanism of tone normalization. Moreover, speech context with flattened F0 trajectory, which neutralizes cues of a talker's full F0 range, fails to facilitate normalization in some conditions, implying that a talker's mean F0 is less efficient for minimizing talker-induced lexical ambiguity in tone perception. 相似文献
13.
14.
Elberling C Don M Cebulla M Stürzebecher E 《The Journal of the Acoustical Society of America》2007,122(5):2772-2785
This study investigates the use of chirp stimuli to compensate for the cochlear traveling wave delay. The temporal dispersion in the cochlea is given by the traveling time, which in this study is estimated from latency-frequency functions obtained from (1) a cochlear model, (2) tone-burst auditory brain stem response (ABR) latencies, (3) and narrow-band ABR latencies. These latency-frequency functions are assumed to reflect the group delay of a linear system that modifies the phase spectrum of the applied stimulus. On the basis of this assumption, three chirps are constructed and evaluated in 49 normal-hearing subjects. The auditory steady-state responses to these chirps and to a click stimulus are compared at two levels of stimulation (30 and 50 dB nHL) and a rate of 90s. The chirps give shorter detection time and higher signal-to-noise ratio than the click. The shorter detection time obtained by the chirps is equivalent to an increase in stimulus level of 20 dB or more. The results indicate that a chirp is a more efficient stimulus than a click for the recording of early auditory evoked responses in normal-hearing adults using transient sounds at a high rate of stimulation. 相似文献
15.
Eaves JM Summerfield AQ Kitterick PT 《The Journal of the Acoustical Society of America》2011,130(1):501-507
Previous studies have assessed the importance of temporal fine structure (TFS) for speech perception in noise by comparing the performance of normal-hearing listeners in two conditions. In one condition, the stimuli have useful information in both their temporal envelopes and their TFS. In the other condition, stimuli are vocoded and contain useful information only in their temporal envelopes. However, these studies have confounded differences in TFS with differences in the temporal envelope. The present study manipulated the analytic signal of stimuli to preserve the temporal envelope between conditions with different TFS. The inclusion of informative TFS improved speech-reception thresholds for sentences presented in steady and modulated noise, demonstrating that there are significant benefits of including informative TFS even when the temporal envelope is controlled. It is likely that the results of previous studies largely reflect the benefits of TFS, rather than uncontrolled effects of changes in the temporal envelope. 相似文献
16.
Auditory temporal integration and the power function model 总被引:2,自引:0,他引:2
G M Gerken V K Bhat M Hutchison-Clutter 《The Journal of the Acoustical Society of America》1990,88(2):767-778
17.
Pichora-Fuller MK Schneider BA Benson NJ Hamstra SJ Storzer E 《The Journal of the Acoustical Society of America》2006,119(2):1143-1155
Gap detection thresholds for speech and analogous nonspeech stimuli were determined in younger and older adults with clinically normal hearing in the speech range. Gap detection thresholds were larger for older than for younger listeners in all conditions, with the size of the age difference increasing with stimulus complexity. For both ages, gap detection thresholds were far smaller when the markers before and after the gap were the same (spectrally symmetrical) compared to when they were different (spectrally asymmetrical) for both speech and nonspeech stimuli. Moreover, gap detection thresholds were smaller for nonspeech than for speech stimuli when the markers were spectrally symmetrical but the opposite was observed when the markers were spectrally asymmetrical. This pattern of results may reflect the benefit of activating well-learned gap-dependent phonemic contrasts. The stimulus-dependent age effects were interpreted as reflecting the differential effects of age-dependent losses in temporal processing ability on within- and between-channel gap detection. 相似文献
18.
Auditory temporal processing was examined using a flutter-fusion paradigm in which two tones were separated by a silent interval. The listener's task was to judge when the two tones, presented in a background noise, fused perceptually. The fusion point was studied in a series of six experiments. In the first five experiments, the duration of the first stimulus (T1) was the dependent variable. In the last experiment, the duration of the second stimulus (T2) was the dependent variable. An inverse relationship was found between T1 duration and the interstimulus interval (ISI) such that, when ISI was decreased, T1 duration had to be increased to maintain fusion. When ISI was plotted as a function of T1 duration, the data were represented by a negative exponential equation. Increasing the level of the tones, increasing the bandwidth of the background noise, or presenting the stimuli dichotically lowered the duration of T1 necessary for fusion. Changing the frequency of the tones had no effect on fusion. Decreasing the duration of T2 and holding T1 constant also resulted in fusion. A neurophysiological model implicating ON and OFF neural response interactions is postulated to account for the data. 相似文献
19.
Auditory and nonauditory factors affecting speech reception in noise by older listeners 总被引:2,自引:0,他引:2
George EL Zekveld AA Kramer SE Goverts ST Festen JM Houtgast T 《The Journal of the Acoustical Society of America》2007,121(4):2362-2375
Speech reception thresholds (SRTs) for sentences were determined in stationary and modulated background noise for two age-matched groups of normal-hearing (N = 13) and hearing-impaired listeners (N = 21). Correlations were studied between the SRT in noise and measures of auditory and nonauditory performance, after which stepwise regression analyses were performed within both groups separately. Auditory measures included the pure-tone audiogram and tests of spectral and temporal acuity. Nonauditory factors were assessed by measuring the text reception threshold (TRT), a visual analogue of the SRT, in which partially masked sentences were adaptively presented. Results indicate that, for the normal-hearing group, the variance in speech reception is mainly associated with nonauditory factors, both in stationary and in modulated noise. For the hearing-impaired group, speech reception in stationary noise is mainly related to the audiogram, even when audibility effects are accounted for. In modulated noise, both auditory (temporal acuity) and nonauditory factors (TRT) contribute to explaining interindividual differences in speech reception. Age was not a significant factor in the results. It is concluded that, under some conditions, nonauditory factors are relevant for the perception of speech in noise. Further evaluation of nonauditory factors might enable adapting the expectations from auditory rehabilitation in clinical settings. 相似文献