首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 109 毫秒
1.
In face-to-face speech communication, the listener extracts and integrates information from the acoustic and optic speech signals. Integration occurs within the auditory modality (i.e., across the acoustic frequency spectrum) and across sensory modalities (i.e., across the acoustic and optic signals). The difficulties experienced by some hearing-impaired listeners in understanding speech could be attributed to losses in the extraction of speech information, the integration of speech cues, or both. The present study evaluated the ability of normal-hearing and hearing-impaired listeners to integrate speech information within and across sensory modalities in order to determine the degree to which integration efficiency may be a factor in the performance of hearing-impaired listeners. Auditory-visual nonsense syllables consisting of eighteen medial consonants surrounded by the vowel [a] were processed into four nonoverlapping acoustic filter bands between 300 and 6000 Hz. A variety of one, two, three, and four filter-band combinations were presented for identification in auditory-only and auditory-visual conditions: A visual-only condition was also included. Integration efficiency was evaluated using a model of optimal integration. Results showed that normal-hearing and hearing-impaired listeners integrated information across the auditory and visual sensory modalities with a high degree of efficiency, independent of differences in auditory capabilities. However, across-frequency integration for auditory-only input was less efficient for hearing-impaired listeners. These individuals exhibited particular difficulty extracting information from the highest frequency band (4762-6000 Hz) when speech information was presented concurrently in the next lower-frequency band (1890-2381 Hz). Results suggest that integration of speech information within the auditory modality, but not across auditory and visual modalities, affects speech understanding in hearing-impaired listeners.  相似文献   

2.
Cochlear implant (CI) users have been shown to benefit from residual low-frequency hearing, specifically in pitch related tasks. It remains unclear whether this benefit is dependent on fundamental frequency (F0) or other acoustic cues. Three experiments were conducted to determine the role of F0, as well as its frequency modulated (FM) and amplitude modulated (AM) components, in speech recognition with a competing voice. In simulated CI listeners, the signal-to-noise ratio was varied to estimate the 50% correct response. Simulation results showed that the F0 cue contributes to a significant proportion of the benefit seen with combined acoustic and electric hearing, and additionally that this benefit is due to the FM rather than the AM component. In actual CI users, sentence recognition scores were collected with either the full F0 cue containing both the FM and AM components or the 500-Hz low-pass speech cue containing the F0 and additional harmonics. The F0 cue provided a benefit similar to the low-pass cue for speech in noise, but not in quiet. Poorer CI users benefited more from the F0 cue than better users. These findings suggest that F0 is critical to improving speech perception in noise in combined acoustic and electric hearing.  相似文献   

3.
4.
Whether or not categorical perception results from the operation of a special, language-specific, speech mode remains controversial. In this cross-language (Mandarin Chinese, English) study of the categorical nature of tone perception, we compared native Mandarin and English speakers' perception of a physical continuum of fundamental frequency contours ranging from a level to rising tone in both Mandarin speech and a homologous (nonspeech) harmonic tone. This design permits us to evaluate the effect of language experience by comparing Chinese and English groups; to determine whether categorical perception is speech-specific or domain-general by comparing speech to nonspeech stimuli for both groups; and to examine whether categorical perception involves a separate categorical process, distinct from regions of sensory discontinuity, by comparing speech to nonspeech stimuli for English listeners. Results show evidence of strong categorical perception of speech stimuli for Chinese but not English listeners. Categorical perception of nonspeech stimuli was comparable to that for speech stimuli for Chinese but weaker for English listeners, and perception of nonspeech stimuli was more categorical for English listeners than was perception of speech stimuli. These findings lead us to adopt a memory-based, multistore model of perception in which categorization is domain-general but influenced by long-term categorical representations.  相似文献   

5.
Different patterns of performance across vowels and consonants in tests of categorization and discrimination indicate that vowels tend to be perceived more continuously, or less categorically, than consonants. The present experiments examined whether analogous differences in perception would arise in nonspeech sounds that share critical transient acoustic cues of consonants and steady-state spectral cues of simplified synthetic vowels. Listeners were trained to categorize novel nonspeech sounds varying along a continuum defined by a steady-state cue, a rapidly-changing cue, or both cues. Listeners' categorization of stimuli varying on the rapidly changing cue showed a sharp category boundary and posttraining discrimination was well predicted from the assumption of categorical perception. Listeners more accurately discriminated but less accurately categorized steady-state nonspeech stimuli. When listeners categorized stimuli defined by both rapidly-changing and steady-state cues, discrimination performance was accurate and the categorization function exhibited a sharp boundary. These data are similar to those found in experiments with dynamic vowels, which are defined by both steady-state and rapidly-changing acoustic cues. A general account for the speech and nonspeech patterns is proposed based on the supposition that the perceptual trace of rapidly-changing sounds decays faster than the trace of steady-state sounds.  相似文献   

6.
Acoustic analyses and perception experiments were conducted to determine the effects of brief deprivation of auditory feedback on fricatives produced by cochlear implant users. The words /si/ and /Si/ were recorded by four children and four adults with their cochlear implant speech processor turned on or off. In the processor-off condition, word durations increased significantly for a majority of talkers. These increases were greater for children compared to adults, suggesting that children may rely on auditory feedback to a greater extent than adults. Significant differences in spectral measures of /S/ were found between processor-on and processor-off conditions for two of the four children and for one of the four adults. These talkers also demonstrated a larger /s/-/S/ contrast in centroid values compared to the other talkers within their respective groups. This finding may indicate that talkers who produce fine spectral distinctions are able to perceive these distinctions through their implants and to use this feedback to fine tune their speech. Two listening experiments provided evidence that some of the acoustic changes were perceptible to normal-hearing listeners. Taken together, these experiments indicate that for certain cochlear-implant users the brief absence of auditory feedback may lead to perceptible modifications in fricative consonants.  相似文献   

7.
In normal-hearing listeners, musical background has been observed to change the sound representation in the auditory system and produce enhanced performance in some speech perception tests. Based on these observations, it has been hypothesized that musical background can influence sound and speech perception, and as an extension also the quality of life, by cochlear-implant users. To test this hypothesis, this study explored musical background [using the Dutch Musical Background Questionnaire (DMBQ)], and self-perceived sound and speech perception and quality of life [using the Nijmegen Cochlear Implant Questionnaire (NCIQ) and the Speech Spatial and Qualities of Hearing Scale (SSQ)] in 98 postlingually deafened adult cochlear-implant recipients. In addition to self-perceived measures, speech perception scores (percentage of phonemes recognized in words presented in quiet) were obtained from patient records. The self-perceived hearing performance was associated with the objective speech perception. Forty-one respondents (44% of 94 respondents) indicated some form of formal musical training. Fifteen respondents (18% of 83 respondents) judged themselves as having musical training, experience, and knowledge. No association was observed between musical background (quantified by DMBQ), and self-perceived hearing-related performance or quality of life (quantified by NCIQ and SSQ), or speech perception in quiet.  相似文献   

8.
Dynamic-range compression acting independently at each ear in a bilateral hearing-aid or cochlear-implant fitting can alter interaural level differences (ILDs) potentially affecting spatial perception. The influence of compression on the lateral position of sounds was studied in normal-hearing listeners using virtual acoustic stimuli. In a lateralization task, listeners indicated the leftmost and rightmost extents of the auditory event and reported whether they heard (1) a single, stationary image, (2) a moving/gradually broadening image, or (3) a split image. Fast-acting compression significantly affected the perceived position of high-pass sounds. For sounds with abrupt onsets and offsets, compression shifted the entire image to a more central position. For sounds containing gradual onsets and offsets, including speech, compression increased the occurrence of moving and split images by up to 57 percentage points and increased the perceived lateral extent of the auditory event. The severity of the effects was reduced when undisturbed low-frequency binaural cues were made available. At high frequencies, listeners gave increased weight to ILDs relative to interaural time differences carried in the envelope when compression caused ILDs to change dynamically at low rates, although individual differences were apparent. Specific conditions are identified in which compression is likely to affect spatial perception.  相似文献   

9.
Three experiments were conducted to study relative contributions of speaking rate, temporal envelope, and temporal fine structure to clear speech perception. Experiment I used uniform time scaling to match the speaking rate between clear and conversational speech. Experiment II decreased the speaking rate in conversational speech without processing artifacts by increasing silent gaps between phonetic segments. Experiment III created "auditory chimeras" by mixing the temporal envelope of clear speech with the fine structure of conversational speech, and vice versa. Speech intelligibility in normal-hearing listeners was measured over a wide range of signal-to-noise ratios to derive speech reception thresholds (SRT). The results showed that processing artifacts in uniform time scaling, particularly time compression, reduced speech intelligibility. Inserting gaps in conversational speech improved the SRT by 1.3 dB, but this improvement might be a result of increased short-term signal-to-noise ratios during level normalization. Data from auditory chimeras indicated that the temporal envelope cue contributed more to the clear speech advantage at high signal-to-noise ratios, whereas the temporal fine structure cue contributed more at low signal-to-noise ratios. Taken together, these results suggest that acoustic cues for the clear speech advantage are multiple and distributed.  相似文献   

10.
Gross variations of the speech amplitude envelope, such as the duration of different segments and the gaps between them, carry information about prosody and some segmental features of vowels and consonants. The amplitude envelope is one parameter encoded by the Tickle Talker, an electrotactile speech processor for the hearing impaired which stimulates the digital nerve bundles with a pulsatile electric current. Psychophysical experiments measuring the duration discrimination and identification, gap detection, and integration times for pulsatile electrical stimulation are described and compared with similar auditory measures for normal and impaired hearing and electrical stimulation via a cochlear implant. The tactile duration limen of 15% for a 300-ms standard was similar to auditory measures. Tactile gap detection thresholds of 9 to 20 ms were larger than for normal-hearing but shorter than for some hearing-impaired listeners and cochlear implant users. The electrotactile integration time of about 250 ms was shorter than previously measured tactile values but longer than auditory integration times. The results indicate that the gross amplitude envelope variations should be conveyed well by the Tickle Talker. Short bursts of low amplitude are the features most likely to be poorly perceived.  相似文献   

11.
Studies with adults have demonstrated that acoustic cues cohere in speech perception such that two stimuli cannot be discriminated if separate cues bias responses equally, but oppositely, in each. This study examined whether this kind of coherence exists for children's perception of speech signals, a test that first required that a contrast be found for which adults and children show similar cue weightings. Accordingly, experiment 1 demonstrated that adults, 7-, and 5-year-olds weight F2-onset frequency and gap duration similarly in "spa" versus "sa" decisions. In experiment 2, listeners of these same ages made "same" or "not-the-same" judgments for pairs of stimuli in an AX paradigm when only one cue differed, when the two cues were set within a stimulus to bias the phonetic percept towards the same category (relative to the other stimulus in the pair), and when the two cues were set within a stimulus to bias the phonetic percept towards different categories. Unexpectedly, adults' results contradicted earlier studies: They were able to discriminate stimuli when the two cues conflicted in how they biased phonetic percepts. Results for 7-year-olds replicated those of adults, but were not as strong. Only the results of 5-year-olds revealed the kind of perceptual coherence reported by earlier studies for adults. Thus, it is concluded that perceptual coherence for speech signals is present from an early age, and in fact listeners learn to overcome it under certain conditions.  相似文献   

12.
Sensitivity to acoustic cues in cochlear implant (CI) listening under natural conditions is a potentially complex interaction between a number of simultaneous factors, and may be difficult to predict. In the present study, sensitivity was measured under conditions that approximate those of natural listening. Synthesized words having increases in intensity or fundamental frequency (F0) in a middle stressed syllable were presented in soundfield to normal-hearing listeners and to CI listeners using their everyday speech processors and programming. In contrast to the extremely fine sensitivity to electrical current observed when direct stimulation of single electrodes is employed, difference limens (DLs) for intensity were larger for the CI listeners by a factor of 2.4. In accord with previous work, F0 DLs were larger by almost one order of magnitude. In a second experiment, it was found that the presence of concurrent intensity and F0 increments reduced the mean DL to half that of either cue alone for both groups of subjects, indicating that both groups combine concurrent cues with equal success. Although sensitivity to either cue in isolation was not related to word recognition in CI users, the listeners having lower combined-cue thresholds produced better word recognition scores.  相似文献   

13.
This study evaluated the effects of time compression and expansion on sentence recognition by normal-hearing (NH) listeners and cochlear-implant (CI) recipients of the Nucleus-22 device. Sentence recognition was measured in five CI users using custom 4-channel continuous interleaved sampler (CIS) processors and five NH listeners using either 4-channel or 32-channel noise-band processors. For NH listeners, recognition was largely unaffected by time expansion, regardless of spectral resolution. However, recognition of time-compressed speech varied significantly with spectral resolution. When fine spectral resolution (32 channels) was available, speech recognition was unaffected even when the duration of sentences was shortened to 40% of their original length (equivalent to a mean duration of 40 ms/phoneme). However, a mean duration of 60 ms/phoneme was required to achieve the same level of recognition when only coarse spectral resolution (4 channels) was available. Recognition patterns were highly variable across CI listeners. The best CI listener performed as well as NH subjects listening to corresponding spectral conditions; however, three out of five CI listeners performed significantly poorer in recognizing time-compressed speech. Further investigation revealed that these three poorer-performing CI users also had more difficulty with simple temporal gap-detection tasks. The results indicate that limited spectral resolution reduces the ability to recognize time-compressed speech. Some CI listeners have more difficulty with time-compressed speech, as produced by rapid speakers, because of reduced spectral resolution and deficits in auditory temporal processing.  相似文献   

14.
This study investigated the effects of simulated cochlear-implant processing on speech reception in a variety of complex masking situations. Speech recognition was measured as a function of target-to-masker ratio, processing condition (4, 8, 24 channels, and unprocessed) and masker type (speech-shaped noise, amplitude-modulated speech-shaped noise, single male talker, and single female talker). The results showed that simulated implant processing was more detrimental to speech reception in fluctuating interference than in steady-state noise. Performance in the 24-channel processing condition was substantially poorer than in the unprocessed condition, despite the comparable representation of the spectral envelope. The detrimental effects of simulated implant processing in fluctuating maskers, even with large numbers of channels, may be due to the reduction in the pitch cues used in sound source segregation, which are normally carried by the peripherally resolved low-frequency harmonics and the temporal fine structure. The results suggest that using steady-state noise to test speech intelligibility may underestimate the difficulties experienced by cochlear-implant users in fluctuating acoustic backgrounds.  相似文献   

15.
People vary in the intelligibility of their speech. This study investigated whether across-talker intelligibility differences observed in normally-hearing listeners are also found in cochlear implant (CI) users. Speech perception for male, female, and child pairs of talkers differing in intelligibility was assessed with actual and simulated CI processing and in normal hearing. While overall speech recognition was, as expected, poorer for CI users, differences in intelligibility across talkers were consistent across all listener groups. This suggests that the primary determinants of intelligibility differences are preserved in the CI-processed signal, though no single critical acoustic property could be identified.  相似文献   

16.
The present study systematically manipulated three acoustic cues--fundamental frequency (f0), amplitude envelope, and duration--to investigate their contributions to tonal contrasts in Mandarin. Simplified stimuli with all possible combinations of these three cues were presented for identification to eight normal-hearing listeners, all native speakers of Mandarin from Taiwan. The f0 information was conveyed either by an f0-controlled sawtooth carrier or a modulated noise so as to compare the performance achievable by a clear indication of voice f0 and what is possible with purely temporal coding of f0. Tone recognition performance with explicit f0 was much better than that with any combination of other acoustic cues (consistently greater than 90% correct compared to 33%-65%; chance is 25%). In the absence of explicit f0, the temporal coding of f0 and amplitude envelope both contributed somewhat to tone recognition, while duration had only a marginal effect. Performance based on these secondary cues varied greatly across listeners. These results explain the relatively poor perception of tone in cochlear implant users, given that cochlear implants currently provide only weak cues to f0, so that users must rely upon the purely temporal (and secondary) features for the perception of tone.  相似文献   

17.
Traditional accounts of speech perception generally hold that listeners use isolable acoustic "cues" to label phonemes. For syllable-final stops, duration of the preceding vocalic portion and formant transitions at syllable's end have been considered the primary cues to voicing decisions. The current experiment tried to extend traditional accounts by asking two questions concerning voicing decisions by adults and children: (1) What weight is given to vocalic duration versus spectral structure, both at syllable's end and across the syllable? (2) Does the naturalness of stimuli affect labeling? Adults and children (4, 6, and 8 years old) labeled synthetic stimuli that varied in vocalic duration and spectral structure, either at syllable's end or earlier in the syllable. Results showed that all listeners weighted dynamic spectral structure, both at syllable's end and earlier in the syllable, more than vocalic duration, and listeners performed with these synthetic stimuli as listeners had performed previously with natural stimuli. The conclusion for accounts of human speech perception is that rather than simply gathering acoustic cues and summing them to derive strings of phonemic segments, listeners are able to attend to global spectral structure, and use it to help recover explicitly phonetic structure.  相似文献   

18.
Forward-masked psychophysical spatial tuning curves (fmSTCs) were measured in twelve cochlear-implant subjects, six using bipolar stimulation (Nucleus devices) and six using monopolar stimulation (Clarion devices). fmSTCs were measured at several probe levels on a middle electrode using a fixed-level probe stimulus and variable-level maskers. The average fmSTC slopes obtained in subjects using bipolar stimulation (3.7 dBmm) were approximately three times steeper than average slopes obtained in subjects using monopolar stimulation (1.2 dBmm). Average spatial bandwidths were about half as wide for subjects with bipolar stimulation (2.6 mm) than for subjects with monopolar stimulation (4.6 mm). None of the tuning curve characteristics changed significantly with probe level. fmSTCs replotted in terms of acoustic frequency, using Greenwood's [J. Acoust. Soc. Am. 33, 1344-1356 (1961)] frequency-to-place equation, were compared with forward-masked psychophysical tuning curves obtained previously from normal-hearing and hearing-impaired acoustic listeners. The average tuning characteristics of fmSTCs in electric hearing were similar to the broad tuning observed in normal-hearing and hearing-impaired acoustic listeners at high stimulus levels. This suggests that spatial tuning is not the primary factor limiting speech perception in many cochlear implant users.  相似文献   

19.
The ability to match a pulsing electrode during multi-electrode stimulation through a research interface was measured in seven cochlear-implant (CI) users. Five listeners were relatively good at the task and two could not perform the task. Performance did not vary as a function of the number of electrodes or stimulation level. Performance on the matching task was not correlated to performance on an electrode-discrimination task. The listeners may have experienced the auditory enhancement effect, and this may have implications for speech recognition in noise for CI users.  相似文献   

20.
The speech perception of two multiple-channel cochlear implant patients was compared with that of three normally hearing listeners using an acoustic model of the implant for 22 different speech tests. The tests used included a minimal auditory capabilities battery, both closed-set and open-set word and sentence tests, speech tracking and a 12-consonant confusion study using nonsense syllables. The acoustic model represented electrical current pulses by bursts of noise and the effects of different electrodes were represented by using bandpass filters with different center frequencies. All subjects used a speech processor that coded the fundamental voicing frequency of speech as a pulse rate and the second formant frequency of speech as the electrode position in the cochlea, or the center frequency of the bandpass filter. Very good agreement was found for the two groups of subjects, indicating that the acoustic model is a useful tool for the development and evaluation of alternative cochlear implant speech processing strategies.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号