首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
An auditory enhancement effect was evaluated in normal and hearing-impaired persons using a paradigm similar to that used by Viemeister and Bacon [J. Acoust. Soc. Am. 71, 1502-1507 (1982)]. Thresholds for a 2000-Hz probe were obtained in two forward-masking conditions: (1) the standard condition in which the masker was a four-component harmonic complex including 2000 Hz, and (2) the enhancing condition in which the same harmonic complex except for the exclusion of the 2000-Hz component preceded the four-component masker. In addition, enhancement for speech was evaluated by asking subjects to identify flat-spectrum harmonic complexes that were preceded by inverse vowel spectra. Finally, suppression effects were evaluated by measuring forward-masked thresholds for a 2000-Hz probe as a function of suppressor frequency added to a 2000-Hz masker. Across all subjects, there was evidence of enhancement and better vowel recognition in those persons who also demonstrated evidence of suppression; however, two of the normal-hearing persons demonstrated reduced enhancement yet normal suppression effects.  相似文献   

2.
This study complements earlier experiments on the perception of the [m]-[n] distinction in CV syllables [B. H. Repp, J. Acoust. Soc. Am. 79, 1987-1999 (1986); B. H. Repp, J. Acoust. Soc. Am. 82, 1525-1538 (1987)]. Six talkers produced VC syllables consisting of [m] or [n] preceded by [i, a, u]. In listening experiments, these syllables were truncated from the beginning and/or from the end, or waveform portions surrounding the point of closure were replaced with noise, so as to map out the distribution of the place of articulation information for consonant perception. These manipulations revealed that the vocalic formant transitions alone conveyed about as much place of articulation information as did the nasal murmur alone, and both signal portions were about as informative in VC as in CV syllables. Nevertheless, full VC syllables were less accurately identified than full CV syllables, especially in female speech. The reason for this was hypothesized to be the relative absence of a salient spectral change between the vowel and the murmur in VC syllables. This hypothesis was supported by the relative ineffectiveness of two additional manipulations meant to disrupt the perception of relational spectral information (channel separation or temporal separation of vowel and murmur) and by subjects' poor identification scores for brief excerpts including the point of maximal spectral change. While, in CV syllables, the abrupt spectral change from the murmur to the vowel provides important additional place of articulation information, for VC syllables it seems as if the format transitions in the vowel and the murmur spectrum functioned as independent cues.  相似文献   

3.
This investigation examined whether listeners with mild-moderate sensorineural hearing impairment have a deficit in the ability to integrate synchronous spectral information in the perception of speech. In stage 1, the bandwidth of filtered speech centered either on 500 or 2500 Hz was varied adaptively to determine the width required for approximately 15%-25% correct recognition. In stage 2, these criterion bandwidths were presented simultaneously and percent correct performance was determined in fixed block trials. Experiment 1 tested normal-hearing listeners in quiet and in masking noise. The main findings were (1) there was no correlation between the criterion bandwidths at 500 and 2500 Hz; (2) listeners achieved a high percent correct in stage 2 (approximately 80%); and (3) performance in quiet and noise was similar. Experiment 2 tested listeners with mild-moderate sensorineural hearing impairment. The main findings were (1) the impaired listeners showed high variability in stage 1, with some listeners requiring narrower and others requiring wider bandwidths than normal, and (2) hearing-impaired listeners achieved percent correct performance in stage 2 that was comparable to normal. The results indicate that listeners with mild-moderate sensorineural hearing loss do not have an essential deficit in the ability to integrate across-frequency speech information.  相似文献   

4.
In face-to-face speech communication, the listener extracts and integrates information from the acoustic and optic speech signals. Integration occurs within the auditory modality (i.e., across the acoustic frequency spectrum) and across sensory modalities (i.e., across the acoustic and optic signals). The difficulties experienced by some hearing-impaired listeners in understanding speech could be attributed to losses in the extraction of speech information, the integration of speech cues, or both. The present study evaluated the ability of normal-hearing and hearing-impaired listeners to integrate speech information within and across sensory modalities in order to determine the degree to which integration efficiency may be a factor in the performance of hearing-impaired listeners. Auditory-visual nonsense syllables consisting of eighteen medial consonants surrounded by the vowel [a] were processed into four nonoverlapping acoustic filter bands between 300 and 6000 Hz. A variety of one, two, three, and four filter-band combinations were presented for identification in auditory-only and auditory-visual conditions: A visual-only condition was also included. Integration efficiency was evaluated using a model of optimal integration. Results showed that normal-hearing and hearing-impaired listeners integrated information across the auditory and visual sensory modalities with a high degree of efficiency, independent of differences in auditory capabilities. However, across-frequency integration for auditory-only input was less efficient for hearing-impaired listeners. These individuals exhibited particular difficulty extracting information from the highest frequency band (4762-6000 Hz) when speech information was presented concurrently in the next lower-frequency band (1890-2381 Hz). Results suggest that integration of speech information within the auditory modality, but not across auditory and visual modalities, affects speech understanding in hearing-impaired listeners.  相似文献   

5.
Integral processing of phonemes: evidence for a phonetic mode of perception   总被引:1,自引:0,他引:1  
To investigate the extent and locus of integral processing in speech perception, a speeded classification task was utilized with a set of noise-tone analogs of the fricative-vowel syllables (fae), (integral of ae), (fu), and (integral of u). Unlike the stimuli used in previous studies of selective perception of syllables, these stimuli did not contain consonant-vowel transitions. Subjects were asked to classify on the basis of one of the two syllable components. Some subjects were told that the stimuli were computer generated noise-tone sequences. These subjects processed the noise and tone separably. Irrelevant variation of the noise did not affect reaction times (RTs) for the classification of the tone, and vice versa. Other subjects were instructed to treat the stimuli as speech. For these subjects, irrelevant variation of the fricative increased RTs for the classification of the vowel, and vice versa. A second experiment employed naturally spoken fricative-vowel syllables with the same task. Classification RTs showed a pattern of integrality in that irrelevant variation of either component increased RTs to the other. These results indicate that knowledge of coarticulation (or its acoustic consequences) is a basic element of speech perception. Furthermore, the use of this knowledge in phonetic coding is mandatory, even in situations where the stimuli do not contain coarticulatory information.  相似文献   

6.
The effective internal level of a 1-kHz tone at 50 dB SPL was estimated by measuring the forward masking produced on a 10-ms signal tone of the same frequency. Noise containing a spectral notch was then added to the masker tone, and its influence on the effective level of the tone was measured with a variety of noise levels, notch widths, and notch shapes. In experiment 1, the masker tone was centered in the spectral notch, itself centered in a 2-kHz band of noise. As the spectrum level in the noise passbands increased from 6 dB/Hz to 36 dB/Hz, signal threshold decreased, indicating a decrease in masking by the masker tone. This "unmasking" effect of the noise was attributed to suppression of the masker tone by the components in the noise. Unmasking was greatest with the narrowest spectral notch (250 Hz), and decreased to zero as the notch widened to 1500 Hz. Compared to its level when presented alone, the effective internal level of the masker tone could be reduced by up to 30 dB (250-Hz notch, 36 dB/Hz). The relative suppressive strength of individual noise components was estimated in experiment 2, in which the 1-kHz masker tone was located at one edge of a spectral notch, rather than in the center. Noise spectrum level was fixed at 16 dB/Hz. As notch width decreased to zero, on either the high-frequency or low-frequency side of the masker tone, its effective internal level was again reduced by approximately 30 dB. In a tentative analysis, the first derivative of the smoothed threshold function was taken, to provide an estimate of the relative contributions to suppression at 1 kHz of noise components between 250 and 1740 Hz.(ABSTRACT TRUNCATED AT 250 WORDS)  相似文献   

7.
The role of different modulation frequencies in the speech envelope were studied by means of the manipulation of vowel-consonant-vowel (VCV) syllables. The envelope of the signal was extracted from the speech and the fine-structure was replaced by speech-shaped noise. The temporal envelopes in every critical band of the speech signal were notch filtered in order to assess the relative importance of different modulation frequency regions between 0 and 20 Hz. For this purpose notch filters around three center frequencies (8, 12, and 16 Hz) with three different notch widths (4-, 8-, and 12-Hz wide) were used. These stimuli were used in a consonant-recognition task in which ten normal-hearing subjects participated, and their results were analyzed in terms of recognition scores. More qualitative information was obtained with a multidimensional scaling method (INDSCAL) and sequential information analysis (SINFA). Consonant recognition is very robust for the removal of certain modulation frequency areas. Only when a wide notch around 8 Hz is applied does the speech signal become heavily degraded. As expected, the voicing information is lost, while there are different effects on plosiveness and nasality. Even the smallest filtering has a substantial effect on the transfer of the plosiveness feature, while on the other hand, filtering out only the low-modulation frequencies has a substantial effect on the transfer of nasality cues.  相似文献   

8.
Earlier work [Nittrouer et al., J. Speech Hear. Res. 32, 120-132 (1989)] demonstrated greater evidence of coarticulation in the fricative-vowel syllables of children than in those of adults when measured by anticipatory vowel effects on the resonant frequency of the fricative back cavity. In the present study, three experiments showed that this increased coarticulation led to improved vowel recognition from the fricative noise alone: Vowel identification by adult listeners was better overall for children's productions and was successful earlier in the fricative noise. This enhanced vowel recognition for children's samples was obtained in spite of the fact that children's and adults' samples were randomized together, therefore indicating that listeners were able to normalize the vowel information within a fricative noise where there often was acoustic evidence of only one formant associated primarily with the vowel. Correct vowel judgments were found to be largely independent of fricative identification. However, when another coarticulatory effect, the lowering of the main spectral prominence of the fricative noise for /u/ versus /i/, was taken into account, vowel judgments were found to interact with fricative identification. The results show that listeners are sensitive to the greater coarticulation in children's fricative-vowel syllables, and that, in some circumstances, they do not need to make a correct identification of the most prominently specified phone in order to make a correct identification of a coarticulated one.  相似文献   

9.
The detection of 500- or 2000-Hz pure-tone signals in unmodulated and modulated noise was investigated in normal-hearing and sensorineural hearing-impaired listeners, as a function of noise bandwidth. Square-wave modulation rates of 15 and 40 Hz were used in the modulated noise conditions. A notched noise measure of frequency selectivity and a gap detection measure of temporal resolution were also obtained on each subject. The modulated noise results indicated a masking release that increased as a function of increasing noise bandwidth, and as a function of decreasing modulation rate for both groups of listeners. However, the improvement of threshold with increasing modulated noise bandwidth was often greatly reduced among the sensorineural hearing-impaired listeners. It was hypothesized that the masking release in modulated noise may be due to several types of processes including across-critical band analysis (CMR), within-critical band analysis, and suppression. Within-band effects appeared to be especially large at the higher frequency region and lower modulation rate. In agreement with previous research, there was a significant correlation between frequency selectivity and masking release in modulated noise. At the 500-Hz region, masking release was correlated more highly with the filter skirt and tail measures than with the filter passband measure. At the 2000-Hz region, masking release was correlated more with the filter passband and skirt measures than with the filter tail measure. The correlation between gap detection and masking release was significant at the 40-Hz modulation rate, but not at the 15-Hz modulation rate. The results of this study suggest that masking release in modulated noise is limited by frequency selectivity at low modulation rates, and by both frequency selectivity and temporal resolution at high modulation rates. However, even when the present measures of frequency selectivity and temporal resolution are both taken into account, significant variance in masking release still remains unaccounted for.  相似文献   

10.
An important problem in speech perception is to determine how humans extract the perceptually invariant place of articulation information in the speech wave across variable acoustic contexts. Although analyses have been developed that attempted to classify the voiced stops /b/ versus /d/ from stimulus onset information, most of the human perceptual research to date suggests that formant transition information is more important than onset information. The purpose of the present study was to determine if animal subjects, specifically Japanese macaque monkeys, are capable of categorizing /b/ versus /d/ in synthesized consonant-vowel (CV) syllables using only formant transition information. Three monkeys were trained to differentiate CV syllables with a "go-left" versus a "go-right" label. All monkeys first learned to differentiate a /za/ versus /da/ manner contrast and easily transferred to three new vowel contexts /[symbol: see text], epsilon, I/. Next, two of the three monkeys learned to differentiate a /ba/ versus /da/ stop place contrast, but were unable to transfer it to the different vowel contexts. These results suggest that animals may not use the same mechanisms as humans do for classifying place contrasts, and call for further investigation of animal perception of formant transition information versus stimulus onset information in place contrasts.  相似文献   

11.
On the role of spectral transition for speech perception   总被引:2,自引:0,他引:2  
This paper examines the relationship between dynamic spectral features and the identification of Japanese syllables modified by initial and/or final truncation. The experiments confirm several main points. "Perceptual critical points," where the percent correct identification of the truncated syllable as a function of the truncation position changes abruptly, are related to maximum spectral transition positions. A speech wave of approximately 10 ms in duration that includes the maximum spectral transition position bears the most important information for consonant and syllable perception. Consonant and vowel identification scores simultaneously change as a function of the truncation position in the short period, including the 10-ms period for final truncation. This suggests that crucial information for both vowel and consonant identification is contained across the same initial part of each syllable. The spectral transition is more crucial than unvoiced and buzz bar periods for consonant (syllable) perception, although the latter features are of some perceptual importance. Also, vowel nuclei are not necessary for either vowel or syllable perception.  相似文献   

12.
The present study examined the benefits of providing amplified speech to the low- and mid-frequency regions of listeners with various degrees of sensorineural hearing loss. Nonsense syllables were low-pass filtered at various cutoff frequencies and consonant recognition was measured as the bandwidth of the signal was increased. In addition, error patterns were analyzed to determine the types of speech cues that were, or were not, transmitted to the listeners. For speech frequencies of 2800 Hz and below, a positive benefit of amplified speech was observed in every case, although the benefit provided was very often less than that observed in normal-hearing listeners who received the same increase in speech audibility. There was no dependence of this benefit upon the degree of hearing loss. Error patterns suggested that the primary difficulty that hearing-impaired individuals have in using amplified speech is due to their poor ability to perceive the place of articulation of consonants, followed by a reduced ability to perceive manner information.  相似文献   

13.
Spectro-temporal analysis in normal-hearing and cochlear-impaired listeners   总被引:1,自引:0,他引:1  
Detection thresholds for a 1.0-kHz pure tone were determined in unmodulated noise and in noise modulated by a 15-Hz square wave. Comodulation masking release (CMR) was calculated as the difference in threshold between the modulated and unmodulated conditions. The noise bandwidth varied between 100 and 1000 Hz. Frequency selectivity was also examined using an abbreviated notched-noise masking method. The subjects in the main experiment consisted of 12 normal-hearing and 12 hearing-impaired subjects with hearing loss of cochlear origin. The most discriminating conditions were repeated on 16 additional hearing-impaired subjects. The CMR of the hearing-impaired group was reduced for the 1000-Hz noise bandwidth. The reduced CMR at this bandwidth correlated significantly with reduced frequency selectivity, consistent with the hypothesis that the across-frequency difference cue used in CMR is diminished by poor frequency selectivity. The results indicated that good frequency selectivity is a prerequisite, but not a guarantee, of large CMR.  相似文献   

14.
Previous research with speechlike signals has suggested that upward spread of masking from the first formant (F 1) may interfere with the identification of place of articulation information signaled by changes in the upper formants. This suggestion was tested by presenting two-formant stop consonant--vowel syllables varying along a/ba--/da/--/ga/ continuum to hearing-impaired listeners grouped according to etiological basis of the disorder. The syllables were presented monaurally at 80 dB and 100 dB SPL when formant amplitudes were equal and when F 1 amplitude was reduced by 6, 12, and 18 dB. Noise-on-tone masking patterns were also generated using narrow bands of noise at 80 and 100 dB SPL to assess the extent of upward spread of masking. Upward spread of masking could be demonstrated in both speech and nonspeech tasks, irrespective of the subject's age, audiometric configuration, or etiology of hearing impairment. Attenuation of F 1 had different effects on phonetic identification in different subject groups: While listeners with noise-induced hearing loss showed substantial improvement in identifying place of articulation, upward spread of masking did not consistently account for poor place identification in other types of sensorineural hearing impairment.  相似文献   

15.
The purpose of this investigation was to study the effects of consonant environment on vowel duration for normally hearing males, hearing-impaired males with intelligible speech, and hearing-impaired males with semi-intelligible speech. The results indicated that the normally hearing and intelligible hearing-impaired speakers exhibited similar trends with respect to consonant influence on vowel duration; i.e., vowels were longer in duration, in a voiced environment as compared with a voiceless, and in a fricative environment as compared with a plosive. The semi-intelligible hearing-impaired speakers, however, failed to demonstrate a consonant effect on vowel duration, and produced the vowels with significantly longer durations when compared with the other two groups of speakers. These data provide information regarding temporal conditions which may contribute to the decreased intelligibility of hearing-impaired persons.  相似文献   

16.
There is limited documentation available on how sensorineurally hearing-impaired listeners use the various sources of phonemic information that are known to be distributed across time in the speech waveform. In this investigation, a group of normally hearing listeners and a group of sensorineurally hearing-impaired listeners (with and without the benefit of amplification) identified various consonant and vowel productions that had been systematically varied in duration. The consonants (presented in a /haCa/ environment) and the vowels (presented in a /bVd/ environment) were truncated in steps to eliminate various segments from the end of the stimulus. The results indicated that normally hearing listeners could extract more phonemic information, especially cues to consonant place, from the earlier occurring portions of the stimulus waveforms than could the hearing-impaired listeners. The use of amplification partially decreased the performance differences between the normally hearing listeners and the unaided hearing-impaired listeners. The results are relevant to current models of normal speech perception that emphasize the need for the listener to make phonemic identifications as quickly as possible.  相似文献   

17.
This study focuses on the initial component of the stop consonant release burst, the release transient. In theory, the transient, because of its impulselike source, should contain much information about the vocal tract configuration at release, but it is usually weak in intensity and difficult to isolate from the accompanying frication in natural speech. For this investigation, a human talker produced isolated release transients of /b,d,g/ in nine vocalic contexts by whispering these syllables very quietly. He also produced the corresponding CV syllables with regular phonation for comparison. Spectral analyses showed the isolated transients to have a clearly defined formant structure, which was not seen in natural release bursts, whose spectra were dominated by the frication noise. The formant frequencies varied systematically with both consonant place of articulation and vocalic context. Perceptual experiments showed that listeners can identify both consonants and vowels from isolated transients, though not very accurately. Knowing one of the two segments in advance did not help, but when the transients were followed by a compatible synthetic, steady-state vowel, consonant identification improved somewhat. On the whole, isolated transients, despite their clear formant structure, provided only partial information for consonant identification, but no less so, it seems, than excerpted natural release bursts. The information conveyed by artificially isolated transients and by natural (frication-dominated) release bursts appears to be perceptually equivalent.  相似文献   

18.
The present study attempted to investigate the acoustic characteristics of Mandarin laryngeal and esophageal speech. Eight normal laryngeal and seven esophageal speakers participated in the acoustic experiments. Results from acoustic analyses of syllables /ma/and /ba/ indicated that, F0, intensity, and signal-to-noise ratio of laryngeal speech were significantly higher than those of esophageal speech. However, opposite results were found for vowel duration, jitter, and shimmer. Mean F0, intensity, and word per minute in reading were greater but number of pauses was smaller in laryngeal speech than those in esophageal speech. Similar patterns of F0 contours and vowel duration as a function of tone were found between laryngeal and esophageal speakers. Long-time spectra analysis indicated that higher first and second formant frequencies were associated with esophageal speech than that with normal laryngeal speech.  相似文献   

19.
Twenty-one sensorineurally hearing-impaired adolescents were studied with an extensive battery of tone-perception, phoneme-perception, and speech-perception tests. Tests on loudness perception, frequency selectivity, and temporal resolution at the test frequencies of 500, 1000, and 2000 Hz were included. The mean values and the gradient across frequencies were used in further analysis. Phoneme-perception data were gathered by means of similarity judgments and phonemic confusions. Speech-reception thresholds were determined in quiet and in noise for unfiltered speech material, and with additional low-pass and high-pass filtering in noise. The results show that hearing loss for speech is related to both the frequency resolving power and temporal processing by the ear. Phoneme-perception parameters proved to be more related to the filtered-speech thresholds than to the thresholds for unfiltered speech. This finding may indicate that phoneme-perception parameters play only a secondary role, and for that reason their bridging function between tone perception and speech perception is only limited.  相似文献   

20.
Temporal integration for a 1000-Hz signal was determined for normal-hearing and cochlear hearing-impaired listeners in quiet and in masking noise of variable bandwidth. Critical ratio and 3-dB critical band measures of frequency resolution were derived from the masking data. Temporal integration for the normal-hearing listeners was markedly reduced in narrow-band noise, when contrasted with temporal integration in quiet or in wideband noise. The effect of noise bandwidth on temporal integration was smaller for the hearing-impaired group. Hearing-impaired subjects showed both reduced temporal integration and reduced frequency resolution for the 200-ms signal. However, a direct relation between temporal integration and frequency resolution was not indicated. Frequency resolution for the normal-hearing listeners did not differ from that of the hearing-impaired listeners for the 20-ms signal. It was suggested that some of the frequency resolution and temporal integration differences between normal-hearing and hearing-impaired listeners could be accounted for by off-frequency listening.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号