首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
Although listeners routinely perceive both the sex and individual identity of talkers from their speech, explanations of these abilities are incomplete. Here, variation in vocal production-related anatomy was assumed to affect vowel acoustics thought to be critical for indexical cueing. Integrating this approach with source-filter theory, patterns of acoustic parameters that should represent sex and identity were identified. Due to sexual dimorphism, the combination of fundamental frequency (F0, reflecting larynx size) and vocal tract length cues (VTL, reflecting body size) was predicted to provide the strongest acoustic correlates of talker sex. Acoustic measures associated with presumed variations in supralaryngeal vocal tract-related anatomy occurring within sex were expected to be prominent in individual talker identity. These predictions were supported by results of analyses of 2500 tokens of the /epsilon/ phoneme, extracted from the naturally produced speech of 125 subjects. Classification by talker sex was virtually perfect when F0 and VTL were used together, whereas talker classification depended primarily on the various acoustic parameters associated with vocal-tract filtering.  相似文献   

2.
This study compared how normal-hearing listeners (NH) and listeners with moderate to moderately severe cochlear hearing loss (HI) use and combine information within and across frequency regions in the perceptual separation of competing vowels with fundamental frequency differences (deltaF0) ranging from 0 to 9 semitones. Following the procedure of Culling and Darwin [J. Acoust. Soc. Am. 93, 3454-3467 (1993)], eight NH listeners and eight HI listeners identified competing vowels with either a consistent or inconsistent harmonic structure. Vowels were amplified to assure audibility for HI listeners. The contribution of frequency region depended on the value of deltaF0 between the competing vowels. When deltaF0 was small, both groups of listeners effectively utilized deltaF0 cues in the low-frequency region. In contrast, HI listeners derived significantly less benefit than NH listeners from deltaF0 cues conveyed by the high-frequency region at small deltaF0's. At larger deltaF0's, both groups combined deltaF0 cues from the low and high formant-frequency regions. Cochlear impairment appears to negatively impact the ability to use F0 cues for within-formant grouping in the high-frequency region. However, cochlear loss does not appear to disrupt the ability to use within-formant F0 cues in the low-frequency region or to group F0 cues across formant regions.  相似文献   

3.
Listeners with sensorineural hearing loss are poorer than listeners with normal hearing at understanding one talker in the presence of another. This deficit is more pronounced when competing talkers are spatially separated, implying a reduced "spatial benefit" in hearing-impaired listeners. This study tested the hypothesis that this deficit is due to increased masking specifically during the simultaneous portions of competing speech signals. Monosyllabic words were compressed to a uniform duration and concatenated to create target and masker sentences with three levels of temporal overlap: 0% (non-overlapping in time), 50% (partially overlapping), or 100% (completely overlapping). Listeners with hearing loss performed particularly poorly in the 100% overlap condition, consistent with the idea that simultaneous speech sounds are most problematic for these listeners. However, spatial release from masking was reduced in all overlap conditions, suggesting that increased masking during periods of temporal overlap is only one factor limiting spatial unmasking in hearing-impaired listeners.  相似文献   

4.
Speech recognition in noise improves with combined acoustic and electric stimulation compared to electric stimulation alone [Kong et al., J. Acoust. Soc. Am. 117, 1351-1361 (2005)]. Here the contribution of fundamental frequency (F0) and low-frequency phonetic cues to speech recognition in combined hearing was investigated. Normal-hearing listeners heard vocoded speech in one ear and low-pass (LP) filtered speech in the other. Three listening conditions (vocode-alone, LP-alone, combined) were investigated. Target speech (average F0=120 Hz) was mixed with a time-reversed masker (average F0=172 Hz) at three signal-to-noise ratios (SNRs). LP speech aided performance at all SNRs. Low-frequency phonetic cues were then removed by replacing the LP speech with a LP equal-amplitude harmonic complex, frequency and amplitude modulated by the F0 and temporal envelope of voiced segments of the target. The combined hearing advantage disappeared at 10 and 15 dB SNR, but persisted at 5 dB SNR. A similar finding occurred when, additionally, F0 contour cues were removed. These results are consistent with a role for low-frequency phonetic cues, but not with a combination of F0 information between the two ears. The enhanced performance at 5 dB SNR with F0 contour cues absent suggests that voicing or glimpsing cues may be responsible for the combined hearing benefit.  相似文献   

5.
Several studies have demonstrated that when talkers are instructed to speak clearly, the resulting speech is significantly more intelligible than speech produced in ordinary conversation. These speech intelligibility improvements are accompanied by a wide variety of acoustic changes. The current study explored the relationship between acoustic properties of vowels and their identification in clear and conversational speech, for young normal-hearing (YNH) and elderly hearing-impaired (EHI) listeners. Monosyllabic words excised from sentences spoken either clearly or conversationally by a male talker were presented in 12-talker babble for vowel identification. While vowel intelligibility was significantly higher in clear speech than in conversational speech for the YNH listeners, no clear speech advantage was found for the EHI group. Regression analyses were used to assess the relative importance of spectral target, dynamic formant movement, and duration information for perception of individual vowels. For both listener groups, all three types of information emerged as primary cues to vowel identity. However, the relative importance of the three cues for individual vowels differed greatly for the YNH and EHI listeners. This suggests that hearing loss alters the way acoustic cues are used for identifying vowels.  相似文献   

6.
Temporal information provided by cochlear implants enables successful speech perception in quiet, but limited spectral information precludes comparable success in voice perception. Talker identification and speech decoding by young hearing children (5-7 yr), older hearing children (10-12 yr), and hearing adults were examined by means of vocoder simulations of cochlear implant processing. In Experiment 1, listeners heard vocoder simulations of sentences from a man, woman, and girl and were required to identify the talker from a closed set. Younger children identified talkers more poorly than older listeners, but all age groups showed similar benefit from increased spectral information. In Experiment 2, children and adults provided verbatim repetition of vocoded sentences from the same talkers. The youngest children had more difficulty than older listeners, but all age groups showed comparable benefit from increasing spectral resolution. At comparable levels of spectral degradation, performance on the open-set task of speech decoding was considerably more accurate than on the closed-set task of talker identification. Hearing children's ability to identify talkers and decode speech from spectrally degraded material sheds light on the difficulty of these domains for child implant users.  相似文献   

7.
This study examined vowel perception by young normal-hearing (YNH) adults, in various listening conditions designed to simulate mild-to-moderate sloping sensorineural hearing loss. YNH listeners were individually age- and gender-matched to young hearing-impaired (YHI) listeners tested in a previous study [Richie et al., J. Acoust. Soc. Am. 114, 2923-2933 (2003)]. YNH listeners were tested in three conditions designed to create equal audibility with the YHI listeners; a low signal level with and without a simulated hearing loss, and a high signal level with a simulated hearing loss. Listeners discriminated changes in synthetic vowel tokens /I e epsilon alpha ae/ when Fl or F2 varied in frequency. Comparison of YNH with YHI results failed to reveal significant differences between groups in terms of performance on vowel discrimination, in conditions of similar audibility by using both noise masking to elevate the hearing thresholds of the YNH and applying frequency-specific gain to the YHI listeners. Further, analysis of learning curves suggests that while the YHI listeners completed an average of 46% more test blocks than YNH listeners, the YHI achieved a level of discrimination similar to that of the YNH within the same number of blocks. Apparently, when age and gender are closely matched between young hearing-impaired and normal-hearing adults, performance on vowel tasks may be explained by audibility alone.  相似文献   

8.
The focus of this study was the release from informational masking that could be obtained in a speech task by viewing a video of the target talker. A closed-set speech recognition paradigm was used to measure informational masking in 23 children (ages 6-16 years) and 10 adults. An audio-only condition required attention to a monaural target speech message that was presented to the same ear with a time-synchronized distracter message. In an audiovisual condition, a synchronized video of the target talker was also presented to assess the release from informational masking that could be achieved by speechreading. Children required higher target/distracter ratios than adults to reach comparable performance levels in the audio-only condition, reflecting a greater extent of informational masking in these listeners. There was a monotonic age effect, such that even the children in the oldest age group (12-16.9 years) demonstrated performance somewhat poorer than adults. Older children and adults improved significantly in the audiovisual condition, producing a release from informational masking of 15 dB or more in some adult listeners. Audiovisual presentation produced no informational masking release for the youngest children. Across all ages, the benefit of a synchronized video was strongly associated with speechreading ability.  相似文献   

9.
Cochlear implants are largely unable to encode voice pitch information, which hampers the perception of some prosodic cues, such as intonation. This study investigated whether children with a cochlear implant in one ear were better able to detect differences in intonation when a hearing aid was added in the other ear ("bimodal fitting"). Fourteen children with normal hearing and 19 children with bimodal fitting participated in two experiments. The first experiment assessed the just noticeable difference in F0, by presenting listeners with a naturally produced bisyllabic utterance with an artificially manipulated pitch accent. The second experiment assessed the ability to distinguish between questions and affirmations in Dutch words, again by using artificial manipulation of F0. For the implanted group, performance significantly improved in each experiment when the hearing aid was added. However, even with a hearing aid, the implanted group required exaggerated F0 excursions to perceive a pitch accent and to identify a question. These exaggerated excursions are close to the maximum excursions typically used by Dutch speakers. Nevertheless, the results of this study showed that compared to the implant only condition, bimodal fitting improved the perception of intonation.  相似文献   

10.
The purpose of this study is to specify the contribution of certain frequency regions to consonant place perception for normal-hearing listeners and listeners with high-frequency hearing loss, and to characterize the differences in stop-consonant place perception among these listeners. Stop-consonant recognition and error patterns were examined at various speech-presentation levels and under conditions of low- and high-pass filtering. Subjects included 18 normal-hearing listeners and a homogeneous group of 10 young, hearing-impaired individuals with high-frequency sensorineural hearing loss. Differential filtering effects on consonant place perception were consistent with the spectral composition of acoustic cues. Differences in consonant recognition and error patterns between normal-hearing and hearing-impaired listeners were observed when the stimulus bandwidth included regions of threshold elevation for the hearing-impaired listeners. Thus place-perception differences among listeners are, for the most part, associated with stimulus bandwidths corresponding to regions of hearing loss.  相似文献   

11.
Coherence masking protection (CMP) refers to the phenomenon in which a target formant is labeled at lower signal-to-noise levels when presented with a stable cosignal consisting of two other formants than when presented alone. This effect has been reported primarily for adults with first-formant (F1) targets and F2/F3 cosignals, but has also been found for children, in fact in greater magnitude. In this experiment, F2 was the target and F1/F3 was the cosignal. Results showed similar effects for each age group as had been found for F1 targets. Implications for auditory prostheses for listeners with hearing loss are discussed.  相似文献   

12.
The perception of fundamental pitch for two-harmonic complex tones was examined in musically experienced listeners with cochlear-based high-frequency hearing loss. Performance in a musical interval identification task was measured as a function of the average rank of the lowest harmonic for both monotic and dichotic presentation of the harmonics at 14 dB Sensation Level. Listeners with hearing loss demonstrated excellent musical interval identification at low fundamental frequencies and low harmonic numbers, but abnormally poor identification at higher fundamental frequencies and higher average ranks. The upper frequency limit of performance in the listeners with hearing loss was similar in both monotic and dichotic conditions. These results suggest that something other than frequency resolution per se limits complex-tone pitch perception in listeners with hearing loss.  相似文献   

13.
This study examined spatial release from masking (SRM) when a target talker was masked by competing talkers or by other types of sounds. The focus was on the role of interaural time differences (ITDs) and time-varying interaural level differences (ILDs) under conditions varying in the strength of informational masking (IM). In the first experiment, a target talker was masked by two other talkers that were either colocated with the target or were symmetrically spatially separated from the target with the stimuli presented through loudspeakers. The sounds were filtered into different frequency regions to restrict the available interaural cues. The largest SRM occurred for the broadband condition followed by a low-pass condition. However, even the highest frequency bandpass-filtered condition (3-6 kHz) yielded a significant SRM. In the second experiment the stimuli were presented via earphones. The listeners identified the speech of a target talker masked by one or two other talkers or noises when the maskers were colocated with the target or were perceptually separated by ITDs. The results revealed a complex pattern of masking in which the factors affecting performance in colocated and spatially separated conditions are to a large degree independent.  相似文献   

14.
This study examined proportional frequency compression as a strategy for improving speech recognition in listeners with high-frequency sensorineural hearing loss. This method of frequency compression preserved the ratios between the frequencies of the components of natural speech, as well as the temporal envelope of the unprocessed speech stimuli. Nonsense syllables spoken by a female and a male talker were used as the speech materials. Both frequency-compressed speech and the control condition of unprocessed speech were presented with high-pass amplification. For the materials spoken by the female talker, significant increases in speech recognition were observed in slightly less than one-half of the listeners with hearing impairment. For the male-talker materials, one-fifth of the hearing-impaired listeners showed significant recognition improvements. The increases in speech recognition due solely to frequency compression were generally smaller than those solely due to high-pass amplification. The results indicate that while high-pass amplification is still the most effective approach for improving speech recognition of listeners with high-frequency hearing loss, proportional frequency compression can offer significant improvements in addition to those provided by amplification for some patients.  相似文献   

15.
Cochlear implant (CI) users have been shown to benefit from residual low-frequency hearing, specifically in pitch related tasks. It remains unclear whether this benefit is dependent on fundamental frequency (F0) or other acoustic cues. Three experiments were conducted to determine the role of F0, as well as its frequency modulated (FM) and amplitude modulated (AM) components, in speech recognition with a competing voice. In simulated CI listeners, the signal-to-noise ratio was varied to estimate the 50% correct response. Simulation results showed that the F0 cue contributes to a significant proportion of the benefit seen with combined acoustic and electric hearing, and additionally that this benefit is due to the FM rather than the AM component. In actual CI users, sentence recognition scores were collected with either the full F0 cue containing both the FM and AM components or the 500-Hz low-pass speech cue containing the F0 and additional harmonics. The F0 cue provided a benefit similar to the low-pass cue for speech in noise, but not in quiet. Poorer CI users benefited more from the F0 cue than better users. These findings suggest that F0 is critical to improving speech perception in noise in combined acoustic and electric hearing.  相似文献   

16.
The study measured listener sensitivity to increments in the inter-onset interval (IOI) separating pairs of successive 20-ms 4000-Hz tone pulses. A silent interval between the tone pulses was adjusted across conditions to create reference tonal IOI values of 25-600 ms. For each condition, a duration DL for increments of the tonal IOI was measured in listeners comprised of young normal-hearing adults and two groups of older adults with and without high-frequency hearing loss. Discrimination performance of all listeners was poorest for the shorter reference IOIs, and improved to stable levels for longer reference intervals exceeding about 200 ms. Temporal sensitivity of the young listeners was significantly better than that of the elderly listeners in each condition, with the largest age-related differences observed for the shortest reference interval. Age-related differences were also observed for duration DLs measured using single 4000-Hz tone bursts set to three reference durations in the range 50-200 ms. The tone DLs of all listeners were smaller than the corresponding tone-pair IOI DLs, particularly for the shorter reference stimulus durations. There were no significant performance differences observed between the older listeners with and without hearing loss for either discrimination task.  相似文献   

17.
Procedures for enhancing the intelligibility of a target talker in the presence of a co-channel competing talker were evaluated in tests involving (i) continuously voiced sentences spoken on a monotone, (ii) continuously voiced sentences with time-varying intonation, and (iii) noncontinuously voiced sentences produced with natural intonation. The procedures were based on the methods of harmonic selection and cepstral filtering [R.J. Stubbs and Q. Summerfield, J. Acoust. Soc. Am. 87, 359-372 (1990)]. Target and competing voices were combined at signal-to-noise ratios (SNRs) between -10 dB and +10 dB. Subjects were a group with normal hearing and a heterogeneous group with mild-moderate cochlear hearing impairments. Processing enhanced the target voice over a range of SNRs for each type of sentence and for most listeners. Enhancement was greatest at negative SNRs. Among the impaired listeners, benefit was generally greater for those with milder losses. These results consolidate and extend previous demonstrations that voice-separation algorithms that exploit the harmonic structure of the voiced portions of speech can enhance intelligibility. However, practical application of such algorithms depends on a solution to the problem of tracking the fundamental-frequency contour of one voice in the presence of a competing voice.  相似文献   

18.
To identify a speaker's sex, listeners may rely on sex-based differences in average fundamental frequency (F0), but overlap in male and female F0 ranges undermines such judgments. To test accuracy of sex-identification throughout the F0 range, listeners were asked to judge sex based on audio recordings of /ɑ/ spoken on a number of overlapping steady F0s by 10 male and 10 female English speakers. In general, listeners performed above chance (71.6% correct). However, near range extrema, listeners followed an apparent bias toward hearing high F0s as female and low as male; confidence was high when accuracy was high and vice-versa. At mid-range, listeners identified sex fairly accurately but were not very confident in their judgments. In a forced-choice task, vowels close in F0 (but beyond the difference limen) were presented in male-female or female-male pairs. Listeners weakly identified speaker sex (63.3% correct). Identification of the male voice was considerably above chance only when the male had the lower F0 of the pair. Reliance on stereotypes of speaking F0 may bias listeners to hear low F0s as male and high F0s as female, perhaps with a contribution from vocal-tract length information. No strong evidence for a contribution of voice quality obtained.  相似文献   

19.
Although most recent multitalker research has emphasized the importance of binaural cues, monaural cues can play an equally important role in the perception of multiple simultaneous speech signals. In this experiment, the intelligibility of a target phrase masked by a single competing masker phrase was measured as a function of signal-to-noise ratio (SNR) with same-talker, same-sex, and different-sex target and masker voices. The results indicate that informational masking, rather than energetic masking, dominated performance in this experiment. The amount of masking was highly dependent on the similarity of the target and masker voices: performance was best when different-sex talkers were used and worst when the same talker was used for target and masker. Performance did not, however, improve monotonically with increasing SNR. Intelligibility generally plateaued at SNRs below 0 dB and, in some cases, intensity differences between the target and masking voices produced substantial improvements in performance with decreasing SNR. The results indicate that informational and energetic masking play substantially different roles in the perception of competing speech messages.  相似文献   

20.
The contributions of auditory and cognitive factors to age-dependent differences in auditory spatial attention were investigated. In conditions of real spatial separation, the target sentence was presented from a central location and competing sentences were presented from left and right locations. In conditions of simulated spatial separation, different apparent spatial locations of the target and competitors were induced using the precedence effect. The identity of the target was cued by a callsign presented either prior to or following each target sentence, and the probability that the target would be presented at the three locations was specified at the beginning of each block. Younger and older adults with normal hearing sensitivity below 4 kHz completed all 16 conditions (2-spatial separation method X 2-callsign conditions X 4-probability conditions). Overall, younger adults performed better than older adults. For both age groups, performance improved with target location certainty, with a priori target cueing, and when location differences were real rather than simulated. For both age groups, the contributions of natural spatial cues were most pronounced when the target occurred at "unlikely" spatial listening locations. This suggests that both age groups benefit similarly from richer acoustical cues and a priori information in difficult listening environments.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号