首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
Four experiments were performed to evaluate a new wearable vibrotactile speech perception aid that extracts fundamental frequency (F0) and displays the extracted F0 as a single-channel temporal or an eight-channel spatio-temporal stimulus. Specifically, we investigated the perception of intonation (i.e., question versus statement) and emphatic stress (i.e., stress on the first, second, or third word) under Visual-Alone (VA), Visual-Tactile (VT), and Tactile-Alone (TA) conditions and compared performance using the temporal and spatio-temporal vibrotactile display. Subjects were adults with normal hearing in experiments I-III and adults with severe to profound hearing impairments in experiment IV. Both versions of the vibrotactile speech perception aid successfully conveyed intonation. Vibrotactile stress information was successfully conveyed, but vibrotactile stress information did not enhance performance in VT conditions beyond performance in VA conditions. In experiment III, which involved only intonation identification, a reliable advantage for the spatio-temporal display was obtained. Differences between subject groups were obtained for intonation identification, with more accurate VT performance by those with normal hearing. Possible effects of long-term hearing status are discussed.  相似文献   

2.
The ability of five profoundly hearing-impaired subjects to "track" connected speech and to make judgments about the intonation and stress in spoken sentences was evaluated under a variety of auditory-visual conditions. These included speechreading alone, speechreading plus speech (low-pass filtered at 4 kHz), and speechreading plus a tone whose frequency, intensity, and temporal characteristics were matched to the speaker's fundamental frequency (F0). In addition, several frequency transfer functions were applied to the normal F0 range resulting in new ranges that were both transposed and expanded with respect to the original F0 range. Three of the five subjects were able to use several of the tonal representations of F0 nearly as well as speech to improve their speechreading rates and to make appropriate judgments concerning sentence intonation and stress. The remaining two subjects greatly improved their identification performance for intonation and stress patterns when expanded F0 signals were presented alone (i.e., without speechreading), but had difficulty integrating visual and auditory information at the connected discourse level, despite intensive training in the connected discourse tracking procedure lasting from 27.8-33.8 h.  相似文献   

3.
Intrinsic fundamental frequency of vowels in sentence context   总被引:1,自引:0,他引:1  
High vowels have a higher intrinsic fundamental frequency (F0) than low vowels. This phenomenon has been verified in several languages. However, most studies of intrinsic F0 of vowels have used words either in isolation or bearing the main phrasal stress in a carrier sentence. As a first step towards an understanding of how the intrinsic F0 of vowels interacts with intonation in running speech, this study examined F0 of the vowels [i,a,u] in four sentence positions. The four speakers used for this study showed a statistically significant main effect of intrinsic F0 (high vowels had higher F0). Three of the four speakers also showed an interaction between intrinsic F0 and sentence position such that no significant F0 difference was observed in the unaccented, sentence-final position. The interaction was shown not to be due to vowel neutralization or correlated with changes in the glottal waveform shape, as evidenced by measures of the first formant frequency and spectral slope. Comparison with studies of tone languages and speech of the deaf suggests that both the lack of accent and the lower F0 caused the reduction in the intrinsic F0 difference.  相似文献   

4.
The main goal of this study was to investigate the efficacy of four vibrotactile speechreading supplements. Three supplements provided single-channel encodings of fundamental frequency (F0). Two encodings involved scaling and shifting glottal pulses to pulse rate ranges suited to tactual sensing capabilities; the third transformed F0 to differential amplitude of two fixed-frequency sinewaves. The fourth supplement added to one of the F0 encodings a second vibrator indicating high-frequency speech energy. A second goal was to develop improved methods for experimental control. Therefore, a sentence corpus was recorded on videodisc using two talkers whose speech was captured by video, microphone, and electroglottograph. Other experimental control issues included use of visual-alone control subjects, a multiple-baseline, single-subject design replicated for each of 15 normal-hearing subjects, sentence and syllable pre- and post-tests balanced for difficulty, and a speechreading screening test for subject selection. Across 17 h of treatment and 5 h of visual-alone baseline testing, each subject performed open-set sentence identification. Covariance analyses showed that the single-channel supplements provided a small but significant benefit, whereas the two-channel supplement was not effective. All subjects improved in visual-alone speechreading and maintained individual differences across the experiment. Vibrotactile benefit did not depend on speechreading ability.  相似文献   

5.
Fundamental frequency (F0) information extracted from low-pass-filtered speech and aurally presented as frequency-modulated sinusoids can greatly improve speechreading performance [Grant et al., J. Acoust. Soc. Am. 77, 671-677 (1985)]. To use this source of information, listeners must be able to detect the presence or absence of F0 (i.e., voicing), discriminate changes in frequency, and make judgments about the linguistic meaning of perceived variations in F0. In the present study, normally hearing and hearing-impaired subjects were required to locate the stressed peak of an intonation contour according to the extent of frequency transition at the primary peak. The results showed that listeners with profound hearing impairments required frequency transitions that were 1.5-6 times greater than those required by normally hearing subjects. These results were consistent with the subjects' identification performance for intonation and stress patterns in natural speech, and suggest that natural variations in F0 may be too small for some impaired listeners to perceive and follow accurately.  相似文献   

6.
7.
Three experiments were performed to obtain vibrotactile sensitivity thresholds from hearing children and adults, and from deaf children. An adaptive two-interval forced-choice procedure was used to obtain estimates of the 70.7% point on the psychometric sensitivity curve. When hearing children of 5-6 and 9-10 years of age and adults were tested with sinusoids and haversine pulse stimuli, at 10, 100, 160, and 250 Hz or pps, respectively, only the 10-Hz stimulus resulted in an age effect. For this stimulus, young children were significantly less sensitive than adults. When sinusoids were again tested at 20, 40, 80, and 160 Hz, a small overall effect of age was observed with a significant effect only at 20 Hz. Two prelingually profoundly deaf children were tested with haversine pulse trains at 10, 50, 100, 160, and 250 pps. Both children were at least as sensitive to the tactile stimulation as were the hearing children and adults. Pulsatile stimulation, compared to sinusoidal stimulation, exhibited relatively flat threshold versus frequency functions. The present results, demonstrating no age effect for pulsatile stimulation and similar performance for deaf and hearing children, suggest that pulsatile stimulation would be appropriate in vibrotactile speech communication aids for the deaf.  相似文献   

8.
Two experiments were conducted to explore the effectiveness of a single vibrotactile stimulator to convey intonation (question versus statement) and contrastive stress (on one of the first three words of four 4- or 5-word sentences). In experiment I, artificially deafened normal-hearing subjects judged stress and intonation in counterbalanced visual-alone and visual-tactile conditions. Six voice fundamental frequency-to-tactile transformations were tested. Two sentence types were voiced throughout, and two contained unvoiced consonants. Benefits to speechreading were significant, but small. No differences among transformations were observed. In experiment II, only the tactile stimuli were presented. Significant differences emerged among the transformations, with larger differences for intonation than for stress judgments. Surprisingly, tactile-alone intonation identification was more accurate than visual-tactile for several transformations.  相似文献   

9.
This investigation determined whether prelingually deaf talkers could correctly produce stressed and unstressed syllables across known changes in stress patterning and phonetic composition. Three deaf and three hearing adults spoke sets of homogeneous syllable strings with stress patterns that they could tap successfully with a finger. Strain gauge transduction of lower lip and jaw movement indicated that both deaf and hearing subjects produced different displacements and durations for the stressed and unstressed syllables, regardless of the stress pattern. Jaw movement did not become more variable with changes in phonetic composition of the syllables. The results show no evidence that motoric abilities (as assessed in lip and jaw movements) limit deaf talkers in producing desired stress patterns.  相似文献   

10.
Acoustic correlates of contrastive stress, i.e., fundamental frequency (F0), duration, and intensity, and listener perceptions of stress, were investigated in a profoundly deaf subject (RS) pre/post single-channel cochlear implant and longitudinally, and compared to the overall patterns of age-peer profoundly deaf (JM) and normally hearing subjects (DL). The stimuli were a group of general American English words in which a change of function from noun to verb is associated with a shift of stress from initial to final syllable, e.g., CON'trast versus conTRAST'. Precochlear implant, RS was unable to produce contrastive stress correctly. Hearing one day post-stimulation resulted in significantly higher F0 for initial and final stressed versus unstressed syllables. Four months post-stimulation, RS maintained significantly higher F0 on stressed syllables, as well as generalization of significantly increased intensity and longer syllable duration differences for all stressed versus unstressed syllables. Perceptually, listeners judged RS's contrastive stress placement as incorrect precochlear implant and as always correct post-cochlear implant. JM's contrastive stress was judged as 96% correct, and DL's contrastive stress placement was 100% correct. It was concluded that RS reacquired all acoustic correlates needed for appropriate differentiation of contrastive stress with longitudinal use of the cochlear implant.  相似文献   

11.
The ability to combine speechreading (i.e., lipreading) with prosodic information extracted from the low-frequency regions of speech was evaluated with three normally hearing subjects. The subjects were tested in a connected discourse tracking procedure which measures the rate at which spoken text can be repeated back without any errors. Receptive conditions included speechreading alone (SA), speechreading plus amplitude envelope cues (AM), speechreading plus fundamental frequency cues (FM), and speechreading plus intensity-modulated fundamental frequency cues (AM + FM). In a second experiment, one subject was further tested in a speechreading plus voicing duration cue condition (DUR). Speechreading performance was best in the AM + FM condition (83.6 words per minute,) and worst in the SA condition (41.1 words per minute). Tracking levels in the AM, FM, and DUR conditions were 73.7, 73.6, and 65.4 words per minute, respectively. The average tracking rate obtained when subjects were allowed to listen to the talker's normal (unfiltered) speech (NS condition) was 108.3 words per minute. These results demonstrate that speechreaders can use information related to the rhythm, stress, and intonation patterns of speech to improve their speechreading performance.  相似文献   

12.
Cochlear implants are largely unable to encode voice pitch information, which hampers the perception of some prosodic cues, such as intonation. This study investigated whether children with a cochlear implant in one ear were better able to detect differences in intonation when a hearing aid was added in the other ear ("bimodal fitting"). Fourteen children with normal hearing and 19 children with bimodal fitting participated in two experiments. The first experiment assessed the just noticeable difference in F0, by presenting listeners with a naturally produced bisyllabic utterance with an artificially manipulated pitch accent. The second experiment assessed the ability to distinguish between questions and affirmations in Dutch words, again by using artificial manipulation of F0. For the implanted group, performance significantly improved in each experiment when the hearing aid was added. However, even with a hearing aid, the implanted group required exaggerated F0 excursions to perceive a pitch accent and to identify a question. These exaggerated excursions are close to the maximum excursions typically used by Dutch speakers. Nevertheless, the results of this study showed that compared to the implant only condition, bimodal fitting improved the perception of intonation.  相似文献   

13.
Three experiments used the Coordinated Response Measure task to examine the roles that differences in F0 and differences in vocal-tract length have on the ability to attend to one of two simultaneous speech signals. The first experiment asked how increases in the natural F0 difference between two sentences (originally spoken by the same talker) affected listeners' ability to attend to one of the sentences. The second experiment used differences in vocal-tract length, and the third used both F0 and vocal-tract length differences. Differences in F0 greater than 2 semitones produced systematic improvements in performance. Differences in vocal-tract length produced systematic improvements in performance when the ratio of lengths was 1.08 or greater, particularly when the shorter vocal tract belonged to the target talker. Neither of these manipulations produced improvements in performance as great as those produced by a different-sex talker. Systematic changes in both F0 and vocal-tract length that simulated an incremental shift in gender produced substantially larger improvements in performance than did differences in F0 or vocal-tract length alone. In general, shifting one of two utterances spoken by a female voice towards a male voice produces a greater improvement in performance than shifting male towards female. The increase in performance varied with the intonation patterns of individual talkers, being smallest for those talkers who showed most variability in their intonation patterns between different utterances.  相似文献   

14.
This paper addresses a classical but important problem: The coupling of lexical tones and sentence intonation in tonal languages, such as Chinese, focusing particularly on voice fundamental frequency (F1) contours of speech. It is important because it forms the basis of speech synthesis technology and prosody analysis. We provide a solution to the problem with a constrained tone transformation technique based on structural modeling of the F1 contours. This consists of transforming target values in pairs from norms to variants. These targets are intended to sparsely specify the prosodic contributions to the F1 contours, while the alignment of target pairs between norms and variants is based on underlying lexical tone structures. When the norms take the citation forms of lexical tones, the technique makes it possible to separate sentence intonation from observed F0 contours. When the norms take normative F0 contours, it is possible to measure intonation variations from the norms to the variants, both having identical lexical tone structures. This paper explains the underlying scientific and linguistic principles and presents an algorithm that was implemented on computers. The method's capability of separating and combining tone and intonation is evaluated through analysis and re-synthesis of several hundred observed F0 contours.  相似文献   

15.
In tone languages there are potential conflicts in the perception of lexical tone and intonation, as both depend mainly on the differences in fundamental frequency (F0) patterns. The present study investigated the acoustic cues associated with the perception of sentences as questions or statements in Cantonese, as a function of the lexical tone in sentence final position. Cantonese listeners performed intonation identification tasks involving complete sentences, isolated final syllables, and sentences without the final syllable (carriers). Sensitivity (d' scores) were similar for complete sentences and final syllables but were significantly lower for carriers. Sensitivity was also affected by tone identity. These findings show that the perception of questions and statements relies primarily on the F0 characteristics of the final syllables (local F0 cues). A measure of response bias (c) provided evidence for a general bias toward the perception of statements. Logistic regression analyses showed that utterances were accurately classified as questions or statements by using average F0 and F0 interval. Average F0 of carriers (global F0 cue) was also found to be a reliable secondary cue. These findings suggest that the use of F0 cues for the perception of intonation question in tonal languages is likely to be language-specific.  相似文献   

16.
Cochlear implant (CI) users have been shown to benefit from residual low-frequency hearing, specifically in pitch related tasks. It remains unclear whether this benefit is dependent on fundamental frequency (F0) or other acoustic cues. Three experiments were conducted to determine the role of F0, as well as its frequency modulated (FM) and amplitude modulated (AM) components, in speech recognition with a competing voice. In simulated CI listeners, the signal-to-noise ratio was varied to estimate the 50% correct response. Simulation results showed that the F0 cue contributes to a significant proportion of the benefit seen with combined acoustic and electric hearing, and additionally that this benefit is due to the FM rather than the AM component. In actual CI users, sentence recognition scores were collected with either the full F0 cue containing both the FM and AM components or the 500-Hz low-pass speech cue containing the F0 and additional harmonics. The F0 cue provided a benefit similar to the low-pass cue for speech in noise, but not in quiet. Poorer CI users benefited more from the F0 cue than better users. These findings suggest that F0 is critical to improving speech perception in noise in combined acoustic and electric hearing.  相似文献   

17.
18.
There is a tendency across languages to use a rising pitch contour to convey question intonation and a falling pitch contour to convey a statement. In a lexical tone language such as Mandarin Chinese, rising and falling pitch contours are also used to differentiate lexical meaning. How, then, does the multiplexing of the F(0) channel affect the perception of question and statement intonation in a lexical tone language? This study investigated the effects of lexical tones and focus on the perception of intonation in Mandarin Chinese. The results show that lexical tones and focus impact the perception of sentence intonation. Question intonation was easier for native speakers to identify on a sentence with a final falling tone and more difficult to identify on a sentence with a final rising tone, suggesting that tone identification intervenes in the mapping of F(0) contours to intonational categories and that tone and intonation interact at the phonological level. In contrast, there is no evidence that the interaction between focus and intonation goes beyond the psychoacoustic level. The results provide insights that will be useful for further research on tone and intonation interactions in both acoustic modeling studies and neurobiological studies.  相似文献   

19.
The purpose of this study was to compare the role of frequency selectivity in measures of auditory and vibrotactile temporal resolution. In the first experiment, temporal modulation transfer functions for a sinusoidally amplitude modulated (SAM) 250-Hz carrier revealed auditory modulation thresholds significantly lower than corresponding vibrotactile modulation thresholds at SAM frequencies greater than or equal to 100 Hz. In the second experiment, auditory and vibrotactile gap detection thresholds were measured by presenting silent gaps bounded by markers of the same or different frequency. The marker frequency F1 = 250 Hz preceded the silent gap and marker frequencies after the silent gap included F2 = 250, 255, 263, 310, and 325 Hz. Auditory gap detection thresholds were lower than corresponding vibrotactile thresholds for F2 markers less than or equal to 263 Hz, but were greater than the corresponding vibrotactile gap detection thresholds for F2 markers greater than or equal to 310 Hz. When the auditory gap detection thresholds were transformed into filter attenuation values, the results were modeled well by a constant-percentage (10%) bandwidth filter centered on F1. The vibrotactile gap detection thresholds, however, were independent of marker frequency separation. In a third experiment, auditory and vibrotactile rate difference limens (RDLs) were measured for a 250-Hz carrier at SAM rates less than or equal to 100 Hz. Auditory RDLs were lower than corresponding vibrotactile RDLs for standard rates greater than 10 Hz. Combination tones may have confounded auditory performance for standard rates of 80 and 100 Hz. The results from these experiments revealed that frequency selectivity influences auditory measures of temporal resolution, but there was no evidence of frequency selectivity affecting vibrotactile temporal resolution.  相似文献   

20.
The relationship between the ability to hear out partials in complex tones, discrimination of the fundamental frequency (F0) of complex tones, and frequency selectivity was examined for subjects with mild-to-moderate cochlear hearing loss. The ability to hear out partials was measured using a two-interval task. Each interval included a sinusoid followed by a complex tone; one complex contained a partial with the same frequency as the sinusoid, whereas in the other complex that partial was missing. Subjects had to indicate the interval in which the partial was present in the complex. The components in the complex were uniformly spaced on the ERB(N)-number scale. Performance was generally good for the two "edge" partials, but poorer for the inner partials. Performance for the latter improved with increasing spacing. F0 discrimination was measured for a bandpass-filtered complex tone containing low harmonics. The equivalent rectangular bandwidth (ERB) of the auditory filter was estimated using the notched-noise method for center frequencies of 0.5, 1, and 2 kHz. Significant correlations were found between the ability to hear out inner partials, F0 discrimination, and the ERB. The results support the idea that F0 discrimination of tones with low harmonics depends on the ability to resolve the harmonics.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号