首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
Humans and monkeys were compared in their differential sensitivity to various acoustic cues underlying voicing contrasts specified by voice-onset time (VOT) in utterance-initial stop consonants. A low-uncertainty repeating standard AX procedure and positive-reinforcement operant conditioning techniques were used to measure difference limens (DLs) along a VOT continuum from--70 ms (prevoiced/ba/) to 0 ms (/ba/) to + 70 ms (/pa/). For all contrasts tested, human sensitivity was more acute than that of monkeys. For voicing lag, which spans a phonemic contrast in English, human DLs for a/ba/(standard)-to-/pa/ (target) continuum averaged 8.3 ms compared to 17 ms for monkeys. Human DLs for a/pa/-to-/ba/ continuum averaged 11 ms compared to 25 ms for monkeys. Larger species differences occurred for voicing lead, which is phonemically nondistinctive in English. Human DLs for a /ba/-to-prevoiced/ba/ continuum averaged 8.2 ms and were four times lower than monkeys (35 ms). Monkeys did not reliably discriminate prevoiced /ba/-to-/ba/, whereas humans DLs averaged 18 ms. The effects of eliminating cues in the English VOT contrasts were also examined. Removal of the aspiration noise in /pa/ greatly increased the DLs and reaction times for both humans and monkeys, but straightening out the F1 transition in /ba/ had only minor effects. Results suggest that quantitative differences in sensitivity should be considered when using monkeys to model the psychoacoustic level of human speech perception.  相似文献   

2.
3.
This study assessed the extent to which second-language learners are sensitive to phonetic information contained in visual cues when identifying a non-native phonemic contrast. In experiment 1, Spanish and Japanese learners of English were tested on their perception of a labial/ labiodental consonant contrast in audio (A), visual (V), and audio-visual (AV) modalities. Spanish students showed better performance overall, and much greater sensitivity to visual cues than Japanese students. Both learner groups achieved higher scores in the AV than in the A test condition, thus showing evidence of audio-visual benefit. Experiment 2 examined the perception of the less visually-salient /1/-/r/ contrast in Japanese and Korean learners of English. Korean learners obtained much higher scores in auditory and audio-visual conditions than in the visual condition, while Japanese learners generally performed poorly in both modalities. Neither group showed evidence of audio-visual benefit. These results show the impact of the language background of the learner and visual salience of the contrast on the use of visual cues for a non-native contrast. Significant correlations between scores in the auditory and visual conditions suggest that increasing auditory proficiency in identifying a non-native contrast is linked with an increasing proficiency in using visual cues to the contrast.  相似文献   

4.
5.
The present study systematically manipulated three acoustic cues--fundamental frequency (f0), amplitude envelope, and duration--to investigate their contributions to tonal contrasts in Mandarin. Simplified stimuli with all possible combinations of these three cues were presented for identification to eight normal-hearing listeners, all native speakers of Mandarin from Taiwan. The f0 information was conveyed either by an f0-controlled sawtooth carrier or a modulated noise so as to compare the performance achievable by a clear indication of voice f0 and what is possible with purely temporal coding of f0. Tone recognition performance with explicit f0 was much better than that with any combination of other acoustic cues (consistently greater than 90% correct compared to 33%-65%; chance is 25%). In the absence of explicit f0, the temporal coding of f0 and amplitude envelope both contributed somewhat to tone recognition, while duration had only a marginal effect. Performance based on these secondary cues varied greatly across listeners. These results explain the relatively poor perception of tone in cochlear implant users, given that cochlear implants currently provide only weak cues to f0, so that users must rely upon the purely temporal (and secondary) features for the perception of tone.  相似文献   

6.
In English, voiced and voiceless syllable-initial stop consonants differ in both fundamental frequency at the onset of voicing (onset F0) and voice onset time (VOT). Although both correlates, alone, can cue the voicing contrast, listeners weight VOT more heavily when both are available. Such differential weighting may arise from differences in the perceptual distance between voicing categories along the VOT versus onset F0 dimensions, or it may arise from a bias to pay more attention to VOT than to onset F0. The present experiment examines listeners' use of these two cues when classifying stimuli in which perceptual distance was artificially equated along the two dimensions. Listeners were also trained to categorize stimuli based on one cue at the expense of another. Equating perceptual distance eliminated the expected bias toward VOT before training, but successfully learning to base decisions more on VOT and less on onset F0 was easier than vice versa. Perceptual distance along both dimensions increased for both groups after training, but only VOT-trained listeners showed a decrease in Garner interference. Results lend qualified support to an attentional model of phonetic learning in which learning involves strategic redeployment of selective attention across integral acoustic cues.  相似文献   

7.
8.
The present study investigated the extent to which native English listeners' perception of Japanese length contrasts can be modified with perceptual training, and how their performance is affected by factors that influence segment duration, which is a primary correlate of Japanese length contrasts. Listeners were trained in a minimal-pair identification paradigm with feedback, using isolated words contrasting in vowel length, produced at a normal speaking rate. Experiment 1 tested listeners using stimuli varying in speaking rate, presentation context (in isolation versus embedded in carrier sentences), and type of length contrast. Experiment 2 examined whether performance varied by the position of the contrast within the word, and by whether the test talkers were professionally trained or not. Results did not show that trained listeners improved overall performance to a greater extent than untrained control participants. Training improved perception of trained contrast types, generalized to nonprofessional talkers' productions, and improved performance in difficult within-word positions. However, training did not enable listeners to cope with speaking rate variation, and did not generalize to untrained contrast types. These results suggest that perceptual training improves non-native listeners' perception of Japanese length contrasts only to a limited extent.  相似文献   

9.
郑康琳  王陶  樊平  李萍 《应用声学》2023,42(1):154-158
该文基于声波在混合物介质中传播时反射及散射的随机特性,把混合物介质抽象为三维各向同性的马尔科夫链,把声波在混合物介质中传播过程抽象为声波在三维马尔科夫链中以声速进行“随机游走”的随机过程。用空间内某点接收到声波的概率类比该点接收波振幅,以声波到达该点所走过的步数类比接收波时域曲线的时间。此理论模型可较好解释声波在混合物介质中传播时“峰波延后”及“尾波”等现象。  相似文献   

10.
Two experiments investigated the ability of 17 school-aged children to process purely temporal and spectro-temporal cues that signal changes in pitch. Percentage correct was measured for the discrimination of sinusoidal amplitude modulation rate (AMR) of broadband noise in experiment 1 and for the discrimination of fundamental frequency (F0) of broadband sine-phase harmonic complexes in experiment 2. The reference AMR was 100 Hz as was the reference F0. A child-friendly interface helped listeners to remain attentive to the task. Data were fitted using a maximum-likelihood technique that extracted threshold, slope, and lapse rate. All thresholds were subsequently standardized to a common d' value equal to 0.77. There were relatively large individual differences across listeners: eight had relatively adult-like thresholds in both tasks and nine had higher thresholds. However, these individual differences did not vary systematically with age, over the span of 6-16 yr. Thresholds were correlated across the two tasks and were about nine times finer for F0 discrimination than for AMR discrimination as has been previously observed in adults.  相似文献   

11.
Recent studies have demonstrated that mothers exaggerate phonetic properties of infant-directed (ID) speech. However, these studies focused on a single acoustic dimension (frequency), whereas speech sounds are composed of multiple acoustic cues. Moreover, little is known about how mothers adjust phonetic properties of speech to children with hearing loss. This study examined mothers' production of frequency and duration cues to the American English tense/lax vowel contrast in speech to profoundly deaf (N?=?14) and normal-hearing (N?=?14) infants, and to an adult experimenter. First and second formant frequencies and vowel duration of tense (/i/,?/u/) and lax (/I/,?/?/) vowels were measured. Results demonstrated that for both infant groups mothers hyperarticulated the acoustic vowel space and increased vowel duration in ID speech relative to adult-directed speech. Mean F2 values were decreased for the /u/ vowel and increased for the /I/ vowel, and vowel duration was longer for the /i/, /u/, and /I/ vowels in ID speech. However, neither acoustic cue differed in speech to hearing-impaired or normal-hearing infants. These results suggest that both formant frequencies and vowel duration that differentiate American English tense/lx vowel contrasts are modified in ID speech regardless of the hearing status of the addressee.  相似文献   

12.
Susceptibility to acoustic trauma in young and aged gerbils   总被引:2,自引:0,他引:2  
The effect of age on susceptibility to noise-induced hearing loss (NIHL), the effect of gender on the interaction of age-related hearing loss (ARHL) and NIHL, and the relative contributions of ARHL and NIHL to total hearing loss are poorly understood. The issues are difficult to resolve empirically in human subjects because of lack of control over extrinsic variables and for ethical reasons. Accordingly, these issues were examined in a well-studied animal model of both ARHL and NIHL, the Mongolian gerbil. Animals were exposed to an intense tone (3.5 kHz, 113 dB SPL, 1 h) either as young adults (6-8 months) or near the end of the average lifespan of the species (34-38 months). Hearing thresholds were determined with the auditory brainstem response (ABR). ARHL was approximately 5-10 dB, with slightly more observed in males at 16 kHz (p<0.05). NIHL of approximately 15-20 dB was similar for the young and old groups, suggesting no differences in susceptibility as a function of age. There were no gender differences in NIHL. The relative contributions of ARHL and NIHL to total hearing loss in aged, noise-exposed gerbils were predicted by an addition of ARHL and NIHL in dB, similar to an international standard on hearing loss allocation, ISO-1999 [Determination of Occupational Noise Exposure and Estimation of Noise-Induced Hearing Impairment (1990)]. Previous evaluations of ISO-1999 using the gerbil animal model concluded that addition of ARHL and NIHL in dB overpredicts total hearing loss. However, in these studies, ARHL was large and nearly equal to NIHL. In the current study, where ARHL was much less than NIHL, addition of the two factors in dB, as recommended by ISO-1999, results in fairly accurate predictions of total hearing loss.  相似文献   

13.
Beginning at the age of about 14 months, eight children who lived in a rhotic dialect region of the United States were recorded approximately every 2 months interacting with their parents. All were recorded until at least the age of 26 months, and some until the age of 31 months. Acoustic analyses of speech samples indicated that these young children acquired [inverted r] production ability at different ages for [inverted r]'s in different syllable positions. The children, as a group, had started to produce postvocalic and syllabic [inverted r] in an adult-like manner by the end of the recording sessions, but were not yet showing evidence of having acquired prevocalic [inverted r]. Articulatory limitations of young children are posited as a cause for the difference in development of [inverted r] according to syllable position. Specifically, it is speculated that adult-like prevocalic [inverted r] production requires two lingual constrictions: one in the mouth, and the other in the pharynx, while postvocalic and syllabic [inverted r] requires only one oral constriction. Two lingual constrictions could be difficult for young children to produce.  相似文献   

14.
Two recent accounts of the acoustic cues which specify place of articulation in syllable-initial stop consonants claim that they are located in the initial portions of the CV waveform and are context-free. Stevens and Blumstein [J. Acoust. Soc. Am. 64, 1358-1368 (1978)] have described the perceptually relevant spectral properties of these cues as static, while Kewley-Port [J. Acoust. Soc. Am. 73, 322-335 (1983)] describes these cues as dynamic. Three perceptual experiments were conducted to test predictions derived from these accounts. Experiment 1 confirmed that acoustic cues for place of articulation are located in the initial 20-40 ms of natural stop-vowel syllables. Next, short synthetic CV's modeled after natural syllables were generated using either a digital, parallel-resonance synthesizer in experiment 2 or linear prediction synthesis in experiment 3. One set of synthetic stimuli preserved the static spectral properties proposed by Stevens and Blumstein. Another set of synthetic stimuli preserved the dynamic properties suggested by Kewley-Port. Listeners in both experiments identified place of articulation significantly better from stimuli which preserved dynamic acoustic properties than from those based on static onset spectra. Evidently, the dynamic structure of the initial stop-vowel articulatory gesture can be preserved in context-free acoustic cues which listeners use to identify place of articulation.  相似文献   

15.
Sperm whales (Physeter macrocephalus) have learned to remove fish from demersal longline gear deployments off the eastern Gulf of Alaska, and are often observed to arrive at a site after a haul begins, suggesting a response to potential acoustic cues like fishing-gear strum, hydraulic winch tones, and propeller cavitation. Passive acoustic recorders attached to anchorlines have permitted continuous monitoring of the ambient noise environment before and during fishing hauls. Timing and tracking analyses of sperm whale acoustic activity during three encounters indicate that cavitation arising from changes in ship propeller speeds is associated with interruptions in nearby sperm whale dive cycles and changes in acoustically derived positions. This conclusion has been tested by cycling a vessel engine and noting the arrival of whales by the vessel, even when the vessel is not next to fishing gear. No evidence of response from activation of ship hydraulics or fishing gear strum has been found to date.  相似文献   

16.
This paper investigates the functional relationship between articulatory variability and stability of acoustic cues during American English /r/ production. The analysis of articulatory movement data on seven subjects shows that the extent of intrasubject articulatory variability along any given articulatory direction is strongly and inversely related to a measure of acoustic stability (the extent of acoustic variation that displacing the articulators in this direction would produce). The presence and direction of this relationship is consistent with a speech motor control mechanism that uses a third formant frequency (F3) target; i.e., the final articulatory variability is lower for those articulatory directions most relevant to determining the F3 value. In contrast, no consistent relationship across speakers and phonetic contexts was found between hypothesized vocal-tract target variables and articulatory variability. Furthermore, simulations of two speakers' productions using the DIVA model of speech production, in conjunction with a novel speaker-specific vocal-tract model derived from magnetic resonance imaging data, mimic the observed range of articulatory gestures for each subject, while exhibiting the same articulatory/acoustic relations as those observed experimentally. Overall these results provide evidence for a common control scheme that utilizes an acoustic, rather than articulatory, target specification for American English /r/.  相似文献   

17.
Although some cochlear implant (CI) listeners can show good word recognition accuracy, it is not clear how they perceive and use the various acoustic cues that contribute to phonetic perceptions. In this study, the use of acoustic cues was assessed for normal-hearing (NH) listeners in optimal and spectrally degraded conditions, and also for CI listeners. Two experiments tested the tense/lax vowel contrast (varying in formant structure, vowel-inherent spectral change, and vowel duration) and the word-final fricative voicing contrast (varying in F1 transition, vowel duration, consonant duration, and consonant voicing). Identification results were modeled using mixed-effects logistic regression. These experiments suggested that under spectrally-degraded conditions, NH listeners decrease their use of formant cues and increase their use of durational cues. Compared to NH listeners, CI listeners showed decreased use of spectral cues like formant structure and formant change and consonant voicing, and showed greater use of durational cues (especially for the fricative contrast). The results suggest that although NH and CI listeners may show similar accuracy on basic tests of word, phoneme or feature recognition, they may be using different perceptual strategies in the process.  相似文献   

18.
This study examined the impact on speech processing of regional phonetic/phonological variation in the listener's native language. The perception of the /e/-/epsilon/ and /o/-/upside down c/ contrasts, produced by standard but not southern French native speakers, was investigated in these two populations. A repetition priming experiment showed that the latter but not the former perceived words such as /epe/ and /epepsilon/ as homophones. In contrast, both groups perceived the two words of /o/-/upside down c/ minimal pairs (/pom/-/p(uspide down c)m/) as being distinct. Thus, standard-French words can be perceived differently depending on the listener's regional accent.  相似文献   

19.
This study examined production of word-final English /p/ and /b/ by subjects whose native language does not possess voiced stops in word-final position. Native Chinese adults resembled native English adults, native English children, and native Chinese children in producing /p/ with greater peak oral air pressure than /b/. However, unlike subjects in the other groups, the Chinese adults' /b/ was sometimes misidentified as /p/. This may have occurred, at least in part, because the Chinese adults produced a much smaller difference between /p/ and /b/ in labial closure duration and voicing than the other subjects. The English adults sustained voicing in /b/ significantly longer than subjects in the other three groups. To help determine the basis for this ability, the shape of oral air pressure waveforms was examined systematically. The percentage of "delayed" and "bimodal" waveforms, in which pressure stopped increasing, or decreased, prior to the release of labial constriction, was calculated for each group. Only the English adults showed more such waveforms for /b/ than /p/. Voicing continued 18 ms longer in /b/ tokens with delayed and bimodal waveforms than in tokens in which oral pressure increased continuously. The duration of closure voicing was correlated with the rate at which pressure increased in the English adults' /b/ waveforms. Previous aerodynamic modeling has shown that delayed and bimodal waveforms may result from an active enlargement of the supraglottal cavity. This, together with the pattern of between-group differences observed here, suggests that the English adults learned to enlarge the supraglottal cavity to sustain voicing in /b/. It appears that neither the children nor the Chinese adults had as yet acquired this skill.  相似文献   

20.
本文综述了我国声表面波研究简要发展历程及现状,指出了今后我国声表面波研究发展的重点。  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号