首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 359 毫秒
1.
Auditory event-related potentials (ERPs) to speech sounds were recorded in a demanding selective attention task to measure how the mismatch negativity (MMN) was affected by attention, deviant feature, and task relevance, i.e., whether the feature was target or nontarget type. With vowel-consonant-vowel (VCV) disyllables randomly presented to the right and left ears, subjects attended to the VCVs in one ear. In different conditions, the subjects responded to either intensity or phoneme deviance in the consonant. The position of the deviance within the VCV also varied, being in the first (VC), second (CV), or both (VC and CV) formant-transition regions. The MMN amplitudes were larger for deviants in the attended ear. Task relevance affected the MMNs to intensity and phoneme deviants differently. Target-type intensity deviants yielded larger MMNs than nontarget types. For phoneme deviants there was no main effect of task relevance, but there was a critical interaction with deviance position. The both position gave the largest MMN amplitudes for target-type phoneme deviants, as it did for target- and nontarget-type intensity deviants. The MMN for nontarget-type phoneme deviants, however, showed an inverse pattern such that the MMN for the both position had the smallest amplitude despite its greater spectro-temporal deviance and its greater detectability when it was the target. These data indicate that the MMN reflects differences in phonetic structure as well as differences in acoustic spectral-energy structure of the deviant stimuli. Furthermore, the task relevance effects demonstrate that top-down controls not only affect the amplitude of the MMN, but can reverse the pattern of MMN amplitudes among different stimuli.  相似文献   

2.
This study used ultrasound imaging to examine the cross-sectional shape of the tongue during the production of the ten English vowels ( see text ) in two consonant contexts--/p/ and /s/--and at two scan angles--anterior and posterior. Results were compared with models of sagittal tongue shape. A newly built transducer holder and head restraint maintained the ultrasound transducer in a fixed position inferior to the mandible at a chosen location and angle. The transducer was free to move only in a superior/inferior direction, and demonstrated reliable tracking of the jaw. Since the tongue is anisotrophic along its length, anterior and posterior scan angles were examined to identify differences in tongue shape. Similarly, the coarticulatory effects of the sibilant /s/ versus the bilabial /p/ were examined, to assess variability of intrinsic tongue shape for the vowels. Results showed that the subject's midsagittal tongue grooving was almost universal for the vowels. Posterior grooves were deeper than anterior grooves. In /s/ context, posterior tongue grooves were shallower than in /p/ context. Anteriorly, /s/ context caused deeper grooves for low vowels. Cross-sectional tongue shape varied with tongue position similarly to sagittal tongue shape.  相似文献   

3.
Healthy volunteers without symptoms of either gastroesophageal reflux or laryngopharyngeal reflux and without abnormalities on laryngologic examination were recruited for esophageal pH monitoring. Thirty subjects underwent ambulatory 24-hour double-channel pH probe monitoring to establish normative data for the upper probe, which was positioned just above the upper esophageal sphincter. Data were analyzed excluding meal periods plus 2 minutes of postprandial time. The mean, standard deviation, median, and 95th percentile were calculated for various reflux parameters for the following intervals: total study duration, upright time, supine time, and postprandial time. Normal subjects display physiologic reflux above the upper esophageal sphincter (median one event, 95th percentile 6.9 events), and 80.4% of these events occur in the upright position. The reflux area index (RAI) appears to be the most useful parameter to measure laryngopharyngeal reflux severity.  相似文献   

4.
A multistream phoneme recognition framework is proposed based on forming streams from different spectrotemporal modulations of speech. Phoneme posterior probabilities were estimated from each stream separately and combined at the output level. A statistical model of the final estimated posterior probabilities is used to characterize the system performance. During the operation, the best fusion architecture is chosen automatically to maximize the similarity of output statistics to clean condition. Results on phoneme recognition from noisy speech indicate the effectiveness of the proposed method.  相似文献   

5.
The association of chronic dysphonia with gastroesophageal reflux has been reported in the otalaryngologic literature; unfortunately, these reports are primarily anecdotal. Because of the difficulty in documenting reflux, patients are often left without a definitive diagnosis or therapy. The purpose of this paper is to report on an objective method of documenting gastroesophageal reflux disease by using ambulatory esophageal and hypopharyngeal pH monitoring. 70% of the subjects who underwent simultaneous dual-probe pH monitoring evidenced reflux in the hypopharynx in both an upright and supine position. All of the subjects had erythema of the arytenoid cartilages on indirect examination; so this appears to be of clinical diagnostic significance.  相似文献   

6.
This study used glossometry to examine the position of the tongue and the velocity of its movements in vowels spoken normally and at a self-selected fast rate. The subject in experiment 1 showed lingual undershoot for stressed vowels in "a big again" and "a bob again." The tongue was lower for /I/ and higher for /a/ at the fast rate than at the normal rate. The stressed vowels exerted an affect on unstressed vowels: The tongue was lower in the schwas that preceded and followed /a/ than /I/. Only one of the three subjects in experiment 2 showed no lingual undershoot for fast-rate /I/. The tongue was higher at the fast rate than at the normal rate in the schwas flanking /I/ so that the displacement was less at the fast rate than at the normal rate. Another talker increased the peak velocity of tongue movements at the fast rate and showed no undershoot for /a/. Multiple regression analyses showed that the timing of movements for successive phonetic segments accounted well for undershoot in only one of the three subjects. The results suggest that in order to model the effects of speaking rate on the tongue movements used in forming stressed vowels, it will be necessary to take into account: (1) how much vowels are shortened at a fast rate: (2) how much the peak velocity of tongue movements is increased, if at all; and (3) the position of the tongue before and after the stressed vowels. All three factors are likely to be influenced by how clearly the talker wishes to speak.  相似文献   

7.
Two experiments are reported which explore variables that may complicate the interpretation of phoneme boundary data from hearing-impaired listeners. Fourteen synthetic consonant-vowel syllables comprising a/ba-da-ga/ continuum were used as stimuli. The first experiment examined the influence of presentation level and ear of presentation in normal-hearing subjects. Only small differences in the phoneme boundaries and labeling functions were observed between ears and across presentation levels. Thus monaural presentation and relatively high signal level do not appear to be complicating factors in research with hearing-impaired listeners, at least for these stimuli. The second experiment described a test procedure for obtaining phoneme boundaries in some hearing-impaired listeners that controlled for between-subject sources of variation unrelated to hearing impairment and delineated the effects of spectral shaping imposed by the hearing impairment on the labeling functions. Labeling data were obtained from unilaterally hearing-impaired listeners under three test conditions: in the normal ear without any signal distortion; in the normal ear listening through a spectrum shaper that was set to match the subject's suprathreshold audiometric configuration; and in the impaired ear. The reduction in the audibility of the distinctive acoustic/phonetic cues seemed to explain all or part of the effects of the hearing impairment on the labeling functions of some subjects. For many other subjects, however, other forms of distortion in addition to reduced audibility seemed to affect their labeling behavior.  相似文献   

8.
The purpose of this study was to determine the acoustic effects on voice of three tasks of cognitive workload and their possible relationship to stress. Acoustic analysis was used to measure stress and workload in four experimental tasks and two experiments. In the first experiment, subjects performed cognitive workload tasks under a stressful condition, performing the tasks as rapidly as possible without errors and with the knowledge that any errors committed would reduce their grade in a course. The second condition was to perform the same tasks but without the condition of stress related to the final grade. Four testing conditions were included. One was a baseline measure in which subjects spelled the Spanish alphabet. The second was the reading of a tongue twister, the third was the reading of a tongue twister with delayed auditory feedback, and the fourth was spelling the Spanish alphabet in reverse order. In each condition the subjects prolonged the vowel /a/ for, approximately 5 sec. All subjects performed a test to determine their overall level of anxiety. The results suggest that in conditions of experimentally induced stress there is an increase in the fundamental frequency (F0) relative to baseline, an increase in jitter and shimmer, an increase in the high-frequency harmonic energy, and a decrease in spectral noise.  相似文献   

9.
The American English phoneme /r/ has long been associated with large amounts of articulatory variability during production. This paper investigates the hypothesis that the articulatory variations used by a speaker to produce /r/ in different contexts exhibit systematic tradeoffs, or articulatory trading relations, that act to maintain a relatively stable acoustic signal despite the large variations in vocal tract shape. Acoustic and articulatory recordings were collected from seven speakers producing /r/ in five phonetic contexts. For every speaker, the different articulator configurations used to produce /r/ in the different phonetic contexts showed systematic tradeoffs, as evidenced by significant correlations between the positions of transducers mounted on the tongue. Analysis of acoustic and articulatory variabilities revealed that these tradeoffs act to reduce acoustic variability, thus allowing relatively large contextual variations in vocal tract shape for /r/ without seriously degrading the primary acoustic cue. Furthermore, some subjects appeared to use completely different articulatory gestures to produce /r/ in different phonetic contexts. When viewed in light of current models of speech movement control, these results appear to favor models that utilize an acoustic or auditory target for each phoneme over models that utilize a vocal tract shape target for each phoneme.  相似文献   

10.
This study was designed to test the hypothesis that the kinematic manipulations used by speakers in different speaking conditions are influenced by kinematic performance limits. A range of kinematic parameter values was elicited by having seven subjects produce cyclical CV movements of lips, tongue blade and tongue dorsum (/ba/, /da/, /ga/), at rates ranging from 1 to 6 Hz. The resulting measures were used to establish speaker- and articulator-specific kinematic performance spaces, defined by movement duration, displacement and peak speed. These data were compared with speech movement data produced by the subjects in several different speaking conditions in the companion study (Perkell et al., 2002). The amount of overlap of the speech data and cyclical data varied across speakers, from almost no overlap to complete overlap. Generally, for a given movement duration, speech movements were larger than cyclical movements, indicating that the speech movements were faster and were produced with greater effort, according to the performance space analysis. It was hypothesized that the cyclical movements of the tongue and lips were slower than the speech movements because they were more constrained by (coupled to) the relatively massive mandible. To test this hypothesis, a comparison was made of cyclical movements in maxillary versus mandibular frames of reference. The results indicate that the cyclical movements were not strongly constrained by mandible movements. The overall results generally indicate that the cyclical task did not succeed in defining the upper limits of kinematic performance spaces within which the speech data were confined. Thus, the hypothesis that performance limits influence speech kinematics could not be tested effectively. The differences between the speech and cyclical movements may be due to other factors, such as differences in speakers' "skill" with the two types of movement, or the size of the movements--the speech movements were larger, probably because of a well-defined target for the primary, stressed vowel.  相似文献   

11.
Spontaneous otoacoustic emissions (SOAEs) were studied in humans during and after postural changes. The subjects were tilted from upright to a recumbent position (head down 30 degrees) and upright again in a 6-min period. The SOAEs were recorded continuously and analyzed off-line. The tilting caused a change in the SOAE spectrum for all subjects. Frequency shifts of 10 Hz, together with changes of amplitude (5 dB) and width (5 Hz), were typically observed. However, these changes were observed in both directions (including the appearance and disappearance of emission peaks). The most substantial changes occurred in the frequency region below 2 kHz. An increase of the intracranial pressure, and consequently of the intracochlear fluid pressure, is thought to result in an increased stiffness of the cochlear windows, which is probably mainly responsible for the SOAE changes observed after the downward turn. The time for the spectrum to regain stability after a postural change differed between the two maneuvers: 1 min for the downward and less than 10 s for the upward turn.  相似文献   

12.
Differences in discriminability of stimuli near phoneme boundaries and findings from selective adaptation have been used to argue for the existence of neurophysiological mechanisms--feature detectors--which mediate the perception of speech and speechlike sounds. A detection theory model was used in order to discover whether or not the phoneme boundary effect and the shift in phoneme boundary after adaptation might rather be attributable to changes in response bias. This model was applied in the analysis of phoneme identifications of three sets of stimuli before and after adaptation. While the origins for the phoneme boundary effect appear to lie below the level of response bias, findings suggest that identification changes after adaptation may be due solely to shifts in criterion, rather than changes at the sensory level.  相似文献   

13.
Ultrasound imaging of the tongue is increasingly common in speech production research. However, there has been little standardization regarding the quantification and statistical analysis of ultrasound data. In linguistic studies, researchers may want to determine whether the tongue shape for an articulation under two different conditions (e.g., consonants in word-final versus word-medial position) is the same or different. This paper demonstrates how the smoothing spline ANOVA (SS ANOVA) can be applied to the comparison of tongue curves [Gu, Smoothing Spline ANOVA Models (Springer, New York, 2002)]. The SS ANOVA is a technique for determining whether or not there are significant differences between the smoothing splines that are the best fits for two data sets being compared. If the interaction term of the SS ANOVA model is statistically significant, then the groups have different shapes. Since the interaction may be significant even if only a small section of the curves are different (i.e., the tongue root is the same, but the tip of one group is raised), Bayesian confidence intervals are used to determine which sections of the curves are statistically different. SS ANOVAs are illustrated with some data comparing obstruents produced in word-final and word-medial coda position.  相似文献   

14.
Hearing thresholds measured with high-frequency resolution show a quasiperiodic change in level called threshold fine structure (or microstructure). The effect of this fine structure on loudness perception over a range of stimulus levels was investigated in 12 subjects. Three different approaches were used. Individual hearing thresholds and equal loudness contours were measured in eight subjects using loudness-matching paradigms. In addition, the loudness growth of sinusoids was observed at frequencies associated with individual minima or maxima in the hearing threshold from five subjects using a loudness-matching paradigm. At low levels, loudness growth depended on the position of the test- or reference-tone frequency within the threshold fine structure. The slope of loudness growth differs by 0.2 dB/dB when an identical test tone is compared with two different reference tones, i.e., a difference in loudness growth of 2 dB per 10-dB change in stimulus. Finally, loudness growth was measured for the same five subjects using categorical loudness scaling as a direct-scaling technique with no reference tone instead of the loudness-matching procedures. Overall, an influence of hearing-threshold fine structure on loudness perception of sinusoids was observable for stimulus levels up to 40 dB SPL--independent of the procedure used. Possible implications of fine structure for loudness measurements and other psychoacoustic experiments, such as different compression within threshold minima and maxima, are discussed.  相似文献   

15.
Previous research has shown that speech recognition differences between native and proficient non-native listeners emerge under suboptimal conditions. Current evidence has suggested that the key deficit that underlies this disproportionate effect of unfavorable listening conditions for non-native listeners is their less effective use of compensatory information at higher levels of processing to recover from information loss at the phoneme identification level. The present study investigated whether this non-native disadvantage could be overcome if enhancements at various levels of processing were presented in combination. Native and non-native listeners were presented with English sentences in which the final word varied in predictability and which were produced in either plain or clear speech. Results showed that, relative to the low-predictability-plain-speech baseline condition, non-native listener final word recognition improved only when both semantic and acoustic enhancements were available (high-predictability-clear-speech). In contrast, the native listeners benefited from each source of enhancement separately and in combination. These results suggests that native and non-native listeners apply similar strategies for speech-in-noise perception: The crucial difference is in the signal clarity required for contextual information to be effective, rather than in an inability of non-native listeners to take advantage of this contextual information per se.  相似文献   

16.
Jenny Iwarsson   《Journal of voice》2001,15(3):384-394
The configuration of the body resulting from inhalatory behavior is sometimes considered a factor of relevance to voice production in singing and speaking pedagogy and in clinical voice therapy. The present investigation compares two different inhalatory behaviors: (1) with a "paradoxical" inward movement of the abdominal wall, and (2) with an expansion of the abdominal wall, both with regard to the effect on vertical laryngeal position during the subsequent phonation. Seventeen male and 17 female healthy, vocally untrained subjects participated. No instructions were given regarding movements of the rib cage. Inhaled air volume as measured by respiratory inductive plethysmography, was controlled to reach 70% inspiratory capacity. Vertical laryngeal position was recorded by two-channel electroglottography during the subsequent vowel production. A significant effect was found; the abdomen-out condition was associated with a higher laryngeal position than the abdomen-in condition. This result apparently contradicted a hypothesis that an expansion of the abdominal wall would allow the diaphragm to descend deeper in the torso, thereby increasing the tracheal pull, which would result in a lower laryngeal position. In a post-hoc experiment including 6 of the subjects, body posture was studied by digital video recordings, revealing that the two inhalatory modes were clearly associated with postural changes affecting laryngeal position. The "paradoxical" inward movement of the abdominal wall was associated with a recession of the chin toward the neck, such that the larynx appeared in a lower position in the neck, for reasons of a postural change. The results suggest that the laryngeal position can be affected by the inhalatory behavior if no attention is paid to posture, implying that instructions from clinicians and pedagogues regarding breathing behavior must be carefully formulated and adjusted in order to ensure that the intended goals are reached.  相似文献   

17.
Two multichannel tactile devices for the hearing impaired were compared in speech perception tasks of varying levels of complexity. Both devices implemented the "vocoder" principle in their stimulus processing: One device had a 16-element linear vibratory array worn on the forearm and displayed activity in 16 overlapping frequency channels; the other device delivered tactile stimulation to a linear array of 16 electrodes worn on the abdomen. Subjects were tested in several phoneme discrimination tasks, ranging from discrimination of pairs of words differing in only one phoneme under tactile aid alone conditions to identification of stimuli in a larger set under tactile aid alone, lipreading alone, and lipreading plus tactile aid conditions. Results showed both devices to be better transmitters of manner and voicing features of articulation than of place features, when tested in single-item tasks. No systematic differences in performance with the two devices were observed. However, in a connected discourse tracking task, the vibrotactile vocoder in conjunction with lipreading yielded much greater improvements over lipreading alone than did the electrotactile vocoder. One possible explanation for this difference in performance, the inclusion of a noise suppression circuit in the electrotactile aid, was evaluated, but did not appear to account for the differences observed. Results are discussed in terms of additional differences between the two devices that may influence performance.  相似文献   

18.
基于光谱的中医舌色客观化方法初探   总被引:5,自引:0,他引:5  
为了更加客观地反映舌的生理、病理变化,探讨了基于光谱的舌色客观化的新方法。按照国际照明委员会推荐的测色标准照明和观测条件测量舌体的反射光谱,并将测量得到的反射光谱进行归一化处理,以尽可能地除去光源等测量条件变化带来的影响。将光谱学的方法应用于舌诊领域,在多个波长上精确分析舌色信息,能够更加客观准确地刻画舌的颜色,提供更多与人体健康状况相关的微观信息,如舌组织成分、微循环状态和舌组织结构等,克服了以往方法不能准确、祥实反映组织成分、结构以及微循环状态,量化指标不精确、实用性差的缺点。通过对物体和人舌光谱数据进行的初步采集实验,证明相比于现有的比色法,光谱法既可以更全面、深入、客观地反映人舌所携带的生理和病理信息,突出舌色的细微区别,又可以用特征参数准确度量来表征,与现有物理基准相衔接,从而促进舌色客观化的研究进程。  相似文献   

19.
This investigation evaluated the effect of method of elicitation onthe maximum sustained durations of /s/, /z/, and /a/ in eighty female subjects. The effect of order of elicitation on the maximum sustained durations of /s/ and /z/, and the effects of method and order of elicitation on the s/z ratio, were analyzed as well. Comparisons of three methods of measuring maximum phoneme duration (MPD) were performed. This study found method of elicitation of MPD to have an effect on the maximum sustained durations of /s/, /z/, and /a/ but not on the s/z ratio. The three methods of measuring MPD were found to correlate highly with one another. Results of this study point to the need for standardized procedures for elicitation of MPD.  相似文献   

20.
This paper examines some of the factors that can affect the magnitude of comodulation masking release (CMR). In experiment I, psychometric functions were measured for the detection of a 1-kHz sinusoidal signal in a "multiplied" narrow-band noise centered at 1 kHz (reference condition) and the same noise with two comodulated flanking bands added. The functions were slightly steeper for the comodulated than for the reference masker. Thus CMRs measured at a high percent correct point were slightly (0.4 dB) larger than CMRs measured at a low percent correct point. Large individual differences were found for the reference masker but not for the comodulated masker. Experiment II compared CMRs obtained with narrow-band Gaussian noise and multiplied noise, using a single flanking band. For a flanking band remote from the signal frequency, the CMRs were smaller and more variable for the multiplied noise than for the Gaussian noise. This variability arose mainly from individual differences in the reference condition. Experiment III compared growth-of-masking functions for a signal centered in Gaussian noise and multiplied noise. Thresholds were lower for the multiplied than for the Gaussian noise, and the differences were greatest at high noise levels. The results are consistent with the idea that, for multiplied noise, some subjects can detect a change in the distribution of the envelope of the stimulus, when the signal is added to the masker. Such subjects have low thresholds in the reference condition, and give small CMRs. Other subjects are relatively insensitive to this cue. They have higher thresholds in the reference condition, and give larger CMRs. For Gaussian noise, thresholds for the reference condition are relatively stable across subjects and CMRs tend to be substantial, even for flanking-band frequencies remote from the signal frequency.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号