共查询到20条相似文献,搜索用时 15 毫秒
1.
A wavelet representation of speech was used to display the instantaneous amplitude and phase within 14 octave frequency bands, representing the envelope and the carrier within each band. Adding stationary noise alters the wavelet pattern, which can be understood as a combination of three simultaneously occurring subeffects: two effects on the wavelet levels (one systematic and one stochastic) and one effect on the wavelet phases. Specific types of signal processing were applied to speech, which allowed each effect to be either included or excluded. The impact of each effect (and of combinations) on speech intelligibility was measured with CVC's. It appeared that the systematic level effect (i.e., the increase of each speech wavelet intensity with the mean noise intensity) has the most degrading effect on speech intelligibility, which is in accordance with measures such as the modulation transfer function and the speech transmission index. However, also the introduction of stochastic level fluctuations and disturbance of the carrier phase seriously contribute to reduced intelligibility in noise. It is argued that these stochastic effects are responsible for the limited success of spectral subtraction as a means to improve speech intelligibility. Results can provide clues for effective noise suppression with respect to intelligibility. 相似文献
2.
The application of the ideal binary mask to an auditory mixture has been shown to yield substantial improvements in intelligibility. This mask is commonly applied to the time-frequency (T-F) representation of a mixture signal and eliminates portions of a signal below a signal-to-noise-ratio (SNR) threshold while allowing others to pass through intact. The factors influencing intelligibility of ideal binary-masked speech are not well understood and are examined in the present study. Specifically, the effects of the local SNR threshold, input SNR level, masker type, and errors introduced in estimating the ideal mask are examined. Consistent with previous studies, intelligibility of binary-masked stimuli is quite high even at -10 dB SNR for all maskers tested. Performance was affected the most when the masker dominated T-F units were wrongly labeled as target-dominated T-F units. Performance plateaued near 100% correct for SNR thresholds ranging from -20 to 5 dB. The existence of the plateau region suggests that it is the pattern of the ideal binary mask that matters the most rather than the local SNR of each T-F unit. This pattern directs the listener's attention to where the target is and enables them to segregate speech effectively in multitalker environments. 相似文献
3.
A single-channel algorithm is proposed for noise reduction in cochlear implants. The proposed algorithm is based on subspace principles and projects the noisy speech vector onto "signal" and "noise" subspaces. An estimate of the clean signal is made by retaining only the components in the signal subspace. The performance of the subspace reduction algorithm is evaluated using 14 subjects wearing the Clarion device. Results indicated that the subspace algorithm produced significant improvements in sentence recognition scores compared to the subjects' daily strategy, at least in stationary noise. Further work is needed to extend the subspace algorithm to nonstationary noise environments. 相似文献
4.
5.
I.IntroductionAcousticshockwavcs(ASW)isanimportantphcnomcnoninnonlinearacoustics.Experimentalrcsultshavcshownthatwhenanaircraftcngincinletopcratesneartheson-iccondition,vcrystrongnoisegcncratedbythcfanscanbcreduccdgreat1yowingtothcformationofASWatthcthroatofthcin1etll].ASWisadiscontinuityofacousticvaria-bles,whichisdifTcrcntfromthcshockwavesoccurringinhighspcedsteadyflowinducts.Theformer'sintensityismuch1cssthanthelattcr's.Furthcrmorc,thepositionandintensityofASWisalwayschangedwithtime.l… 相似文献
6.
A comparative study of phase-shifting algorithms in digital speckle pattern interferometry 总被引:1,自引:0,他引:1
Digital speckle pattern interferometry (DSPI) is a tool for making qualitative as well as quantitative measurements of deformation of objects. Phase-shifting algorithms in DSPI are useful for extracting quantitative deformation data from the system. Comparative studies of the different phase-shifting algorithms in DSPI for object deformation measurement are presented. Static and quasi-dynamic deformation of the object can be measured using these algorithms. Error compensating five-step phase-shifting method is used for the algorithms. 相似文献
7.
8.
An adaptive leaky normalized least-mean-square (NLMS) algorithm has been developed to optimize stability and performance of active noise cancellation systems. The research addresses LMS filter performance issues related to insufficient excitation, nonstationary noise fields, and time-varying signal-to-noise ratio. The adaptive leaky NLMS algorithm is based on a Lyapunov tuning approach in which three candidate algorithms, each of which is a function of the instantaneous measured reference input, measurement noise variance, and filter length, are shown to provide varying degrees of tradeoff between stability and noise reduction performance. Each algorithm is evaluated experimentally for reduction of low frequency noise in communication headsets, and stability and noise reduction performance are compared with that of traditional NLMS and fixed-leakage NLMS algorithms. Acoustic measurements are made in a specially designed acoustic test cell which is based on the original work of Ryan et al. ["Enclosure for low frequency assessment of active noise reducing circumaural headsets and hearing protection," Can. Acoust. 21, 19-20 (1993)] and which provides a highly controlled and uniform acoustic environment. The stability and performance of the active noise reduction system, including a prototype communication headset, are investigated for a variety of noise sources ranging from stationary tonal noise to highly nonstationary measured F-16 aircraft noise over a 20 dB dynamic range. Results demonstrate significant improvements in stability of Lyapunov-tuned LMS algorithms over traditional leaky or nonleaky normalized algorithms, while providing noise reduction performance equivalent to that of the NLMS algorithm for idealized noise fields. 相似文献
9.
Cornelis B Moonen M Wouters J 《The Journal of the Acoustical Society of America》2012,131(6):4743-4755
This paper evaluates noise reduction techniques in bilateral and binaural hearing aids. Adaptive implementations (on a real-time test platform) of the bilateral and binaural speech distortion weighted multichannel Wiener filter (SDW-MWF) and a competing bilateral fixed beamformer are evaluated. As the SDW-MWF relies on a voice activity detector (VAD), a realistic binaural VAD is also included. The test subjects (both normal hearing subjects and hearing aid users) are tested by an adaptive speech reception threshold (SRT) test in different spatial scenarios, including a realistic cafeteria scenario with nonstationary noise. The main conclusions are: (a) The binaural SDW-MWF can further improve the SRT (up to 2 dB) over the improvements achieved by bilateral algorithms, although a significant difference is only achievable if the binaural SDW-MWF uses a perfect VAD. However, in the cafeteria scenario only the binaural SDW-MWF achieves a significant SRT improvement (2.6 dB with perfect VAD, 2.2 dB with real VAD), for the group of hearing aid users. (b) There is no significant degradation when using a real VAD at the input signal-to-noise ratio (SNR) levels where the hearing aid users reach their SRT. (c) The bilateral SDW-MWF achieves no SRT improvements compared to the bilateral fixed beamformer. 相似文献
10.
The effect of head-induced interaural time delay (ITD) and interaural level differences (ILD) on binaural speech intelligibility in noise was studied for listeners with symmetrical and asymmetrical sensorineural hearing losses. The material, recorded with a KEMAR manikin in an anechoic room, consisted of speech, presented from the front (0 degree), and noise, presented at azimuths of 0 degree, 30 degrees, and 90 degrees. Derived noise signals, containing either only ITD or only ILD, were generated using a computer. For both groups of subjects, speech-reception thresholds (SRT) for sentences in noise were determined as a function of: (1) noise azimuth, (2) binaural cue, and (3) an interaural difference in overall presentation level, simulating the effect of a monaural hearing acid. Comparison of the mean results with corresponding data obtained previously from normal-hearing listeners shows that the hearing impaired have a 2.5 dB higher SRT in noise when both speech and noise are presented from the front, and 2.6-5.1 dB less binaural gain when the noise azimuth is changed from 0 degree to 90 degrees. The gain due to ILD varies among the hearing-impaired listeners between 0 dB and normal values of 7 dB or more. It depends on the high-frequency hearing loss at the side presented with the most favorable signal-to-noise (S/N) ratio. The gain due to ITD is nearly normal for the symmetrically impaired (4.2 dB, compared with 4.7 dB for the normal hearing), but only 2.5 dB in the case of asymmetrical impairment. When ITD is introduced in noise already containing ILD, the resulting gain is 2-2.5 dB for all groups. The only marked effect of the interaural difference in overall presentation level is a reduction of the gain due to ILD when the level at the ear with the better S/N ratio is decreased. This implies that an optimal monaural hearing aid (with a moderate gain) will hardly interfere with unmasking through ITD, while it may increase the gain due to ILD by preventing or diminishing threshold effects. 相似文献
11.
I.IntroductionAnti-soundisalsocal1edactivenoisecontro1(ANC).Itsbasicidea,presentedintheLueg'spatentinl936l'l,isthatthenoisereductionisobtainedbyuseofthesignalpro-cessingofthepreliminarysoundsource(i.e.noisesource)toformcoherenceinfluencebe-twccnthepreliminarysoundsourceandthesecondarysoundsourcc(i.c.anti-soundsource).Therearesomeadvantagesofanti-soundsuchasactivecontro1,lcsseffectonthecharacteristicsofnoisesourccandmorereductionoflowfrequcncynoisc.Inrecentyears,therewerealotoftheoreticalande… 相似文献
12.
Scott SK Rosen S Lang H Wise RJ 《The Journal of the Acoustical Society of America》2006,120(2):1075-1083
Functional imaging studies of speech perception in the human brain have identified a key role for auditory association areas in the temporal lobes (bilateral superior temporal gyri and sulci) in the perceptual processing of the speech signal. This is extended to suggest some functional specialization within this bilateral system, with a particular role for the left anterior superior temporal sulcus (STS) in processing intelligible speech. In the current study, noise-vocoded speech was used to vary the intelligibility of speech parametrically. This replicated the finding of a selective response to intelligibility in speech in the left anterior superior temporal sulcus, in contrast to the posterior superior temporal sulcus, which showed a response profile insensitive to the degree of intelligibility. These results are related to theories of functional organization in the human auditory system, which have indicated that there are separate processing streams, with different functional roles, running anterior and posterior to primary auditory cortex. Specifically, it is suggested that an anterior stream processing intelligibility can be distinguished from a posterior stream associated with transient representations, important in spoken repetition and working memory. 相似文献
13.
14.
van Wijngaarden SJ Steeneken HJ Houtgast T 《The Journal of the Acoustical Society of America》2002,112(6):3004-3013
The intelligibility of speech pronounced by non-native talkers is generally lower than speech pronounced by native talkers, especially under adverse conditions, such as high levels of background noise. The effect of foreign accent on speech intelligibility was investigated quantitatively through a series of experiments involving voices of 15 talkers, differing in language background, age of second-language (L2) acquisition and experience with the target language (Dutch). Overall speech intelligibility of L2 talkers in noise is predicted with a reasonable accuracy from accent ratings by native listeners, as well as from the self-ratings for proficiency of L2 talkers. For non-native speech, unlike native speech, the intelligibility of short messages (sentences) cannot be fully predicted by phoneme-based intelligibility tests. Although incorrect recognition of specific phonemes certainly occurs as a result of foreign accent, the effect of reduced phoneme recognition on the intelligibility of sentences may range from severe to virtually absent, depending on (for instance) the speech-to-noise ratio. Objective acoustic-phonetic analyses of accented speech were also carried out, but satisfactory overall predictions of speech intelligibility could not be obtained with relatively simple acoustic-phonetic measures. 相似文献
15.
van Wijngaarden SJ Steeneken HJ Houtgast T 《The Journal of the Acoustical Society of America》2002,111(4):1906-1916
When listening to languages learned at a later age, speech intelligibility is generally lower than when listening to one's native language. The main purpose of this study is to quantify speech intelligibility in noise for specific populations of non-native listeners, only broadly addressing the underlying perceptual and linguistic processing. An easy method is sought to extend these quantitative findings to other listener populations. Dutch subjects listening to Germans and English speech, ranging from reasonable to excellent proficiency in these languages, were found to require a 1-7 dB better speech-to-noise ratio to obtain 50% sentence intelligibility than native listeners. Also, the psychometric function for sentence recognition in noise was found to be shallower for non-native than for native listeners (worst-case slope around the 50% point of 7.5%/dB, compared to 12.6%/dB for native listeners). Differences between native and non-native speech intelligibility are largely predicted by linguistic entropy estimates as derived from a letter guessing task. Less effective use of context effects (especially semantic redundancy) explains the reduced speech intelligibility for non-native listeners. While measuring speech intelligibility for many different populations of listeners (languages, linguistic experience) may be prohibitively time consuming, obtaining predictions of non-native intelligibility from linguistic entropy may help to extend the results of this study to other listener populations. 相似文献
16.
T. Houtgast 《Applied Acoustics》1981,14(1):15-25
Intelligibility tests were performed by teachers and pupils in classrooms under a variety of (road traffic) noise conditions. The intelligibility scores are found to deteriorate at (indoor) noise levels exceeding a critical value of — 15 dB with regard to a teacher's long-term (reverberant) speech level. The implications for external noise levels are discussed: typically, an external noise level of 50 dB(A) would imply that the critical indoor level is exceeded for about 20 per cent of teachers. 相似文献
17.
Large-scale outdoor field measurements were carried out on a residential building to assess the noise levels caused by pass-by trains that run on a nearby viaduct. The experimental results were compared with different schemes for predicting noise from trains. The octave band sound power levels of the train passing by, which are required as input parameters for the Nordic prediction method for train noise (NMT), CSTB 92 and ISO 9613-2 provided in the Mithra software, were determined by an inversion method. The method of calculation of railway noise (CRN) from the UK gives the best agreement with the measured results. The NMT prediction scheme also provides a good prediction of the general trend of the experimental data, but it always overestimates the measured noise levels. As far as the quantitative agreement with experimental data is concerned, the CSTB 92 and ISO 9613-2 prediction schemes are comparatively less satisfactory. 相似文献
18.
Hu Y 《The Journal of the Acoustical Society of America》2010,127(5):3145-3153
Recent research results show that combined electric and acoustic stimulation (EAS) significantly improves speech recognition in noise, and it is generally established that access to the improved F0 representation of target speech, along with the glimpse cues, provide the EAS benefits. Under noisy listening conditions, noise signals degrade these important cues by introducing undesired temporal-frequency components and corrupting harmonics structure. In this study, the potential of combining noise reduction and harmonics regeneration techniques was investigated to further improve speech intelligibility in noise by providing improved beneficial cues for EAS. Three hypotheses were tested: (1) noise reduction methods can improve speech intelligibility in noise for EAS; (2) harmonics regeneration after noise reduction can further improve speech intelligibility in noise for EAS; and (3) harmonics sideband constraints in frequency domain (or equivalently, amplitude modulation in temporal domain), even deterministic ones, can provide additional benefits. Test results demonstrate that combining noise reduction and harmonics regeneration can significantly improve speech recognition in noise for EAS, and it is also beneficial to preserve the harmonics sidebands under adverse listening conditions. This finding warrants further work into the development of algorithms that regenerate harmonics and the related sidebands for EAS processing under noisy conditions. 相似文献
19.
George EL Festen JM Houtgast T 《The Journal of the Acoustical Society of America》2008,124(2):1269-1277
Listening conditions in everyday life typically include a combination of reverberation and nonstationary background noise. It is well known that sentence intelligibility is adversely affected by these factors. To assess their combined effects, an approach is introduced which combines two methods of predicting speech intelligibility, the extended speech intelligibility index (ESII) and the speech transmission index. First, the effects of reverberation on nonstationary noise (i.e., reduction of masker modulations) and on speech modulations are evaluated separately. Subsequently, the ESII is applied to predict the speech reception threshold (SRT) in the masker with reduced modulations. To validate this approach, SRTs were measured for ten normal-hearing listeners, in various combinations of nonstationary noise and artificially created reverberation. After taking the characteristics of the speech corpus into account, results show that the approach accurately predicts SRTs in nonstationary noise and reverberation for normal-hearing listeners. Furthermore, it is shown that, when reverberation is present, the benefit from masker fluctuations may be substantially reduced. 相似文献
20.
Hilkhuysen G Gaubitch N Brookes M Huckvale M 《The Journal of the Acoustical Society of America》2012,131(1):531-539
The effects on speech intelligibility of three different noise reduction algorithms (spectral subtraction, minimal mean squared error spectral estimation, and subspace analysis) were evaluated in two types of noise (car and babble) over a 12 dB range of signal-to-noise ratios (SNRs). Results from these listening experiments showed that most algorithms deteriorated intelligibility scores. Modeling of the results with a logit-shaped psychometric function showed that the degradation in intelligibility scores was largely congruent with a constant shift in SNR, although some additional degradation was observed at two SNRs, suggesting a limited interaction between the effects of noise suppression and SNR. 相似文献