首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 62 毫秒
1.
Wan B  Small GW 《The Analyst》2011,136(2):309-316
A novel synthetic data generation methodology is described for use in the development of pattern recognition classifiers that are employed for the automated detection of volatile organic compounds (VOCs) during infrared remote sensing measurements. The approach used is passive Fourier transform infrared spectrometry implemented in a downward-looking mode on an aircraft platform. A key issue in developing this methodology in practice is the need for example data that can be used to train the classifiers. To replace the time-consuming and costly collection of training data in the field, this work implements a strategy for taking laboratory analyte spectra and superimposing them on background spectra collected from the air. The resulting synthetic spectra can be used to train the classifiers. This methodology is tested by developing classifiers for ethanol and methanol, two prevalent VOCs in wide industrial use. The classifiers are successfully tested with data collected from the aircraft during controlled releases of ethanol and during a methanol release from an industrial facility. For both ethanol and methanol, missed detections in the aircraft data are in the range of 4 to 5%, with false positive detections ranging from 0.1 to 0.3%.  相似文献   

2.
High-pass (HP) digital filtering and second-derivative (SD) filtering are evaluated as methods of removing background contributions from spectra collected by passive Fourier transform infrared spectrometry. In measurements performed with a downward-looking spectrometer mounted on an aircraft platform, the effects of non-constant background radiance from the ground make it challenging to build automated classifiers for detecting an analyte of interest. Applying HP digital filtering to the spectra to remove background contributions is evaluated as a strategy to help improve classifier performance. This methodology is tested by building classifiers for detecting heated ethanol plumes released from a portable emission stack. The classifiers are trained with data collected on the ground with the spectrometer viewing the plumes against a synthetic backdrop designed to simulate a terrestrial radiance source. The resulting classifiers are tested with data collected by the same spectrometer mounted on an aircraft flying over the emission stack. Support vector machines are employed as a classification algorithm with HP filtered spectra used as input patterns. Butterworth filters are used to implement HP digital filtering, while Savitzky-Golay filters are used to implement SD filtering. Significant improvement in classification performance is achieved by use of the HP filters. Because of variation in backgrounds between the training and prediction data, the best classifier obtained with unfiltered spectra is unable to detect ethanol in 37% of the test cases. HP filtering of spectra with an optimized Butterworth filter (order 8, cutoff frequency 0.060) improves the prediction results, resulting in no missed ethanol detections and false positive rates of less than 0.4%.  相似文献   

3.
Wavelet analysis is developed as a preprocessing tool for use in removing background information from near-infrared (near-IR) single-beam spectra before the construction of multivariate calibration models. Three data sets collected with three different near-IR spectrometers are investigated that involve the determination of physiological levels of glucose (1-30 mM) in a simulated biological matrix containing alanine, ascorbate, lactate, triacetin, and urea in phosphate buffer. A factorial design is employed to optimize the specific wavelet function used and the level of decomposition applied, in addition to the spectral range and number of latent variables associated with a partial least-squares calibration model. The prediction performance of the computed models is studied with separate data acquired after the collection of the calibration spectra. This evaluation includes one data set collected over a period of more than 6 months. Preprocessing with wavelet analysis is also compared to the calculation of second-derivative spectra. Over the three data sets evaluated, wavelet analysis is observed to produce better-performing calibration models, with improvements in concentration predictions on the order of 30% being realized relative to models based on either second-derivative spectra or spectra preprocessed with simple additive and multiplicative scaling correction. This methodology allows the construction of stable calibrations directly with single-beam spectra, thereby eliminating the need for the collection of a separate background or reference spectrum.  相似文献   

4.
在波长200~400 nm范围内,测定酪氨酸、色氨酸和苯丙氨酸混合体系的吸光度,用连续小波变换(CWT)对光谱数据进行预处理,再用支持向量回归(SVR)方法进行建模,建立了支持向量回归紫外分光光度法同时测定酪氨酸、色氨酸和苯丙氨酸的方法,用所建方法对模拟样品进行了测定。结果表明,酪氨酸、色氨酸和苯丙氨酸预测结果的回收率在98%~102%之间,测定结果准确。  相似文献   

5.
The application of chemometrics to analyze the information of the cis/trans structure of alkenes in infrared spectra (IR) is introduced. For data from the OMNIC IR spectral database, two feature selection methods, Fisher ratios and genetic algorithm-partial least squares (GA-PLS), and two classification methods, support vector machine (SVM) and probabilistic neural network (PNN), have been used to obtain optimization classifiers. At last, some spectra from other IR databases are used to evaluate the optimization classifiers. It has been demonstrated that both the SVM and PNN optimization classifiers could give preferable predictive results about the cis and trans structures of alkene.  相似文献   

6.
该文利用近红外光谱技术结合化学计量学方法开发了不同品种绿茶的无损鉴别方法。通过近红外光谱技术得到了8个品种绿茶样品的近红外光谱,比较了单一以及优化组合光谱预处理方法对光谱的影响,利用无监督的主成分分析(PCA)与有监督的线性判别分析方法(LDA)分别构建了茶叶品种鉴别模型。结果表明:对比单一预处理方法,优化组合预处理具有更优的鉴别准确性。标准正态变量变换预处理消除了茶叶样品大小不均造成的光谱散射影响,一阶导数预处理实现了变动背景的消除,减少了基线漂移的影响,突出了图谱中的有效信息,采用二者相结合的预处理方式并结合无监督的主成分分析法可实现较为准确的绿茶样品种类鉴别分析,准确率达75.0%。此外,采用有监督的线性判别分析方法处理原始光谱数据,可达到100%的鉴别准确率,但该方法需提供类别的先验知识。因此,采用近红外光谱技术和化学计量学相结合的手段可实现不同品种绿茶的快速无损鉴别。  相似文献   

7.
一种基于小波变换的近红外化学指纹图谱分析方法   总被引:6,自引:0,他引:6  
提出了一种快速无损的化学指纹图谱分析新方法. 根据小波变换多分辨分析的特点, 对近红外(NIR)光谱进行小波分解, 从中提取被测样品的化学特征信息, 并作数字信息可视化处理, 构建形成可直观识别样品模式特征的NIR指纹图谱. 将该方法用于中药材丹参的质量检测, 检测结果与色谱指纹图谱检测结果相符, 能快速有效地识别丹参质量模式间的差异, 有望发展成为一种快速分析检测天然产物质量的方法.  相似文献   

8.
A novel method based on continuous wavelet transform (CWT) was proposed as a preprocessing tool for the near-infrared (NIR) spectra. Due to the property of the vanishing moments of the wavelet, the fluctuating background of the NIR spectra can be successfully removed through convolution of the spectra with an appropriate wavelet function. The vanishing moments of a wavelet and the scale parameter are two key factors that govern the result of the background elimination. The result of its application to both the simulated spectra and the NIR spectra of tobacco samples demonstrates that CWT is a competitive tool for removing fluctuating background in spectra.  相似文献   

9.
Aiming at the prediction of pleiotropic effects of drugs, we have investigated the multilabel classification of drugs that have one or more of 100 different kinds of activity labels. Structural feature representation of each drug molecule was based on the topological fragment spectra method, which was proposed in our previous work. Support vector machine (SVM) was used for the classification and the prediction of their activity classes. Multilabel classification was carried out by a set of the SVM classifiers. The collective SVM classifiers were trained with a training set of 59,180 compounds and validated by another set (validation set) of 29,590 compounds. For a test set that consists of 9,864 compounds, the classifiers correctly classified 80.8% of the drugs into their own active classes. The SVM classifiers also successfully performed predictions of the activity spectra for multilabel compounds.  相似文献   

10.
将离散小波变换、小波包变换、傅里叶变换和离散余弦变换与主组分回归方法结合构成4种离散变换主组分回归方法,编制了离散变换主组分回归方法的计算程序。将离散变换主组分回归方法用于处理对硝基甲苯、对硝基酚和对硝基苯胺混合物的重叠紫外吸收光谱数据。结果表明,离散变换主组分回归方法优于主组分回归方法,试样质量浓度的预测值与实际值的相对预测标准误差由3.81%降至约1.11%。  相似文献   

11.
Locally linear embedding (LLE) is introduced here as a nonlinear compression method for near infrared reflectance spectra of endometrial tissue sections. The LLE has been evaluated by using support vector machine (SVM) classifiers and the projected difference resolution (PDR) method. Synthetic data sets devised to resemble near-infrared spectra of tissue samples were used to characterize the performance of the LLE. The LLE was compared using principal component compression (PCC) method to evaluate nonlinear and linear compression. For a set of real tissue samples, if the compressed data were not range-scaled prior to SVM classification, the principal component compressed data gave an average prediction rate of 39 ± 2% while the LLE 94 ± 2%; if range-scaled after compression, the LLE and PCC performed evenly, with maximum average prediction values of 94 ± 2% and 93 ± 2%, respectively. The SVM without compression yielded a classification rate of 92 ± 2%. The prediction accuracy was consistent with PDR results. Without the second derivative preprocessing, the classification rates were 90 ± 3%, 89 ± 2%, and 78 ± 2% for the LLE compressed, the PCC, and no compression classifications by the SVM, respectively.  相似文献   

12.
在波长范围200~400nm测定苯酚、苯胺和苯甲酸混合液的吸收光谱,用离散小波变换(DWT)对光谱数据进行处理,再用支持向量回归SVR方法进行建模,建立了离散小波变换一支持向量回归方法(DWT—SVR)。方法用于模拟样品和污染水样中苯酚、苯胺和苯甲酸的同时测定,结果满意。  相似文献   

13.
The quantification of diclofenac sodium (DS) in tablets was performed using partial least squares (PLS) models based on FTIR ATR (Fourier transform infrared attenuated total reflection) and FT-Raman spectra. Separate calibration models were built for two groups of tablets, standard and sustained release, containing different excipients. To compare the predictive ability of these models the relative standard errors of prediction (RSEP) were calculated. In the case of DS determination from the Raman data, RSEP error values in the range of 2.4-2.8% (2.7-2.9%) for the calibration (validation) data sets were obtained. For ATR models constructed using spectra registered three times for each sample, RSEP errors in the range of 3.6-3.7% (4.2-4.3%) were found. These errors decreased to 2.8% (3.0%) when spectra collected six times were applied. Five commercial products containing 25, 50, 75 and 100 mg of DS per tablet were quantified. Concentrations derived from the elaborated models correlated strongly with the results of reference analyses and gave recoveries of 99.1-101.3% and 99.1-101.7% for the ATR and Raman data, respectively. Although both spectroscopic techniques can be used as fast and convenient alternatives to the standard pharmacopeial methods of DS quantification in solid dosage forms, in the case of the ATR technique, it is necessary to repeat measurements at least a few times to obtain acceptable quantification errors.  相似文献   

14.
Seventeen preprocessing methods have been applied to 524 low-resolution mass spectra of steroids before computing classifiers, which can recognize substructures in a steroid molecule. Best classification results have been obtained by normalization of peak height to local ion current (predictive abilities 85%) and with “significant” spectra that contain only the “most important” peaks (predictive abilities 84%).  相似文献   

15.
To date, few efforts have been made to take simultaneous advantage of the local nature of spectral data in both the time and frequency domains in a single regression model. We describe here the use of a novel chemometrics algorithm using the wavelet transform. We call the algorithm dual-domain regression, as the regression step defines a weighted model in the time-domain based on the contributions of parallel, frequency-domain models made from wavelet coefficients reflecting different scales. In principle, any regression method can be used, and implementation of the algorithm using partial least squares regression and principal component regression are reported here. The performance of the models produced from the algorithm is generally superior to that of regular partial least squares (PLS) or principal component regression (PCR) models applied to data restricted to a single domain. Dual-domain PLS and PCR algorithms are applied to near infrared (NIR) spectral datasets of Cargill corn samples and sets of spectra collected on batch chemical reactions run in different reactors to illustrate the improved robustness of the modeling.  相似文献   

16.
We are making a numerical comparison of various preprocessing strategies for dealing with data from voltammetric electronic tongues in order to reduce the high dimensionality of the response matrices. Different modelling tools are presented and briefly described. We then compare combinations of four preprocessing strategies (principal component analysis, fast Fourier transform, discrete wavelet transform, voltammogram-windowed slicing integral) with four modelling alternatives (principal component regression, partial least squares regression, multi-way partial least squares regression, artificial neural networks) by employing data from a voltammetric bioelectronic tongue, an array formed by enzyme-modified biosensors and applied to the discrimination and quantification of phenolic compounds.
Figure
We are making a numerical comparison of various preprocessing strategies for dealing with data from voltammetric electronic tongues in order to reduce the high dimensionality of the response matrices  相似文献   

17.
Assessment of liver fibrosis is of paramount importance to guide the therapeutic strategy in patients with chronic hepatitis C (CHC). In this pilot study, we investigated the potential of serum Fourier transform infrared (FTIR) spectroscopy for differentiating CHC patients with extensive hepatic fibrosis from those without fibrosis. Twenty-three serum samples from CHC patients were selected according to the degree of hepatic fibrosis as evaluated by the FibroTest: 12 from patients with no hepatic fibrosis (F0) and 11 from patients with extensive fibrosis (F3–F4). The FTIR spectra (ten per sample) were acquired in the transmission mode and data homogeneity was tested by cluster analysis to exclude outliers. After selection of the most discriminant wavelengths using an ANOVA-based algorithm, the support vector machine (SVM) method was used as a supervised classification model to classify the spectra into two classes of hepatic fibrosis, F0 and F3–F4. Given the small number of samples, a leave-one-out cross-validation algorithm was used. When SVM was applied to all spectra (n = 230), the sensitivity and specificity of the classifier were 90.1% and 100%, respectively. When SVM was applied to the subset of 219 spectra, i.e., excluding the outliers, the sensitivity and specificity of the classifier were 95.2% and 100%, respectively. This pilot study strongly suggests that the serum from CHC patients exhibits infrared spectral characteristics, allowing patients with extensive fibrosis to be differentiated from those with no hepatic fibrosis.  相似文献   

18.
传统的柑橘黄龙病检测方法存在准确度低、稳定性差等问题,该文提出了一种基于最小角回归结合核极限学习机(Least angle regression combined with kernel extreme learning machine,LAR-KELM_((RBF)))的近红外柑橘黄龙病鉴别方法。该方法将光谱数据通过小波变换进行预处理,然后用最小角回归(LAR)算法进行光谱波长的筛选,最后通过核极限学习机(KELM_((RBF)))实现样本的分类。实验采用柑橘叶片的近红外光谱数据,验证了LAR-KELM_((RBF))算法的性能,其分类准确度最高为99.91%,标准偏差为0.11。不同规模训练集的实验结果表明,LAR-KELM_((RBF))模型较极限学习机(ELM)、波形叠加极限学习机(SWELM)、反向传播神经网络(BP_((2层)))、KELM_((RBF))和支持向量机(SVM)模型分类准确度高、稳定性强,能够广泛应用于柑橘黄龙病的检测鉴别。  相似文献   

19.
This paper describes the calibration process of a Visible-Near Infrared sensor for the condition monitoring of a gas engine's lubricating oil correlating transmittance oil spectra with the degradation of a gas engine's oil via a regression model. Chemometric techniques were applied to determine different parameters: Base Number (BN), Acid Number (AN), insolubles in pentane and viscosity at 40 °C. A Visible-Near Infrared (400-1100 nm) sensor developed in Tekniker research center was used to obtain the spectra of artificial and real gas engine oils. In order to improve sensor's data, different preprocessing methods such as smoothing by Saviztky-Golay, moving average with Multivariate Scatter Correction or Standard Normal Variate to eliminate the scatter effect were applied. A combination of these preprocessing methods was applied to each parameter. The regression models were developed by Partial Least Squares Regression (PLSR). In the end, it was shown that only some models were valid, fulfilling a set of quality requirements. The paper shows which models achieved the established validation requirements and which preprocessing methods perform better. A discussion follows regarding the potential improvement in the robustness of the models.  相似文献   

20.
Artificial neural network (ANN) and a hybrid principal component analysis-artificial neural network (PCA-ANN) classifiers have been successfully implemented for classification of static time-of-flight secondary ion mass spectrometry (ToF-SIMS) mass spectra collected from complex Cu–Fe sulphides (chalcopyrite, bornite, chalcocite and pyrite) at different flotation conditions. ANNs are very good pattern classifiers because of: their ability to learn and generalise patterns that are not linearly separable; their fault and noise tolerance capability; and high parallelism. In the first approach, fragments from the whole ToF-SIMS spectrum were used as input to the ANN, the model yielded high overall correct classification rates of 100% for feed samples, 88% for conditioned feed samples and 91% for Eh modified samples. In the second approach, the hybrid pattern classifier PCA-ANN was integrated. PCA is a very effective multivariate data analysis tool applied to enhance species features and reduce data dimensionality. Principal component (PC) scores which accounted for 95% of the raw spectral data variance, were used as input to the ANN, the model yielded high overall correct classification rates of 88% for conditioned feed samples and 95% for Eh modified samples.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号