首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
Two novel algorithms which employ the idea of stacked generalization or stacked regression, stacked partial least squares (SPLS) and stacked moving‐window partial least squares (SMWPLS) are reported in the present paper. The new algorithms establish parallel, conventional PLS models based on all intervals of a set of spectra to take advantage of the information from the whole spectrum by incorporating parallel models in a way to emphasize intervals highly related to the target property. It is theoretically and experimentally illustrated that the predictive ability of these two stacked methods combining all subsets or intervals of the whole spectrum is never poorer than that of a PLS model based only on the best interval. These two stacking algorithms generate more parsimonious regression models with better predictive power than conventional PLS, and perform best when the spectral information is neither isolated to a single, small region, nor spread uniformly over the response. A simulation data set is employed in this work not only to demonstrate this improvement, but also to demonstrate that stacked regressions have the potential capability of predicting property information from an outlier spectrum in the prediction set. Moisture, oil, protein and starch in Cargill corn samples have been successfully predicted by these new algorithms, as well as hydroxyl number for different instruments of terpolymer samples including and excluding an outlier spectrum. Copyright © 2009 John Wiley & Sons, Ltd.  相似文献   

2.
BackgroundIdentification of potential drug-target interaction pairs is very important for pharmaceutical innovation and drug discovery. Numerous machine learning-based and network-based algorithms have been developed for predicting drug-target interactions. However, large-scale pharmacological, genomic and chemical datum emerged recently provide new opportunity for further heightening the accuracy of drug-target interactions prediction.ResultsIn this work, based on the assumption that similar drugs tend to interact with similar proteins and vice versa, we developed a novel computational method (namely MKLC-BiRW) to predict new drug-target interactions. MKLC-BiRW integrates diverse drug-related and target-related heterogeneous information source by using the multiple kernel learning and clustering methods to generate the drug and target similarity matrices, in which the low similarity elements are set to zero to build the drug and target similarity correction networks. By incorporating these drug and target similarity correction networks with known drug-target interaction bipartite graph, MKLC-BiRW constructs the heterogeneous network on which Bi-random walk algorithm is adopted to infer the potential drug-target interactions.ConclusionsCompared with other existing state-of-the-art methods, MKLC-BiRW achieves the best performance in terms of AUC and AUPR. MKLC-BiRW can effectively predict the potential drug-target interactions.  相似文献   

3.
The estimation of the prediction region of partial least squares (PLS) is necessary in many engineering applications. However, research in this area focuses on the estimation of prediction intervals only. In this work, a new recursive formulation of PLS is proposed to facilitate the calculation of the Jacobian matrix of the estimated coefficient matrix. Furthermore, the computational complexity analysis indicates that the proposed algorithm is O(m2N + mpN + mpN2 + mN3 + mpN4) per number of component. The prediction region of the multivariate PLS is obtained through local linearization. The new formulation provides one way to obtain the prediction region of the multivariate PLS. Simulation and near‐infrared spectra of corn case studies indicate the utility of the proposed method. Copyright © 2013 John Wiley & Sons, Ltd.  相似文献   

4.
Changeable size moving window partial least squares (CSMWPLS) and searching combination moving window partial least squares (SCMWPLS) are proposed to search for an optimized spectral interval and an optimized combination of spectral regions from informative regions obtained by a previously proposed spectral interval selection method, moving window partial least squares (MWPLSR) [Anal. Chem. 74 (2002) 3555]. The utilization of informative regions aims to construct better PLS models than those based on the whole spectral points. The purpose of CSMWPLS and SCMWPLS is to optimize the informative regions and their combination to further improve the prediction ability of the PLS models. The results of their application to an open-path (OP)/FT-IR spectra data set show that the proposed methods, especially SCMWPLS can find out an optimized combination, with which one can improve, often significantly, the performance of the corresponding PLS model, in terms of low prediction error, root mean square error of prediction (RMSEP) with the reasonable latent variable (LVs) number, comparing with the results obtained using whole spectra or direct combination of informative regions for a compound. Regions consisting of the combinations obtained can easily be explained by the existence of IR absorption bands in those spectral regions.  相似文献   

5.
In the current study, robust boosting partial least squares (RBPLS) regression has been proposed to model the activities of a series of 4H-1,2,4-triazoles as angiotensin II antagonists. RBPLS works by sequentially employing PLS method to the robustly reweighted versions of the training compounds, and then combing these resulting predictors through weighted median. In PLS modeling, an F-statistic has been introduced to automatically determine the number of PLS components. The results obtained by RBPLS have been compared to those by boosting partial least squares (BPLS) repression and partial least squares (PLS) regression, showing the good performance of RBPLS in improving the QSAR modeling. In addition, the interaction of angiotensin II antagonists is a complex one, including topological, spatial, thermodynamic and electronic effects.  相似文献   

6.
The integration of multiple data sources has emerged as a pivotal aspect to assess complex systems comprehensively. This new paradigm requires the ability to separate common and redundant from specific and complementary information during the joint analysis of several data blocks. However, inherent problems encountered when analysing single tables are amplified with the generation of multiblock datasets. Finding the relationships between data layers of increasing complexity constitutes therefore a challenging task. In the present work, an algorithm is proposed for the supervised analysis of multiblock data structures. It associates the advantages of interpretability from the orthogonal partial least squares (OPLS) framework and the ability of common component and specific weights analysis (CCSWA) to weight each data table individually in order to grasp its specificities and handle efficiently the different sources of Y-orthogonal variation.  相似文献   

7.
In this paper, fault detection and identification methods based on semi‐supervised Laplacian regularization kernel partial least squares (LRKPLS) are proposed. In Laplacian regularization learning framework, unlabeled and labeled samples are used to improve estimate of data manifold so that one can establish a more robust data model. We show that LRKPLS can avoid the over‐fitting problem which may be caused by sample insufficient and outliers present. Moreover, the proposed LRKPLS approach has no special restriction on data distribution, in other words, it can be used in the case of nonlinear or non‐Gaussian data. On the basis of LRKPLS, corresponding fault detection and identification methods are proposed. Those methods are used to monitor a numerical example and Hot Galvanizing Pickling Waste Liquor Treatment Process (HGPWLTP), and the cases study show effeteness of the proposed approaches. Copyright © 2016 John Wiley & Sons, Ltd.  相似文献   

8.
In this paper, a methodology to evaluate the probability of false non-compliance and false compliance for screening methods, which give first or second-order multivariate signals is proposed. For this task 120 samples of 6 different kinds of milk have been measured by excitation-emission fluorescence. The samples have been spiked with different amounts of three sulfonamides (sulfadiazine, sulfamerazine and sulfamethazine). These substances have been classified in group B1 (veterinary medicines and contaminants) of annex I of Directive 96/23/EC. The European Union (Commission Regulation EC no. 281/96) has set the maximum residue level (MRL) of total sulfonamides at 100 μg kg−1 in muscle, liver, kidney and milk.The work shows that excitation-emission fluorescence together with the partial least squares class modeling (PLS-CM) procedure may be a suitable and cheap screening method for the total amount of sulfonamides in milk. Three models, PLS-CM, have been built, for the emission and excitation spectra (first-order signals) and for the excitation-emission matrices (second-order signals). In all the cases it reaches probabilities of false compliance below 5% as required by Decision 2002/657/EC.With the same flourescence signals, the total quantity of sulfonamide was calibrated using 2-PLS, 3-PLS and PARAFAC regressions. Using this quantitative approach, the capability of detection, CCβ, around the MRL has been estimated between 114.3 and 115.1 μg kg−1 for a probability of false non-compliance and false compliance equal to 5%.  相似文献   

9.
The partial least squares (PLS-1) calibration model based on spectrophotometric measurement, for the simultaneous determination of CN and SCN ions is described. The method is based on the difference in the rate of the reaction between CN and SCN ions with chloramine-T in a pH 4.0 buffer solution and at 30 °C. The produced cyanogen chloride (CNCl) reacts with pyridine and the product condenses with barbituric acid and forms a final colored product. The absorption kinetic profiles of the solutions were monitored by measuring absorbance at 578 nm in the time range 20-180 s after initiation of the reaction with 2 s intervals. The experimental calibration matrix for partial least squares (PLS-1) calibration was designed with 31 samples. The cross-validation method was used for selecting the number of factors. The results showed that simultaneous determination could be performed in the range 10.0-900.0 and 50.0-1200.0 ng mL−1 for CN and SCN ions, respectively. The proposed method was successfully applied to the simultaneous determination of cyanide and thiocyanate in water samples.  相似文献   

10.
Fluorescence spectrum, as well as the first and second derivative spectra in the region of 220–900 nm, was utilized to determine the concentration of triglyceride in human serum. Nonlinear partial least squares regression with cubic B‐spline‐function‐based nonlinear transformation was employed as the chemometric method. Window genetic algorithms partial least squares (WGAPLS) was proposed as a new wavelength selection method to find the optimized spectra wavelengths combination. Study shows that when WGAPLS is applied within the optimized regions ascertained by changeable size moving window partial least squares (CSMWPLS) or searching combination moving window partial least squares (SCMWPLS), the calibration and prediction performance of the model can be further improved at a reasonable latent variable number. SCMWPLS should start from the sub‐region found by CSMWPLS with the smallest root mean squares error of calibration (RMSEC). In addition, WGAPLS should be utilized within the region of smallest RMSEC whether it is the sub‐region found by CSMWPLS or region combination found by SCMWPLS. Moreover, the prediction ability of nonlinear models was better than the linear models significantly. The prediction performance of the three spectra was in the following order: second derivative spectrum < original spectrum < first derivative spectrum. Wavelengths within the region of 300–367 nm and 386–392 nm in the first derivative of the original fluorescence spectrum were the optimized wavelength combination for the prediction model. Copyright © 2012 John Wiley & Sons, Ltd.  相似文献   

11.
Pérez NF  Boqué R  Ferré J 《Talanta》2010,83(2):475-481
A novel method for establishing multivariate specifications of food commodities is proposed. The specifications are established for discriminant partial least squares (DPLS) by setting limits on the predictions of the DPLS model together with Hotelling T2 and square error of prediction (SPE). These limits can be tuned depending on whether type I error (i.e. a correct sample is declared out-of-specification) or type II error (i.e. an out-of-specification sample is declared within specifications) need to be minimized. The methodology is illustrated with a set of NIR spectra of Italian olive oils, corresponding to five regions and the class Liguria is the class of interest. The results demonstrate the possibility of establishing multivariate specification for olive oils from the Liguria region on the basis of spectral data obtaining type I and type II errors lower than 5%.  相似文献   

12.
Ren S  Gao L 《Talanta》2000,50(6):1163-1173
The mathematical bases and program algorithms of discrete wavelet transform (DWT), multiresolution and Mallat’s pyramid algorithm were described. The multiresolution analysis (MRA) based on Daubechies orthogonal wavelet basis was studied as a tool for removing noise and irrelevant information from spectrophotometric spectra. After wavelet MRA pre-treatment, eight error functions were calculated for deducing the number of factors. A partial least squares based on wavelet MRA (WPLS) method was developed to perform simultaneous spectrophotometric determination of Fe(II) and Fe(III) with overlapping peaks. Data reduction was performed using wavelet MRA and principal component analysis (PCA) algorithm. Two programs, SPWMRA and SPWPLS, were designed to perform wavelet MRA and simultaneous multicomponent determination. Experimental results showed the WPLS method to be successful even where there was severe overlap of spectra.  相似文献   

13.
《Analytica chimica acta》2002,452(2):311-319
The characterisation of adsorption or impregnation processes using conventional or supercritical fluid technologies becomes an increasing part of the research on drug formulations. The complexity of the relationships between these adsorption processes and the experimental variables potentially influencing them, however, makes these studies more problematic. In this paper, a chemometric approach based on nonlinear partial least squares (NL-PLS) modelling is applied to characterise the effect of the experimental variables on the supercritical impregnation process. Various adsorbent materials such as silica gel, zeolite and amberlite were investigated using the following model compounds as adsorbates: benzoic, salicylic and acetylsalicylic acids.  相似文献   

14.
The partial least squares (PLS) applied to the simultaneous determination of the divalent ions of copper, nickel, cobalt and zinc based on the formation of their complexes with 2-carboxy-2′-hydroxy-5′-sulfoformazyl benzene (zincon). The absorption spectra were recorded from 515 through 750 nm. The effect of pH on sensitivity and the selectivity was studied in the range 3.0-10.0 and the pH 8.0 was choused according to net analyte signal (NAS) as a function of pH. The concentration range for Cu2+, Ni2+, Co2+and Zn2+ in solution calibration sets were 0-2.6, 0-4.6, 0-3.0 and 0-4.92 ppm, respectively. The root mean squares differences (RMSD) for copper, nickel, cobalt and zinc were 0.0181, 0.0488, 0.0309 and 0.0463, respectively.  相似文献   

15.
对氨基甲酸酯杀虫剂残杀威(PRO)和异丙威(ISO)的速差动力学光度法同时测定进行了研究。残杀威和异丙威均能在碱性条件下发生水解,水解生成的酚盐,均可以与对氨基苯酚及高碘酸钾混合物发生反应生成兰色化合物,且这是一个反应速率适中的动力学反应。本实验在538nm-700nm采集多个时间点下多个波长的动力学--吸收光谱数据,构成量测矩阵。采用主成分-偏最小二乘法(PC-PLS)对测定数据进行了解析。本文对环境水样中残杀威和异丙威的含量进行了测定。取得了较好的分析结果。从而提出了一种易于实现,准确度高的残杀威和异丙威的同时测定新方法。  相似文献   

16.
17.
Near-infrared spectroscopy(NIR),which is generally used for online monitoring of the food analysis and production process, was applied to determine the internal quality of toothpaste samples.It is acknowledged that the spectra can be significantly influenced by non-linearities introduced by light scatter,therefore,four data preprocessing methods,including off-set correction, 1st-derivative,standard normal variate(SNV) and multiplicative scatter correction(MSC),were employed before the date analysis. The multivariate calibration model of partial least squares(PLS) was established and then was used to predict the pH values of the toothpaste samples of different brand.The results showed that the spectral date processed by MSC was the best one for predicting the pH value of the toothpaste samples.  相似文献   

18.
Wei Huang 《Talanta》2010,82(4):1516-5905
Fluorescence spectroscopy provides high sensitivity in quantitative analysis. However, due to spectral interference, it is difficult to determine the individual components of fluorescent multi-component mixtures in such complicated and important body matrices as blood, urine and feces without any pre-separation. In this study, a simple and rapid approach based on non-linear variable-angle synchronous fluorescence spectrometry coupled with partial least squares analysis (NLVASF/PLS) was developed for the simultaneous determination of protoporphyrin IX (PP), uroporphyrin III (UP) and coproporphyrin III (CP). The detection limits were 0.18, 0.29 and 0.24 nmol L−1 for protoporphyrin IX (PP), uroporphyrin III (UP) and coproporphyrin III (CP), respectively. The individual components of blood porphyrins were quantified, by this method, simultaneously in one scan with only about 30 s. The recoveries of this method were above 80% in human whole blood samples. This method provided a potential tool for the determination of porphyrins in whole blood and the differential diagnosis of porphyria, especially for rapid routine screening of large number of samples.  相似文献   

19.
A new differential pulse voltammetric method for dopamine determination at a bare glassy carbon electrode has been developed. Dopamine, ascorbic acid (AA) and uric acid (UA) usually coexist in physiological samples. Because AA and UA can be oxidized at potentials close to that of DA it is difficult to determine dopamine electrochemically, although resolution can be achieved using modified electrodes. Additionally, oxidized dopamine mediates AA oxidation and the electrode surface can be easily fouled by the AA oxidation product. In this work a chemometrics strategy, partial least squares (PLS) regression, has been applied to determine dopamine in the presence of AA and UA without electrode modification. The method is based on the electrooxidation of dopamine at a glassy carbon electrode in pH 7 phosphate buffer. The dopamine calibration curve was linear over the range of 1–313 μM and the limit of detection was 0.25 μM. The relative standard error (RSE %) was 5.28%. The method has been successfully applied to the measurement of dopamine in human plasma and urine.   相似文献   

20.
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号