首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 109 毫秒
1.
A modified partial least squares (PLS) algorithm is presented on the basis of a novel weight updating strategy. The new weight can handle situations with directions in X space having large variance unrelated to Y , whereas the linear PLS may not work well. In the proposed algorithm, the slice transform technique is introduced to provide a piecewise linear representation of the weight vectors. Then, the corresponding mapping functions are estimated by a least square criterion of the inner relation between the observed variables and the score of response variables. At last, weight vectors are updated by the obtained mapping functions, and the corresponding scores and loadings are calculated with the new weights. An optimal piecewise linear replacements of the PLS weights are achieved by the proposed method. The predictive performances of the new approach and other methods are compared statistically using the Wilcoxon signed rank test. Experimental results show that the proposed method can achieve simpler models, whereas the model performances are at least comparable with PLS and other methods. Copyright © 2012 John Wiley & Sons, Ltd.  相似文献   

2.
The nearest shrunken centroid (NSC) Classifier is successfully applied for class prediction in a wide range of studies based on microarray data. The contribution from seemingly irrelevant variables to the classifier is minimized by the so‐called soft‐thresholding property of the approach. In this paper, we first show that for the two‐class prediction problem, the NSC Classifier is similar to a one‐component discriminant partial least squares (PLS) model with soft‐shrinkage of the loading weights. Then we introduce the soft‐threshold‐PLS (ST‐PLS) as a general discriminant‐PLS model with soft‐thresholding of the loading weights of multiple latent components. This method is especially suited for classification and variable selection when the number of variables is large compared to the number of samples, which is typical for gene expression data. A characteristic feature of ST‐PLS is the ability to identify important variables in multiple directions in the variable space. Both the ST‐PLS and the NSC classifiers are applied to four real data sets. The results indicate that ST‐PLS performs better than the shrunken centroid approach if there are several directions in the variable space which are important for classification, and there are strong dependencies between subsets of variables. Copyright © 2007 John Wiley & Sons, Ltd.  相似文献   

3.
Target projection (TP) also called target rotation (TR) was introduced to facilitate interpretation of latent‐variable regression models. Orthogonal partial least squares (OPLS) regression and PLS post‐processing by similarity transform (PLS + ST) represent two alternative algorithms for the same purpose. In addition, OPLS and PLS + ST provide components to explain systematic variation in X orthogonal to the response. We show, that for the same number of components, OPLS and PLS + ST provide score and loading vectors for the predictive latent variable that are the same as for TP except for a scaling factor. Furthermore, we show how the TP approach can be extended to become a hybrid of latent‐variable (LV) regression and exploratory LV analysis and thus embrace systematic variation in X unrelated to the response. Principal component analysis (PCA) of the residual variation after removal of the target component is here used to extract the orthogonal components, but X‐tended TP (XTP) permits other criteria for decomposition of the residual variation. If PCA is used for decomposing the orthogonal variation in XTP, the variance of the major orthogonal components obtained for OPLS and XTP is observed to be almost the same, showing the close relationship between the methods. The XTP approach is tested and compared with OPLS for a three‐component mixture analyzed by infrared spectroscopy and a multicomponent mixture measured by near infrared spectroscopy in a reactor. Copyright © 2008 John Wiley & Sons, Ltd.  相似文献   

4.
构建支持向量机-偏最小二乘法为药物构效关系建模   总被引:6,自引:0,他引:6  
李剑  陈德钊  成忠  叶子青 《分析化学》2006,34(2):263-266
为研究药物构效关系积累样本数据的过程中,需为小样本建模。此时较易造成过拟合,影响模型的预测性能和稳定性。为此可用偏最小二乘(PLS)法从样本数据中成对地提取最优成分,消除自变量间的复共线性,并有效的降维,然后应用最小二乘支持向量机对成对成分进行非线性回归,并以基于误差修正的策略调整,使之更有效地表达自、因变量间的非线性关系。由此构建为EB-LSSVM-PLS算法,所建模型的预报精度高,稳定性良好。将其应用于新型黄烷酮类衍生物的QSAR建模,效果令人满意,其泛化性能优于其它方法。  相似文献   

5.
The aim of this paper is to characterize metabolism disorders in Kunming mice induced by S180 and H22 tumor cells. Metabolic fingerprint based on high performance liquid chromatography‐diode array detector (HPLC‐DAD) was developed to map the disturbed metabolic responses. In vivo testing of the antitumor activity of paclitaxel (Taxol) was carried out by inhibiting the growth of S180 and H22 tumor cells. Based on 27 common peaks, principal component analysis (PCA) and partial least squares‐discriminant analysis (PLS‐DA) were used to distinguish the abnormal from control and to find significant endogenous compounds (SECs) which have significant contributions to classification. The tumor growth inhibition ratios (TIRs) of Taxol groups were used to validate the predictive accuracies of the PLS‐DA models. The predictive accuracies of PLS‐DA models for S180 and H22 tumor model groups were 97.6 and 100%, respectively. Nine (S180) and seven (H22) SECs were discovered, including uric acid and cytidine. In addition, the correlations between relative tumor weights (RTWs) and chromatographic data for the SECs were significant (p < 0.05). Investigations on the stability and precision of the established metabolic fingerprints demonstrate that the experiment is well controlled and reliable. This work shows that the platform of HPLC‐DAD coupled with chemometric methods provides a promising method for the study of metabolism disorders induced by tumor cells. Copyright © 2011 John Wiley & Sons, Ltd.  相似文献   

6.
After showing that plain covariance or correlation‐based criteria are generally not suitable to deal with multiple‐block component models in an exploratory framework, we propose an extended criterion: multiple co‐structure (MCS). MCS combines the goodness‐of‐fit indicator of the component model to a flexible measure of structural relevance of the components. Thus, it allows to track various kinds of interpretable structures within the data, on top of variance–maximizing components: variable‐bundles, components close to satisfying relevant structural constraints, and so on. MCS is to be maximised under unit‐norm constraints on coefficient‐vectors. We provide a dedicated ascent algorithm for it. This algorithm is nested into a more general one, named THEME (thematic equation model explorator), which calculates several components per data‐array and extracts nested structural component models. The method is tested on simulated data and applied to physicochemical data. Copyright © 2015 John Wiley & Sons, Ltd.  相似文献   

7.
In several scientific applications, data are generated from two or more diverse sources (views) with the goal of predicting an outcome of interest. Often it is the case that the outcome is not associated with any single view. However, the synergy of all measurements from each view may yield a more predictive classifier. For example, consider a drug discovery application in which individual molecules are described partially by several assay screens based on diverse profiles and partially by their chemical structural fingerprints. A common classification problem is to determine whether the molecule is associated with a particular disease. In this paper, a co‐training algorithm is developed to utilize data from diverse sources to predict the common class variable. Novel enhancements for variable importance, robustness to a mislabeled class variable, and a technique to handle unbalanced classes are applied to the motivating data set, highlighting that the approach attains strong performance and provides useful diagnostics for data analytic purposes. In addition, comparisons to a framework with data fusion using partial least squares (PLS) are also assessed on real data. An R package for performing the proposed approach is provided as Supporting information. Copyright © 2003 John Wiley & Sons, Ltd.  相似文献   

8.
Run to run (R2R) optimization based on unfolded Partial Least Squares (u‐PLS) is a promising approach for improving the performance of batch and fed‐batch processes as it is able to continuously adapt to changing processing conditions. Using this technique, the regression coefficients of PLS are used to modify the input profile of the process in order to optimize the yield. When this approach was initially proposed, it was observed that the optimization performed better when PLS was combined with a smoothing technique, in particular a sliding window filtering, which constrained the regression coefficients to be smooth. In the present paper, this result is further investigated and some modifications to the original approach are proposed. Also, the suitability of different smoothing techniques in combination with PLS is studied for both end‐of‐batch quality prediction and R2R optimization. The smoothing techniques considered in this paper include the original filtering approach, the introduction of smoothing constraints in the PLS calibration (Penalized PLS), and the use of functional analysis (Functional PLS). Two fed‐batch process simulators are used to illustrate the results. Copyright © 2015 John Wiley & Sons, Ltd.  相似文献   

9.
Co‐crystallisation of, in particular, 4‐iodotetrafluorophenol with a series of secondary and tertiary cyclic amines results in deprotonation of the phenol and formation of the corresponding ammonium phenate. Careful examination of the X‐ray single‐crystal structures shows that the phenate anion develops a C?O double bond and that the C?C bond lengths in the ring suggest a Meissenheimer‐like delocalisation. This delocalisation is supported by the geometry of the phenate anion optimised at the MP2(Full) level of theory within the aug‐cc‐pVDZ basis (aug‐cc‐pVDZ‐PP on I) and by natural bond orbital (NBO) analyses. With sp2 hybridisation at the phenate oxygen atom, there is strong preference for the formation of two non‐covalent interactions with the oxygen sp2 lone pairs and, in the case of secondary amines, this occurs through hydrogen bonding to the ammonium hydrogen atoms. However, where tertiary amines are concerned, there are insufficient hydrogen atoms available and so an electrophilic iodine atom from a neighbouring 4‐iodotetrafluorophenate group forms an I???O halogen bond to give the second interaction. However, in some co‐crystals with secondary amines, it is also found that in addition to the two hydrogen bonds forming with the phenate oxygen sp2 lone pairs, there is an additional intermolecular I???O halogen bond in which the electrophilic iodine atom interacts with the C?O π‐system. All attempts to reproduce this behaviour with 4‐bromotetrafluorophenol were unsuccessful. These structural motifs are significant as they reproduce extremely well, in low‐molar‐mass synthetic systems, motifs found by Ho and co‐workers when examining halogen‐bonding interactions in biological systems. The analogy is cemented through the structures of co‐crystals of 1,4‐diiodotetrafluorobenzene with acetamide and with N‐methylbenzamide, which, as designed models, demonstrate the orthogonality of hydrogen and halogen bonding proposed in Ho’s biological study.  相似文献   

10.
11.
In the present study, boosting has been combined with partial least‐squares discriminant analysis (PLS‐DA) to develop a new pattern recognition method called boosting partial least‐squares discriminant analysis (BPLS‐DA). BPLS‐DA is implemented by firstly constructing a series of PLS‐DA models on the various weighted versions of the original calibration set and then combining the predictions from the constructed PLS‐DA models to obtain the integrative results by weighted majority vote. Coupled with near infrared (NIR) spectroscopy, BPLS‐DA has been applied to discriminate different kinds of tea varieties. As comparisons to BPLS‐DA, the conventional principal component analysis, linear discriminant analysis (LDA), and PLS‐DA have also been investigated. Experimental results have shown that the inter‐variety difference can be accurately and rapidly distinguished via NIR spectroscopy coupled with BPLS‐DA. Moreover, the introduction of boosting drastically enhances the performance of an individual PLS‐DA, and BPLS‐DA is a well‐performed pattern recognition technique superior to LDA. Copyright © 2012 John Wiley & Sons, Ltd.  相似文献   

12.
A new method for high‐sensitive determination of glutamate was developed and evaluated based on CE by using dual‐enzyme co‐immobilized capillary microreactor combined with substrate recycling. The capillary microreactor was prepared by covalently co‐immobilizing glutamate dehydrogenase (GDH) and glutamic pyruvic transaminase (GPT) on the inner surface of a capillary and was characterized by SEM, ultraviolet‐visible spectroscopy, and fluorescence spectroscopy. The GDH‐GPT co‐immobilized capillary microreactor showed great stability and reproducibility. The apparent Km for glutamate with GDH‐GPT coupled reaction was determined to be 0.61±0.06 mM but 2.56±0.24 mM when only GDH was immobilized. Glutamate determination was based on on‐column monitoring UV absorption at 340 nm of the reaction product reduced nicotinamide adenine dinucleotide, of which peak area was directly related to the glutamate concentration. The response of the present co‐immobilized GDH‐GPT assay for glutamate is greatly enhanced over single enzyme system, and a 15.7‐fold improvement in sensitivity was obtained. The detection limit of the proposed method is 0.15 μM glutamate (S/N=3). Selectivity for glutamate is good over most of the 20 amino acids. Finally, this method was successfully applied to determine the glutamate content in rat plasma and serum samples.  相似文献   

13.
Summary: Pulse field gradient‐NMR (PFG‐NMR) spectroscopy is determined to be a more suitable method for the investigation of self‐association processes in multi‐component (co)polymer systems than light scattering methods. Here the co‐micellization of mixtures of the diblock copolymer polystyrene‐block‐(hydrogenated polyisoprene) (PS‐HPI) and the triblock copolymer polystyrene‐block‐(hydrogenated polybutadiene)‐block‐polystyrene (PS‐HPB‐PS) in decane is investigated by PFG‐NMR spectroscopy and the results compared to those experimentally determined by static (SLS) and dynamic (DLS) light scattering. As expected, diffusion coefficients determined by PFG‐NMR spectroscopy are systematically lower than those from DLS. The PFG‐NMR measurements provided higher values of cequation/tex2gif-stack-1.gif(X)/ctot than the model calculations, illustrating that the basic assumption used in the calculations, i.e., that the number concentration of co‐micelles in mixed solutions follows the dilution with a triblock copolymer solution, 1 − X, is not fully valid at high X (weight fraction of PS‐HPB) values.

Comparison of PFG‐NMR spectroscopy and SLS (cequation/tex2gif-stack-2.gif/ctot = equilibrium concentration of free PS‐HPB‐PS over the total concentration of copolymers in solution, X = weight fraction of PS‐HPB).  相似文献   


14.
A novel cocasting approach is presented for improving electroactivity of solution‐cast films of conducting polymers. Solutions of the n‐doping polymer poly(benzimidazobenzophenanthroline) (BBL) were co‐deposited with the ionic liquid electrolyte 1‐ethyl‐3‐methyl‐imidazolium bis(trifluoromethylsulfonyl)imide (EMIBTI). The resultant co‐continuous mixture yielded highly porous polymer films (CC‐BBL) upon removal of solvent and EMIBTI. Electrochemical quartz crystal microgravimetry revealed that the n‐doping process in neat ionic liquid is anion‐dominant, which is contrary to what is observed in dilute electrolyte solutions. The CC‐BBL films exhibit a thirty‐fold increase in initial current response and capacity relative to non‐cocast BBL films. While current response and capacity of the non‐cocast BBL improve with cycling, they level out after 800 cycles at 35% of those of the CC‐BBL. CC‐BBL shows high n‐doping stability; no decrease in electroactivity is seen after 1000 cycles. © 2012 Wiley Periodicals, Inc. J Polym Sci Part B: Polym Phys, 2012  相似文献   

15.
16.
The on‐line monitoring of batch processes based on principal component analysis (PCA) has been widely studied. Nonetheless, researchers have not paid so much attention to the on‐line application of partial least squares (PLS). In this paper, the influence of several issues in the predictive power of a PLS model for the on‐line estimation of key variables in a batch process is studied. Some of the conclusions can help to better understand the capabilities of the proposals presented for on‐line PCA‐based monitoring. Issues like the convenience of batch‐wise or variable‐wise unfolding, the method for the imputation of future measurements and the use of several sub‐models are addressed. This is the first time that the adaptive hierarchical (or multi‐block) approach is extended to the PLS modelling. Also, the formulation of the so‐called trimmed scores regression (TSR), a powerful imputation method defined for PCA, is extended for its application with PLS modelling. Data from two processes, one simulated and one real, are used to illustrate the results. Copyright © 2008 John Wiley & Sons, Ltd.  相似文献   

17.
With the advancement of modern techniques, complex‐valued data have become more important in chemistry and many other areas. The data collected are often multi‐dimensional. This imposes an increasing demand on the tools used for the analysis of complex‐valued data. In multivariate data analysis, projection pursuit is a useful and important technique that in many cases gives better results than principal component analysis. One important projection pursuit variant uses the real‐valued kurtosis as its projection index and has been shown to be a powerful approach to address different problems. However, using the complex‐valued kurtosis as a projection index to deal with complex‐valued data is rare. This is, to a great extent, due to the lack of simple and fast optimization algorithms. In this work, simple and rapidly executed optimization algorithms for the complex‐valued kurtosis used as a projection index are proposed. The developed algorithms have a variety of advantages: no requirement for sphering or strong‐uncorrelation transformation of the data in advance, no assumption for the latent components (source signals) to be circular or non‐circular, search for maxima or minima on users' requirements, and users having the option to choose uncorrelated scores or orthogonal projection vectors. The mathematical development of the algorithms is described and simulated and real experimental data are employed to demonstrate the utility of the proposed algorithms. Copyright © 2015 John Wiley & Sons, Ltd.  相似文献   

18.
Different published versions of partial least squares discriminant analysis (PLS‐DA) are shown as special cases of an approach exploiting prior probabilities in the estimated between groups covariance matrix used for calculation of loading weights. With prior probabilities included in the calculation of both PLS components and canonical variates, a complete strategy for extracting appropriate decision spaces with multicollinear data is obtained. This idea easily extends to weighted linear dummy regression so that the corresponding fitted values also span the canonical space. Two different choices of prior probabilities are applied with a real dataset to illustrate the effect for the obtained decision spaces. Copyright © 2007 John Wiley & Sons, Ltd.  相似文献   

19.
This paper is about how to incorporate interaction effects in multi‐block methodologies. The method proposed is inspired by polynomial regression modelling in the case with only a few independent variables but extends/generalises the idea to situations where the blocks are potentially very large with respect to the number of variables. The method follows a so‐called type I sums of squares strategy where the linear effects (main effects) are incorporated sequentially and before the interactions. The sequential and orthogonalised partial least squares (SO‐PLS) technique is used as a basis for the proposal. The SO‐PLS method is based on sequential estimation of each new block by the PLS regression method after orthogonalisation with respect to blocks already fitted. The new method preserves the invariance already established for SO‐PLS and can be used for blocks with different dimensionality. The method is tested on one real data set with two independent blocks with different complexity and on a simulated data set with a large number of variables in each block. Copyright © 2011 John Wiley & Sons, Ltd.  相似文献   

20.
自适应模糊偏最小二乘方法在药物构效关系建模中的应用   总被引:2,自引:0,他引:2  
作为一种局部逼近方法,自适应神经模糊推理系统(ANFIS)适于为药物定量构效关系(QSAR)建模。描述药物分子结构的参数较多,常存在耦合关系,会增加建模难度,并影响模型的预报性能。为此,将ANFIS和偏最小二乘(PLS)相结合,先由PLS从样本数据中提取成分,再由ANFIS实现每对成分间的非线性映射,并基于输出误差进一步修正所提取的成分,使之对因变量具有最优的解释能力,由此构建为EB-AFPLS方法。该法已成功地应用于HIV-1蛋白酶抑制剂的QSAR建模,效果良好,显示出很强的学习能力,所建模型的预报性能也优于其它方法。  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号