首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 18 毫秒
1.
2.
3.
支持向量机用于多氯代萘毒性的定量构效研究   总被引:2,自引:0,他引:2  
用偏最小二乘法(PLS)和留一交叉验证从90多个量子化学参数中筛选出极化率、分子量、部分原子上的净电荷、静电势等作为描述符,应用支持向量机(SVM)对20个多氯代萘同系物的三组毒性数据分别建立了定量构效关系模型.所得模型的交叉验证相关系数的平方分别为0.805、0.890、0.936.并将偏最小二乘法建模所得结果与之进行比较,结果表明,SVM预报能力优于PLS.  相似文献   

4.
In this paper, the support vector machine was trained to grasp the relationship between the pair-coupled amino acid composition and the content of protein secondary structural elements, including -helix, 310-helix, π-helix, β-strand, β-bridge, turn, bend and the rest random coil. Self-consistency and cross validation tests were made to assess the performance of our method. Results superior to or competitive with the popular theoretical and experimental methods have been obtained.  相似文献   

5.
6.
Total 200 properties related to structural characteristics were employed to represent structures of 400 HA coded proteins of influenza virus as training samples. Some recognition models for HA proteins of avian influenza virus (AIV) were developed using support vector machine (SVM) and linear discriminant analysis (LDA). The results obtained from LDA are as follows: the identification accuracy (Ria) for training samples is 99.8% and Ria by leave one out cross validation is 99.5%. Both Ria of 99.8% for training samples and Ria of 99.3% by leave one out cross validation are obtained using SVM model, respectively. External 200 HA proteins of influenza virus were used to validate the external predictive power of the resulting model. The external Ria for them is 95.5% by LDA and 96.5% by SVM, respectively, which shows that HA proteins of AIVs are preferably recognized by SVM and LDA, and the performances by SVM are superior to those by LDA.  相似文献   

7.
8.
9.
Several methods have been proposed for protein–sugar binding site prediction using machine learning algorithms. However, they are not effective to learn various properties of binding site residues caused by various interactions between proteins and sugars. In this study, we classified sugars into acidic and nonacidic sugars and showed that their binding sites have different amino acid occurrence frequencies. By using this result, we developed sugar-binding residue predictors dedicated to the two classes of sugars: an acid sugar binding predictor and a nonacidic sugar binding predictor. We also developed a combination predictor which combines the results of the two predictors. We showed that when a sugar is known to be an acidic sugar, the acidic sugar binding predictor achieves the best performance, and showed that when a sugar is known to be a nonacidic sugar or is not known to be either of the two classes, the combination predictor achieves the best performance. Our method uses only amino acid sequences for prediction. Support vector machine was used as a machine learning algorithm and the position-specific scoring matrix created by the position-specific iterative basic local alignment search tool was used as the feature vector. We evaluated the performance of the predictors using five-fold cross-validation. We have launched our system, as an open source freeware tool on the GitHub repository (https://doi.org/10.5281/zenodo.61513).  相似文献   

10.
11.
The partial least squares (PLS-1) calibration model based on spectrophotometric measurement, for the simultaneous determination of CN and SCN ions is described. The method is based on the difference in the rate of the reaction between CN and SCN ions with chloramine-T in a pH 4.0 buffer solution and at 30 °C. The produced cyanogen chloride (CNCl) reacts with pyridine and the product condenses with barbituric acid and forms a final colored product. The absorption kinetic profiles of the solutions were monitored by measuring absorbance at 578 nm in the time range 20-180 s after initiation of the reaction with 2 s intervals. The experimental calibration matrix for partial least squares (PLS-1) calibration was designed with 31 samples. The cross-validation method was used for selecting the number of factors. The results showed that simultaneous determination could be performed in the range 10.0-900.0 and 50.0-1200.0 ng mL−1 for CN and SCN ions, respectively. The proposed method was successfully applied to the simultaneous determination of cyanide and thiocyanate in water samples.  相似文献   

12.
研究了基于统计学习理论的支持向量机(SVM)回归法在X射线荧光光谱定量分析中的应用。以39个农田土壤样品作为实验材料,以其中32个土壤样品作为校正集,选用SVM模型中Linear、Poly和RBF 3种核函数对As元素含量与荧光光谱数据进行回归建模。用3种不同模型对预测集中7个土壤样品的As元素含量进行预测分析,结果显示模型预测As元素含量与电感耦合等离子体发射光谱法测定的As元素含量之间的相关系数R2均大于0.99,相对分析误差RPD均大于3,表明所建立的SVM模型具有较好的使用价值。为了进一步考察SVM回归模型的预测效果,同应用较成熟的PLS回归模型的预测结果进行对比,结果显示SVM法的预测结果更好,表明SVM回归模型亦可用于便携式X射线荧光光谱法的定量预测分析。  相似文献   

13.
14.
Li-Juan Tang  Hai-Long Wu 《Talanta》2009,79(2):260-1694
One problem with discriminant analysis of microarray data is representation of each sample by a large number of genes that are possibly irrelevant, insignificant or redundant. Methods of variable selection are, therefore, of great significance in microarray data analysis. To circumvent the problem, a new gene mining approach is proposed based on the similarity between probability density functions on each gene for the class of interest with respect to the others. This method allows the ascertainment of significant genes that are informative for discriminating each individual class rather than maximizing the separability of all classes. Then one can select genes containing important information about the particular subtypes of diseases. Based on the mined significant genes for individual classes, a support vector machine with local kernel transform is constructed for the classification of different diseases. The combination of the gene mining approach with support vector machine is demonstrated for cancer classification using two public data sets. The results reveal that significant genes are identified for each cancer, and the classification model shows satisfactory performance in training and prediction for both data sets.  相似文献   

15.
16.
17.
18.
《Analytical letters》2012,45(12):1910-1921
Multiblock partial least squares (MB-PLS) are applied for determination of corn and tobacco samples by using near-infrared diffuse reflection spectroscopy. In the model, the spectra are separated into several sub-blocks along the wavenumber, and different latent variable number was used for each sub-block. Compared with ordinary PLS, the importance and the contribution of each sub-block can be balanced by super-weights and the usage of different latent variable numbers. Therefore, the prediction obtained by the MB-PLS model is superior to that of the ordinary PLS, especially for the large data sets of tobacco samples with a large number of variables.  相似文献   

19.
《Analytica chimica acta》2002,452(2):311-319
The characterisation of adsorption or impregnation processes using conventional or supercritical fluid technologies becomes an increasing part of the research on drug formulations. The complexity of the relationships between these adsorption processes and the experimental variables potentially influencing them, however, makes these studies more problematic. In this paper, a chemometric approach based on nonlinear partial least squares (NL-PLS) modelling is applied to characterise the effect of the experimental variables on the supercritical impregnation process. Various adsorbent materials such as silica gel, zeolite and amberlite were investigated using the following model compounds as adsorbates: benzoic, salicylic and acetylsalicylic acids.  相似文献   

20.
Simultaneous anodic stripping voltammetric determination of Pb and Cd is restricted on gold electrodes as a result of the overlapping of these two peaks. This work describes the quantitative determination of a binary mixture system of Pb and Cd, at low concentration levels (up to 15.0 and 10.0 µg L?1 for Pb and Cd, respectively) by differential pulse anodic stripping voltammetry (DPASV; deposition time of 30 s), using a green electrode (vibrating gold microwire electrode) without purging in a chloride medium (0.5 M NaCl) under moderate acidic conditions (HCl 1.0 mM), assisted by chemometric tools. The application of multivariate curve resolution alternating least squares (MCR‐ALS) for the resolution and quantification of both metals is shown. The optimized MCR‐ALS models showed good prediction ability with concentration prediction errors of 12.4 and 11.4 % for Pb and Cd, respectively. The quantitative results obtained by MCR‐ALS were compared to those obtained with partial least squares (PLS) and classical least squares (CLS) regression methods. For both metals, PLS and MCR‐ALS results are comparable and superior to CLS. For Cd, as a result of the peak shift problem, the application of CLS was unsuitable. MCR‐ALS provides additional advantage compared to PLS since it estimates the pure response of the analytes signal. Finally, the built up multivariate calibration models, based either in MCR‐ALS or PLS regression, allowed to quantify concentrations of Pb and Cd in surface river water samples, with satisfactory results.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号