首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
Cross‐validation has become one of the principal methods to adjust the meta‐parameters in predictive models. Extensions of the cross‐validation idea have been proposed to select the number of components in principal components analysis (PCA). The element‐wise k‐fold (ekf) cross‐validation is among the most used algorithms for principal components analysis cross‐validation. This is the method programmed in the PLS_Toolbox, and it has been stated to outperform other methods under most circumstances in a numerical experiment. The ekf algorithm is based on missing data imputation, and it can be programmed using any method for this purpose. In this paper, the ekf algorithm with the simplest missing data imputation method, trimmed score imputation, is analyzed. A theoretical study is driven to identify in which situations the application of ekf is adequate and, more importantly, in which situations it is not. The results presented show that the ekf method may be unable to assess the extent to which a model represents a test set and may lead to discard principal components with important information. On a second paper of this series, other imputation methods are studied within the ekf algorithm. Copyright © 2012 John Wiley & Sons, Ltd.  相似文献   

2.
In the quantitative structure‐activity relationship (QSAR) study, local lazy regression (LLR) can predict the activity of a query molecule by using the information of its local neighborhood without need to produce QSAR models a priori. When a prediction is required for a query compound, a set of local models including different number of nearest neighbors are identified. The leave‐one‐out cross‐validation (LOO‐CV) procedure is usually used to assess the prediction ability of each model, and the model giving the lowest LOO‐CV error or highest LOO‐CV correlation coefficient is chosen as the best model. However, it has been proved that the good statistical value from LOO cross‐validation appears to be the necessary, but not the sufficient condition for the model to have a high predictive power. In this work, a new strategy is proposed to improve the predictive ability of LLR models and to access the accuracy of a query prediction. The bandwidth of k neighbor value for LLR is optimized by considering the predictive ability of local models using an external validation set. This approach was applied to the QSAR study of a series of thienopyrimidinone antagonists of melanin‐concentrating hormone receptor 1. The obtained results from the new strategy shows evident improvement compared with the commonly used LOO‐CV LLR methods and the traditional global linear model. © 2009 Wiley Periodicals, Inc. J Comput Chem, 2010  相似文献   

3.
4.
An evaluation of computational performance and precision regarding the cross‐validation error of five partial least squares (PLS) algorithms (NIPALS, modified NIPALS, Kernel, SIMPLS and bidiagonal PLS), available and widely used in the literature, is presented. When dealing with large data sets, computational time is an important issue, mainly in cross‐validation and variable selection. In the present paper, the PLS algorithms are compared in terms of the run time and the relative error in the precision obtained when performing leave‐one‐out cross‐validation using simulated and real data sets. The simulated data sets were investigated through factorial and Latin square experimental designs. The evaluations were based on the number of rows, the number of columns and the number of latent variables. With respect to their performance, the results for both simulated and real data sets have shown that the differences in run time are statistically different. PLS bidiagonal is the fastest algorithm, followed by Kernel and SIMPLS. Regarding cross‐validation error, all algorithms showed similar results. However, in some situations as, for example, when many latent variables were in question, discrepancies were observed, especially with respect to SIMPLS. Copyright © 2010 John Wiley & Sons, Ltd.  相似文献   

5.
A reliable selection of a representative subset of chemical compounds has been reported to be crucial for numerous tasks in computational chemistry and chemoinformatics. We investigated the usability of an approach on the basis of the k‐medoid algorithm for this task and in particular for experimental design and the split between training and validation set. We therefore compared the performance of models derived from such a selection to that of models derived using several other approaches, such as space‐filling design and D‐optimal design. We validated the performance on four datasets with different endpoints, representing toxicity, physicochemical properties and others. Compared with the models derived from the compounds selected by the other examined approaches, those derived with the k‐medoid selection show a high reliability for experimental design, as their performance was constantly among the best for all examined datasets. Of all the models derived with all examined approaches, those derived with the k‐medoid approach were the only ones that showed a significantly improved performance compared with a random selection, for all datasets, the whole examined range of selected compounds and for each dimensionality of the search space. Copyright © 2012 John Wiley & Sons, Ltd.  相似文献   

6.
Recent developments in fragment‐based methods make it increasingly feasible to use high‐level ab initio electronic structure techniques to molecular crystals. Such studies remain computationally demanding, however. Here, we describe a straightforward algorithm for exploiting space‐group symmetry in fragment‐based methods which often provides computational speed‐ups of several fold or more. This algorithm does not require a priori specification of the space group or symmetry operators. Rather, the symmetrically equivalent fragments are identified automatically by aligning the individual fragments along their principle axes of inertia and testing for equivalence with other fragments. The symmetry operators relating equivalent fragments can then be worked out easily. Implementation of this algorithm for computing energies, nuclear gradients with respect to both atomic coordinates and lattice parameters, and the nuclear hessian is described. © 2014 Wiley Periodicals, Inc.  相似文献   

7.
D ‐Fructose‐6‐phosphate aldolase (FSA) is a unique catalyst for asymmetric cross‐aldol additions of glycolaldehyde. A combination of a structure‐guided approach of saturation mutagenesis, site‐directed mutagenesis, and computational modeling was applied to construct a set of FSA variants that improved the catalytic efficiency towards glycolaldehyde dimerization up to 1800‐fold. A combination of mutations in positions L107, A129, and A165 provided a toolbox of FSA variants that expand the synthetic possibilities towards the preparation of aldose‐like carbohydrate compounds. The new FSA variants were applied as highly efficient catalysts for cross‐aldol additions of glycolaldehyde to N‐carbobenzyloxyaminoaldehydes to furnish between 80–98 % aldol adduct under optimized reaction conditions. Donor competition experiments showed high selectivity for glycolaldehyde relative to dihydroxyacetone or hydroxyacetone. These results demonstrate the exceptional malleability of the active site in FSA, which can be remodeled to accept a wide spectrum of donor and acceptor substrates with high efficiency and selectivity.  相似文献   

8.
The reaction of 4,4′‐biphenol and two species of bromoalkanes (e.g., bromoethane and 1‐bromobutane) to synthesize two symmetric products (4,4′‐diethanoxy biphenyl and 4,4′‐dibutanoxy biphenyl) and one asymmetric product (4‐ethanoxy, 4′‐butanoxy biphenyl) was successfully carried out under two‐phase phase‐transfer catalysis conditions. A rational mechanism and kinetic model were built up by considering the reactions both in aqueous phase and in organic phase. The first active catalyst (QO(Ph)2OQ) was also synthesized under two‐phase reaction and was identified by instruments. The experimental data were explained satisfactorily by the pseudo‐steady‐state hypothesis. Two sets of rate constants of organic reactions, i.e. primary (k1 and k2) and secondary (k11, k12, k21, and k22) rate constants participate in the kinetic model. The two primary rate constants were obtained individually via experimental data for synthesizing the symmetric products. The ratios of the other four secondary rate constants were obtained from the reaction of synthesizing asymmetric products and determined from the initial yield rates of symmetric products. The effects of the ratio of bromoethane and 1‐bromobutane, temperature, organic solvents, amount of catalyst, and amount of sodium hydroxide on the reaction rate and the selectivity of products were investigated in detail. The results were explained satisfactorily by the interaction between the reactants and the environmental species. © 2003 Wiley Periodicals, Inc. Int J Chem Kinet 35: 139–153, 2003  相似文献   

9.
A strategy to utilize neutral model compounds for lipophilicity measurement of ionizable basic compounds by reversed‐phase high‐performance liquid chromatography is proposed in this paper. The applicability of the novel protocol was justified by theoretical derivation. Meanwhile, the linear relationships between logarithm of apparent n‐octanol/water partition coefficients (logKow′′) and logarithm of retention factors corresponding to the 100% aqueous fraction of mobile phase (logkw) were established for a basic training set, a neutral training set and a mixed training set of these two. As proved in theory, the good linearity and external validation results indicated that the logKow′′–logkw relationships obtained from a neutral model training set were always reliable regardless of mobile phase pH. Afterwards, the above relationships were adopted to determine the logKow of harmaline, a weakly dissociable alkaloid. As far as we know, this is the first report on experimental logKow data for harmaline (logKow = 2.28 ± 0.08). Introducing neutral compounds into a basic model training set or using neutral model compounds alone is recommended to measure the lipophilicity of weakly ionizable basic compounds especially those with high hydrophobicity for the advantages of more suitable model compound choices and convenient mobile phase pH control.  相似文献   

10.
11.
Sulfenic acids play a prominent role in biology as key participants in cellular signaling relating to redox homeostasis, in the formation of protein‐disulfide linkages, and as the central players in the fascinating organosulfur chemistry of the Allium species (e.g., garlic). Despite their relevance, direct measurements of their reaction kinetics have proven difficult owing to their high reactivity. Herein, we describe the results of hydrocarbon autoxidations inhibited by the persistent 9‐triptycenesulfenic acid, which yields a second order rate constant of 3.0×106 M ?1 s?1 for its reaction with peroxyl radicals in PhCl at 30 °C. This rate constant drops 19‐fold in CH3CN, and is subject to a significant primary deuterium kinetic isotope effect, kH/kD=6.1, supporting a formal H‐atom transfer (HAT) mechanism. Analogous autoxidations inhibited by the Allium‐derived (S)‐benzyl phenylmethanethiosulfinate and a corresponding deuterium‐labeled derivative unequivocally demonstrate the role of sulfenic acids in the radical‐trapping antioxidant activity of thiosulfinates, through the rate‐determining Cope elimination of phenylmethanesulfenic acid (kH/kD≈4.5) and its subsequent formal HAT reaction with peroxyl radicals (kH/kD≈3.5). The rate constant that we derived from these experiments for the reaction of phenylmethanesulfenic acid with peroxyl radicals was 2.8×107 M ?1 s?1; a value 10‐fold larger than that we measured for the reaction of 9‐triptycenesulfenic acid with peroxyl radicals. We propose that whereas phenylmethanesulfenic acid can adopt the optimal syn geometry for a 5‐centre proton‐coupled electron‐transfer reaction with a peroxyl radical, the 9‐triptycenesulfenic is too sterically hindered, and undergoes the reaction instead through the less‐energetically favorable anti geometry, which is reminiscent of a conventional HAT.  相似文献   

12.
Synthesis of N‐(1H‐imidazoline‐2‐yl)‐1H‐benzimidazol‐2‐amine was carried out under microwave irradiation (MWI) conditions. Dynamic 1H NMR investigation of N‐(1H‐imidazoline‐2‐yl)‐1H‐benzimidazol‐2‐amine compound was reported at temperature range of 223–333 K in DMF‐d7. Some physical parameters, such as coalescence temperature (Tc), the free energy of activation (ΔG??) and rate constant (k) values were calculated from its 1H NMR spectra at various temperatures. Electrochemical feature of this compound was investigated by cyclic (CV) and square wave voltammetry (SWV).  相似文献   

13.
Reaction orders for the key components in the palladium(II)‐catalyzed oxidative cross‐coupling between phenylboronic acid and ethyl thiophen‐3‐yl acetate were obtained by the method of initial rates. It turned out that the reaction rate not only depended on the concentration of palladium trifluoroacetate (reaction order: 0.97) and phenylboronic acid (reaction order: 1.26), but also on the concentration of the thiophene (reaction order: 0.55) and silver oxide (reaction order: ?1.27). NMR spectroscopy titration studies established the existence of 1:1 complexes between the silver salt and both phenylboronic acid and ethyl thiophen‐3‐yl acetate. A low inverse kinetic isotope effect (kH/kD=0.93) was determined upon employing the 4‐deuterated isotopomer of ethyl thiophen‐3‐yl acetate and monitoring its reaction to the 4‐phenyl‐substituted product. A Hammett analysis performed with para‐substituted 2‐phenylthiophenes gave a negative ρ value for oxidative cross‐coupling with phenylboronic acid. Based on the kinetic data and additional evidence, a mechanism is suggested that invokes transfer of the phenyl group from phenylboronic acid to a 1:1 complex of palladium trifluoroacetate and thiophene as the rate‐determining step. Proposals for the structure of relevant intermediates are made and discussed.  相似文献   

14.
The values of pseudo first‐order rate constants (kobs) for the cleavage of N‐(2‐hydroxyphenyl)phthalamic acid ( 7 ), obtained at 4.9 × 10?2 M HCl, 35°C, and within CH3CN content range 2–80% (v/v) in mixed aqueous solvent are smaller than kobs for the cleavage of N‐(2‐methoxyphenyl)phthalamic acid ( 8 ), obtained under almost similar experimental conditions, by nearly 1.5‐ to 2‐fold. These observations show the absence of expected intramolecular general acid catalysis due to 2‐OH group in 7 . The values of kobs for the cleavage of 7 and 8 decrease by more than 20‐fold with the increase in the content of CH3CN from 2 to 80–82% (v/v) in mixed aqueous solvent. The kinetic data reveal that in acidic aqueous cleavage of 7 , N‐cyclization (leading to the formation of imide) and O‐cyclization (leading to the formation of phthalic anhydride) vary from ~10 to 15% and ~90 to 85%, respectively, with the increase in CH3CN content from 2 to 80% (v/v). Similar increase in CH3CN content causes increase in N‐cyclization from ~0 to 5% and decrease in O‐cyclization from ~100 to 95% in the acidic aqueous cleavage of 8 . Some speculative, yet conceivable, reasons for nearly 10 and 0% N‐cyclization in the cleavage of respective 7 and 8 at low content of CH3CN have been described. © 2006 Wiley Periodicals, Inc. Int J Chem Kinet 38: 746–758, 2006  相似文献   

15.
SPE method is a very popular technique, and is commonly used for the prepurification, concentration, and isolation of different organic compounds from variable matrices. In this work, the optimization of SPE process was carried out. The breakthrough volume of solid sorbents based on octadecylsilane was determined and three methods were compared: (1) calculation one – the breakthrough volume was calculated using retention factor k determined with micro‐TLC method, frontal analysis – (2) breakthrough volume was determined as volume of whole elution peak, and (3) breakthrough volume was determined as the center of peak gravity. For calculation method, the k values of key estrogens and progestogens were derived from the micro‐TLC experiment reported previously. By combining these three methods, we can point the start of elution, the maximum concentration of analyte in eluate, and the whole eluent volume, which is necessary to achieve an appropriate selectivity and high extraction recovery. Proposed calculation method allows to estimate the beginning of the steroid peak, when the analyte appears in the eluate flowing from the sorbent. Such observation advances the SPE optimization protocol that was described before and was based on the correlation between raw kSPE and kmicro‐TLC data.  相似文献   

16.
The 2,6,8‐triaryl‐3‐iodoquinolin‐4(1H)‐ones derived from the 2,6,8‐triarylquinolin‐4(1H)‐ones were found to undergo Suzuki–Miyaura cross‐coupling with arylboronic acids to afford the corresponding 2,3,6,8‐tetraarylquinolin‐4(1H)‐ones. Sonogashira cross‐coupling of the 2,6,8‐triaryl‐3‐iodoquinolin‐4(1H)‐ones with terminal acetylene in DMF–water (4:1, v/v) in the presence of triethylamine, on the other hand, afforded the 2‐substituted 4,6,8‐triaryl‐1H‐furo[3,2‐c]quinolines in a single‐pot operation.  相似文献   

17.
Malonylation is a recently discovered post‐translational modification (PTM) in which a malonyl group attaches to a lysine (K) amino acid residue of a protein. In this work, a novel machine learning model, SPRINT‐Mal, is developed to predict malonylation sites by employing sequence and predicted structural features. Evolutionary information and physicochemical properties are found to be the two most discriminative features whereas a structural feature called half‐sphere exposure provides additional improvement to the prediction performance. SPRINT‐Mal trained on mouse data yields robust performance for 10‐fold cross validation and independent test set with Area Under the Curve (AUC) values of 0.74 and 0.76 and Matthews’ Correlation Coefficient (MCC) of 0.213 and 0.20, respectively. Moreover, SPRINT‐Mal achieved comparable performance when testing on H. sapiens proteins without species‐specific training but not in bacterium S. erythraea. This suggests similar underlying physicochemical mechanisms between mouse and human but not between mouse and bacterium. SPRINT‐Mal is freely available as an online server at: http://sparks-lab.org/server/SPRINT-Mal/ . © 2018 Wiley Periodicals, Inc.  相似文献   

18.
We have carried out relative rate experiments (T = 294 ± 2 K, atmospheric pressure) to investigate the OH‐oxidation of o‐, m‐, and p‐ethyltoluene and n‐nonane (k1, k2, k3, and k4 respectively). The experiments were performed in a 2‐m3 smog chamber with Teflon coated walls. The rate constants obtained are (in cm3 molecule?1 s?1 with two sigma uncertainties): k1 = (1.36 ± 0.07) × 10?11; k2 = (2.12 ± 0.26) × 10?11; k3 = (1.47 ± 0.04) × 10?11, and k4 = (0.95 ± 0.02) × 10?11. The measured rate constants are in accordance with previously published data, so that a coherent group of values for the compounds studied can be established. Atmospheric implications, ozone, and particle production are discussed. In addition, we have determined the amount of o‐, m‐, and p‐ethyltoluenes in different types of gasoline. © 2004 Wiley Periodicals, Inc. Int J Chem Kinet 36: 367–378 2004  相似文献   

19.
Using factor analysis and stepwise linear regression methods, two parameters – CMR and ECCR – were selected from eight solute‐related structure parameters as the most retention‐influencing parameters. The relationships between the retention data (k ´) and the two structure parameters were established for 13 O‐aryl,O‐(1‐methylthioethylideneamino)phosphate compounds under a wide range of experimental conditions. The retention data (k ´) of another seven compounds with similar structures were predicted using these QSRR equations. Good agreement was obtained between the experimental k ´ values and predicted ones.  相似文献   

20.
1,3‐Dipolar cycloaddition of methyl diazoacetate to methyl acrylate was investigated by kinetic 1Н NMR spectroscopy. It was established that the mechanism of the process includes parallel formation of trans‐ and cis‐dimethyl‐4,5‐dihydro‐3H‐pyrazol‐3,5‐dicarboxylates as a result of [3 + 2]‐cycloaddition of methyl diazoacetate to methyl acrylate; the corresponding rate constants were denoted k1t and k1c. The reaction rate of the isomerization of 3Н‐pyrazolines to 4,5‐dihydro‐1H‐pyrazol‐3,5‐dicarboxylate (3Н → 1Н‐pyrazoline rearrangement) was found to be sensitive to both the methyl acrylate (k2t, k2c) and 1Н‐pyrazoline concentrations (k3t, k3c). Kinetic analysis showed that the proposed scheme is valid for various reagent concentrations. The numerical solution of the system of differential equations corresponded to the reaction scheme and was used to determine the complete set of reaction rate constants (k (× 105 M–1·s–1), 298 K; solvent, benzene‐d6): k1t = 2.3 ± 0.3, k1c = 1.6 ± 0.2, k2t = 1.1 ± 0.3, k2c = 1.8 ± 0.5, k3t = 1.2 ± 0.4, k3c = 2.2 ± 0.7.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号