首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 93 毫秒
1.
2.
3.
4.
5.
定量结构-活性/性质相关性(QSAR/QSPR)研究的基本依据是化合物的性质与结构具有相关性,所以只要有方法描述化合物的结构(得到X)就可与化合物的性质(作为Y)建立起数学模型,并由引模型预测未知化合物。由化合物的结构可衍生(即描述)出诸多变量,从统计学出发,希望用尽可能少的变量来表征尽可能多的信息(如多元回归分析)。过多的变量不仅计算量大,从而可以导致所得的数学模型不稳定,使预测结果较差^[1],而且不同变量的组合所得结果可能差别很大,由此需要对变量进行压缩和选择。虽然变量的选择是一个非常费时和复杂的工作,但变量选择的好坏对数学模型的稳定性及准确性有致关重要的影响,从某种角度上讲,它能决定一项QSAR/QSPR研究的成败。最简单的选择变量的方法是穷举组合法,但此方法的计算量非常大,特别是当变量数较大时,该方法是实际上是不可行的,尽管用于变量选择的方法已有报道,但问题尚有待进一步研究。本文侧重比较了正交变换法与变量最优子集回归法,得到了很有启示性的结果。  相似文献   

6.
7.
8.
9.
10.
11.
12.
13.
14.
15.
16.
17.
18.
19.
20.
The relevance of terms other than linear when deriving quantitative structure-activity relationship/quantitative structure-property relationship (QSAR/QSPR) models has been rarely considered so far. In this study, the impact of quadratic and interacting terms has been taken into account. The first effect of including such highly structured terms is a significant extension of the parametric domain that moves from the initial N to N(N + 3)/2 parameters. This substantial enlargement over the conventional linear boundaries involves a higher computational cost due to the increased combinatorial number of resulting theoretical QSAR/QSPR models. To face this issue, novel genetic-algorithm-based software, MGZ (multigenetic zooming), was developed and used for both variable selection and model building. To speed up the entire process of domain searching, MGZ was supported with multiple independent evolving populations and genetic storms to further QSAR/QSPR analyses. In addition, a novel fitness function was developed to score models on the basis of their inner predictive capability, assessed on the training set, structure complexity, and presence of nonlinear terms. The models were further validated by monitoring model redundancy and performing intensive randomization runs. The Selwood data set was used as a reference set to derive QSAR models. Furthermore, a QSPR study was conducted on the solubility data set of a large array of organic compounds. The results reported in the present paper demonstrate that our approach is successful in finding linear models, which are at least as good as the models previously derived using standard statistical approaches, and in deriving new nonlinear models with good statistical figures.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号