期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

1.

Induction of decision trees using genetic programming for modelling ecotoxicity data: adaptive discretization of real-valued endpoints

Wang XZ Buontempo FV Young A Osborn D 《SAR and QSAR in environmental research》2006,17(5):451-471

Recent literature has demonstrated the applicability of genetic programming to induction of decision trees for modelling toxicity endpoints. Compared with other decision tree induction techniques that are based upon recursive partitioning employing greedy searches to choose the best splitting attribute and value at each node that will necessarily miss regions of the search space, the genetic programming based approach can overcome the problem. However, the method still requires the discretization of the often continuous-valued toxicity endpoints prior to the tree induction. A novel extension of this method, YAdapt, is introduced in this work which models the original continuous endpoint by adaptively finding suitable ranges to describe the endpoints during the tree induction process, removing the need for discretization prior to tree induction and allowing the ordinal nature of the endpoint to be taken into account in the models built. 相似文献

2.

A novel kernel Fisher discriminant analysis: constructing informative kernel by decision tree ensemble for metabolomics data analysis

Cao DS Zeng MM Yi LZ Wang B Xu QS Hu QN Zhang LX Lu HM Liang YZ 《Analytica chimica acta》2011,(1):97-104

Large amounts of data from high-throughput metabolomics experiments become commonly more and more complex, which brings an enormous amount of challenges to existing statistical modeling. Thus there is a need to develop statistically efficient approach for mining the underlying metabolite information contained by metabolomics data under investigation. In the work, we developed a novel kernel Fisher discriminant analysis (KFDA) algorithm by constructing an informative kernel based on decision tree ensemble. The constructed kernel can effectively encode the similarities of metabolomics samples between informative metabolites/biomarkers in specific parts of the measurement space. Simultaneously, informative metabolites or potential biomarkers can be successfully discovered by variable importance ranking in the process of building kernel. Moreover, KFDA can also deal with nonlinear relationship in the metabolomics data by such a kernel to some extent. Finally, two real metabolomics datasets together with a simulated data were used to demonstrate the performance of the proposed approach through the comparison of different approaches. 相似文献

3.

Greener chemicals for the future: QSAR modelling of the PBT index using ETA descriptors

P. De 《SAR and QSAR in environmental research》2018,29(4):319-337

相似文献

4.

Variable selection for multivariate calibration using a genetic algorithm: prediction of additive concentrations in polymer films from Fourier transform-infrared spectral data

Riccardo LeardiRandy J. Pell 《Analytica chimica acta》2002,461(2):189-200

Variable selection using a genetic algorithm is combined with partial least squares (PLS) for the prediction of additive concentrations in polymer films using Fourier transform-infrared (FT-IR) spectral data. An approach using an iterative application of the genetic algorithm is proposed. This approach allows for all variables to be considered and at the same time minimizes the risk of overfitting. We demonstrate that the variables selected by the genetic algorithm are consistent with expert knowledge. This very exciting result is a convincing application that the algorithm can select correct variables in an automated fashion. 相似文献