期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

1.

Variable ranking based on the estimated degree of separation for two distributions of data by the length of the receiver operating characteristic curve

Waleed M. Maswadeh A. Peter Snyder 《Analytica chimica acta》2015

Variable responses are fundamental for all experiments, and they can consist of information-rich, redundant, and low signal intensities. A dataset can consist of a collection of variable responses over multiple classes or groups. Usually some of the variables are removed in a dataset that contain very little information. Sometimes all the variables are used in the data analysis phase. It is common practice to discriminate between two distributions of data; however, there is no formal algorithm to arrive at a degree of separation (DS) between two distributions of data. The DS is defined herein as the average of the sum of the areas from the probability density functions (PDFs) of A and B that contain a ≥ percentage of A and/or B. Thus, DS90 is the average of the sum of the PDF areas of A and B that contain ≥90% of A and/or B. To arrive at a DS value, two synthesized PDFs or very large experimental datasets are required. Experimentally it is common practice to generate relatively small datasets. Therefore, the challenge was to find a statistical parameter that can be used on small datasets to estimate and highly correlate with the DS90 parameter. Established statistical methods include the overlap area of the two data distribution profiles, Welch’s t-test, Kolmogorov–Smirnov (K–S) test, Mann–Whitney–Wilcoxon test, and the area under the receiver operating characteristics (ROC) curve (AUC). The area between the ROC curve and diagonal (ACD) and the length of the ROC curve (LROC) are introduced. The established, ACD, and LROC methods were correlated to the DS90 when applied on many pairs of synthesized PDFs. The LROC method provided the best linear correlation with, and estimation of, the DS90. The estimated DS90 from the LROC (DS90–LROC) is applied to a database, as an example, of three Italian wines consisting of thirteen variable responses for variable ranking consideration. An important highlight of the DS90–LROC method is utilizing the LROC curve methodology to test all variables one-at-a-time with all pairs of classes in a dataset. 相似文献

2.

Gene expression analysis of combined RNA-seq experiments using a receiver operating characteristic calibrated procedure

《Computational Biology and Chemistry》2021

相似文献

3.

Feature selection for descriptor based classification models. 1. Theory and GA-SEC algorithm

Wegner JK Fröhlich H Zell A 《Journal of chemical information and computer sciences》2004,44(3):921-930

The paper describes different aspects of classification models based on molecular data sets with the focus on feature selection methods. Especially model quality and avoiding a high variance on unseen data (overfitting) will be discussed with respect to the feature selection problem. We present several standard approaches and modifications of our Genetic Algorithm based on the Shannon Entropy Cliques (GA-SEC) algorithm and the extension for classification problems using boosting. 相似文献

4.

Feature selection for descriptor based classification models. 2. Human intestinal absorption (HIA)

Wegner JK Fröhlich H Zell A 《Journal of chemical information and computer sciences》2004,44(3):931-939

相似文献

5.

Emerging chemical patterns: a new methodology for molecular classification and compound selection

Auer J Bajorath J 《Journal of chemical information and modeling》2006,46(6):2502-2514

A concept termed Emerging Chemical Patterns (ECPs) is introduced as a novel approach to molecular classification. The methodology makes it possible to extract key molecular features from very few known active compounds and classify molecules according to different potency levels. The approach was developed in light of the situation often faced during the early stages of lead optimization efforts: too few active reference molecules are available to build computational models for the prediction of potent compounds. The ECP method generates high-resolution signatures of active compounds. Predictive ECP models can be built based on the information provided by sets of only three molecules with potency in the nanomolar and micromolar range. In addition to individual compound predictions, an iterative ECP scheme has been designed. When applied to different sets of active molecules, iterative ECP classification produced compound selection sets with increases in average potency of up to 3 orders of magnitude. 相似文献

6.

An attempt for a new classification of the methods used to evaluate non-isothermal kinetic parameters

E. Urbanovici E. Segal 《Thermochimica Acta》1985,94(2):409-410

相似文献

7.

Influence of variable selection on partial least squares discriminant analysis models for explosive residue classification

Frank C. De Lucia Jr. Jennifer L. Gottfried 《Spectrochimica Acta Part B: Atomic Spectroscopy》2011,66(2):122-128

Using a series of thirteen organic materials that includes novel high-nitrogen energetic materials, conventional organic military explosives, and benign organic materials, we have demonstrated the importance of variable selection for maximizing residue discrimination with partial least squares discriminant analysis (PLS-DA). We built several PLS-DA models using different variable sets based on laser induced breakdown spectroscopy (LIBS) spectra of the organic residues on an aluminum substrate under an argon atmosphere. The model classification results for each sample are presented and the influence of the variables on these results is discussed. We found that using the whole spectra as the data input for the PLS-DA model gave the best results. However, variables due to the surrounding atmosphere and the substrate contribute to discrimination when the whole spectra are used, indicating this may not be the most robust model. Further iterative testing with additional validation data sets is necessary to determine the most robust model. 相似文献

8.

Determination of the LOQ in real-time PCR by receiver operating characteristic curve analysis: application to qPCR assays for Fusarium verticillioides and F. proliferatum

Nutz S Döll K Karlovsky P 《Analytical and bioanalytical chemistry》2011,401(2):717-726

Real-time PCR (qPCR) is the principal technique for the quantification of pathogen biomass in host tissue, yet no generic methods exist for the determination of the limit of quantification (LOQ) and the limit of detection (LOD) in qPCR. We suggest using the Youden index in the context of the receiver operating characteristic (ROC) curve analysis for this purpose. The LOQ was defined as the amount of target DNA that maximizes the sum of sensitivity and specificity. The LOD was defined as the lowest amount of target DNA that was amplified with a false-negative rate below a given threshold. We applied this concept to qPCR assays for Fusarium verticillioides and Fusarium proliferatum DNA in maize kernels. Spiked matrix and field samples characterized by melting curve analysis of PCR products were used as the source of true positives and true negatives. On the basis of the analysis of sensitivity and specificity of the assays, we estimated the LOQ values as 0.11 pg of DNA for spiked matrix and 0.62 pg of DNA for field samples for F. verticillioides. The LOQ values for F. proliferatum were 0.03 pg for spiked matrix and 0.24 pg for field samples. The mean LOQ values correspond to approximately eight genomes for F. verticillioides and three genomes for F. proliferatum. We demonstrated that the ROC analysis concept, developed for qualitative diagnostics, can be used for the determination of performance parameters of quantitative PCR. 相似文献

9.

Pharmacophore modeling methods in focused library selection - applications in the context of a new classification scheme

Luu TT Malcolm N Nadassy K 《Combinatorial chemistry & high throughput screening》2011,14(6):488-499

A pharmacophore is a model which represents the key physico-chemical interactions that mediate biological activity. There is a long history of using pharmacophore modeling methods to select subsets of compounds, focused towards a specific target of interest. This paper will review existing computational methods for deriving and comparing pharmacophore models. We outline a new classification of pharmacophore methods based on the abstraction of the underlying chemical interactions which embody a pharmacophore, and the methods available to quantitatively compare them. Within the context of this classification, example studies, using specific pharmacophore modeling methods for focused library selection, will be discussed. 相似文献

10.

Statistical tests for the selection of the optimum parameters set in models describing response surfaces in reversed-phase liquid chromatography

A. Pappa-Louisi P. Nikitas 《Chromatographia》2003,57(3-4):169-176

Summary An appropriate procedure based on statistical criteria is suggested for the determination of the optimum set of model parameters for a given chromatographic system. The criteria employed are the t-ratio test, the rate of change in the sum of squares of residuals, the standard error of the fit, the F-test, and the C_P-test. The suggested procedure has been evaluated using two different models, one based on partition and the other on adsorption mechanisms, which describe the combined effect of pH and organic modifier content on the retention of ionogenic solutes in reversed-phase liquid chromatography. It is shown that all the criteria give almost converged results and therefore we may simply use the F-test, which seems to be the most sensitive and reliable criterion excluding any personal judgement. It is also found that the retention models tested show a different behavior towards their simplification. In particular, the use of a reduced equation of the partition model, selected on the basis of the suggested procedure, is necessary for the prediction of meaningful retention surfaces, whereas the decrease in the number of the adjustable parameters in the adsorption model offers only noise reduction and fitting simplicity, because no version of this model predicts abnormal retention surfaces. 相似文献

11.

Two new 2-D coordination polymers based on a purine-containing carboxylate

《Journal of Coordination Chemistry》2012,65(21):3721-3730

A purine-containing multifunctional ligand, 2-(6-oxo-6H-purin-1(9H)-yl)acetic acid (HL), and two new 2-D coordination polymers, [Co(L)₂(H₂O)₂] _n?·?2nH₂O (1) and [Ni(L)₂(H₂O)₂] _n?·?2nH₂O (2), were synthesized and characterized. Polymers 1 and 2 have isomorphous structures with (4,4)-connected topologies composed of left- and right-handed metal–organic helices sharing common metal centers. Two helical conformations in the same net are stabilized by strong π–π stacking interactions between purine groups. Through direct and water-mediated interlayer hydrogen-bond interactions those layers are assembled into stable 3-D supermolecules where slight differences in the strength of hydrogen bonds and coordination bonds result in their decomposition behaviors. 相似文献

12.

The use of artificial neural networks for the selection of the most appropriate thermal parameters and for the classification of a set of phenylcarbamic acid derivates

Umbreit MH Nowicki P Klos J Cizmarik J 《Combinatorial chemistry & high throughput screening》2006,9(6):455-464

The objective of this work was to apply artificial neural networks (ANNs) to the classification group of 43 derivatives of phenylcarbamic acid. To find the appropriate clusters Kohonen topological maps were employed. As input data, thermal parameters obtained during DSC and TG analysis were used. Input feature selection (IFS) algorithms were used in order to give an estimate of the relative importance of various input variables. Additionally, sensitivity analysis was carried out to eliminate less important thermal variables. As a result, one classification model was obtained, which can assign our compounds to an appropriate class. Because the classes contain groups of molecules structurally related, it is possible to predict the structure of the compounds (for example the position of the substitution alkoxy group in the phenyl ring) on the basis of obtained parameters. 相似文献

13.

Rivality index neighbourhood algorithm with density and distances weighted schemes for the building of robust QSAR classification models with high reliable applicability domain

I. Luque Ruiz M.Á. Gómez-Nieto 《SAR and QSAR in environmental research》2013,24(8):587-615

The rivality index (RI) is a normalized distance measurement between a molecule and their first nearest neighbours providing a robust prediction of the activity of a molecule based on the known activity of their nearest neighbours. Negative values of the RI describe molecules that would be correctly classified by a statistic algorithm and, vice versa, positive values of this index describe those molecules detected as outliers by the classification algorithms. In this paper, we have described a classification algorithm based on the RI and we have proposed four weighted schemes (kernels) for its calculation based on the measuring of different characteristics of the neighbourhood of molecules for each molecule of the dataset at established values of the threshold of neighbours. The results obtained have demonstrated that the proposed classification algorithm, based on the RI, generates more reliable and robust classification models than many of the more used and well-known machine learning algorithms. These results have been validated and corroborated by using 20 balanced and unbalanced benchmark datasets of different sizes and modelability. The classification models generated provide valuable information about the molecules of the dataset, the applicability domain of the models and the reliability of the predictions. 相似文献

14.

The classification of solvents based on solvatochromic characteristics: the choice of optimal parameters for artificial neural networks

Yaroslava Pushkarova Yuriy Kholin 《Central European Journal of Chemistry》2012,10(4):1318-1327

The Taft-Kamlet-Abboud hydrogen-bond acidity, hydrogen-bond basicity and polarity-polarizability are widely used as empirical characteristics of solvent-solute interactions. These solvatochromic parameters are determined from the absorption band positions of solvatochromic probes in the standard medium and in the medium under study. The practice of solvatochromic probing is growing rapidly, and the values of solvatochromic parameters are refined from time to time. As these values are rather close for many media, the classification of media based on these values can be tedious. This increases the choice of algorithms that can be employed in order to decrease the ambiguity of classification. The classification algorithms stable to small variations of solvatochromic parameters are of special interest. The artificial neural networks (ANN) proved to be a powerful tool for the supervised classification. The paper focuses on the search of optimal parameters of probabilistic, dynamic, Elman, feed-forward, and cascade ANN for the classification of solvent on the basis of their solvatochromic characteristics. Also, the influence of data variation on the stability of classification is examined. The dynamic and probabilistic neural networks have been found to be error-free and stable; they have significantly become such a common tool for supervised classification as linear discriminant analysis. 相似文献

15.

Verification of calorimetric models based on physical parameters by frequential characteristics

E. Cesari J. Ortín J. Vińals J. Hatt W. Zielenkiewicz V. Torra 《Thermochimica Acta》1983,71(3):351-357

Quantitative criteria to ascertain the quality of calorimetric models based on physical parameters are presented. These include not only a comparison between model and experimental pulse responses, especially for the larger time constants, but also an analysis of their spectra up to the frequential limit brought about by the experimental noise.A calorimetric model based on the physical parameters of a Unipan 600 calorimeter is used to reconstruct a given power dissipation. The results are then compared to those given by other methods, i.e. dynamic optimization, inverse filtering and harmonic analysis. 相似文献

16.

Evaluating the applicability domain in the case of classification predictive models for carcinogenicity based on the counter propagation artificial neural network

Fjodorova N Novič M Roncaglioni A Benfenati E 《Journal of computer-aided molecular design》2011,25(12):1147-1158

相似文献

17.

A new sensor for ammonia based on cyanidin-sensitized titanium dioxide film operating at room temperature

Huang Xiao-wei Zou Xiao-bo Shi Ji-yong Zhao Jie-wen Li Yanxiao Hao Limin Zhang Jianchun 《Analytica chimica acta》2013

Design and fabrication of an ammonia sensor operating at room temperature based on pigment-sensitized TiO₂ films was described. TiO₂ was prepared by sol–gel method and deposited on glass slides containing gold electrodes. Then, the film immersed in a 2.5 × 10⁻⁴ M ethanol solution of cyanidin to absorb the pigment. The hybrid organic–inorganic formed film here can detect ammonia reversibly at room temperature. The relative change resistance of the films at a potential difference of 1.5 V is determined when the films are exposed to atmospheres containing ammonia vapors with concentrations over the range 10–50 ppm. The relative change resistance, S, of the films increased almost linearly with increasing concentrations of ammonia (r = 0.92). The response time to increasing concentrations of the ammonia is about 180–220 s, and the corresponding values for decreasing concentrations 240–270 s. At low humidity, ammonia could be ionized by the cyanidin on the TiO₂ film and thereby decrease in the proton concentration at the surface. Consequently, more positively charged holes at the surface of the TiO₂ have to be extracted to neutralize the adsorbed cyanidin and water film. The resistance response to ammonia of the sensors was nearly independent on temperature from 10 to 50 °C. These results are not actually as good as those reported in the literature, but this preliminary work proposes simpler and cheaper processes to realize NH₃ sensor for room temperature applications. 相似文献

18.

Regression models based on new local strategies for near infrared spectroscopic data

F. Allegrini J.A. Fernández Pierna W.D. Fragoso A.C. Olivieri V. Baeten P. Dardenne 《Analytica chimica acta》2016

In this work, a comparative study of two novel algorithms to perform sample selection in local regression based on Partial Least Squares Regression (PLS) is presented. These methodologies were applied for Near Infrared Spectroscopy (NIRS) quantification of five major constituents in corn seeds and are compared and contrasted with global PLS calibrations. Validation results show a significant improvement in the prediction quality when local models implemented by the proposed algorithms are applied to large data bases. 相似文献

19.

Influence of operating parameters on reproducibility in capillary electrophoresis.

S C Smith J K Strasters M G Khaledi 《Journal of chromatography. A》1991,559(1-2):57-68

The reproducibility of two migration parameters (retention time and mobility) of a seven-component test mixture was examined under various operating conditions using laboratory-built capillary electrophoresis systems. It was found that the frequency of rinsing the capillary and the solutions used for rinsing had the greatest effect on migration reproducibility. In addition, it was found that the migration behavior of solutes that interact with micelles is not repeatable unless the proper rinse protocol is applied. Inconsistent migration behavior is linked to inconsistent total current of the system. Preliminary investigations indicate that the fluctuation in total current were associated with non-equilibrium conditions between the buffer and the capillary wall. 相似文献

20.

Effects of design and operating parameters on entropy generation of a dual cycle

Ebrahimi Rahim Dehkordi Nader Sakenian 《Journal of Thermal Analysis and Calorimetry》2018,133(3):1609-1616

Journal of Thermal Analysis and Calorimetry - In this paper, a performance analysis based on the entropy generation and thermal efficiency has been carried out for the irreversible dual cycle. The... 相似文献