This article discusses problems of validating classification models especially in datasets where sample sizes are small and the number of variables is large. It describes the use of percentage correctly classified (%CC) as an indicator for success of a classification model. For small datasets, %CC should not be used uncritically and its interpretation depends on sample size. It illustrates the use of a common classification method, discriminant partial least squares (D-PLS) on a randomly generated dataset of 200 samples and 200 variables.
An aim of the classifier is to determine whether the null hypothesis (there is no distinction between two classes) can be rejected. Autoprediction gives an 84.5% CC. It is shown that, if there is variable selection, it must be performed independently on the training set to obtain a CC close to 50% on the test set; otherwise, over-optimistic and false conclusions can be reached about the ability to classify samples into groups.
Finally, two aims of determining the quality of a model are frequently confused, namely optimisation (often used to determine the most appropriate number of components in a model) and independent validation; to overcome this, the data should be split into three groups.
There are often difficulties with model building if validation and optimisation have been done on different groups of samples, especially using iterative methods, each group being modelled using properties, such as a different number of components or different variables. 相似文献
The densities of H2O, D2O, and MeOH solutions in acetonitrile with the solute concentrations up to 0.07 molar fractions at 278.15, 288.15, 298.15, 308.15, and 318.15 K were measured using vibrating-tube densimetry with an error 8·10–6 g cm–3. The limiting partial molar volumes for the H/D isotopomers of water and IaII in acetonitrile (V–2) and the isotope effects in V–2 and in excess molar volumes of acetonitrile—water mixtures were calculated. Molecules of H2O, D2O, and IaII form associates with acetonitrile molecules via hydrogen bonds. The associates have the packing volumes close to those in the individual solute. The water and methanol molecules were assumed to be incorporated into the acetonitrile structure without substantial changes in the latter. However, this process results in some compression of the system with a simultaneous increase in its expansibility. 相似文献
Using a precise technique of scanning microcalorimetry the heat capacity differences between water and dilute aqueous solutions of ethanol, n-propanol, n-butanol and n-pentanol were measured from 5 to 125°C and the partial molar heat capacities of these substances in water were determined. It was found that the heat capacity increment for alcohol disolved in water is proportional to the number of the-CH
2–
groups and decrease with a temperature increase. The heat capacity increment of hydration of non-polar groups is shown to be positive and large at room temperature and decreases in magnitude as the temperature increases. In contrast, the heat capacity increment of hydration of polar groups is negative at room tempreature and increases as the temperature increases. From the temperature dependence of the heat capacity increment one can assume that the water molecules solvated by the non-polar groups of the alcohols behave in a non-cooperative manner. 相似文献
n–electron valence state perturbation theory (NEVPT) is a form of multireference perturbation theory where all the zero-order wave functions are of multireference nature, being generated as eigenfunctions of a two–electron model Hamiltonian. The absence of intruder states makes NEVPT an interesting choice for the calculation of electronically excited states. Test calculations have been performed on several valence and Rydberg transitions for the formaldehyde and acetone molecules; the results are in good accordance with the best calculations and with the existing experimental data.Contribution to the Jacopo Tomasi Honorary Issue 相似文献
Starting from the natural neo-clerodane diterpenoid teubotrin (1) several neo-clerodane derivatives (3-7,9-11) have been obtained. The naturally occurring diterpenoid teuscordinon (12) has also been synthesized from teubotrin (1), showing thereby how some of these transformations can be useful for the synthesis of other natural neo-clerodane diterpenes. The latter are of interest due to their activity as insect antifeedants and other important biological properties. 相似文献
Assuming the separation of the intermolecular scattering function into the radial and angular parts and using Egelstaffet al’s orientational model for tetrachlorides, the structure of liquid vanadium tetrachloride has been studied. It has been observed
that such a separation is approximate for this liquid and the introduction of a third correction term is required to account
for the molecular structure function. The chlorine-chlorine partial structure and effective angleaveraged intermolecular chlorine-chlorine
potential in the liquid has been evaluated. Without taking the third correction term, introduced to generate theoretically
the molecular structure function, the centre structure function has been obtained in an approximate way from the experimentally
observed molecular structure function and from it the centre radial distribution function, centre direct correlation function
and the angle-averaged vanadium-vanadium effective potential has been evaluated. 相似文献