Application of Shannon-like diversity measures to cell-based chemistry spaces |
| |
Authors: | Veerabahu Shanmugasundaram Gerald M Maggiora |
| |
Institution: | 1.Anti Bacterials Chemistry/Discovery Technologies,World Wide Medicinal Chemistry, Pfizer Pharma Therapeutics Research & Development,Groton,USA;2.Department of Pharmacology & Toxicology,University of Arizona, College of Pharmacy,Tucson,USA;3.Structural & Computational Chemistry, Pharmacia Corporation,Kalamazoo,USA |
| |
Abstract: | The use of multi-dimensional “chemistry spaces” to represent large compound collections has become widespread in pharmaceutical
research. In such spaces compounds are treated as points. Points in close proximity represent similar compounds, while distant
points represent dissimilar compounds. Assessing the diversity of a compound collection, thus, is tantamount to characterizing
the distribution of points in chemistry space. To facilitate many procedures such as selecting subsets of compounds for screening,
for compound acquisition and designing combinatorial libraries, chemistry spaces have been partitioned into sets of non-overlapping,
multi-dimensional cells, which are generated by dividing each axis into a number of equal-sized bins. This leads to a lattice
of (Nbins)Ndim{(N_{bins})^{N_{\rm dim}}} cells, where N
bins
is the number of bins on each axis and N
dim is the dimensionality of the space. One diversity measure that is typically used in cell-based chemistry spaces is identical
in form to Shannon entropy, DNcpdcpd{D_{N_{cpd}}^{cpd}} A normalized measure of this Shannon entropy given by, Drelcpd{D_{rel}^{cpd}} enables comparison between compound collections that occupy different number of occupied cells. Although Drelcpd{D_{rel}^{cpd}} characterizes the uniformity and “spreadout” of the corresponding compound collection, it treats cells as positionally independent. Some of the positional information lost can be recaptured by another diversity measure, which is also related in form to
Shannon entropy. This new measure DNbincell (l){D_{N_{bin}}^{cell} (\lambda)} characterizes the distribution of occupied cells along each axis of chemistry space. The normalized measure
á Drelcell
ñ{\left\langle {D_{rel}^{cell}}\right\rangle} over all axes is given then by the average. Examples illustrating the applicability of these two Shannon-like measures to
compound collections are presented. |
| |
Keywords: | |
本文献已被 SpringerLink 等数据库收录! |
|