首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 62 毫秒
1.
2.
3.
The advent of focused library and virtual screening has reduced the disadvantage of combinatorial chemistry and changed it to a realizable and cost-effective tool in drug discovery. Usually, genetic algorithms (GAs) are used to quickly finding high-scoring molecules by sampling a small subset of the total combinatorial space. Therefore, scoring functions play essential roles in focused library design. Reported here is our initial attempt to establish a new approach for generating a target-focused library using the combination of the scores of structural diversity and binding affinity with our newly improved drug-likeness scoring functions. Meanwhile, a software package, named LD1.0, was developed on the basis of the new approach. One test on a cyclooxygenase (COX)2-focused library successfully reproduced the structures that have been experimentally studied as COX2-selective inhibitors. Another test is on a peroxisome proliferator-activated receptors gamma-focused library design, which not only reproduces the key fragments in the approved (thiazolidinedione) TZD drugs, but also generates some new structures that are more active than the approved drugs or published ligands. Both of the two tests took approximately 15% of the running time of the ordinary molecular docking method. Thus, our new approach is an effective, reliable, and practical way for building up a properly sized focused library with a high hit rate, novel structure, and good ADME/T profile.  相似文献   

4.
5.
Compound subsets, which may be screened where it is not feasible or desirable to screen all available compounds, may be designed using rational or random selection. Literature on the relative performance of random versus rational selection reports conflicting observations, possibly because some random subsets might be more representative than others and perform better than subsets designed by rational means, or vice versa. In order to address this likelihood, we simulated a large number of rationally designed subsets for evaluation against an equally large number of randomly generated subsets. We found that our rationally designed subsets give higher mean hit rates compared to those of the random ones. We also compared subsets comprising random plates with subsets of random compounds and found that, while the mean hit rate of both is the same, the former demonstrates more variation in the hit rate. The choice of compound file, rational subset method, and ratio of the subset size to the compound file size are key factors in the relative performance of random and rational selection, and statistical simulation is a viable way to identify the selection approach appropriate for a subset.  相似文献   

6.
In this paper, we propose an algorithm for the design of lead generation libraries required in combinatorial drug discovery. This algorithm addresses simultaneously the two key criteria of diversity and representativeness of compounds in the resulting library and is computationally efficient when applied to a large class of lead generation design problems. At the same time, additional constraints on experimental resources are also incorporated in the framework presented in this paper. A computationally efficient scalable algorithm is developed, where the ability of the deterministic annealing algorithm to identify clusters is exploited to truncate computations over the entire data set to computations over individual clusters. An analysis of this algorithm quantifies the tradeoff between the error due to truncation and computational effort. Results applied on test data sets corroborate the analysis and show improvement by factors as large as 10 or more, depending on the data sets.  相似文献   

7.
Serial analysis of gene expression (SAGE) is a powerful tool to obtain gene expression profiles. Clustering analysis is a valuable technique for analyzing SAGE data. In this paper, we propose an adaptive clustering method for SAGE data analysis, namely, PoissonAPS. The method incorporates a novel clustering algorithm, Affinity Propagation (AP). While AP algorithm has demonstrated good performance on many different data sets, it also faces several limitations. PoissonAPS overcomes the limitations of AP using the clustering validation measure as a cost function of merging and splitting, and as a result, it can automatically cluster SAGE data without user-specified parameters. We evaluated PoissonAPS and compared its performance with other methods on several real life SAGE datasets. The experimental results show that PoissonAPS can produce meaningful and interpretable clusters for SAGE data.  相似文献   

8.
"Tailoring" combinatorial libraries was developed several years ago as a very general and intuitive method to design diverse compound collections while controlling the profile of other pharmaceutically relevant properties. The candidate substituents were assigned to "categorical bins" according to their properties, and successive steps of D-optimal design were performed to generate diverse substituent sets consistent with required membership quotas from each bin. This serial algorithm was expedient to implement from existing D-optimal design codes, but was order-dependent and did not generally locate the very best possible design. A new "parallel" Fedorov search algorithm has now been implemented that can find the most diverse property-tailored design. An ambiguous mass penalty has been added, whereby most duplicate masses can be eliminated with little loss of library diversity. Sensitivity analysis has also been added to quantitatively explore the diversity trade-offs due to increasing or decreasing each specific kind of bias.  相似文献   

9.
We address the problem of designing a general-purpose combinatorial library to screen for pharmaceutical leads. Conventional approaches focus on diversity as the primary factor in designing such libraries. We suggest making screening libraries out of a set of pharmaceutically relevant scaffolds, with multiple analogs per scaffold. The rationale for this rests on the fact that even though the hit-rate in active series is much higher than in the database as a whole, often a large fraction of the compounds in active series are inactive. This is especially true when the series has not been optimized for the target under study. We introduce the concept of hit-rate within a series and use historic screening data to arrive at a crude estimate for it. We then use simple probability arguments to show that 50-100 compounds are required in each series in order to be nearly certain of finding at least one active compound in each true active series for any given target.  相似文献   

10.
11.
Early results from screening combinatorial libraries have been disappointing with libraries either failing to deliver the improved hit rates that were expected or resulting in hits with characteristics that make them undesirable as lead compounds. Consequently, the focus in library design has shifted toward designing libraries that are optimized on multiple properties simultaneously, for example, diversity and "druglike" physicochemical properties. Here we describe the program MoSELECT that is based on a multiobjective genetic algorithm and which is able to suggest a family of solutions to multiobjective library design where all the solutions are equally valid and each represents a different compromise between the objectives. MoSELECT also allows the relationships between the different objectives to be explored with competing objectives easily identified. The library designer can then make an informed choice on which solution(s) to explore. Various performance characteristics of MoSELECT are reported based on a number of different combinatorial libraries.  相似文献   

12.
The scaffold diversity of 7 representative commercial and proprietary compound libraries is explored for the first time using both Murcko frameworks and Scaffold Trees. We show that Level 1 of the Scaffold Tree is useful for the characterization of scaffold diversity in compound libraries and offers advantages over the use of Murcko frameworks. This analysis also demonstrates that the majority of compounds in the libraries we analyzed contain only a small number of well represented scaffolds and that a high percentage of singleton scaffolds represent the remaining compounds. We use Tree Maps to clearly visualize the scaffold space of representative compound libraries, for example, to display highly populated scaffolds and clusters of structurally similar scaffolds. This study further highlights the need for diversification of compound libraries used in hit discovery by focusing library enrichment on the synthesis of compounds with novel or underrepresented scaffolds.  相似文献   

13.
14.
15.
16.
17.
18.
Optimizable k-dissimilarity (OptiSim) selection entails drawing a series of subsamples of size k from a population and choosing the "best" candidate from each such subsample for inclusion in the selection set. By varying the size of the subsample, one can control the balance between representativeness and diversity in the selection set obtained. In the original formulation, a uniform random sampling from among valid candidates was used to draw the subsamples from a single target population. Here we describe in detail two key modifications that serve to extend the OptiSim methodology to vector selection for interdependent variables, specifically as applied to the design of combinatorial sublibraries. The first modification involves pivoting between variables: subsamples are drawn from each reagent pool in turn, with the viability of each candidate being evaluated in isolation as well as in terms of the products it will produce from complementary reagents already selected. The filters applied may be static or dynamic in nature, with molecular weight and hydrophobicity being examples of the former and structural diversity with respect to reagents already selected being an example of the latter. The second key modification is adding the ability to bias the selection of candidate reagents for inclusion in the subsamples. Taken together, these modifications support the efficient generation of multiblock and other sparse matrix designs that are both representative and diverse, and for which "backfilling" of designs edited to remove undesirable reagents or products is straightforward. The method is intrinsically fast and efficient, since enumeration of the full combinatorial is not required- only those candidates actually considered for inclusion need be evaluated. Moreover, because the subsample selection step is separate from the diversity-based selection of the "best" candidate, incorporating such bias in favor of a competing criterion such as low price provides a "natural," nonparametric mechanism for generating designs that are likely to be "good" in a double-objective, Pareto sense.  相似文献   

19.
20.
Recent trends in the computer-aided design of diverse and focussed combinatorial libraries are surveyed. First, chemical data input, storage and retrieval including chemical database management and virtual chemical structure enumeration are outlined as background. Then, the optimization of ADMET parameters, diversity maximization, molecular similarity search, QSAR-based virtual screening, pharmacophore search and molecular docking are discussed.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号