首页 | 本学科首页   官方微博 | 高级检索  
     


One class classification as a practical approach for accelerating π–π co-crystal discovery
Authors:Aikaterini Vriza  Angelos B. Canaj  Rebecca Vismara  Laurence J. Kershaw Cook  Troy D. Manning  Michael W. Gaultois  Peter A. Wood  Vitaliy Kurlin  Neil Berry  Matthew S. Dyer  Matthew J. Rosseinsky
Affiliation:Department of Chemistry and Materials Innovation Factory, University of Liverpool, 51 Oxford Street, Liverpool L7 3NY UK.; Leverhulme Research Centre for Functional Materials Design, University of Liverpool, Oxford Street, Liverpool L7 3NY UK ; Cambridge Crystallographic Data Centre, 12 Union Road, Cambridge CB2 1EZ UK ; Materials Innovation Factory, Computer Science Department, University of Liverpool, Liverpool L69 3BX UK
Abstract:The implementation of machine learning models has brought major changes in the decision-making process for materials design. One matter of concern for the data-driven approaches is the lack of negative data from unsuccessful synthetic attempts, which might generate inherently imbalanced datasets. We propose the application of the one-class classification methodology as an effective tool for tackling these limitations on the materials design problems. This is a concept of learning based only on a well-defined class without counter examples. An extensive study on the different one-class classification algorithms is performed until the most appropriate workflow is identified for guiding the discovery of emerging materials belonging to a relatively small class, that being the weakly bound polyaromatic hydrocarbon co-crystals. The two-step approach presented in this study first trains the model using all the known molecular combinations that form this class of co-crystals extracted from the Cambridge Structural Database (1722 molecular combinations), followed by scoring possible yet unknown pairs from the ZINC15 database (21 736 possible molecular combinations). Focusing on the highest-ranking pairs predicted to have higher probability of forming co-crystals, materials discovery can be accelerated by reducing the vast molecular space and directing the synthetic efforts of chemists. Further on, using interpretability techniques a more detailed understanding of the molecular properties causing co-crystallization is sought after. The applicability of the current methodology is demonstrated with the discovery of two novel co-crystals, namely pyrene-6H-benzo[c]chromen-6-one (1) and pyrene-9,10-dicyanoanthracene (2).

Machine learning using one class classification on a database of existing co-crystals enables the identification of co-formers which are likely to form stable co-crystals, resulting in the synthesis of two co-crystals of polyaromatic hydrocarbons.
Keywords:
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号