3 years ago

Cross-Classified Multilevel Modelling of the Effectiveness of Similarity-Based Virtual Screening

Peter Willett, Laura Sbaffi, Andrew Bell, Lucyantie Mazalan
The screening effectiveness of a chemical similarity search depends on a range of factors, including the bioactivity of interest, the types of similarity coefficient and fingerprint that comprise the similarity measure, and the nature of the reference structure that is being searched against a database. This study introduces the use of cross-classified multilevel modelling as a way to investigate the relative importance of these four factors when carrying out similarity searches on the ChEMBL database. Two principal conclusions can be drawn from the analyses: that the fingerprint plays a more important role than the similarity coefficient in determining the effectiveness of a similarity search, and that comparative studies of similarity measures should involve many more reference structures than has been the case in much previous work. We describe the use of cross-classified multilevel modelling to analyse the results of similarity-based virtual screening searches using 2D fingerprints. We show that the choice of fingerprint is more important than the choice of similarity coefficient, and that multiple reference structures need to be used in benchmark studies such as this.

Publisher URL: http://onlinelibrary.wiley.com/resolve/doi

DOI: 10.1002/cmdc.201700487

