Breaking the polar-nonpolar division in solvation free energy prediction

5 years ago

Breaking the polar-nonpolar division in solvation free energy prediction

Guo-Wei Wei, Bao Wang, Chengzhang Wang, Kedi Wu

Implicit solvent models divide solvation free energies into polar and nonpolar additive contributions, whereas polar and nonpolar interactions are inseparable and nonadditive. We present a feature functional theory (FFT) framework to break this ad hoc division. The essential ideas of FFT are as follows: (i) representability assumption: there exists a microscopic feature vector that can uniquely characterize and distinguish one molecule from another; (ii) feature-function relationship assumption: the macroscopic features, including solvation free energy, of a molecule is a functional of microscopic feature vectors; and (iii) similarity assumption: molecules with similar microscopic features have similar macroscopic properties, such as solvation free energies. Based on these assumptions, solvation free energy prediction is carried out in the following protocol. First, we construct a molecular microscopic feature vector that is efficient in characterizing the solvation process using quantum mechanics and Poisson–Boltzmann theory. Microscopic feature vectors are combined with macroscopic features, that is, physical observable, to form extended feature vectors. Additionally, we partition a solvation dataset into queries according to molecular compositions. Moreover, for each target molecule, we adopt a machine learning algorithm for its nearest neighbor search, based on the selected microscopic feature vectors. Finally, from the extended feature vectors of obtained nearest neighbors, we construct a functional of solvation free energy, which is employed to predict the solvation free energy of the target molecule. The proposed FFT model has been extensively validated via a large dataset of 668 molecules. The leave-one-out test gives an optimal root-mean-square error (RMSE) of 1.05 kcal/mol. FFT predictions of SAMPL0, SAMPL1, SAMPL2, SAMPL3, and SAMPL4 challenge sets deliver the RMSEs of 0.61, 1.86, 1.64, 0.86, and 1.14 kcal/mol, respectively. Using a test set of 94 molecules and its associated training set, the present approach was carefully compared with a classic solvation model based on weighted solvent accessible surface area. © 2017 Wiley Periodicals, Inc. The traditional polar and nonpolar division of solvation free energies is ad hoc and inaccurate. This work offers a feature functional theory (FFT)-based approach to describe the nonlinear and nonadditive interactions between polar and nonpolar components. FFT utilizes machine learning algorithms to convolve both polar and nonpolar features for solvation free energy predictions. A root-mean-square error of 1.05 kcal/mol is achieved in the leave-one-out solvation free energy prediction of a relatively large dataset of 668 molecules.

Publisher URL: http://onlinelibrary.wiley.com/resolve/doi

DOI: 10.1002/jcc.25107