Selection of putative binding poses is a challenging part of virtual screening for protein鈥損rotein interactions. Predictive models to filter out binding candidates with the highest binding affinities comprise scoring
functions that assign a score to each binding pose. Existing scoring
functions are typically deduced by collecting statistical information about interfaces of native conformations of protein complexes along with interfaces of a large generated set of non-native conformations. However, the obtained scoring
functions become biased toward the method used to generate the non-native conformations, i.e., they may not recognize near-native interfaces generated with a different method. The present study demonstrates that knowledge of only native protein鈥損rotein interfaces is sufficient to construct well-discriminative predictive models for the selection of binding candidates. Here we introduce a new scoring method that comprises a knowledge-based potential called
KSENIA deduced from structural information about the native interfaces of 844 crystallographic protein鈥損rotein complexes. We derive
KSENIA using
convex optimization with a training set composed of native protein complexes and their near-native conformations obtained using deformations along the low-frequency normal modes. As a result, our knowledge-based potential has only marginal bias toward a method used to generate putative binding poses. Furthermore,
KSENIA is smooth by construction, which allows it to be used along with rigid-body optimization to refine the binding poses. Using several test benchmarks, we demonstrate that our method discriminates well native and near-native conformations of protein complexes from non-native ones. Our methodology can be easily adapted to the recognition of other types of molecular interactions, such as protein鈥搇igand, protein鈥揜NA, etc.
KSENIA will be made publicly available as a part of the SAMSON software platform at
https://team.inria.fr/nano-d/software.