Study of the Applicability Domain of the QSAR Classification Models by Means of the Rivality and Modelability Indexes

View/ Open
Author
Luque Ruiz, Irene
Gómez-Nieto, Miguel Ángel
Publisher
MDPIDate
2018Subject
QSARClassification model
Applicability domain
Rivality index
Modelability index
METS:
Mostrar el registro METSPREMIS:
Mostrar el registro PREMISMetadata
Show full item recordAbstract
The reliability of a QSAR classification model depends on its capacity to achieve confident
predictions of new compounds not considered in the building of the model. The results of this
external validation process show the applicability domain (AD) of the QSAR model and, therefore,
the robustness of the model to predict the property/activity of new molecules. In this paper we
propose the use of the rivality and modelability indexes for the study of the characteristics of the
datasets to be correctly modeled by a QSAR algorithm and to predict the reliability of the built
model to prognosticate the property/activity of new molecules. The calculation of these indexes
has a very low computational cost, not requiring the building of a model, thus being good tools for
the analysis of the datasets in the first stages of the building of QSAR classification models. In our
study, we have selected two benchmark datasets with similar number of molecules but with very
different modelability and we have corroborated the capacity of the predictability of the rivality and
modelability indexes regarding the classification models built using Support Vector Machine and
Random Forest algorithms with 5-fold cross-validation and leave-one-out techniques. The results
have shown the excellent ability of both indexes to predict outliers and the applicability domain of
the QSAR classification models. In all cases, these values accurately predicted the statistic parameters
of the QSAR models generated by the algorithms