Trainable Models

Overview

The ‘Trainable Model’ concept utilizing a novel similarity based analysis methodology allows the user to:

Assess the quality of the predictions by means of the Reliability Index (RI) estimation. This index provides values in a range from 0 to 1 and serves as an evaluation of whether a submitted compound falls within the Model Applicability Domain. Estimation of the Reliability Index takes into account the following two aspects: similarity of the tested compound to the training set and the consistency of experimental values for similar compounds.
Instantly expand the Model Applicability Domain with the help of any user-defined proprietary ‘in-house’ data of experimental values for the property of interest.

Each ‘Trainable Model’ consists of the following parts:

A structure based QSAR/QSPR for the prediction of the property of interest derived from a literature training set – the baseline QSAR/QSPR.
A user defined data set with experimental values for the property of interest – the Self-training Library.
A special similarity based routine which identifies the most similar compounds contained in the Self-training Library and considering their experimental values calculates systematic deviations produced by the baseline QSAR/QSPR for each submitted molecule – the training engine.

The current version of ACD/Percepta has implemented ‘Trainable Model’ methodology for the prediction of the following properties:

P-gp Specificity
- Trainable P-gpS
  Calculates the probability of a compound being a P-gp substrate.
- Trainable P-gpI
  Predicts the probability for a compound to act as a P-gp inhibitor.
Solubility
- Trainable LogSw
  Calculates quantitative solubility in pure water (LogSw, mmol/ml).
- Trainable LogS
  Calculates quantitative solubility in buffer at selected pH values (LogS, mmol/ml at pH=1.7, 6.5 and 7.4).
- Trainable Qual.S
  Estimates probabilities for the solubility of the compound in buffer (S, mg/ml at pH=7.4) to exceed selected thresholds (0.1, 1 and 10 mg/ml).
Plasma Protein Binding
- Trainable LogKa
  Predicts the compound's equilibrium binding constant to human serum albumin in the blood plasma (LogK_a^HSA).
- Trainable PPB
  Estimates the fraction of the compound bound to the blood plasma proteins (%PPB)
Partitioning
- Trainable LogP
  Calculates the logarithm of the otanol-water partitioning coefficient for the neutral form of the compound (LogP)
- Trainable LogD
  Predicts the logarithm of the apparent octanol water partition coefficient at selected pH values (LogD at pH=1.7, 6.5 and 7.4) taking into account all the species (including ionized) of the compound present in the system.
Ionization constants
- Trainable pKa Full
  Calculates pKa constants for all ionization stages
Cytochrome P450 Inhibitor Specificity
Calculates probability of a compound being an inhibitor of a particular cytochrome P450 enzyme with IC₅₀ below one of the two selected thresholds (general inhibition models - IC₅₀ < 50 μM; efficient inhibition - IC₅₀ < 10 μM). Predictions are available for five P450 isoforms :
- Trainable CYP1A2 I
- Trainable CYP2C19 I
- Trainable CYP2C9 I
- Trainable CYP2D6 I
- Trainable CYP3A4 I
Cytochrome P450 Substrate Specificity Calculates probability of a compound being metabolized by a particular cytochrome P450 enzyme. Predictions are available for five P450 isoforms:
- Trainable CYP1A2 S
- Trainable CYP2C19 S
- Trainable CYP2C9 S
- Trainable CYP2D6 S
- Trainable CYP3A4 S

Built-in Self-training Libraries

As a starting point for the calculations a number of Built-in Self-training Libraries with experimental values of the corresponding properties is provided for each ‘Trainable Model’ in ACD/Percepta:

Trainable P-gpS
- Built-in P-gpS Self-training Library - 1596 compounds.
Trainable P-gpI
- Built-in P-gpI Self-training Library - 2006 compounds.
Trainable LogSw
- Built-in LogSw Self-training Library - 6807 compounds.
Trainable LogS
- Built-in LogS(pH=1.7) Self-training Library - 6807 compounds.
- Built-in LogS(pH=6.5) Self-training Library - 6807 compounds.
- Built-in LogS(pH=7.4) Self-training Library - 6807 compounds.
Trainable Qual.S
- Built-in Qualitative Solubility (S(7.4) > 0.1 mg/ml) Self-training Library - 7587 compounds.
- Built-in Qualitative Solubility (S(7.4) > 1 mg/ml) Self-training Library - 8163 compounds.
- Built-in Qualitative Solubility (S(7.4) > 10 mg/ml) Self-training Library - 7973 compounds.
Trainable LogKa
- Built-in LogKa(HSA) Self-training Library - 334 compounds.
Trainable PPB
- Built-in %PPB Self-training Library - 1453 compounds.
Trainable LogP
- Built-in LogP Self-training Library - 16277 compounds.
Trainable LogD
- Built-in LogD(pH=1.7) Self-training Library - 16277 compounds.
- Built-in LogD(pH=6.5) Self-training Library - 16277 compounds.
- Built-in LogD(pH=7.4) Self-training Library - 16321 compounds.
Trainable pKa Full
- Built-in pKa Self-training Library - 20264 entries.
Trainable CYP1A2 I
- Built-in CYP1A2 Inhibition (IC50 < 10 uM) Self-training Library - 5815 compounds.
- Built-in CYP1A2 Inhibition (IC50 < 50 uM) Self-training Library - 4867 compounds.
Trainable CYP2C19 I
- Built-in CY2C19 Inhibition (IC50 < 10 uM) Self-training Library - 6833 compounds.
- Built-in CYP2C19 Inhibition (IC50 < 50 uM) Self-training Library - 6899 compounds.
Trainable CYP2C9 I
- Built-in CY2C9 Inhibition (IC50 < 10 uM) Self-training Library - 7677 compounds.
- Built-in CYP2C9 Inhibition (IC50 < 50 uM) Self-training Library - 7666 compounds.
Trainable CYP2D6 I
- Built-in CY2D6 Inhibition (IC50 < 10 uM) Self-training Library - 7507 compounds.
- Built-in CYP2D6 Inhibition (IC50 < 50 uM) Self-training Library - 7707 compounds.
Trainable CYP3A4 I
- Built-in CY3A4 Inhibition (IC50 < 10 uM) Self-training Library - 7927 compounds.
- Built-in CYP3A4 Inhibition (IC50 < 50 uM) Self-training Library - 6684 compounds.
Trainable CYP1A2 S
- Built-in CYP1A2 Substrates Self-training Library - 935 compounds.
Trainable CYP2C19 S
- Built-in CYP2C19 Substrates Self-training Library - 794 compounds.
Trainable CYP2C9 S
- Built-in CYP2C9 Substrates Self-training Library - 867 compounds.
Trainable CYP2D6 S
- Built-in CYP2D6 Substrates Self-training Library - 1001 compounds.
Trainable CYP3A4 S
- Built-in CYP1A2 Substrates Self-training Library - 960 compounds.

Note: The size of Built-in pKa Self-training Library is given not as a number of compounds, but rather as a total number of entries, since experimental data for several ionogenic centers in the same molecule may be present in the library.

Each library comes in two identical copies – ‘Read-only’ and ‘Editable’. The user is free to edit the contents of the ‘Editable’ version while no alterations are allowed to the ‘Read-only’ library which can be considered as a backup copy of the original data. Otherwise these Built-in Self-training Libraries have the same functionality – both can be used in calculations or as an initial data source for the creation of user-defined Self-training Libraries.

Trainable Models

Overview

Built-in Self-training Libraries

Navigation menu

Search