hERG Inhibition: Difference between revisions
Line 8: | Line 8: | ||
===Features=== | ===Features=== | ||
* Predicts the probability for a compound to inhibit hERG channel at clinically relevant concentrations (K<sub>i</sub> < 10 μM). | * Predicts the probability for a compound to inhibit hERG channel at clinically relevant concentrations (K<sub>i</sub> < 10 μM). | ||
* Calculates Reliability Index (RI values) of predictions that | * The predictive model is based on a data set of more than 6500 compounds with experimental results collected from published hERG inhibition studies utilizing either patch-clamp or competitive binding methods. | ||
* Calculates Reliability Index (RI values) of predictions that indicates whether tested compounds belong to Applicability Domain of predictive model. | |||
* Performs a similarity search and displays top 5 most similar structures from the training set of the model along with their names, experimental results, and literature references. | * Performs a similarity search and displays top 5 most similar structures from the training set of the model along with their names, experimental results, and literature references. | ||
* Training of the model using ‘in-house’ data generated by ‘in-house’ screening protocol. | * Training of the model using ‘in-house’ data generated by ‘in-house’ screening protocol. | ||
== Interface == | == Interface == |
Revision as of 08:21, 25 May 2016
Overview
Cardiotoxicity of drug-like compounds associated with human ether-a-go-go (hERG) channel inhibition is becoming more and more common cause of drug candidates’ attrition. The hERG potassium channel is required for normal cardiac depolarization and its blockage can lead to cardiac QT interval prolongation and life threatening arrhythmias.
Using hERG inhibition module, you have the capability to quickly identify hERG inhibitors. Training of models using usually very large ‘in-house’ experimental (screening) data of hERG inhibition would expand the Applicability Domain of the model and would produce reliable predictions for compounds synthesized in your company. Moreover, training allows customization of our model to ensure that it correctly handles the data originating from the particular screening protocol used in your company that may significantly differ from standard protocols described in the literature.
Features
- Predicts the probability for a compound to inhibit hERG channel at clinically relevant concentrations (Ki < 10 μM).
- The predictive model is based on a data set of more than 6500 compounds with experimental results collected from published hERG inhibition studies utilizing either patch-clamp or competitive binding methods.
- Calculates Reliability Index (RI values) of predictions that indicates whether tested compounds belong to Applicability Domain of predictive model.
- Performs a similarity search and displays top 5 most similar structures from the training set of the model along with their names, experimental results, and literature references.
- Training of the model using ‘in-house’ data generated by ‘in-house’ screening protocol.
Interface
- Estimated probability of a compound being human ether-a-go-go (hERG) channel inhibitor.
- Indication of the prediction reliability along with the Reliability Index value:
- RI < 0.3 – Not Reliable,
- RI in range 0.3-0.5 – Borderline Reliability,
- RI in range 0.5-0.75 – Moderate Reliability,
- RI >= 0.75 – High Reliability
- Up to 5 similar structures in the training set with names, experimental results (Inhibitor, Non-inhibitor, Inconclusive data), and references
Technical information
Experimental data
Data set used for model development consisted of 663 binary values (inhibitor, non-inhibitor) These were collected from original publications considering two types of experiments:
- Electrophysiological patch-clamp assay - hERG current inhibition expressed as IC50 constants (512 compounds).
- Radioligand (dofetilide, astemizole, MK-499) displacement assay providing Ki values (161 compound).
Assignment of qualitative categories
The following criteria were applied for conversion of continuous data representing strength of compounds' interaction with hERG channel to binary representation:
- In patch-clamp studies compounds that exhibited IC50 < 10μM were considered hERG inhibitors, while those with IC50 > 10μM – hERG non-inhibitors.
- For the data coming from radioligand displacement assay the corresponding thresholds were as follows: Ki < 0.5 μM - inhibitors, Ki > 100 μM - non-inhibitors, while compounds in the intermediate range (0.5 μM < Ki < 100 μM) were labeled inconclusive.
More strict criteria were applied to radioligand displacement data compared to patch-clamp studies since the former method does not provide a direct measure of hERG channel inhibition, but rather represents hERG binding affinity. To ensure high quality of the data set only sufficiently strong or weak binders were considered inhibitors or non-inhibitors respectively, while no definitive categories were are assigned to compounds with moderate binding affinities.
Model features & prediction accuracy
The predictive model of hERG inhibition was derived using GALAS (Global, Adjusted Locally According to Similarity) modeling methodology (please refer to [1] for more details).
Each GALAS model consists of two parts:
- Global baseline statistical model employing binomial PLS with multiple bootstrapping using a predefined set of fragmental descriptors, that reflects general trends in mutagenicity.
- Similarity-based routine that performs local correction of baseline predictions taking into account the differences between baseline and experimental values for the most similar training set compounds.
GALAS methodology also provides the basis for estimating reliability of predictions by the means of calculated Reliability Index (RI) value ranging from 0 to 1 that takes into account the following two criteria:
- Similarity of tested compound to the training set molecules (prediction is unreliable if no similar compounds have been found).
- Consistence of experimental values and baseline model prediction for the most similar similar compounds from the training set (discrepant data for similar molecules, i.e. alternating hERG blockers and hERG non-blockers lead to lower RI values).
The used method also provides the basis of model Trainability. 'Trainable model' methodology addresses the issue of the chemical space of ‘in-house’ libraries being considerably wider than that of publicly available data which results in limited applicability of most third-party QSARs for analysis of ‘in-house’ data. The ‘Training engine‘ makes appropriate corrections for systematic deviations produced by the baseline QSAR model based on analysis of similar compounds from the experimental data library. Expansion of this Self-training Library with user-defined experimental data for new compounds leads to instant improvement of prediction accuracy for the respective compound classes. Moreover, addition of 'in-house' data allows adapting the existing model to the particular experimental protocol used in your company and avoiding potential issues related to discrepancies between different experimental methods used for determination of drug interactions with hERG (see Model Trainability Demonstration) section.
The accuracy of predictions for compounds within model Applicability Domain (indicated by Reliability Index values) is comparable to screening results. Predictions that are not reliable, may be instantly improved by addition of experimental data for a few similar compounds to the model Self-training Library.
The table below shows performance of the model on the internal validation set consisting of 151 molecules. Predictions for 103 compounds (68.2% of the validation set) within Model Applicability Domain (indicated by Reliability Index (RI) value > 0.3) are highly accurate:
Predicted | ||||
---|---|---|---|---|
True | False | Accuracy | 91.3% | |
True | 60 | 4 | Sensitivity | 93.4% |
False | 5 | 34 | Specificity | 87.2% |
- Only compounds within Applicability Domain (RI > 0.3) were considered in testing.
Model Trainability Demonstration
Trainability of the described predictive model of hERG inhibition was tested using an external data set derived from HTS fluorescence assay that has recently become available in the PubChem database. Validation procedure was performed as follows:
- HTS fluorescence assay data for 1609 compounds were extracted from Pubchem database. Quantitative values provided in the PubChem database (PubChem scores - fluorescence increase over negative control compared to reference compound terfenadine) were converted to binary representation: compounds with Pubchem score > 40% were considered hERG inhibitors; those with Pubchem score from -20 to 20% - non-inhibitors.
- Part of this external data library was reserved as a test set. The remaining data were added to the Selftraining
Library in three steps.
- The resulting models containing different portions of HTS data were validated against the reserved test set.
When calculations for the test set are made using Built-in Self-training Library, predicted values for many compounds aremarked ‘Not reliable’ (i.e. fall outside of the Model Applicability Domain, red bars in the figure). However, as discussed above, prediction accuracy is still impressive if calculations of at least borderline reliability (RI ≥ 0.3) are considered. The key point is the appearance of a considerable number of moderate (RI ≥ 0.5) and high quality predictions (RI ≥ 0.7) when even a small part of external data set is added to the Self-training Library (green bars in the figure). The percentage of reliable predictions goes even higher with further expansion of the Library, while the same or better overall accuracy of calculations is maintained:
Reliability | RI > 0.5 | RI > 0.7 | ||
---|---|---|---|---|
Library | N | Accuracy | N | Accuracy |
Built-in | 104 | 96.15% | 6 | 83.33% |
Built-in + 320 | 302 | 99.01% | 150 | 99.33% |
Built-in + 623 | 345 | 99.13% | 177 | 99.48% |
Built-in + 935 | 376 | 98.34% | 192 | 99.48% |
These results demonstrate the ability of our ‘Trainable model’ methodology to adapt the existing model to the particular chemical space represented by an external compound set. It is also obvious that our Training engine successfully corrects for the differences in experimental estimation when data from different assays are combined and therefore, is particularly suitable for analysis of ‘in-house’ data.