ACD/Qualitative Solubility GALAS: Difference between revisions

From ACD Percepta
Jump to navigation Jump to search
(Created page with "==Overview== <br /> This module classifies the compound into one of the five possible classes according to its solubility in buffer at pH = 7.4 (extremely insoluble, highly i...")
 
Line 8: Line 8:
<br />
<br />


[[Image:qualitative_solubility.png|center]]
[[Image:Qualitative_solubility.png|center]]
<br />
<br />
# Qualitative estimate of the compound solubility in buffer at pH=7.4 (highly insoluble, insoluble, slightly soluble, soluble) based on cumulative result of several probabilistic models
# Qualitative estimate of the compound solubility in buffer at pH=7.4 (highly insoluble, insoluble, slightly soluble, soluble) based on cumulative result of several probabilistic models

Revision as of 13:34, 29 May 2012

Overview


This module classifies the compound into one of the five possible classes according to its solubility in buffer at pH = 7.4 (extremely insoluble, highly insoluble, insoluble, slightly soluble, soluble). This qualitative solubility assessment is based on the cumulative result of several probabilistic models each predicting the probability of the compound’s solubility in buffer at pH = 7.4 exceeding one of the four established thresholds, ranging from 0.01 mg/ml to 10 mg/ml. Each probabilistic prediction is supported by the corresponding Reliability Index value.


Interface


Qualitative solubility.png


  1. Qualitative estimate of the compound solubility in buffer at pH=7.4 (highly insoluble, insoluble, slightly soluble, soluble) based on cumulative result of several probabilistic models
  2. Classification basis - results of individual probabilistic models
  3. Up to 5 similar structures from the training set with experimental values

Note: Definition of solubility classes is as follows:

  • Highly insoluble – Sw < 0.1 mg/ml
  • Insoluble – Sw < 1 mg/ml
  • Slightly soluble – Sw > 1 mg/ml
  • Soluble – Sw > 10 mg/ml



Technical information


Definition of solubility classes is as follows:

Extremely Insoluble   S7.4 < 0.01 mg/ml
Highly Insoluble 0.01 mg/ml <  S7.4 < 0.1 mg/ml
Insoluble 0.1 mg/ml <  S7.4 < 1 mg/ml
Slightly Soluble 1 mg/ml <  S7.4 < 10 mg/ml
Soluble   S7.4 > 10 mg/ml







Training set size:

Sub-model and threshold Number of compounds
non-Extremely Insoluble (S7.4 > 0.01 mg/ml) 5,310
non-Highly Insoluble (S7.4 > 0.1 mg/ml) 5,310
non-Insoluble (S7.4 > 1 mg/ml) 5,692
Soluble (S7.4 > 10 mg/ml) 5,561






Internal validation set size:

Sub-model and threshold Number of compounds
non-Extremely Insoluble (S7.4 > 0.01 mg/ml) 2,277
non-Highly Insoluble (S7.4 > 0.1 mg/ml) 2,277
non-Insoluble (S7.4 > 1 mg/ml) 2,441
Soluble (S7.4 > 10 mg/ml) 2,378






Main sources of experimental data:

  • Reference books:
    • The Merck Index. An Encyclopedia of Chemicals, Drugs, and Biologicals, O'Neil, M.J., Smith, A., Heckelman, P.E., Budavari, S., Eds. 13th Edition, Merck & Co., Inc., Whitehouse Station, NJ, 2001
    • Therapeutic Drugs, Dolery, C., Ed. 2nd Edition, Churchill Livingstone, New York, NY, 1999
    • Clarke's Isolation and Identification of Drugs, Moffat, A.C., Jackson, J.V., Moss, M.S., Widdop, B., Eds. 2nd Edition, The Pharmaceutical Press, London, 1986
  • Various articles from peer-reviewed scientific journals*

* - Articles reporting solubility models by other authors were the predominant type among analyzed literature, meaning that each publication contained larger collections of experimental data (usually in the order of tens or hundreds compounds) compiled from corresponding original experimental articles.

Internal Validation

Each of the sub-models has been internally validated using their separate internal validation set, constituting ca. 30% of the entire dataset available for a particular threshold model.


Table 1. Performance statistics for the various fractions of the internal validation set of the non-Extremely Insoluble (S7.4 > 0.01 mg/ml) sub-model of the ACD/Qualitative Solubility predictor.
Subset Coverage of the entire
internal validation set (N=2,277)
Observed* Calculated probability (p)
>0.5 <0.5
RI > 0.3
N = 2,146
94.2%   
True 1,800
(83.9%)
51
(2.4%)
False 71
(3.3%)
224
(10.4%)
  Accuracy
94.3%   
  Sensitivity
97.2%   
  Specificity
75.9%   
RI > 0.5
N = 1,800
79.1%   
True 1,559
(86.6%)
24
(1.3%)
False 45
(2.5%)
172
(9.6%)
  Accuracy
96.2%   
  Sensitivity
98.5%   
  Specificity
79.3%   
RI > 0.75
N = 1,054
46.3%   
Genotoxic 936
(88.8%)
5
(0.5%)
Safe 11
(1.0%)
102
(9.7%)
  Accuracy
98.5%   
  Sensitivity
99.5%   
  Specificity
90.3%   

* - True means that compound's solubility in buffer at pH=7.4 does exceed the indicated threshold, while False indicates that this parameter is lower than the value indicated in the table name.


Table 2. Performance statistics for the various fractions of the internal validation set of the non-Highly Insoluble (S7.4 > 0.1 mg/ml) sub-model of the ACD/Qualitative Solubility predictor.
Subset Coverage of the entire
internal validation set (N=2,277)
Observed* Calculated probability (p)
>0.5 <0.5
RI > 0.3
N = 2,037
89.5%   
True 1,473
(72.3%)
60
(2.9%)
False 90
(4.4%)
414
(20.3%)
  Accuracy
92.6%   
  Sensitivity
96.1%   
  Specificity
82.1%   
RI > 0.5
N = 1,628
71.5%   
True 1,236
(75.9%)
29
(1.8%)
False 46
(2.8%)
317
(19.5%)
  Accuracy
95.4%   
  Sensitivity
97.7%   
  Specificity
87.3%   
RI > 0.75
N = 908
39.9%   
Genotoxic 725
(79.8%)
4
(0.4%)
Safe 9
(1.0%)
170
(18.7%)
  Accuracy
98.6%   
  Sensitivity
99.5%   
  Specificity
95.0%   

* - True means that compound's solubility in buffer at pH=7.4 does exceed the indicated threshold, while False indicates that this parameter is lower than the value indicated in the table name.


Table 3. Performance statistics for the various fractions of the internal validation set of the non-Insoluble (S7.4 > 1 mg/ml) sub-model of the ACD/Qualitative Solubility predictor.
Subset Coverage of the entire
internal validation set (N=2,441)
Observed* Calculated probability (p)
>0.5 <0.5
RI > 0.3
N = 2,153
88.2%   
True 1,142
(53.0%)
100
(4.6%)
False 136
(6.3%)
775
(36.0%)
  Accuracy
89.0%   
  Sensitivity
91.9%   
  Specificity
85.1%   
RI > 0.5
N = 1,634
66.9%   
True 918
(56.2%)
47
(2.9%)
False 67
(4.1%)
602
(36.8%)
  Accuracy
93.0%   
  Sensitivity
95.1%   
  Specificity
90.0%   
RI > 0.75
N = 847
34.7%   
Genotoxic 525
(62.0%)
7
(0.8%)
Safe 15
(1.8%)
300
(35.4%)
  Accuracy
97.4%   
  Sensitivity
98.7%   
  Specificity
95.2%   

* - True means that compound's solubility in buffer at pH=7.4 does exceed the indicated threshold, while False indicates that this parameter is lower than the value indicated in the table name.


Table 4. Performance statistics for the various fractions of the internal validation set of the Soluble (S7.4 > 10 mg/ml) sub-model of the ACD/Qualitative Solubility predictor.
Subset Coverage of the entire
internal validation set (N=2,378)
Observed* Calculated probability (p)
>0.5 <0.5
RI > 0.3
N = 2,114
88.9%   
True 688
(32.5%)
98
(4.6%)
False 99
(4.7%)
1,229
(58.1%)
  Accuracy
90.7%   
  Sensitivity
87.5%   
  Specificity
92.5%   
RI > 0.5
N = 1,649
69.3%   
True 560
(34.0%)
47
(2.9%)
False 65
(3.9%)
977
(59.2%)
  Accuracy
93.2%   
  Sensitivity
92.3%   
  Specificity
93.8%   
RI > 0.75
N = 869
36.5%   
Genotoxic 351
(40.4%)
9
(1.0%)
Safe 14
(1.6%)
495
(57.0%)
  Accuracy
97.4%   
  Sensitivity
97.5%   
  Specificity
97.2%   

* - True means that compound's solubility in buffer at pH=7.4 does exceed the indicated threshold, while False indicates that this parameter is lower than the value indicated in the table name.