pKa: Difference between revisions

Revision as of 13:22, 1 March 2021

Overview

The acid dissociation constant, K_a, is a measure of the tendency of a molecule or ion to keep a proton (H⁺) at its ionization center(s). It is related to the ionization ability of chemical species and is a core property that defines chemical and biological behaviour.

Features

Includes two different predictive algorithms – ACD/pKa Classic and ACD/pKa GALAS.
Calculates accurate acid and base pK_a constants (pK_a = -log K_a) under standard conditions (25°C and zero ionic strength) in aqueous solutions for every ionizable group within organic structures.
Provides confidence intervals for all estimations indicating their accuracy.
Gives an explicit insight into processes running during each ionization stage. Contains a number of other useful features depending on the selected prediction algorithm.

Interface

ACD/pKa Classic

Ionizable groups are highlighted using color shading (red for acid, blue for base, purple for amphoteric ionization centers). More intensive shading denotes strongest acid and base groups
Strongest acid and base pK_a values including reliability range in ±log units
List of pK_a constants for all stages of ionization
List of dissociation stages (DS) corresponding to different pK_a values.
Hover over to see the screentip showing the respective dissociation reaction:
Click the appropriate tab to display the protocol, according to which the pK_a value for that dissociation stage was calculated.
Click the structure fragment to see it highlighted in the Structure pane.

ACD/pKa GALAS

Ionizable groups are highlighted using color shading (red for acid, blue for base, purple for amphoteric). More intensive shading denotes strongest acid and base groups
Strongest acid and base pK_a values including reliability range in ±log units
List of pK_a constants for all stages of ionization
List of partial ionization reactions (microstages) responsible for each ionization stage. Contribution of each microstage to the final pK_a value is given in percent
Hover over to see the screentip:

a. Color shading marks the ionization center

b. Dissociation reaction and its pK_a microconstant

Click the appropriate tab to select the type of plot to be displayed
Net charge vs. pH plot
Protonation states of the molecule. The selected protonation state (PS2 in this example) is displayed in the screentip with ionized atoms marked by color-shading:

Click to view the Net Charge vs. pH table. Fractions of the ionic species having a particular net charge are displayed at selected points on the pH scale including physiologically relevant pH values (1.7, 4.6, 6.5, 7.4)
Click and drag the slider to see calculated fractions of different ionic forms at precise pH value displayed on the right.
Calculated fractions of different ionic forms at selected pH.

Protonation State vs. pH plot
Click the label of a protonation state to show / hide its curve on the plot
Fractions of different protonation states at selected pH
Click to view the Protonation State vs. pH table

Ionogenic Group State vs. pH plot
Click the label of a ionogenic group to toggle its curve. Hover over the label to view a screentip with the selected ionogenic group shaded (G1 in this example):
TC – total charge of all ionogenic groups in the molecule

Click to view the Ionogenic Group State vs. pH table

Technical information

Introduction to pK_a

The pK_a is a measure of the tendency of a molecule or ion to keep a proton, H⁺, at its ionization center(s). It is related to ionization capabilities of chemical species. The more likely ionization occurs, the more likely a species will be taken up into aqueous solution, because water is a very polar solvent (its dielectric constant, ε²⁰ = 80). If a molecule does not readily ionize, then it will tend to stay in a non-polar solvent such as cyclohexane (ε²⁰ = 2) or octanol (ε²⁰ = 10). In biological terms, pK_a is thus an important concept in determining whether a molecule will be taken up by aqueous tissue components or the lipid membranes. It is also closely related to the concepts of pH (the acidity of solution) and logP (the partition coefficient between immiscible liquids).

The equilibrium acid ionization constant, K_a, expresses the ratio of concentrations for the reaction:

HA + H₂O → H₃O⁺ + A^-
K_a = [H₃O⁺] [A^-] / [HA]

where, by convention, it is assumed that the concentration of water is constant, and it is absorbed into the K_a definition.

The acid ionization constant varies by orders of magnitude. For example, at 25°C:

acetic acid: K_a = 1.8 x 10^-5
phenol: K_a = 1.0 x 10^-10

It is easier to refer to such extreme numbers on a logarithmic scale and, again by convention, "p" is used to denote the negative logarithm (base 10):

pK_a = -log(K_a)

The K_a values of the compounds above are then easily converted to pK_a values:

acetic acid: pK_a = -log(1.8 x 10^-5) = 4.756
phenol: pK_a = -log(1.0 x 10^-10) = 10.0

There is an essential difference between interpreting the pK_a values for molecules vs. ions. A molecule which loses a proton ionizes:

HA + H₂O → H₃O⁺ + A^-

and so a low pK_a value denotes good aqueous solubility.

An ion which loses a proton, however, de-ionizes:

HB⁺ + H₂O → H₃O⁺ + B

and so a high pK_a value denotes good aqueous solubility.

Note that there is no intrinsic reason to rule out pK_a values less than 0 or greater than 14. For example, sulfuric acid, H₂SO₄, has a negative pK_a for the loss of its first proton:

H₂SO₄ → HSO₄^- + H⁺ (pK_a < 0)

although normally experiment can only measure pK_a between 1 and 13.

Ionization Centers

The pK_a determination depends on the presence of heteroatoms such as oxygen or nitrogen. Although in principle a pK_a value could be calculated for any atomic center, including carbon, in practice the extrapolation is poor for systems which have a very low amount of ionization. For example, the C–H bonds in methane have such highly covalent character that

CH₄ + H₂O → CH₃^- + H₃O⁺

has a vanishingly small probability of occurring. Some C-H bonds do have measurable ionic character, and these are calculated by ACD/pKa. For example, the C–H bond of the methylene group at the 2-position in 1,3-cyclopentanedione is highly polarized; its pKa is predicted to be about 8.9:

Normally, however, a heteroatom is part of the ionization center, and ACD/pKa is designed to test for the presence of heteroatoms which are capable of forming bonds with sufficient ionic character to have measurable pK_a values, thus enabling reasonable prediction of pK_a for related compounds.

Statistical Factor

The approximated calculation of constants will yield the statistical factor which takes into account identical protonation sites. Here is how the statistical factor is defined by leading authorities:

"When a polybasic acid has n groups, each of which has an equal probability of losing a proton, the observed pK_a will be less by (log n) than the pK_a of a closely related monobasic acid. This "statistical effect" arises because there are n equivalent ways of losing a proton but only one site to which the proton can be restored. Similarly, for second proton loss, the correction becomes (log((n – 1) / 2), then (log((n – 2) / 3), and so on. Thus, for a molecule such as butanedioic acid (HOOC–CH₂–CH₂–COOH), which has two identical acidic groups, loss of a proton from either group leads to the same monoanion. The consequence is that the first ionization constant, pK_a1, for the dibasic acid is twice as large as that for the closely related monobasic acid, that is, the observed pK_a1 is 0.3 (= log2) units less than would be expected from a consideration of factors other than probability. Conversely, the monoanion has only one ionizable proton whereas the dianion has two identical sites for proton addition, so that the second ionization step, pK_a2, appears to be weaker by a factor of two, and the observed pK_a2 to be greater by 0.3 than anticipated. Similarly, for a base with n basic centers, the measured pK_a ["apparent pK_a" in ACD/pKa] of greatest magnitude, pK_aN, will be greater than anticipated by log n, and so on."

D. D. Perrin, Boyd Dempsey and E. P. Serjeant, pKa Prediction for Organic Acids and Bases, 1981, pp.16–17.

Experimental Measurement of pK_a

When comparing calculated pK_a values with experimentally determined data, it is wise to bear in mind how these measurements are carried out.

The determination of pK_a is based on pH measurements for a series of mixtures of the acid and its salt. For pK_a values in the range 2–12, this is frequently done by titrimetric methods. The pH is converted to proton molality, and then K_a is determined by measuring (or estimating) the activity coefficients of species in solution. Note that the temperature, ionic strength, and reference solutions used in these determinations can influence the measured pK_a substantially. For example, benzoic acid was determined to have a pK_a of 4.2 by one experimental group and 4.0 by another.

Another standard method is the spectrophotometric determination of pK_a. This is particularly recommended for very small quantities of sample, or for poorly soluble sample. A refinement of this method requires an estimate of the spectra for each form from the data. The pK_a values are determined by nonlinear curve fitting, assuming good initial estimates can be chosen. In theory, any kind of spectral data can be used—UV-Vis, IR, NMR, etc., provided that the pH of the solution in which the spectrum was obtained can be measured. A plot of absorbance versus pH will show asymptotes at the absorbance of the conjugate acid and base forms of the molecule. Each wavelength gives different asymptotes, but the same inflection point. Data at enough wavelengths will generate the spectra of the conjugate acid and base forms, even if they can't be measured experimentally, say, for molecules with pK_a outside of the range 2–12. The (common) inflection point is the pK_a. For molecules with multiple ionization sites, a sum of S-shaped curves that need to be deconvolved is obtained. Without good initial estimates, the calculations can be tedious. The better the initial estimate, the faster the convergence. ACD/pKa can provide good initial estimates for these calculations.

Just as there are aspects of experimental design which affect the accuracy of a pK_a determination, there are also aspects to the physical solution which can lead to apparent disagreement between the calculated and measured pK_a. For example, one factor which may cause a discrepancy between calculated and experimentally measured pK_a values is the presence of a non-negligible tautomeric ratio. ACD/Percepta automatically checks for tautomers when a structure is entered in the Prediction module Workspace, and to check for tautomers in Spreadsheet Workspace, choose Check Tautomers command from the Utilities menu.

Database of Experimental pK_a Values

The internal database contains 15,924 structures with more than 31,000 experimental values under different temperatures and ionic strengths in purely aqueous solutions. In ACD/Percepta, the database is directly accessible and searchable as Databases\pKa data source, and each experimental value is provided with a reference to the original literature. No pKa values in organic solvents or aqueous-organic mixtures are included.

Description of ACD/pKa GALAS Algorithm

Estimation of ionization constants using this algorithm is a multi-step procedure involving estimation of pK_a microconstants for all possible ionization centers in a hypothetical state of an uncharged molecule ("fundamental microconstants"), numerous corrections of these initial pK_a values according to the surrounding of the reaction center and calculation of charge influences of ionized groups to the neighbouring ionization centers. Calculation routine utilizes a database of 4,600 ionization centers, a set of ca. 500 various interaction constants and four interaction calculation methods for different types of interactions, producing a full range of microconstants from which pK_a macroconstants are obtained. This allows for a simulation of complete distribution plot of all protonation states of the molecule at different pH conditions. For example, the complete simulated ionization profile for cysteine molecule is illustrated in the following figure:

¹Experimental pK_a values obtained from The Merck Index (see full citation below).

ACD/pKa GALAS algorithm is based on a training set containing 17,593 compounds (>20,000 ionization centers) obtained from various articles in peer-reviewed scientific journals and well-known reference books:

The Merck Index. An Encyclopedia of Chemicals, Drugs, and Biologicals, O'Neil, M.J., Smith, A., Heckelman, P.E., Budavari, S., Eds. 13th Edition, Merck & Co., Inc., Whitehouse Station, NJ, 2001
Therapeutic Drugs, Dolery, C., Ed. 2nd Edition, Churchill Livingstone, New York, NY, 1999
Clarke's Isolation and Identification of Drugs, Moffat, A.C., Jackson, J.V., Moss, M.S., Widdop, B., Eds. 2nd Edition, The Pharmaceutical Press, London, 1986

A specific features of this algorithm include is the graphical/tabular representation of the obtained predictions in the form of pH dependency of:

Net molecular charge
Distribution of protonation states
Average charge of each ionization centre

Description of ACD/pKa Classic Algorithm

This algorithm uses Hammett-type equations and electronic substituent constants (σ) to predict pK_a values for ionizable groups. Effects considered by the software include tautomeric equilibria, covalent hydration, and resonance effects in α, β-unsaturated systems.

Hammett-Type Equations — every ionizable group is characterized by several Hammett-type equations that have been parameterized to cover the most popular ionizable functional groups.

Sigma constants — the internal training set contains >3,000 derived experimental electronic constants. When the required substituent constant is not available from the experimental database, one of four algorithms are used to describe electronic effect transmissions through the molecular system.

This method of pK_a calculation mimics the experimental situation by "adding" protons to the molecule in the order the molecule would normally be protonated in solution. For example, performing the calculation for a neutral glycine molecule H₂N–CH₂–COOH will give two values: 9.64 and 2.43. These values are calculated for the actual ionization equilibria:

H₃N⁺–CH₂–COOH → H₂N–CH₂–COO^- + H⁺ (pK_a = 9.64)
H₃N⁺–CH₂–COOH → H₃N⁺–CH₂–COO^- + H⁺ (pK_a = 2.43)

The internal training set of ACD/pKa Classic algorithm contains 15,932 molecules representing >30,000 pK_a values.

Specific features of this particular algorithm are as follows:

A detailed calculation protocol on how the prediction has been carried out is provided for each molecule (including Hammett-type equations, substituent constants, and literature references where available).
To improve prediction accuracy and make the model relevant to in-house chemical space or a particular project, the ACD/pKa Classic prediction model offers the ability for training with user provided experimental data. Training is user-friendly, and may be switched on, off, or certain training sets used for different predictions, putting full control in your hands.

Further sections of this document provide more detailed information regarding the various aspects of ACD/pKa Classic algorithm.

Database of Hammett-type Equations

The Hammett-type equations used in ACD/pKa calculations have been parameterized to cover over 1,500 combinations of over 650 of the most popular ionizable functional groups. Each functional group has been characterized by several equations involving different types of substituent constants in order to achieve the most accurate calculation. All equations for a given functional group have been ranked according to their reliability (number of correlated structures, correlation coefficient and standard deviation) and reliability of available substituent constants. For example, the following ranking has been used for calculating pK_a values of para-substituted quinolines:

pK_a = 5.009 – 5.058*σ_I – 4.363*σ_R⁺ : n = 10, r = 0.9989, sd = 0.13
pK_a = 4.874 – 4.561*σ_I – 5.63*σ_R : n = 10, r = 0.9878, sd = 0.46
pK_a = 5.179 – 5.318*σ_Para : n = 9, r = 0.9878, sd = 0.42

Database of Electronic Substituent Constants (σ)

There are many variants of the original electronic substituent constant, σ. The ACD/pKa database contains constants for over 1,200 substituents with over 3,000 carefully derived experimental electronic constants. The following table summarizes the number of constant values present in the database.

Sigma	Number in Database
σ_I	592
σ^* (Taft)	265
σ_R	453
σ_R^–	157
σ_R⁺	143
σ_Para	585
σ_Meta	431
σ_Para^–	142
σ_Para⁺	135
σ_Phosph (P-Acids)	68
σ_Ortho (Benzoic acid)	41
σ_Ortho (Phenol)	37
σ_Ortho (Aniline)	30
σ_Ortho (Pyridine)	48

Estimation of Electronic Substituent Constants

Although the parameter database contains a wide array of σ values, in some cases no reliable constant is available. When the required substituent constant is not available from the experimental database it can be calculated by one of the algorithms described in this section.

Electronic Effect Transmission through Skeleton

This estimation is based on the following formula:

σ^R–G– = σ^–G– + Σz_I,R,…^–G–∙σ_I,R,…^R– + Σz_I,R,…^–G–∙(σ_I^R–∙σ_R^R–)…,

where all σ_I,R,…^R– are substituent R electronic constants (inductive, resonance, etc.) and all z_{I,R,…sup>–G–} are skeleton G transmission constants. The accuracy of the σ^R–G– calculation is usually better than ±0.05–0.1. The algorithm contains 42 of the most frequently used skeletons G described by 126 such equations:

σ_I–36, σ_R–25, σ_R^-–6, σ_R⁺–4, σ_Para–24, σ_Meta–24, σ_Phosph–7

For example, the following constants which are calculated for carbamate species containing the carbamate functional group were determined to be σ_I = 0.45, σ_R = -0.34, σ_R^- = -0.36, σ_R⁺ = -0.38, σ_Para = 0.10, σ_Meta = 0.32, σ_Phosph = 0.0238.

Using these parameters, the pK_a of 2-ammonio-4-thioxohexanedioate calculated by this method is 7.72 (experimental is 7.90).

Secondary Algorithm

If the preceding estimate cannot be made, a back-up method is available, based on the following formula:

σ^R–G– = σ^–G– + z_I^–G–∙σ_I^R–

The accuracy of the σ^R–G– calculation is usually ±0.15–0.20. It is not as good as the first algorithm, but it can be used to calculate the σ_I, σ^*, σ_R and σ_R^- electronic constants for any possible substituents.

For example, the constants σ_I = 0.37, and σ_R = 0.08 are calculated for N-trifluoromethyl-carbamothioic halides:

Transmission through Aliphatic Cycles

This algorithm is based on the modified Exner-Fiedler method. The original Exner-Fiedler method can be used to calculate electronic transmission effects for only very limited number of aliphatic cycles. The improved ACD/pKa method allows calculation of these effects for any possible aliphatic (poly)cycles.

For example, the calculated transmission factor for variants of bicyclo[1.1.0]butane-1-carboxylic acid is 1.72 (experimental is 1.92).

Transmission through Condensed Polyaromatic Systems

This algorithm is based on the modified Dewar-Grisdale method. The original Dewar-Grisdale method can be used to calculate electronic transmission effects for only very limited number of condensed polyaromatic systems (Dewar M.J.S., Grisdale P.J., J. Am. Chem. Soc., 1962, 84, 3539). [1] The improved ACD/pKa method allows you to calculate these effects for virtually any polyaromatic system.

For example, the pK_a of the 3-amino-5-hydroxynaphthalene-2,7-disulfonate calculated by this method is 8.64 (the experimentally determined value is 8.54):

Calculation of Steric Effects

In most cases, steric effects have been taken into account by defining the ionization center as an ionizable functional group with a sufficiently large invariable skeleton. In cases where the variable substituents are in close proximity to ionizable groups, steric effects are calculated by the modified branching equations. For example, pK_a of N-monoalkylanilynium ions are calculated by the following equation:

pK_a = 4.85 + 0.27 x (n_β)^1.84 - 0.08 x (n_γ)^2.36 + 0.01 x (n_δ)^2.36 (sd = 0.2)

where n_β, n_γ and n_δ denote the numbers of atoms in second, third and fourth spheres of the N-alkyl substituent. The accuracy of the pK_a calculation for N-t-butyl anilynium is ±0.1, whereas without this equation it would be ±2!

Calculation of Charge Effects

In most cases, charge effects have been taken into account by including the constant charged substituent into the definition of ionizable center. For example, the pK_a of carboxy groups in α-amino acids are calculated from the equation characterizing the –CH(NH₃⁺)COOH ionization center. In the cases when the charged substituent is variable, its effect is calculated from the distance to ionization center.

Other Effects

ACD/pKa warns you when other effects may appear which affect the experimentally observed pK_a values. These effects, if not properly taken into account, may cause a large discrepancy between the calculated and experimentally observed pK_a values.

Tautomeric Equilibria

For certain compounds, there is mixture of two or more structurally distinct species which are in rapid equilibrium. Normally proton transfer is involved in tautomeric equilibria. Some of the most common instances of tautomerism are related to the following forms:

keto-enol;
phenol-keto;
nitroso-oxime;
aliphatic nitro compounds; and
imine-enamine.

If you are calculating pK_a values for species which contain these functional groups, after entering the compound structure in ACD/Percepta you should always choose the appropriate tautomer from the Select Tautomeric Form dialog box that is automatically shown in such cases. For example, 3 tautomeric forms are possible for the hydroxytriazoliumonate species:

Covalent Hydration

If the energy barrier to the addition of water across a double bond is relatively low, this can be a significant complicating factor in the accurate experimental determination of pK_a; thus, ACD/pKa is designed to flag known cases. For example, for pteridine, a pK_a calculation will automatically flag the species on the left as undergoing covalent hydration:

Vinylology

Another complicating factor in the calculation and measurement of pK_a is vinylology. Vinylology occurs due to resonance effects being transmitted through the double bond. In α,β-unsaturated ketones, nitriles, and esters, such as in the following structures

the γ-hydrogen acquires a level of acidity normally held by the position α to the carbonyl group. Due to vinylology, alkylation at the α-position competes with alkylation at the γ-position.

ACD/pKa does not explicitly flag cases of vinylology, although a message about tautomeric forms may appear.

Limitations

ACD/pKa Classic algorithm will refuse to predict the pKa for structures that:

Contain more than 255 atoms (note that the program refuses to predict pKa for some cyclic compounds having less than 255 atoms due to the fact that the program uses a cycle-breaking algorithm that increases the number of atoms)
Do not contain an ionization center
Contain atoms of non-typical valence
Contain atoms other than C, H, O, S, P, N, F, Cl, Br, I, Se, Si, Ge, Pb, Sn, As, and B
Contain two or more fragments
Contain more than 30 ionizable centers
Contain d-block or f-block metal atoms
Contain textual abbreviations which cannot be transformed to structural fragments.

Note: There certainly exist some structures that formally meet the aforementioned limitations, but cannot be calculated with the current algorithm.

@@ Line 427: / Line 427: @@
 * Contain atoms other than C, H, O, S, P, N, F, Cl, Br, I, Se, Si, Ge, Pb, Sn, As, and B
 * Contain two or more fragments
-* Contain more than 20 ionizable centers
+* Contain more than 30 ionizable centers
 * Contain d-block or f-block metal atoms
 * Contain textual abbreviations which cannot be transformed to structural fragments.
 '''Note:''' There certainly exist some structures that formally meet the aforementioned limitations, but cannot be calculated with the current algorithm.

pKa: Difference between revisions

Revision as of 13:22, 1 March 2021

Contents

Overview

Features

Interface

ACD/pKa Classic

ACD/pKa GALAS

Technical information

Introduction to pK_a

Ionization Centers

Statistical Factor

Experimental Measurement of pK_a

Database of Experimental pK_a Values

Description of ACD/pKa GALAS Algorithm

Description of ACD/pKa Classic Algorithm

Database of Hammett-type Equations

Database of Electronic Substituent Constants (σ)

Estimation of Electronic Substituent Constants

Electronic Effect Transmission through Skeleton

Secondary Algorithm

Transmission through Aliphatic Cycles

Transmission through Condensed Polyaromatic Systems

Calculation of Steric Effects

Calculation of Charge Effects

Other Effects

Tautomeric Equilibria

Covalent Hydration

Vinylology

Limitations

Navigation menu

pKa: Difference between revisions

Revision as of 13:22, 1 March 2021

Overview

Features

Interface

ACD/pKa Classic

ACD/pKa GALAS

Technical information

Introduction to pKa

Ionization Centers

Statistical Factor

Experimental Measurement of pKa

Database of Experimental pKa Values

Description of ACD/pKa GALAS Algorithm

Description of ACD/pKa Classic Algorithm

Database of Hammett-type Equations

Database of Electronic Substituent Constants (σ)

Estimation of Electronic Substituent Constants

Electronic Effect Transmission through Skeleton

Secondary Algorithm

Transmission through Aliphatic Cycles

Transmission through Condensed Polyaromatic Systems

Calculation of Steric Effects

Calculation of Charge Effects

Other Effects

Tautomeric Equilibria

Covalent Hydration

Vinylology

Limitations

Navigation menu

Search

Introduction to pK_a

Experimental Measurement of pK_a

Database of Experimental pK_a Values