A new approach to optimize pH buffers

Buffer solutions are pervasive in chemistry, biochemistry, analytical chemistry, etc. A better understanding of buffer properties and what controls them is susceptible to be of interest in many scientific and technological fields. For instance, linear pH gradients are commonly used in electrophoresis and their optimization rests on numerical optimization of the concentrations of various weak species. It is probably generally assumed that no basic progress could be made on optimization approaches. We introduce here a new strategy to buffer optimization, based on a parametric study of the roots of the first derivative of the buffer index. In this way, it is possible to find mathematically optimal sets of parameters (pKa and concentrations). The method is applied to mixtures of 2, 3 and 4 monovalent species, which represent simple cases that do not call for overly elaborate numerical optimization techniques, but are nevertheless of practical interest in various branches of analytical chemistry.


Introduction
Buffers are of primary importance in many fields, from biology [1] to geochemistry [2] as a scientific notion, but they also are used in technological applications-among many, one can cite water treatment [3], catalytic reduction [4,5], drug solubility [6], control of enzyme activity [7], polarography [8,9,10] and, in analytical chemistry, atomic absorption spectrophotometry [11], chromatography [12,13], electrophoresis [14,15] -in particular the use of immobilized pH gradients in electrophoresis [16,17], binary buffers for cell electrophoresis [18] or as an important parameter influencing peak spreading [14]. In all these practical uses, it is important to obtain as good a control as possible over the buffer index, sometimes across several pH units.
In particular, the search for an even buffer capacity across large pH ranges is primordial for the production of linear pH gradients. Mathematical optimization of such mixtures rests mostly on objective functions minimization, typically through least square methods, by adjusting the concentrations of known buffer substances [19,20,21]. On the other hand, theoretical studies of pH mixtures that lead to real values prediction has mostly been conducted analytically, greatly limiting the situations that could be dealt with as the complexity of the system grows extremely rapidly with the composition [22,23,24,25,26,27]. Such studies are not primarily interested with actual buffers but rather with the mathematical conditions that drive a buffer system. In this way, it is possible, for instance, to find the conditions in terms of dissociation constants (or pK a ) and concentrations that guarantee the largest possible pH range of even buffer index for a two weak acid mixture, provided that one can neglect the contribution of H 3 O + and OH − . One can, analytically, hardly go much further given the degree of the master polynomials that drive the more complicated systems-this is the case also for complexation equilibria where overlapping involving two steps already leads to intricate calculations [28], even though they do not involve formally an equilibrium with the solvent. Hence, it is hopeless to expect analytical solutions that could predict the best chemical properties to produce constant buffer index over large pH ranges-i.e. more than 1 pH unit, roughly, as going further necessarily involves using more than two chemical components. For that matter, it is even hard to guarantee finding the best mathematical conditions to reach the flattest possible index buffer curve for extreme pH, as then the contribution of H 3 O + or OH − cannot be neglected anymore, which renders the system impossible to solve analytically for two weak acid mixtures.
General formulation of buffer capacity β have been proposed (see, for instance, [22,29,30]) and Urbansky and Schock have suggested the use of the derivative of the buffer index β to locate critical points [31]. However, to the best of our knowledge, this suggestion has not been the object of extended study, even though it turns out to bring interesting results for buffer optimization in the framework of a parametric study.
The object of the present paper is to introduce a new approach, resting on a numerical analysis of the roots of the β function, which allows finding the optimal mathematical conditions to produce flat buffer index curves. It has been applied to solutions of 2, 3 and 4 weak acids, which are simple enough to be analyzed without numerical optimization techniques.

Preliminary considerations and notations
The buffer index, β, was first proposed by Van Slyke [32,33] as: where B, in Van Slyke's original paper, was the concentration of strong base in gram equivalent per L, whereas here we will instead use molarity to define β, as is generally done.
In what follows, pK a 's will be numbered pK a1 , pK a2 , pK a3 and pK a4 and we will always consider pK a1 < pK a2 < pK a3 < pK a4 . The associated concentrations may also be numbered C 1 , C 2 , C 3 and C 4 . We will note ∆pK a the difference between the highest pK a values and pK a1 (as it is the smallest one by definition here). We will neglect ionic strength effects, hence activities reduce to concentrations in the equilibrium constants, and we will use the shorthand notation [H 3 O + ] = h.

Buffers of two weak acids
This section deals with solutions of two weak acids with their acidity constant noted pK a1 and pK a2 and concentrations C 1 and C 2 . We will note the average pK a , pK a = (pK a1 + pK a2 )/2. Such mixtures have already been studied at great length. To quote two particularly relevant contributions, one should note that Ricci has studied the presence of a minimum of β in the pH interval between pK a1 and pK a2 , and provided conditions for its existence when neglecting the contribution of H 3 O + and OH − (see pp. 185-191 in [22]). Rilbe ([26]) has shown that this minimum disappears when pK a2 − pK a1 ≤ 1.14 for equimolar mixtures, also when neglecting the contributions of H 3 O + and OH − .
In the present section, we will first study the equimolar case without taking H 3 O + and OH − into consideration in the relationship for β, and find the same result as Rilbe. Then, we study the mixtures of two weak acids of different concentrations and establish for which value of ∆pK a = pK a2 − pK a1 the minimum exists or not and, finally, we study the effect of the contribution of H 3 O + and OH − and show how to determine the optimal values of concentrations and pK a 's in order to obtain a flat buffer index curve. All this will be done using numerical methods, as they can be extended to more complicated mixtures and to the inclusion of H 3 O + and OH − , which is already difficult analytically for a mixture of two weak monovalent species.

Two weak acids with the same concentration
In the case of two weak acids with the same concentration C, the buffer index is given by: and its derivative β : with P 1 = (K a1 + h) and P 2 = (K a2 + h). This derivative is zero when which does not depend on the concentration of the conjugated species as it could be factored out. By finding the roots of this polynomial in, say, the interval pH ∈ [0 : 14], one can determine the number of maxima and minima of the buffer index and the values of the corresponding roots. Figure 1.a represents the pH values for which β = 0 in the case where pK a = 4 as a function of ∆pK a = pK a2 − pK a1 i . The first obvious point is that a single root is always present, that for pH = pK a . A point at ∆pK c a = 1.1439 ii corresponds to the transition from the one to the three roots regime of the buffer index curve and matches the flattest buffer index that can be obtained at pH = 4 using two weak acids [26,25], in this case with pK a1 = 3.428 and pK a2 = 4.5720 (see figure 1.b). For ∆pK a < 1.1439 the single root in the derivative of β corresponds to a maximum, whereas it becomes a minimum surrounded by two maxima for ∆pK a > 1.1439.

Two weak acids with different concentrations
In the case of two weak acids with two different concentrations, the buffer index is given by: with r = C 2 /C 1 . Its derivative writes: We observe a significant change in the behaviour of the roots of β when C 1 = C 2 . Indeed, as shown in figure 2.a, the triple point disappears and is split in two, even if there still exists a 1-and a 3-roots regime, with a transition point for a given value of ∆pK c a that appears dependent upon the ratio r = C 2 /C 1 (see figure 2.b) but not on the magnitude of the concentrations. Here, as r < 1, C 2 varies below C 1 . In that case, the single root regime corresponds to a maximum that is closer to pK a1 than to pK a2 and not anymore equal to pK a . However, it logically goes to 4 when ∆pK a → 0. In the three roots regime, the "middle root" is above pK a and goes to pK a as ∆pK a increases beyond ∆pK c a . The situation is perfectly symmetrical when C 1 is below C 2 .
As shown in figure 2.b, the value of the split triple point, ∆pK c a , increases as the difference in concentrations increases. When this split occurs (i.e. for ∆pK a = ∆pK c a ), the buffer index of the solution presents a rather flat regime around pH = pK a2 , as can be seen in figure 2.c, but pH = pK a1 appears as a clear maximum of the buffer index. Below the critical value, the buffer index presents a single maximum and above that value, two maxima and a minimum are present, roughly and respectively for pH = pK a1 , i We chose that value of pKa for no particular reason as, in the considered situation, this curve is independent from this parameter. At this stage, we could have used reduced variables as proposed by Rilbe [23], however when we will consider the contribution of H 3 O + and OH − , it will not be possible to use such variables anymore so we prefer this homogeneous presentation of all equations studied here. ii A similar case can be made considering a solution of a weak diacid, in which case ∆pK c a = 1.204, see [23,27]. pK a2 and pK a (the exact values of these extrema must be computed as a function of r and present a dependency similar to the ones presented in figure 2.a).

Including the contribution of H 3 O + and OH −
Now, we include the contribution of H 3 O + and OH − . In that case, the buffer index writes: and its derivative is: which cancels out if the numerator cancels, obviously, as above.

Two weak acids with the same concentration
We first consider the case where C 1 = C 2 . As is shown by figure 3, the triple point is broken when the contributions of H 3 O + and OH − are taken into account even if C 1 = C 2 , as long as pK a = 7. This behaviour is symmetrical whether this parameter is above or below 7, and these curves appear very similar to the ones obtained for C 1 = C 2 when neglecting H 3 O + and OH − (see figure 2). The further pK a is from 7, the stronger the divergence from the ∆pK a curve established in the absence of H 3 O + and OH − contributions. Moreover, these curves become dependent upon the concentration of the buffer solutions, as it cannot be factored out: the higher the dilution, the higher ∆pK c a , as in the case of solutions with higher differences in concentrations. Figure 3.c illustrates how the buffer index evolves for values of ∆pK a above, equal or below ∆pK c a in the case where C = 0.1 mol.L −1 and pK a = 3 (here, ∆pK c a = 1.249).

Two weak acids with different concentrations
Now, it becomes possible to compensate for the asymmetry due to the H + or OH − contributions by changing the concentration of either species in order to improve the buffer index in the target pH-area, which is obtained by converging towards a set of concentrations C 1 and C 2 such that the single triple root is restored. In practice, here, we reached that goal by setting the value of C 1 and then optimizing C 2 .
As an example, we treat the case of the solution seen above, with C 1 = 0.1 mol.L −1 and as target pH = 3. In fig. 4.a are represented the roots in function of ∆pK a for C 1 = C 2 = 0.1 mol.L −1 -the starting point-and C 1 = 0.1 mol.L −1 , C 2 = 0.1104 mol.L −1 , when the convergence was satisfying. In that situation, ∆pK c a = 1.099 ≈ 1.1, value that differs significantly from 1.144 found in the absence of the H 3 O + contribution (which is the important one at pH = 3, where OH − can safely be neglected) or the one found for C 1 = C 2 = 0.1 mol.L −1 (∆pK c a = 1.249).  .b shows the dependency of the buffer index with pH for various sets of C 1 , C 2 and ∆pK a . The benefit of selecting the right value of C 2 and ∆pK a = ∆pK c a with C 1 set is obvious, as in that case β turns out to be perfectly flat over a range of almost 1 pH unit, centered around 3.

Three weak acids
We do not know of any detailed study dealing with mixtures of three weak acids in detail. The closer situation we found was that of trivalent species. In his study of the buffer capacity of trivalent protolytes, Rilbe found no value of ∆pK a that allowed a flat dependency of the buffer index [24]. As a matter of fact, this is only possible if the three pK a correspond to monovalent species of different concentrations, as we are about to see, and this is obviously impossible for trivalent species.
We consider here the situation without taking into account the contribution of H 3 O + and OH − , as we want to keep to situations that allow simple optimization. Otherwise, more advanced and fully automated numerical developments would be needed, which could be the object of future studies. Hence, the results established here will be of interest in pH and concentration ranges where these contributions can be neglected.

Isoconcentration of extreme pK a and centered pK a intervals
Here, we take C 1 = C 3 = C = C 2 = C m and r = C/C m . We also note ∆pK a = pK a3 − pK a1 and pK a = pK a2 = (pK a1 + pK a3 )/2. In that case, it comes: and its derivative β : where P 1 , P 2 and P 3 are defined as in the previous section.   5.b). Additionally, one can note that the multiplicity of the root evolves also with r. For large r values, as ∆pK a increases, there actually exists a domain with a single root, then with 3 roots and finally with 5 roots. In contrast, for low values of r, the transition goes straight from a single to a 5 roots situation. The value of r found to be the transition between these two types of behaviour is r ≈ 1.56. Another value of interest turns out to be the one for which the intermediate branches come to touch the pH = 7 solution, observed at r = 1.5. Indeed, figure 5.c shows the evolution of the buffer index for solutions of various values of r and ∆pK a = ∆pK c a . The value of r = 1.5 appears as the transition to a very flat behaviour, when mixtures parametrized with lower values of r present a non flat behaviour even if with a single maximum iii .

Varying concentration and non centered pK a intervals
We now take pK a1 < pK a2 < pK a3 and C l = C h = C m and r l = C l /C m , r h = C h /C m . We also note ∆pK a = pK a3 − pK a1 . In that case, it comes: and its derivative β : where P 1 , P 2 and P 3 are defined as in the previous sections.
iii Here, it is illustrated for the value r = 1 and ∆pK c a = 2.527, more or less corresponding to trivalent species -Rilbe found ∆pK c a = 2.58 for trivalent species, which is rather close [24]. Obviously, these two situations are different and it is no more surprising to find different values of ∆pK c a than it was when comparing mixtures of two monovalent acids with divalent species in section 3.2.1.
We do not present here a general study of this situation as too many parameters are involved for a simple visualisation, however it is still possible to optimize solutions "by hand" as we exemplify below. In these asymmetrical situations, it becomes handy to work also with r ∆pKa = ∆pKa h ∆pKa l , where ∆pK ah = pK a3 − pK a2 and ∆pK al = pK a2 − pK a1 .
Optimizing the parameter of a buffer solution Now, it is possible to optimize a solution of three weak acids with asymmetrical ∆pK a by varying the concentrations of one of the weak acid with the extreme pK a , and, in this case, without the contribution of H 3 O + and OH − . This is interesting to obtain as flat a dependency of β with pH as possible, when using weak acids with irregular pK a intervals.
The same procedure that was used in the case of two weak acids (taking into account the contribution of H 3 O + and OH − , see section 3.2.2) allows performing just such a task. To exemplify that point, figure  6.a displays the roots landscape in the case where pK a2 = 7 and for r ∆pKa = 1 and 1.1, with or without optimization of r l and r h . A first situation is simply the one obtained for pK a = 7 and r = 1.5 (and, here, r ∆pKa = 1 and r l = r h ) in section 4.1. Then the roots evolve in an asymmetrical manner when r ∆pKa = 1.1 is imposed with r l = r h , leading to a skewed plateau of the buffer index. Finally, by optimizing r h , it is possible to recover a single multiple roots and the corresponding β curve appears as flat as the first one. The difference in height is due to the fact that the optimization was obtained by increasing C h , leaving the other two unchanged, hence the overall buffer index increases.

Four weak acids
We could not find any detailed study of mixtures of four weak species in the literature and it turns out that the tools proposed here allow studying this case, provided that we restrict ourselves to "nice" combinations of parameters. Indeed, the four acids mixture represents an interesting and still somewhat simple situation under the following conditions: first, the contribution of H 3 O + and OH − is neglected and, second, we consider a highly symmetrical set of concentrations and pK a 's in the sense defined hereafter.
In accordance with what was said before, we take pK a1 < pK a2 < pK a3 < pK a4 . Additionally, we take C 1 = C 4 , C 2 = C 3 = C and r C = C 1 /C 2 , hence C 1 = r C C. We also note ∆pK a = (pK a4 − pK a1 )/2 and we consider that pK a = pKa 1 +pKa 4 2 = pKa 2 +pKa 3 2 . It then comes pK a1 = pK a − ∆pK a and pK a4 = pK a + ∆pK a . Moreover, we have r ∆pKa = pKa 3 −pKa 2 pKa 4 −pKa 1 , which gives pK a2 = pK a − r ∆pKa ∆pK a and pK a3 = pK a + r ∆pKa ∆pK a . In that case, it comes: where P 1 , P 2 , P 3 and P 4 are defined as in the previous section. It will also be interesting to study the second derivative, which writes: Then, by varying r C and r ∆pKa , it becomes possible to map the properties of all mixtures of four weak acids with the same average pK a . Figure 7 presents the study of the first critical point for these systems by plotting (a) the value of ∆pK c a as a function of r C and r ∆pKa , (b) the dependency of the number of roots of β for ∆pK a > ∆pK c a and (c) the corresponding number of roots of β". The domain plotted was restricted to an "interesting" area, as for larger values of r C and/or r ∆pKa , the behaviour becomes monotonous. Indeed, we observe the existence of a domain on the border of which ∆pK c a values undergo an inflexion, and this border corresponds to a zone of transition of the number of roots from 5 to 3 for ∆pK a immediately above ∆pK c a . The maximum of that area is observed at r c = 1.37 and r ∆pKa = 0.293 and it disappears for r C > 3.12.
Figures 7.c and d display the behaviour of the buffer index for two values of r C . It is clear that exiting the 5 roots domain leads to buffer index curves with a single maximum while at the same time the values of ∆pK c a decreases as r ∆pKa increases in the 3 roots domain, leading to narrower flat region in the buffer index curve. Hence, the optimal solution must be found on the 5-3 boundary. Figure 7.e and f display respectively the buffer index and its derivative along the critical line for ∆pK a = ∆pK c a . At low values of r C , beta actually displays several shoulder that tend to disappear for r C > 1.5. In order to optimize more precisely, it becomes necessary to proceed as above, by examining the root space in detail.
Hence, by iteratively searching over the boundary between the 5 and 3 roots domains, we can find an optimal combination, which is obtained for r C = 1.6217, r ∆pKa = 0.2862, ∆pK c a = 2.572. It corresponds to a situation where branches start to touch in the root space, as in the 3 weak acids case of section 4.1. Hence, for 4 weak acids mixture the optimal combination found here for an interval centered around pH = 7 is pK a1 = 5.714, pK a2 = 6.632, pK a3 = 7.368, pK a4 = 8.286 and r C = 1.6217, as depicted in figure 8.a (root space) and 8.b (corresponding buffer index curves at ∆pK a = ∆pK c a ).

Discussion and conclusion
In this paper, we have presented the interest that lies in studying buffer solutions in the root space of the derivative of the buffer index. It has allowed us to easily find optimal values of concentrations and pK a in the case of 2 and 3 weak acids mixtures, and, with greater difficulty, an optimal point in a symmetrical 4 weak acids case. As we restricted ourselves to simple optimization techniques, only the case with 2 weak acids could be dealt with while including the contribution of H 3 O + and OH − . Going beyond -for instance finding the optimal combination of parameters for mixtures of 3 weak acids at low or high pH target values -would involve numerical multiparameter optimization techniques, as is done in other approaches by combining stochastic and steepest descent methods [19], which goes beyond the scope of the present study. The question that will face such an enterprise will be that of the proper optimization technique as well as the proper target function(s). This will be the object of upcoming work. But it should be underlined that the flatness of buffer index curves obtained here appears encouraging for optimization methods based on the approach proposed here, as it finds mathematically well defined optima. Another route of investigation would be to optimize the dynamics buffer capacity [33], however in that case an additional parameter would have to be taken into account as the dilution effects depend on the concentration of the added solution.
In a more practical light, the results and techniques presented in this study allow defining optimal buffer solutions which might or might not be feasible when considering the actual pK a values of monovalent species. However, finding combinations of real weak acids that come as close as possible to such optima will lead to solutions with improved buffer-index stability over a wider range of pH and can serve as a guide for buffer preparation that do not aim at pH ranges wider than 3 pH units.
Moreover, this method could obviously be extended to situations including the contribution of polyprotic species, for instance in order to find specific optimal points in the presence of a given polyacid whose pK a 's are known and set, while trying to flatten the dependency of β with pH by including one or more monovalent species.