Site correlations, capacitance, and polarizability from protein protonisation fluctuations

We generalize the Kirkwood-Shumaker theory of protonisation fluctuation for an anisotropic distribution of dissociable charges on a globular protein. The fluctuations of the total charge and the total dipole moment, in contrast to their average values, depend on the same proton occupancy correlator, thus exhibiting a similar dependence also on the solution pH. This has important consequences for the Kirkwood-Shumaker interaction and its dependence on the bathing solution conditions.


I. INTRODUCTION
Electrostatic interactions are ubiquitous in proteinaceous systems and make an essential contribution to protein structure, folding, binding, and condensation [1].Proteins differ in a fundamental way from many other charged colloidal systems because their charge is not fixed, but depends through the dissociation equilibrium of constitutive AAs on local molecular geometry, dielectric inhomogeneities, vicinal charges, and solution composition.As soon as any of these parameters is changed, the protein charge responds in a process referred to as charge regulation (CR), through which the same protein acquires a different charge.CR effects are significant and have been shown to qualitatively change the phase diagram of protein solutions, including a liquid-liquid phase separation based on CR [2], while self-assembled protein structures appear to exhibit asymmetric constructs stabilized due to global charge redistribution [3].
Understanding the charging mechanism of dissociable AAs has a long history, starting with Linderstrøm-Lang of the Carlsberg Laboratory [4] and followed by the early general studies of the acid-base equilibria in polyelectrolytes by Lifson [5] and Marcus [6].Protonisation/deprotonisation dissociation equilibria were introduced into protein electrostatics in the seminal works of Kirkwood and Shumaker [7,8] and Tanford and Kirkwood [9], and later developed further through detailed simulation models of protein behaviour in biomolecular solutions [3,10,11].The connection between the protonisation/deprotonisation dissociation equilibria and Poisson-Boltzmann electrostatics was developed by Ninham and Parsegian [12], who introduced the CR mechanism as a self-consistent relationship between the local electrostatic potential and the dissociated state of chargeable surface groups described by the Langmuir isotherm, which has been recently extended to any adsorption isotherm [13].CR in the context of proteins [14] and its development was authoritatively reviewed by Borkovec et al. [15].
An important consequence of the CR mechanism is the fact that the charging state of a protein is an annealed variable and can therefore exhibit thermal fluctuations, a theoretical prediction [7] borne out by experiments [16].These thermal protonisation/deprotonisation fluctuations engender fluctuations of both the protein charge as well as the protein dipole moment that in turn lead to capacitance and polarizability of the protein.The charge and the dipole moment are then together at the origin of the fluctuation interactions usually referred to as the Kirkwood-Shumaker (KS) forces [8].They have been quantified in detailed simulations of protein-protein interactions by Lund and Jönsson [10,14] and Barroso da Silva and Dias [17].Upgrading the original KS approach, the fluctuation interactions were cast into a more appropriate theoretical framework of monopole and dipole Casimir-type thermal fluctuation interactions [18,19] or within the zero frequency Lifshitz term in the general theory of van der Waals interactions [20,21].
Apart from their effect on the interactions, calculations of the protonisation fluctuation contribution to protein capacitance and polarizability are scarce, which is why we embark in this work upon a more detailed analysis of their properties.Specifically, based on previously introduced methodology [22,23] we will investigate CR effects of the protein net charge and net dipole moment as well as the corresponding charge fluctuations, quantified by the capacitance response function, and the dipole fluctuations, quantified by the dielectric response function.Our work generalizes some aspects of the seminal approach of Kirkwood and Shumaker [7] and takes explicitly into account the electrostatic interactions on the weak-coupling level in the calculations of the response functions.

II. CHARGE REGULATION OF A CHARGED DIELECTRIC SPHERE
We will treat globular proteins as impermeable dielectric spheres with radius R, dielectric constant ε ∞ , and carrying N charges q k , k = 1, . . ., N , located at positions r k , |r k | = R.This gives rise to a surface charge density on the protein, where Ω = (ϑ, ϕ) comprises the azimuthal and polar angle in spherical coordinates and γ k is the great circle distance between Ω and Ω k .We have associated with each charge a finite size λ k through a normal distribution on the sphere [24]; when λ → ∞, this reduces to the Dirac delta function.For simplicity, we will assume that λ k = λ ∀k, and we set λ = 100, since the exact choice of this parameter does not significantly influence the results [23].In Eq. ( 1), we have already made a distinction between the charges based on whether they acquire a positive charge, q + k = eη k , or negative charge, q − k = e(η k − 1); η k ∈ [0, 1] determines the protonation/deprotonation state of each charge.The surface charge density can be expanded in terms of spherical harmonics Y lm (Ω), with the expansion coefficients given by σ(lm) = 4π k q k g l (λ k )Y * lm (Ω k ) [23], where we have defined g l (λ) = λ i l (λ)/sinh λ and i l (x) = π/2x I l+1/2 (x) are the modified spherical Bessel functions of the first kind.

A. Free energy of the system
In the linearized Debye-Hückel (DH) approximation, the electrostatic free energy of a charged, impermeable dielectric sphere is [23] (2) where κ = 2en 0 /ε w ε 0 k B T is the inverse Debye screening length, n 0 is the monovalent salt concentration, ε = ε ∞ /ε w , and ε w = 80 is the dielectric constant of water.In Eq. (2), C(l, x) is the multipole coupling function as described in the Supplementary Information (SI) and in Fig. S1 and it is important to note that it exhibits no exponential screening, so that the different multipoles in the electrostatic energy are coupled algebraically.
The electrostatic energy in Eq. ( 2) differs from and simplifies the well-known Kirkwood expression [25][26][27] in two ways: (i) the charges in the protein interior are not dissociated and the position of the charges is thus confined to the surface of the dielectric discontinuity; and (ii) the charges are not point-like but have a finite angular size described through a normal distribution on the sphere, quantified by the parameter λ [24].In addition, Kirkwood and Shumaker [7] consider the electrostatic interactions either only through the electrostatic self-energy or neglect them altogether.
The total free energy of the system is then composed of the electrostatic (ES) and charge regulation (CR) components and can be cast into the form [23] βF where βα i = ln 10(pH − pK (i)  a ) and pK a are the AA dissociation constants.For the CR part of the free energy we have chosen the form corresponding to the Langmuir-Davies dissociation isotherm (related to the Henderson-Hasselbalch isotherm) that is typically used to model the CR process in proteins [28].Other forms, such as the Frumkin-Fowler-Guggenheim isotherm, would correspond to more detailed models of the CR process [29].The thermodynamic equilibrium solutions for η k and the corresponding charges q k are then obtained from an implicit solution of and the equilibrium free energy is thus given by In this way, we can express the total free energy of the system through its electrostatic part and its derivatives, and the form of Eq. ( 5) is universal for any form of F ES .

B. Linearized CR approximation
Since the electrostatic free energy is based on the DH approximation, valid for small electrostatic potentials and thus necessarily for small electrostatic interactions, we can expand Eq. ( 4) in terms of the electrostatic contribution to obtain [23] where the bare charge q (0) i and the bare capacitance c (0) i of the i-th moiety are defined in the SI.In this context, we use the term "bare" to denote that the electrostatic contribution to these quantities has not been taken into account.From Eq. ( 2), we obtain the derivative of the free energy with respect to the chargeable moieties: with the electrostatic coupling matrix defined as Here, B = βe 2 0 /4πε w ε 0 is the Bjerrum length in water and cos γ kt = cos ϑ k cos ϑ t + sin ϑ k sin ϑ t cos(ϕ k − ϕ t ) is the cosine of the spherical distance on the sphere.Note that the magnitude of ξ ik is determined by the ratio of κ B and κR.The charges in the linearized CR approximation that follow from Eq. ( 6) are then obtained by solving a linear system of equations [23] k giving the charges q k as a function of the capacitance and the electrostatic coupling matrix.

III. PROTEIN PROTONISATION FLUCTUATIONS
A. Fluctuations around the charge regulated thermal average Fluctuations around the thermodynamic equilibrium solution [Eq.(3)] are obtained as ∂q k ∂q t δq k δq t + . . ., (10) where δq i = q i − q i is the deviation from the mean-field solution.We introduce the proton occupancy correlator δq k δq t as the average over fluctuations around the equilibrium value as To determine the proton occupancy correlator matrix δq k δq t , we note that where we have used ∂ q = (1/e)∂ η .In practice, we use a special case of the Woodbury matrix identity to determine the inverse [30].The proton occupancy correlator matrix describes the correlations between the protonisation sites on the protein.From Eq. ( 12), it follows that proton occupancy fluctuations are anisotropic due to the electrostatic interaction coupling matrix ξ kp [Eq.( 8)].

B. Charge and dipole moment fluctuations
Charge and dipole moment fluctuations around the thermal equilibrium values for a charge distribution on a sphere can be defined as δq = R 2 dΩ δσ(Ω) and δp = R 2 dΩ δσ(Ω) r Ω , respectively, where δσ(Ω) = σ(Ω) − σ(Ω) and we have taken into account that the charges are confined to the surface of a sphere [31].To calculate the fluctuations in the total charge of the protein around the mean-field solution, we have due to the orthonormality of spherical harmonics.As g 0 (λ) = 1, it follows that Similarly, the integrals for the fluctuations in the dipole moment vanish for all l = 1, and we obtain In the limit where we have no electrostatic contribution to CR, we have ξ ik = 0 and thus which is the result obtained by Kirkwood and Shumaker [7].Our theory is clearly a generalization of the latter in the sense that it takes into account the electrostatic energy of the fluctuating states.Note again that the derivations presented here are general, and F ES can be substituted with any form of electrostatic free energy of a given system.From the definitions of δq 2 and δp 2 it transpires that the pH dependence in both cases is given solely by the correlator δq k δq t .The conclusion would therefore be that both have a similar dependence on pH, the only difference being in the summation over the arclength cos γ kt in Eq. (15).For a vanishing electrostatic coupling (vanishing ξ kt ), the charge and dipole fluctuations become proportional in the lowest order, where we have assumed for simplicity a Dirac-like distribution on a sphere, λ → ∞.The electrostatic part ξ kt is thus the one that underpins the difference between the pH dependence of q 2 and p 2 .The stronger the electrostatic coupling in Eq. ( 12), the more the fluctuations deviate from locality and exhibit a collective behavior which acts to screen out the details in this dependence.

C. Capacitance and polarizability due to protonisation fluctuations
Since thermodynamic fluctuations in an annealed thermodynamic variable predict the response of the system, it follows that charge and dipole moment fluctuations are generally connected with capacitance and polarizability response functions, respectively.The Johnson-Nyquist fluctuation-dissipation formula, as an example of the general linear response theory, can be cast as [10] where φ is the electrostatic potential.Equation ( 18) provides the connection between the differential capacitance response function C and charge fluctuations [28].It can be derived from thermal fluctuations of the electrostatic potential by the Einstein formula [32].From the linear approximation of Eq. ( 4) the differential capacitance response function can be rewritten as Protonisation fluctuations in addition instigate dipole moment fluctuations, which in turn lead to a correction of the dielectric constant of the protein ε p through a variant of the Kirkwood formula [33,34], derived by considering protonation-induced dipole fluctuation of a spherical cavity of radius R with an isotropic dielectric response ε ∞ [35].The relation between dipole fluctuations and the excess polarizability response function α is then where E 0 is the external electrostatic field.Taking into account the difference between the internal field E and the external field, we can write Here, ε ∞ refers to all high frequency contributions-such as electronic and vibrational-excluding the protonisation fluctuations.As a caveat, configurational fluctuations of the average dipole moment are not included at this level.

D. Fluctuation interactions
The response functions of two proteins, separated by a fixed distance and coupled via a screened electrostatic potential, imply a fluctuation-driven interaction in the standard second order perturbation theory.The dominant contribution comes from monopole charge fluctuations and the subdominant term from the dipole charge fluctuations.For two proteins located at r 1,2 , the excess electrostatic free energy as a functional of the local electrostatic potential configuration can be written as where H 0 [φ(r)] is the electrostatic free energy of the bathing ionic solution that is taken in its approximate DH form, with external charges composed of a charge monopole and dipole (23) The generalized response function can similarly be written as composed of the capacitance C i = C = β δq 2 and the polarizability Writing the partition function of this system in the form of a Hubbard-Stratonovich field functional integral, one remains to the lowest (Gaussian) order in the electrostatic potential fluctuations with free energy of the form βF 2 [q i , r i ] = q 1 q 2 V 0 (r 1 , r 2 ) where we have introduced Ci = C i + ∇α i ∇ .Here, V 0 (r, r ) is the effective electrostatic interaction between two Coulomb charges, equal either to the bare Coulomb interaction for two proteins in a pure solvent, V 0 (r, r ) = 1/4πε w ε 0 |r − r |, or to the DH potential for two proteins in a monovalent electrolyte, V 0 (r, r ) = e −κ|r−r | /4πε w ε 0 |r − r |.For α i → 0, the result of Eq. ( 25) coincides with the KS interaction as derived by Lund and Jönsson [10].It also resembles the recently calculated charge fluctuation Casimir interaction between dielectric spheres [21], but without the CR model.
When the average charge and dipole moment of two identical proteins vanish and in the limit of large screening, κ|r 1 − r 2 | 1, we have in the leading order in the separation distance In this case, the effective strength of the KS interactions is additive in CR-generated capacitance and polarizability of the protein, mediated crucially by the screening length.This finding is distinct from both the original [25] and subsequent treatments of the KS interactions [10].

IV. EXAMPLES: GLOBULAR PROTEINS
We explore the results of our work on four well-studied globular proteins: bovine β-lactoglobulin (PDB:3NPO), hen egg-white lysozyme (PDB:3WUN), hisactophilin (PDB:1HCD), and calmodulin (PDB:3CLN).These four proteins were chosen as charge regulation effects in them have been studied previously both in theory and simulation [10,14,36] and as their shape can be approximated reasonably well by a sphere.In extracting their charge distributions (see the example in Fig. 1a), we follow previously established methods [22,23]; details of the procedure are given in the SI.Using the four proteins as examples, we analyze their average charge and the corresponding charge fluctuations, their average dipole moment and the dipole moment fluctuations, their polarizability and the associated dielectric constant of the protein, and the strength of the generalized KS charge fluctuation interactions.In what follows, the charges q i in the presence of CR are calculated from Eq. ( 6) and charges q (0) i in the absence of CR are obtained from Eq. (S3) in the SI.

A. Charge and dipole moment fluctuations
Figure 2a shows how the total charge on hisactophilin changes as a function of pH when ε ∞ = 4 and n 0 = 100 mM, both in presence and in absence of CR.In this particular case, the inclusion of CR effects does not significantly alter the predicted isoelectric point of the protein, pI ≈ 7.45 (CR) compared to pI ≈ 7.35 (no CR).The largest predicted difference at the same system parameters occurs for β-lactoglobulin, where the isoelectric point increases by more than half a unit in presence of CR, pI ≈ 5.55 (CR) compared to pI ≈ 4.95 (no CR)-see Fig. S4 in the SI.This slightly overestimates the mea-sured values [37,38].We note that even in the absence of CR, certain structural and continuum electrostatic effects are included in the calculation as the dissociation constants of the AAs are determined by PROPKA [39] (see also the SI).
Furthermore, Fig. 2a shows the changes in the square root of the charge fluctuations as a function of pH.We observe two things: charge fluctuations are clearly maximal when the total charge changes the most with pH, and are minimal when the change in the total charge plateaus.This is, of course, a direct consequence of the  theoretical derivation.However, the scale of the charge fluctuations is always fairly small compared to the overall charge on the proteins.
The changes in the dipole moment of hisactophilin as a function of pH are shown in Fig. 2b, again both in presence and in absence of CR. Figure 2b also shows the changes in the square root of dipole moment fluctuations as a function of pH.Here, the fluctuations are often maximal when the dipole moment is minimal and vice versa, although this does not seem to be a general rule.Another observation, stemming directly from our theoretical calculations, is that the pH dependence of the dipole moment fluctuations is essentially the same as the dependence of total charge fluctuations, which can be seen from comparing panels (a) and (b) of Fig. 2.This is a simple consequence of the fact that the total charge as well as the total dipole moment fluctuations depend on the same proton occupancy correlator δq k δq t (see the examples in Fig. 1 and Figs.S2 and S3 in the SI), which displays positive values for the diagonal terms (self energy) and negative values (anticorrelation) for off-diagonal terms.The intermittency in the diagonal components is due to full neutralization of some charged groups at those solution conditions.These observations are characteristic of all four proteins studied, which can be seen from Figs. S4 and S5 in the SI.
External parameters-namely, salt concentration and protein background dielectric constant-influence the fluctuations to various degrees.In general, changes in either of the two do not modify the pH dependence of the fluctuations to a significant extent, as can be seen from Figs. S6 and S7 in the SI on the examples of βlactoglobulin and lysozyme, respectively.Changes in salt concentration tend to lead to somewhat larger changes in the magnitude of the fluctuations compared to changes in protein dielectric constant (Fig. S6), although the difference depends on the protein in question (Fig. S7).Moreover, while charge fluctuations depend only on the dimensionless parameter κR (through the electrostatic coupling  (27)] for all four proteins studied.System parameters are ε∞ = 4 and n0 = 100 mM.matrix ξ ik ), dipole fluctuations are in comparison more sensitive to the choice of the protein radius, as they feature an explicit quadratic dependence on R in the lowest order [Eq.( 17)].

B. Capacitance, polarizability, and thermal fluctuation interactions
Equations ( 20) and ( 21) allow us to determine the correction to the dielectric constant of the proteins due to the dipole component of the protonisation fluctuations.Namely, we have that This correction is shown in Fig. 3 for all four proteins studied here and for the background protein dielectric constant ε ∞ = 4 and salt concentration n 0 = 100 mM.The correction to the dielectric constant is very large and exceeds ε p 10 for most of the pH range studied.This trend persists even at higher values of ε ∞ or lower values of n 0 (Fig. S8 in the SI).
Capacitance and polarizability of the protein also additively combine in the dominant term in the separation dependence of the fluctuation interaction between two identical proteins, which differs from the original KS form.The relative contribution of the two is further weighted by the screening length of the bathing solution.Figure 4 shows the scaled strength of the fluctuation interactions 1+κ 2 δp 2 / δq 2 [see Eq. (26)] on the example of calmodulin for different values of ε ∞ and n 0 .Regardless of these two parameters, polarizability of the protein appears to be the dominant contribution to the interaction.The relative contributions of charge and dipole fluctuations are modified by changes in ε p and n 0 , somewhat more so by the latter.And while the relative contributions of polarizability and capacitance to the fluctuations differ from protein to protein (Fig. S9 in the SI), the former is always dominant.

V. CONCLUSIONS
Based upon our previous work [23], we generalized the KS theory of protonisation fluctuation [7] by calculating the electrostatic and CR free energy contributions due to anisotropic distribution of dissociable charges on a sphere circumscribed to a protein.This allowed us to implement and solve for the weak-coupling CR boundary condition as well as quadratic fluctuations around it.We show that fluctuations in both the total charge and the total dipole moment depend on the same proton occupancy correlator δq k δq t .As a result, both the capacitance (proportional to the charge fluctuations) as well as the polarizability (proportional to the dipole fluctuations) of the protein depend in a similar fashion on the solution pH through the proton occupancy correlator.This is surprising, as these two lowest multipoles, i.e., monopole and dipole, describe two linearly independent terms in the spherical harmonic expansion of the electrostatic interaction energy; in fact, it is known that their equilibrium values, q and p, show vastly different pH behavior [22].We also show that the strength of the KS fluctuation interaction in the asymptotic regime of protein-protein separation [10] is modified and depends additively on the capacitance and polarizability of the two interacting proteins.As a consequence, the strength of the KS interaction acquires a strong pH and salt dependence.
The model we presented here makes several simplifying assumptions, such as linearization of both the electrostatic and charge regulation contributions to the free energy as well as treating the proteins as spherical objects.With this, we also assume that any conformational changes do not contribute in a major fashion, which should be a valid approximation for proteins, unlike, for instance, in the case of polyelectrolyte chains [40,41] or protein-RNA complexation [42].Nonetheless, this effect should be investigated further also in proteins, and recent advances in Monte Carlo and molecular dynamics simulations that include efficient computations of charge regulation effects should be of great aid in this [3,[43][44][45].Constant-pH molecular dynamics simulations of protein protonisation behaviour would also allow to explore the question of protein protonisation fluctuations fully at the appropriate level without any linearization assumptions while providing a good comparison for the predictions of our model.

Figure 1 .
Figure 1.(a) An example of a globular protein, hisactophilin (PDB:1HCD), consisting of N = 60 charged AAs with different pKa values.Coloured in red and in blue are the AAs which can become positively or negatively charged, respectively.The charge distribution of the protein can be mapped onto its circumscribed sphere with radius R = 2.44 nm.(b)-(d) Elements of the proton occupancy correlator matrix δq k δqt of hisactophilin at pH = 4 (a), 7 (b), and 10 (c).System parameters are ε∞ = 4 and n0 = 100 mM.For easier discernibility, we show the square root of the elements of the matrix, and the sign of the original matrix elements is preserved.Charge indices k, t refer to the dissociable AAs of the protein.

Figure 2 .
Figure 2. (a) Total charge Q (b) total dipole moment p in presence (full line) and absence (dashed line) of CR as a function of pH (left y-axes).Right y-axes of panels (a) and (b) show the square root of charge fluctuations δq 2 and dipole fluctuations δp 2 , respectively.Shown for hisactophilin (PDB:1HCD) with ε∞ = 4 and n0 = 100 mM.

2 ε∞ = 4 ,Figure 4 .
Figure 4. Relative contributions of charge fluctuations and dipole fluctuations to the fluctuation interactions between two identical proteins [Eq.(26)].Shown on the example of calmodulin (PDB:3CLN) for different values of protein dielectric constant ε∞ and salt concentration n0.