The Performance of Different Water Models on the Structure and Function of Cytochrome P450 Enzymes

: Modelling approaches and modern simulations to investigate the


Introduction:
2][13][14][15] In fact, the water molecule starts playing a role from the first step of CYP450 itself where it vacates the sixth coordination place for the substrate binding. 16In the proceeding steps of the catalytic cycle it also plays a role as a proton donor during the formation of the Cpd 0 and Cpd I. 17,18 Furthermore, the presence of a water molecule can significantly lower the transition state barrier in CYP450s. 13With an ever-increasing use of computational chemistry tools to study the mechanism and functions of CYP450, accurate modeling of the water molecules in the CYP450s has become an important challenge for a computational biochemist.In the last four decades, several water models have been proposed 0][21] Although, all these water models represent similar interaction site, they show significant differences in the bulk properties. 22Very recently, Onufriev's group developed a new water model, OPC which differs significantly in the charges and the bond length. 23This new water model resulting into 3-charge, 4-point rigid water model, reproduces a comprehensive set of bulk properties significantly more accurate than commonly used water models (see ref). 23igure 1.Structure and parameters (bond length, angle, charges) for different water models.
In the present study, using SPC/E, TIP3P, and OPC water models, we have highlighted the effect of different water models on the active site orientations and reactivity of CYP450 enzymes.
For so doing, we have chosen CYP450BM3, 24 CYP450BSβ, 25 , and CYP450OleT 26 as model systems since they catalyze similar fatty acids but possess entirely different reactivity patterns.For instance, in the CYP450BM3 the fatty acid enters the active site via its hydrophobic tail part which lies above the heme plane.The presence of an alkyl substrate tail and F87 residue in the heme cavity makes the reaction center of CYP450BM3 hydrophobic (cf. Figure 2a).In contrast, the fatty acid enters the heme cavity in CYP450BSβ and CYP450OleT via its carboxylate end and interacts with the guanidinium group of R242/245 (CYP450BSβ/CYP450OleT) close to the active site, 27 (cf.Figure 2b), which produces a more polar environment vis-à-vis CYP450BM3.
Therefore, due to the different polarities of the active site in these enzymes, they could be an ideal system to benchmark the performance of water for similar substrates but of different reactivity.

System setup
The study focuses on three CYP450 enzymes; CYP450BM3, CYP450OleT, and CYP450BSβ.
The initial coordinates for these enzymes were taken from the crystal structures with PDB IDs: 1JPZ (CYP450BM3), 24 4L4O (CYP450OleT) 26 , and 1IZO (CYP450BSβ). 25These structures were cocrystalized with the following substrates: N-Palmitoglycine (NPG) in CYP450BM3, Icosanoic acid (DCR) in CYP450OleT, and Palmitoloeic acid (PAM) in CYP450BSβ.Missing hydrogen atoms and some heavy atoms were added by the leap module of Amber 20.The force fields for the heme moiety were taken from the already published parameters for the Cpd I state. 28The Amber ff19SB force field was employed for all the enzymatic systems. 29The associated partial atomic charges and missing parameters for the substrates were generated by applying the restraint electrostatic potential (RESP) method of QM calculated charges at the HF/6-31G(d) level of theory. 30,31The corresponding parameters for substrates were generated using Generalized Amber Forcefields (GAFF2) in the Antechamber module of Amber 20.A few Na+ ions were added to the protein surfaces to neutralize the total charge of the system.Finally, the resulting systems were solvated in an octahedral box of three different water models: TIP3P, OPC, and SPC/E, each extended up to a minimum cut-off of 10 Å from the protein boundary.The catalytic cycle of CYP450 has many states.However, here we focus on the Cpd I state which is the ultimate oxidant species that leads to catalyze the substrate.

MD Simulation.
After proper parametrization of the system, to remove bad contacts, minimization was performed in two stages using a combination of steepest descent (5000 steps) and conjugate gradient (5000 steps) methods.In the first stage water position and conformations are relaxed keeping the protein fixed.Thereafter the whole complex was minimized.Subsequently, the system is gently annealed up to 300K under the NVT ensemble for 50ps.After that, 1ns of density equilibration was performed under an NPT ensemble at a target temperature of 300K and pressure 1 atm by using Langevin thermostat 32 and Berendsen barostat 33 with a collision frequency of 2ps and pressure relaxation time of 1ps.This 1 ns density equilibration is a weakly restrained MD simulation in which the system is slowly released to achieve uniform density after heating under periodic boundary conditions.Then after we remove all the restraints applied before and the system gets equilibrated for 3 ns to get a well-settled pressure and temperature for chemical and conformational analyses.Thereafter, a productive MD simulation was performed using the Monte Carlo barostat 34 for 100 ns for each complex system.We used three different replicas starting from different initial velocities each for 300 ns.Therefore, we performed a total of 300 ns simulations including all three replicas for each system.During all the MD simulations, covalent bonds containing hydrogens were constrained using the SHAKE 35 algorithm, and Particle Mesh Ewald (PME) 36 method was used to treat long-range electrostatic interactions with the cut-off set as 10 Å.All MD simulation was performed with the GPU version of the AMBER 20 package. 37The MD trajectory analysis was done with the CPPTRAJ 38 module of AMBER20.The visualization of the MD trajectories was performed by VMD 39 .All MD simulations were re-run for 3 different replicas starting from different initial velocities (see SI).

QM/MM Methodology
Hydrogen atom transfer (HAT) reactions were calculated using QM/MM calculations for the representative snapshots from MD simulations.In CYP450BM3, the QM-region involves substrate NPG and heme, whereas, the QM region of peroxygenases CYP450OleT and CYP450BSβ include truncated heme-porphyrin, truncated substrates, and R245/R242, H85/Q85 and two water molecules.The active region in QM/MM calculations in all the systems involves the protein residues and water molecules present within the cutoff of 8Å from the active oxidant heme.The atoms in the selected "active region" (mainly from the MM part) interact with the QM zone through electrostatic and van der Waals interactions and the corresponding polarization effects were considered in the subsequent QM/MM calculations.All QM/MM calculations were performed with ChemShell, 40,41 by combining the Turbomole 42,43 for the QM part, and DL_POLY 44 for the MM part.The MM part was described using ff19SB forcefield.To account for the polarizing effect of the protein environment on the QM region, an electronic embedding scheme was used.
Hydrogen link atoms with the charge shift model were employed for treating QM/MM boundary.
During QM/MM geometry optimizations, the QM region was treated using the hybrid UB3LYP functional with two basis sets, B1 and B2, where B1 stands for def2-SVP and B2 stands for def2-TZVP.The B1 basis set was used for geometry optimization, potential energy surface scanning, and frequency calculations.The energies were further corrected with the Grimme dispersion correction.All of the QM/MM transition states were located by relaxed potential energy surface (PES) scans followed by full TS optimizations using the P-RFO 45 optimizer implemented in the HDLC code.The energies were further corrected with the large all-electron basis set, designated as def2-TZVP.The zero-point energies (ZPE) were calculated for all the species and the corresponding final energies were reported as UB3LYP/B2-D3+ZPE.Since s=1/2 (doublet) and s=3/2 (quartet) spin state of Cpd I generally exhibit similar reactivities in CYP450 enzymes.For the sake of comparison to the already reported mechanistic studies, we have performed the calculation in the same spin state i.e., s=1/2 (doublet) used in earlier studies. 13

Results and Discussion
We benchmarked the effect of different water models based on: a) Molecular Dynamics Simulations to study the structural effects and b) QM/MM calculations to study the effect on reactivity of Compound I (Cpd I) for three CYP450 members.The structural effects were probed using RMSD, RMSF, SASA, dipole moment, etc., while reactivity patterns were studied by the hybrid QM/MM calculations for the structure generated by three solvent models.The performance of different water models were ranked on the qualitative comparison depending on above parameters.In CYP450 chemistry, it is well established that the 'water occupancies' close to the heme site is crucial for the catalytic activity of the P450 enzymes.In fact, it is well known that destroying the water channel through the acid-alcohol pairs (for example Glu267-Thr268 in P450BM3) can inhibit the enzyme function.Therefore, we have used the population of water molecules close to the heme site as our standard parameter to measure the relative performance of each water model.In this way-a simulated system with more ordered and populated water molecules has been ranked better and vice versa.

Effect on Structural Parameters:
RMSD and RMSF: RMSD is commonly used by computational chemists to validate the convergence and the stability of the simulated complex while RMSF is widely used to study the thermal fluctuations of atoms/residues during the simulations.To study whether the different water models have any effect on the convergence and the stability of the simulated complex, we, therefore, analyzed the RMSD and RMSF of all three water models for CYP450BM3, CYP450BSβ, and CYP450OleT.A comparison of these RMSDs and RMSFs is shown in Figure 3.As can be seen from Figure 3a, the three water models for all three CYP450s have similar convergence and they show very small overall deviations.However, all three enzymes show marginally fewer RMSD deviations with OPC/E water model relative to TIP3P and OPC models.The RMSF of all three water models for three CYP450 enzymes is shown in Figure 3b.Similar to RMSDs, we can see that all three models qualitatively show similar thermal fluctuations, however, the residue-wise thermal fluctuations are slightly higher in the TIP3P model relative to OPC and SPC/E.

Populations of Water and H-bonds:
In CYP450s, it is well known that the heme propionate forms a well-organized chain connected through the H-bonds, therefore, we performed water population analysis (SI) and water-heme H-bond analysis around 5 Å of Heme for different water models for all three complexes.Interestingly, among all the water models, the TIP3P water model has the maximum water count for CYP450BM3 while the TIP3P water count observed to be less for the CYP450BSβ, and CYP450OleT.At the same time, the OPC water model exhibits greater number of water molecules in case of CYP450OleT and CYP450BSβ.We note that the hydrophobicity of the active site is higher in CYP450BM3 due to the Phe87 and substrate hydrophobic tail while, in contrast, the active site in CYP450OleT and CYP450BSβ are more polar due to charged end of the substrate and an Arg245/Arg242 residues.Therefore, it could be anticipated that due to the more accurate distribution of point charges in OPC which mimic the optimal electrostatics of water models and the hydrophilic nature of active sites (in CYP450OleT and CYP450BSβ) supports the electrostatic interactions like H-bonds.Moreover, for CYP450BM3 where the active site is hydrophobic in nature, higher water populations (SI) and the H-bonds were observed for TIP3P water model.Therefore, we may conclude that TIP3P and SPC/E could be more suitable models for the hydrophobic active site while the OPC water model is more suitable for the polar active sites.Dipole moment: In a recent study, we have shown that the local electric field of the CYP450 enzyme could dictate the specific functions, and the dipole moment induced by the enzymatic local electric field could significantly catalyze the reaction by lowering the activation energy. 46erefore, it is intriguing to study how the different water models affect the electrostatics of the active site residues.For so doing, we selected the 11 snapshots each at the interval of 10 ns (0ns, 10ns, 20ns, 30ns, -----100ns) and calculated the dipole moment due to the water molecules present near the 5 Å of the active site heme fragment.Table 1 represents the average dipole moment for each model for all three enzymes.As can be seen, the dipole moment due to the water molecules is higher in the case of TIP3P for CYP450BM3 only, while OPC shows higher dipole moment for both peroxygenases, i.e., CYP450OleT and CYP450BSβ.The rationale for these different dipole moments could again be correlated with the hydrophobicity and the polarity of the active sites in all three enzymes.

Effect on the Conformational Properties:
In the above sections, we have seen the effect of different water models on some structural parameters.In the proceeding section, we will see if the water models also affect the conformation of the substrate and active site residues.

Conformation of CYP450BM3: CYP450BM3 catalyzes the hydroxylation of long-chain fatty acid
and the conformational changes responsible for its functions have already been explained with MD simulation using the TIP3P water model. 47Therefore, we compared the key conformational features e.g., the orientation of Phe87 and substrate dynamics using different water models.It is well established that the Phe87 flips from parallel to perpendicular plane relative to the heme plane, therefore we compared the propensity of the rotation of Phe87 in different water models.
Interestingly, the propensity to switch to a perpendicular plane was significantly higher in the OPC model relative to SPC/E and TIP3P.Similarly, the substrate can easily slide close to the heme site in the MD simulation with the OPC model while in other water models it requires more MD sampling to access the heme site.
Figure 6.Representative MD snapshots in case of CYP450BM3 system for three different water models.

Conformation of CYP450 peroxygenases i.e., CYP450OleT and CYP450BSβ:
As discussed earlier, the active site of the peroxygenases is relatively polar (with respect to CYP450BM3) due to carboxylate tail and an Arginine close to heme site.During the MD Simulations of both peroxygenases, the water occupancy was almost similar for all three water models (TIP3P, OPC, and SPC/E) which are gated by propionate side chain as shown in Figure 7. Though the aqueducts are rather similar to all the water modeled systems, we have found a significant difference in the behavior of the fatty acid substrate.During the MD with TIP3P and SPC/E modeled aqueous medium, the substrate strongly binds to the Arg245 residue through its carboxylate end and shows significant rigidity during the simulations in both peroxygenases.In contrast, during the MD simulations with the OPC modeled aqueous medium, the substrate slightly changes conformation over the course of time in both peroxygenases.Initially, the His363/H361 (CYP450OleT/CYP450BSβ) restricts the water accessibility, and the substrate interacts strongly through its carboxylate end with the guanidinium group of Arg245/Arg242.As the simulation progresses, the H363/H361 flips its orientation and allows an easy approach of water to the heme.
Due to the change in the water influx, the interaction of the substrate with Arg245 weakens and it reclines towards the heme.Therefore, comparing the three water models in three different CYP450 enzymes we may conclude that the role of the water model becomes imperative in the polar environment and the OPC model may have a significant impact on the conformation in polar active sites.

Effect on Reactivity
In the previous section, we have seen that the water model may affect the substrate orientation via altered water influx which, in turn, may affect the activation energy of the reaction.
We, therefore, performed QM/MM calculations for representative snapshots from the MD trajectories.Since the hydrogen atom transfer (HAT) from the substrate to Cpd I is the key step in the substrate oxidation, we performed the Potential Energy Surface (PES) scan for the HAT step for each enzyme for different water models.TIP3P water model acts more efficiently, while in case of monooxygenase, OPC water model acts more efficiently.On the other hand, the SPC/E water model is the least efficient for peroxygenase and TIP3P is least efficient in case of monooxygenase among all three water models.However, it is important to note that we were unable to fully implement the OPC water model in Chemshell due to the unavailable parameter of the fourth site of OPC water model.Therefore, we believe if the complete implementation of the OPC water model in QM/MM will be done then the OPC model could get better results in the field of activation energy barrier calculation due to the implementation of a better charge model.

Effect of Water Model on the Enzymatic Local electric Field
It is well established that the enzymatic local electric field (LEF) can be a descriptor of the enzymatic activity, 48-51 therefore, we probed the effect of different water models on the LEF of the enzyme.The LEF of the enzyme was calculated on the oxo moiety of heme "O" and in the direction Fe=O axis using TITAN 52 Programs of QM/MM optimized RC, TS, and IM geometries.As can be seen from Figure 12, the LEF remains constant despite using the different water models i.e., OPC and SPC/E in RC, TS, and IM, however, there is a notable difference in the dipole moments of the reactive species.We found an increase in the dipole moment of 0.13 debye from RC to TS using the OPC water model for CYP450OleT, while in the case of the SPC/E water model this increment in dipole moment is 0.05 debye only.
Interestingly in the case of CYP450BSβ, this dipole moment trend is completely opposite.
In both the water models (OPC and SPC/E) dipole moment due to the waters of the QM zone, get decreased from RC to TS.This decrement of dipole moment is 0.13 debye and 0.31 debye in OPC and SPC/E water models, respectively.Considering the stabilization energy ΔELEF = 4.8 (LEF) (μ) cos θ (where LEF is the local electric field, μ is the dipole moment and θ is the angle between LEF vector and μ vector), the change in dipole moment can significantly affect the total stabilization of the state.In a positively charged polar environment of CYP450OleT due to the OPC water model, an increase in the dipole moment stabilizes the TS and we get lower activation energy.On the other hand, a decrease in the dipole moment of TS in CYP450BSβ destabilizes the TS vis-à-vis CYP450OleT, and therefore we see relatively high activation energy (cf.Table 2).
In summary: The effectiveness of various water models on three different enzyme systems has been evaluated using computational methods while taking into account various characteristics.We have calculated several properties with the classical MD simulation tool and all the outcomes has been summarized in the table given below.As can be seen that the use of different water models creates no difference (N.D) on the RMSD and RMSF parameters of the enzymes whereas significant differences was observed for other properties mentioned in the Table 3.In summary, for the peroxygenases CYP450OleT and CYP450BSβ OPC water models performs well whereas for the CYP450BM3 better results were observed for the TIP3P water model.Effect of the different water models on the reactivity is describes in section 3.3 and Table 2.

Conclusions
In the present study, we have comprehensively studied the impact of three widely used water models on the structure and reactivity of three CYP450 enzymes using MD simulations and hybrid QM/MM calculations.MD simulations show that the three water models, TIP3P, SPC/E, and OPC do not have much impact on the RMSD and RMSF values, however, the population of water and dipole moment depends upon the polarity of the active sites of the CYP450 enzymes.For instance, for hydrophobic active sites such as CYP450BM3, the TIP3P, and SPC/E water models have better water populations while for the polar active sites such as CYP450OleT and CYP450BSβ the OPC model performs well.Due to altered water count, the local dynamics in the active site such as Phe87 and substrate orientations in CYP450BM3 were found to be slightly different in OPC and it requires less conformational sampling with respect to TIP3P and SPC/E water models.The QM/MM calculations show that the TIP3P and OPC water models perform better than the SPC/E water model, although an accurate implementation of the fourth site of the OPC water model was not possible in the Chemshell program.In nutshell, TIP3P and SPC/E water models may perform better than the OPC model in the hydrophobic site while the OPC water model is recommended for the polar environments.

Data and Software Availability:
For Table of Contents Only.

Figure 2 .
Figure 2. Different active site environments in two different families of CYP450.(a) The presence of an alkyl substrate tail and F87 above the heme creates a hydrophobic environment around the heme in bacterial CYP450BM3.(b) In CYP450 peroxygenases, the presence of carboxylate end of the substrate and two polar residues like Arg245 and His85 make the heme environment hydrophilic.

Figure 5 .
Figure 5. Hydrogen bonds calculated between active oxidant heme and the solvent water molecules present near 5Å.The graph shown at the top represents the number of H-bonds in CYP450BM3 simulated with the three different water models.Below one signifies the same for CYP450OleT and CYP450BSβ.

Figure 7 .
Figure 7. Origin of water molecules approaching to heme.Note that flipping of H363 opens the water aqueduct and accumulation of waters in the active site affects the substrate orientation in CYP450OleT.

Figure 8 .
Figure 8. Representative snapshots for the CYP450BSβ simulated in TIP3P, OPC, and SPC/E water model.

,
CYP450OleT and CYP450BSβ: We started the QM/MM calculations with the geometry optimization of the reactant complex (RC), Cpd I followed by PES scanning of the optimized geometry.The reaction coordinate of the HAT reaction along with the key stationary points is shown in Figure 8 9.

Figure 9 .
Figure 9. (a,b) Optimized geometries of RC, TS, and IM during the HAT in CYP450OleT in case of OPC and SPC/E water model respectively.(c,d) Charge and spin density values for CYP450OleT in OPC and SPC/E water model respectively.(e) Energy profile obtained during the PES scanning by QM/MM at the B3LYP-D3/def2-TZVP level of theory.Energies reported have the unit, kcal/mol.Note that the OPC water model used here exhibits the direction of OPC and structural properties of TIP3P.The energy profile for this step shows a barrier of 10.02 and 11.27 kcal/mol for OPC and SPC/E

Figure 10 .
Figure 10.(a,b) Optimized geometries of RC, TS, and IM during the HAT in P450BSβ in the case of OPC and SPC/E water model.(c,d) Charge and spin density values for CYP450BSβ in OPC and SPC/E water model respectively.(e) Energy profile obtained during the PES scanning by QM/MM at the B3LYP-D3/def2-TZVP level of theory.Energies reported have the unit kcal/mol.Note that the OPC water model used here exhibits the direction of OPC and structural properties of TIP3P.

Figure 11 .
Figure 11.(a,b) Optimized geometries of RC, TS, and IM during HAT process in CYP450BM3 in the case of OPC and SPC/E water model respectively.(c,d) Charge and spin density values for CYP450BM3 in OPC and SPC/E water model respectively.(e) Energy profile obtained during the PES scanning by QM/MM at the B3LYP/def2-TZVP level of theory.Energies reported have the unit kcal/mol.

Figure 12 .
Figure 12.(a) QM zone of CYP450OleT and CYP450BSβ.(b) Gaussian convention for positive field and dipole moment vectors along Z-axis.(c,d) Dipole moment and local electric field calculation for the RC, TS, and IM in CYP450OleT and CYP450BSβ respectively, in the case of the OPC and SPC/E water model.Note that L.E.F has been calculated along the Fe=O axis at the point "O".

Table 2 .
Hydrogen atom transfer barrier for different water models in CYP450OleT, CYP450BSβ and CYP450BM3.

Table 3 .
Evaluation of different water models for CYP450OleT, CYP450BSβ and CYP450BM3.Note that the water models have been enlisted based on the best performance for various properties.Here N.D represents No difference.