Fast Quantitative Validation of 3D Models of Low-Affinity Protein–Ligand Complexes by STD NMR Spectroscopy

Low-affinity protein–ligand interactions are important for many biological processes, including cell communication, signal transduction, and immune responses. Structural characterization of these complexes is also critical for the development of new drugs through fragment-based drug discovery (FBDD), but it is challenging due to the low affinity of fragments for the binding site. Saturation transfer difference (STD) NMR spectroscopy has revolutionized the study of low-affinity receptor–ligand interactions enabling binding detection and structural characterization. Comparison of relaxation and exchange matrix calculations with 1H STD NMR experimental data is essential for the validation of 3D structures of protein–ligand complexes. In this work, we present a new approach based on the calculation of a reduced relaxation matrix, in combination with funnel metadynamics MD simulations, that allows a very fast generation of experimentally STD-NMR-validated 3D structures of low-affinity protein–ligand complexes.

We have also studied the human monoclonal antibody 2G12 in complex with a tetramannoside (PDB code: 6MSY), which is of therapeutic interest due to its role as a broadly neutralizing antibody against human immunodeficiency virus (HIV) 1 .For the RedMat calculation, we used a rotational correlation time of the protein of 43.1 ns, estimated with HYDRONMR 2 , and a dissociation constant of 1000 μM.The ligand and protein concentrations were 8000 μM and 25 μM, respectively, according to the experimental conditions. 3Figure S6 shows a superposition of the the X-ray crystal structure and 3 MD trajectory frames of the 2G12-tetramannoside complex (panel b), and the comparison of experimental STD 0 values (blue bars) with those calculated using RedMat (red bars; panel a).An NOE R-factor of 0.18, using a cutoff of 18 Å, was obtained for the X-ray structure, indicating a good fit between the experimental and theoretical STD 0 values.We next switched to the dynamics mode by analyzing a 100 ns unbiased MD simulation of the complex.The RedMat analysis of the resulting MD trajectory showed an average NOE R-factor of 0.18 (with a low standard deviation of 0.02), indicating that the complex remained relatively stable throughout the simulation generating a dynamic ensemble fully compatible with the experimental STD NMR data (Fig. S6d).

Figure S6. a)
Comparison between the calculated (from the X-ray structure; red bars) and experimental (blue bars) relative STD 0 factors (binding epitope mapping) of the non-exchangeable protons of a tetramannoside bound to the anti-HIV 2G12 monoclonal antibody (PDB 6MSY).A NOE R-factor of 0.18 was obtained using a cutoff of 18 Å, hence showing a very good agreement between the crystal and solution state structures of the complex.A 2D sketch of the ligand is also shown, with the labels associated with each sugar ring shown next to it.b) Superposition of 3 frames of the MD simulation (protein in wheat-colored cartoon, ligand in cyan sticks) and the X-ray structure (protein in wheat colored cartoon, ligand in orange sticks) of the complex.The protein residues within 12 Å from the ligand, shown as wheat-colored lines, were included in the calculation of the theoretical STD 0 values.c) Evolution of the root mean squared deviation (RMSD) of the tetramannoside ligand (all atoms considered except the protons) with respect to the protein binding site (residues within 6 Å from the ligand).d) Evolution of the NOE R-factor of the tetramannoside ligand over the 100 ns MD simulation.
It is noteworthy that a significant increase in ligand RMSD with respect to the binding site occurs between 52 and 77 ns (Fig. S6c), due to a conformational change of the ligand (1-3) glycosidic bond.This leads to a reorientation of the mannose residue at the reducing end (D), which experimentally shows very small STD 0 values as it is the most solvent exposed ring of the tetramannoside ligand (Figure S6a).For this reason, such a ligand conformational change leads to only a very small increase in the NOE R-factor (Fig. S6d).This highlights that significant changes in ligand RMSD along an MD simulation do not necessarily correlate with significant changes in the agreement with the experimental binding epitope mapping when conformational changes affect regions of the ligand with low experimental STD values, and emphasizes the great potential of combining MD simulations and RedMat analysis to generate and validate 3D ensemble structures of flexible protein-ligand complexes that best represent the experimental data in solution.
Finally, we also studied the complex of the GH94 laminaribiose phosphorylase enzyme from Paenibacillus sp.YM1 (PsLBP) with the ligand -glucopyranose-1-phosphate (PDB code: 6GH2). 4The PsLBP reverse reaction (synthesis of glycosidic linkages) is of high importance because it can be used as an alternative way of enzymatic glycosylation using sugar-1-phosphates as donor substrates. 5,6For the RedMat calculation, we used a protein rotational correlation time of 68.5 ns, estimated with HYDRONMR 2 (GH94 has a molecular weight of ≈ 102 kDa), and a dissociation constant of 2000 μM.The concentrations of ligand and protein were 5000 μM and 50 μM, respectively, according to the experimental conditions. 4Figure S7 shows the superposition of the X-ray crystal structure and three MD trajectory frames of the a-Glc-1-phosphate-GH94 laminaribiose phosphorylase complex (panel b).We compared also the results obtained using both RedMat and CORCEMA-ST (red and grey bars, respectively; panel a).The NOE R-factor of the X-ray model was 0.16 and 0.24 as calculated by RedMat and CORCEMA-ST, respectively (using a cutoff of 15 Å), hence in good agreement with the STD NMR epitope mapping.With regards to the MD simulation, the ligand remained stable in the binding site for the first 75 ns (Fig. S7c), showing an average NOE R-factor of 0.18 (standard deviation of 0.04) (Fig. S7d).In all cases, a distance cutoff of 10 Å was used for the calculations.

RedMat and Corcema
For both BRD4 complexes, CORCEMA-ST yields poor results compared to experiments, being greatly overcome by RedMat performance.However, we should note that only the experimental binding epitope was reported for these two complexes but not the experimental STD NMR build-up curves, hence hindering the direct comparison with CORCEMA-ST simulated curves.In addition, we should emphasise that CORCEMA-ST relies on calculating the whole build-up of saturation transfer onto the ligand protons along all saturation times and, hence, it is highly dependent on several parameters (kinetics of exchange, global and local correlation times, proton longitudinal relaxation times, etc.) which can introduce bias.In that sense, our results indicate that, when CORCEMA-ST shows great performance (very low NOE Rfactor), RedMat reproduces very well the predicted outcomes (ESI Fig. S7).On the other hand, when CORCEMA-ST results are not very "reliable" (i.e.either the NOE R-factor is above 0.3 or the whole experimental build-up curves are not available for direct comparison with CORCEMA build-up curves), RedMat outperforms CORCEMA-ST (ESI Fig. S8).However, in the latter case we should be cautious and carry out a detailed comparison with the experimental build-up curves.(ii) For simplicity, we advise to remove all exchangeable ligand protons (e.g.OH, NH, SH) from the PDB file.Also, note that all the proton(s) for which no experimental STD 0 values are input are not considered in the calculation and, hence, they are not shown in the results window.

Raw Data
(iii) When preparing the ligand(s) PDB file, please be aware that some molecular visualization software (e.g.Maestro 7 ) include the formal charge of the atom in the element column (last column) of the PDB file.For RedMat to read this file, please remove any numbers and symbols in this column apart from the element character.
(iv) Tens of ligand docking poses can be uploaded together in the same ligand PDB file to perform the RedMat calculation for all of them on a single run.To do so, the coordinates of each ligand pose in the PDB file must appear between the starting and ending MODEL (MODEL_NUMBER) and ENDMDL lines, respectively.

Molecular Dynamics Calculations
Same as in Fig. 11a, but two additional files must be uploaded (Fig. S12): (i) An AMBER topology file of the protein-ligand complex (.prmtop extension), (ii) A trajectory file of the protein-ligand complex (i.e.no water and no ions) in standard 10-column AMBER format (.mdcrd extension).

Set Relative Experimental STD Initial Slopes (STD 0 )
In this step, the relative (in %) experimental initial slopes (STD 0 ) for each proton of the ligand for which the STD factor could be determined (i.e.STD binding epitope) have to be manually set (Fig. S13).

Set Irradiated Protons
Here, set the protein saturation frequency employed in your experiments.Three options are allowed: Methyl protons (irradiation around 0.5 ppm), aromatic protons (irradiation around 7.2 ppm), and a SHIFTX (http://www.shiftx2.ca/)csv file containing the residues of your protein that are predicted to be irradiated at the experimental irradiation frequency (Fig. S14).For best RedMat performance, we advise to use the SHIFTX option.).

Set Parameters
The most important parameters to set here are the Complex Correlation Time (in ps), which can be calculated with HydroNMR, 2 and the cutoff distance (in Angstrom), which represents the distance around the ligand for which to consider the directly irradiated and indirectly irradiated protein protons during the RedMat calculation.We advise RedMat users to screen for a range of cutoff values, typically from 10 to 14 Å, and select the results in best agreement with the experimental STD NMR binding epitope.Regarding the spectrometer frequency, ligand and receptor concentrations, and complex dissociation constant parameters, it should be noted that they have minimum influence on the results (Fig. S15).

RedMat submission window
where some parameters can be set by the user.

Results window
The results window shows the average NOE R-factor for the whole ligand and for each individual proton.In the RedMat Docking mode, if more than one ligand docking pose/model is contained in the input ligand PDB file, the average NOE R-factor for each model is shown.In the RedMat MD mode, plots showing the evolution of the average NOE R-factor of the whole ligand (Fig. S16) and of each individual proton against each frame of the MD trajectory are also outputted.

A final note about the potential limitations of the Reduced Relaxation Matrix Approach
The relaxation matrix, although essential for NOE calculations, comes with some well-known limitations.][10] This approximation might limit the accuracy of the structures during refinement steps.On the other hand, the inconsistency between predicted and experimental NOE might be due to simple motional averaging.For this reason, it is sometimes necessary to obtain a relaxation matrix from an ensemble of structures. 11Moreover, due to previously mentioned approximations, the relaxation matrix might not be able to define a global minimum of the conformations of a ligand-protein system.In these cases, repeated Simulated Annealing and system convergence criteria should be considered together with relaxation matrix calculations.Similarly, simulated annealing, long MD simulations or replica exchange techniques should be considered for those systems where an agreement between experimental and calculated NOEs is not reached in the starting structure.Finally, in the presence of few protons or in case of peak overlapping of ligand protons, the relaxation matrix calculations might be useful qualitatively to evaluate the protein-ligand structure, making these cases still very challenging.

Figure S1 .
Figure S1.Superposition of the ligand poses obtained from docking calculations with low NOE-R factor and moderately high ligand RMSD (wrt the x-ray orientation) and the cristal structure.(a) Complex between the 2,7-anhydro Neu5Ac ligand and RgNanH-GH33 (PDB code 4X4A).Docking poses within a ligand RMSD range of 3.2-4.2and NOE-R factor range of 0.18-0.22 are shown as magenta sticks while the ligand crystallographic orientation is shown as cyan sticks.(b-c) Complexes between two pyridazine ligands (Ligand 1-(b), Ligand 2-(c)) and BRD4 (PDB codes 5M3A-(b) and 5M39-(c)).Docking poses within a ligand RMSD range of 2.0-4.0 and NOE-R factor range of 0.12-0.19are shown as magenta sticks while the ligand crystallographic orientation is shown as cyan sticks.

Figure S2 .
Figure S2.Snapshot of one of the association events between the 2,7-anhydro Neu5Ac ligand (shown as cyan sticks) and the RgNanH-GH33 protein during a funnel-MD trajectory (see also video 1 provided as supporting information).Convergence towards the ligand crystallographic orientation (shown as yellow sticks) is observed, which corresponds to frames of low RMSD and NOE R-factor values (as shown in Fig. 1e,f in the main text).

Figure S3 .
Figure S3.RedMat analysis of the 2,7-anhydro-Neu5Ac binding to RgNanH-GH33 (PDB 4X4A) during a 100-nanosecond classical MD simulation.(a) Superposition of 3 frames of the MD simulation (protein in wheat colored cartoon, ligand in cyan sticks) and the X-ray structure (ligand in yellow sticks) of the complex.The protein residues within 12 Å from the ligand, shown as wheat-colored lines, were included in the calculation of theoretical STD 0 values.(b) Evolution of the RMSD of 2,7 anhydro Neu5Ac (all atoms except the protons considered), with respect to the protein binding site (residues within 6 Å from the ligand considered), over the MD simulation.(c) Evolution of the NOE R-factor of 2,7 anhydro Neu5Ac over the MD simulation.

Figure S4 .
Figure S4.(a-b) 2D plot representing the NOE R-factor vs the ligand RMSD for the docking poses obtained for Ligand 1 (a) and Ligand 2 (b) binding to BRD4.The docking score of each pose is indicated by the color code shown in the legend.The data point corresponding to the X-ray structure is highlighted with an arrow and the docking poses resembling the crystallographic orientation are indicated with a dashed circle.(c-d) Evolution of the root mean squared deviation (RMSD) for Ligand 1 (c) and Ligand 2 (d) (all atoms except the protons considered) with respect to the protein binding site of BRD4 (residues within 6 Å from the ligand considered).The fragments of the trajectories where the ligands adopt a X-ray-type of orientation are highlighted with a dashed circle.

Figure S5 .
Figure S5.Snapshots of one of the association events between Ligand 1 (left) and Ligand 2 (right) (shown as cyan sticks) with the BRD4 protein during funnel-MD simulations (see also videos 2 and 3 provided as supporting information).Convergence towards the ligand crystallographic orientation (shown as yellow sticks) is observed for both ligands, this corresponding to frames of low RMSD and NOE Rfactor values, as shown in Fig. 2c and 2f of the main text for Ligand 1 and Ligand 2, respectively.

Figure
Figure S7.a) Comparison between the calculated (for the X-ray structure 6GH2; RedMat -red bars, CORCEMA-ST -gray bars) and experimental (blue bars) relative STD 0 factors (binding epitope mapping) of the non-exchangeable protons of -Glc-1-phosphate bound to PsLBP.A NOE R-factor of 0.16 and 0.24 was obtained from RedMat and CORCEMA-ST theoretical calculations, respectively, using a cutoff of 15 Å.A 2D sketch of the ligand is also shown, with the labels associated with each sugar ring shown next to it.b) Superposition of 3 frames of the MD simulation (protein in wheat colored cartoon, ligand in cyan sticks) and the X-ray structure (protein in wheat colored cartoon, ligand in orange sticks) of the complex.The protein residues within 12 Å from the ligand, shown as wheat-colored lines, were included in the calculation of theoretical STD 0 values.c) Evolution of the root mean squared deviation (RMSD) of the tetramannoside ligand (all atoms except the protons considered) with respect to the protein binding site (residues within 6 Å from the ligand).d) Evolution of the NOE R-factor of -Glc-1-phosphate over the 100 ns MD simulation.

Figure S8 .
Figure S8.Comparison between calculated (RedMat -red bars, CORCEMA-ST -gray bars) and experimental (blue bars) relative STD 0 factors (binding epitope mapping) of the non-exchangeable protons of Ligand 1 (a) and Ligand 2 (b) binding to BRD4 (PDB codes 5M3A and 5M39, respectively).For Ligand 1 (a) NOE R-factors of 0.17 and 0.62 were obtained from RedMat and CORCEMA-ST calculations, respectively.For Ligand 2 (b) NOE R factors of 0.13 and 0.65 were obtained from RedMat and CORCEMA-ST calculations, respectively.In all cases, a distance cutoff of 10 Å was used for the calculations.

Fig. S15 .
Fig. S15.Menu to select the experimental irradiation frequency employed.Three options are allowed: Methyl protons (irradiation around 0.5 ppm), aromatic protons (irradiation around 7.2 ppm), and a SHIFTX csv file containing the residues of your protein that are predicted to be irradiated at the experimental irradiation frequency (the format of this file in the inset table).

Fig. S17 .
Fig. S17.RedMat results window for the analysis on an MD trajectory.(a-b) The evolution of the NOE R-factor of the ligand (grey line and dots) and the moving average NOE R-factor (red line and dots) over the last 20 (a) and 5 (b) frames are plotted in the output.The moving average window (upper right corner, dashed rectangle) can be set by the user.The dashed line drawn at a NOE R-factor of 0.30 indicates an upper limit for considering a 3D protein-ligand model to be in good agreement with the experimental ligand binding epitope by STD NMR initial slopes.(c-d) Evolution of the NOE R-factor of an individual ligand proton (grey line and dots) and the moving average NOE R-factor (red line and dots) using a window of 20 (c) and 5 (d) frames.

by-step user guide 1. Sign up / Log in adjusted
based on the last atom and residue numbers of the protein.If any atom or residue number of protein and ligand overlap an error will pop up.To easily adjust atom and residue numbers we recommend using pdb-tools (https://wenmr.science.uu.nl/pdbtools/).