Binding pockets in proteins induced by mechanical stress

We report the first observation of a pocket that opens as a result of a mechanical force applied to an Ig-like domain from the cardiac muscle. This previously unseen mechanism of pocket formation is revealed by molecular dynamics simulations under force. Preliminary investigations show that this ‘mechano-pocket’ is potentially druggable and could be found in other domains from the same fold family, suggesting the existence of a general mechanism of pocket formation under mechanical stress. Small molecules tend to bind to pocket-shaped regions of the protein surface. Indeed, shape complementarity can promote ligand binding by optimising protein-ligand interactions and shielding them from solvent. The recent literature has highlighted the dynamic character of pockets, showing that they can significantly change their shape or even open and close as a result of protein dynamics. Different examples of transient or cryptic binding pockets have been described, which could be observed only by taking into account the intrinsic motions of the protein. In this work we explored the possibility that the formation of new pockets can be induced by a mechanical stress. Indeed, proteins can be subjected to mechanical forces during several types of biological processes such as muscle contraction, cell adhesion and cytokinesis. However, most of the previous work has focused on mechanical unfolding, while force-induced structural modifications in folded proteins have not been extensively studied. These modifications can become particularly important if the force applied under physiological conditions is not large enough to induce unfolding but can affect the structure and dynamics of the protein. In particular, force-driven modifications of a binding site can have a dramatic impact on the protein behaviour. For example, the tightening of a binding pocket induced by forces applied to a β-sandwich protein has been shown to be at the basis of force-enhanced catch-bonds involved in cell adhesion. But can mechanical stress create new pockets? To answer this question, we studied the effect of external forces on the molecular surface of an Ig-like domain from the cardiac muscle, namely the C1 domain of human cardiac Myosin Binding Protein-C (MyBP-C). While a complete understanding of its function is still missing, MyBP-C is considered to be a key regulator of muscle contraction. As part of the cardiac sarcomere and possibly crosslinking thin and thick filaments, MyBP-C is considered to be subjected to mechanical strain during muscle contraction. In order to describe possible force-induced modifications of the protein surface with atomistic resolution, we adopted a computational approach. In particular, we monitored the formation of pockets on the molecular surface of the protein during molecular dynamics (MD) simulations performed in the presence of an external force to mimic a mechanical stress (Methods). Force was applied by using a linear potential on the distance between the Cα atoms of the Nand C-terminal residues in constant force (CF) steered MD simulations. Multiple 100-ns trajectories (CFX in Table S1) were generated at increasing force magnitudes X up to 450 pN. Larger forces induced unfolding so they were not further considered (Supplementary Methods). Control simulations without external forces (CF0) were also performed.

Small molecules tend to bind to pocket-shaped regions of the protein surface 1 . Indeed, shape complementarity can promote ligand binding by optimising protein-ligand interactions and shielding them from solvent. The recent literature has highlighted the dynamic character of pockets, showing that they can significantly change their shape or even open and close as a result of protein dynamics 1b . Different examples of transient or cryptic binding pockets have been described, which could be observed only by taking into account the intrinsic motions of the protein 2 . In this work we explored the possibility that the formation of new pockets can be induced by a mechanical stress. Indeed, proteins can be subjected to mechanical forces during several types of biological processes such as muscle contraction 3 , cell adhesion 4 and cytokinesis 5 . However, most of the previous work has focused on mechanical unfolding 6 , while force-induced structural modifications in folded proteins have not been extensively studied. These modifications can become particularly important if the force applied under physiological conditions is not large enough to induce unfolding but can affect the structure and dynamics of the protein. In particu-lar, force-driven modifications of a binding site can have a dramatic impact on the protein behaviour. For example, the tightening of a binding pocket induced by forces applied to a β-sandwich protein has been shown to be at the basis of force-enhanced catch-bonds involved in cell adhesion 4,7 . But can mechanical stress create new pockets?
To answer this question, we studied the effect of external forces on the molecular surface of an Ig-like domain from the cardiac muscle, namely the C1 domain 8 of human cardiac Myosin Binding Protein-C (MyBP-C). While a complete understanding of its function is still missing, MyBP-C is considered to be a key regulator of muscle contraction 9 . As part of the cardiac sarcomere and possibly crosslinking thin and thick filaments, MyBP-C is considered to be subjected to mechanical strain during muscle contraction 10 . In order to describe possible force-induced modifications of the protein surface with atomistic resolution, we adopted a computational approach. In particular, we monitored the formation of pockets on the molecular surface of the protein during molecular dynamics (MD) simulations performed in the presence of an external force to mimic a mechanical stress (Methods). Force was applied by using a linear potential on the distance between the C α atoms of the N-and C-terminal residues in constant force (CF) steered MD simulations. Multiple 100-ns trajectories (CFX in Table S1) were generated at increasing force magnitudes X up to 450 pN. Larger forces induced unfolding so they were not further considered (Supplementary Methods). Control simulations without external forces (CF0) were also performed.  The formation of pockets during the simulations was detected using MDpocket 11 . 3D maps of pocket frequency values were generated (Figures 1 and S1), providing the fraction of time a given point in space is found to be part of a pocket (Supplementary Methods). Different transient pockets were found in the simulations without external force (blue surfaces in Figure 1), with three of them (p1-3) having frequency ≥ 0.3 in all the replicas (blue bars in Figure S2). The two largest pockets (p1 and p3) were also identified in a single-frame analysis of the starting X-ray structure ( Figure S3). Remarkably, a new pocket (pF) appeared in the two simulations with the largest force (CF375 and CF450, pink in Figures 1 and S2). Correspondingly, the volume distributions for pF showed a strong force dependence, while no significant variations were found for the p1-3 distributions across the simulations ( Figure S4). This behaviour was observed in both sets of replicas. The force-dependent pF pocket is sandwiched between the two β-sheets that compose the C1 domain, near the N-terminal end (Figures 1 and S5). The entrance to the pocket ( Figure S5A and Table S2) is composed of residues from the A and G strands as well as the N-terminal tail, while its bottom ( Figure S5B) is formed by mostly hydrophobic residues from strands B (176-178), C (189 and 191), E (222) and F (238-241). The mechanism of formation of pF involves a large rearrangement of the F157 residue in the A strand (orange in Figure 2). Indeed, while in the native structure this resi-due is almost completely buried (Figure 2A), under mechanical stress its side chain is pulled out of the protein core and relocated to the rim of the pF pocket ( Figure  2B). This reorganisation is made possible by breaking selected interactions between the backbone of Nterminus and A-strand residues (D152 to F157) on one side and the side chains of G-strand residues (K246 and D248) on the other ( Figures 3A and S6). In particular, the N-term-K246, L156-D248 and F157-D248 hydrogen bonds, which are stable in the CF0 simulation (blue in Figure 3C), are lost early in the CF450 simulation (orange) due to a force-induced straightening the Nterminal tail ( Figure 3A, structures 1 and 2). This results in a looser structure around F157, which can eventually flip over ( Figure 3A, structures 2 and 3). Interestingly, these changes remain mostly local to the pF region, so that the overall structure is not significantly affected ( Figure S7). In particular, the A'-G hydrogen bonds, which have been found to protect Ig domains from mechanical unfolding 12 , are stable throughout all our simulations ( Figure S7). The progression in the degree of formation of the pF pocket was monitored by using a single geometric parameter or collective variable (CV pF in Figure 3B), which measures the distance between the F157 side chain and the bottom of the pocket (Methods). A good correlation was found between the CV pF values and the probability of finding an open-pF state ( Figure S8). A comparison of the CV pF time evolution shows that CV pF plateaued to values > 0.8 nm in all the high-force simulations (green, orange and red shades in Figure S6). Cor-respondingly, the solvent accessible surface of key residues in the pocket was stably above the value in control CF0 runs for most of the simulation in three out of four cases ( Figure S9), indicating that once it is formed the pocket stays open. An intermediate state (0.6 nm < CV pF < 0.8 nm) was visited in all cases, where F157 is partially detached from the bottom of the pocket ( Figure 3A, structure 2). CV pF was stably below 0.6 nm in both control simulations (blue shades in Figure S6). Even if lower than the values used in unfolding simulations, the force magnitudes necessary to open the pF pocket within the simulated timescale are higher than the estimated physiological forces in the muscle 13 . Observing a possible pocket opening at lower forces would require much longer simulation times. Therefore, we decided to adopt a more efficient way to test if lower forces have an effect on the opening of pF. In this approach, we added a bias to induce the pocket formation in the presence and absence of the force and we compared the energy needed to open the pocket in the two cases. To drive the opening of pF, a harmonic steering potential was applied to the CV pF variable defined above to induce a linear increase of CV pF to a target value of 1 nm (Methods). Two sets of 10 replicas were run, one in the absence (SCV-CF0) and the other in the presence of an external constant force of 187.5 pN (SCV-CF187.5) between the N-and C-terminus. This force magnitude, closer to physiological values, was selected since no pF opening was observed in our CF187.5 simulations performed without the steering potential on CV pF ( Figures  S1 and S2). The formation of the pF pocket, as measured by the solvent accessible surface of representative residues, was observed in all the replicas of both sets (Figures S10 and S11), validating the choice of CV pF as steering variable. The work associated with the CV pF steering potential was calculated for each simulation (Supplementary Methods) and the resulting distributions were compared (Figure 4). The SCV-CF187.5 work values (orange) were significantly lower than the SCV-CF0 ones (blue), with distribution medians of 23 and 70 kJ/mol, respectively. This indicates that the presence of the external force facilitates the opening of the pocket. This is further confirmed by the fact that the strength of the interactions that oppose the pocket opening is strongly correlated with the work values ( Figure S12). The wider SCV-CF187.5 distributions of both work and hydrogen bond occurrence values also suggests that the application of the force is making the structures at the beginning of the steering more heterogeneous ( Figure S12 and Table S3), with some of the structures preserving a partially formed hydrogen bonding network in the pocket region even in the presence of the force.  N-terminal (151-160 region) and G-strand residues (246-250) that are broken during the opening of the pF pocket. The CV pF distance between the centre of mass of the F157 side chain (orange sphere) and the centre of mass of the residues at the bottom of the pF pocket (green sphere) is represented as a green dashed line. (B) Time evolution of the CV pF variable for CF0 (blue) and CF450 (orange) simulations. (C) Hydrogen-bond existence maps for CF0 (blue) and CF450 (orange) simulations. The plot show the presence (colour) or absence (white) of at least one hydrogen bond between the specified residues for each simulation frame. N-term indicates residues 151 to 155. In order to test the druggability of the pF pocket, the fragment-based method FTMap 14 was used to detect ligand-binding hotspots on the surface of the domain. FTMap uses small-molecule probes to identify the regions of a protein surface that are more likely to bind drug-like ligands. In particular, regions binding different probes are considered as binding hotspots, with binding affinity proportional to the number of low-energy binding poses of the probes (probe clusters). When FTMap was run on open-pF conformations (Supplementary Methods), a hotspot was found in the pF pocket in 3 out of 4 analysed structures ( Figure 2C and Table S4). All the 16 probe molecules were found to be able to bind pF in at least one binding pose. In two of the structures ( Figure S13), the number of probe clusters in the pF pocket was equal to or over the FTMap druggability threshold (16) 14 , with another possible partner hotspot nearby (grey sticks in Figure S13) that could be used to design larger compounds targeting both hotspots at the same time. In the pF pocket, probes were found to form several polar contacts (cyan dashed lines) with the nearby polar and charged side chains (in particular D248 and S250) and backbone atoms (V158, M159 and R177). Our findings thus indicate the existence of a potentially druggable binding pocket that opens when the C1 domain of MyBP-C is subjected to a mechanical stress. MyBP-C is a multi-domain protein composed by different Ig-like and fibronectin domains. Similarly to titin, single molecule AFM experiments suggest a mechanical hierarchy among the different regions of MyBP-C 10, 15 . While it is not possible to determine the mechanical stability of single domains of MyBP-C based on the currently available experimental data, a comparison of force-extension curves of C1-C2 fragments with the full length ones shows that their mechanical stability is lower than the protein average 10 . This suggests that, under physiological forces, C1 might be among the first domains to be subjected to mechanical strain in the protein.
Moreover, the modulation of the mechanical extensibil-ity of the C1-C2 linker by phosphorylation has been suggested to regulate the binding of MyBP-C to actin and/or myosin 15b , indicating that the mechanical properties of this region of the protein may have a strong functional role. A natural question is whether the behaviour observed here might be shared by other proteins with the same fold (immunoglobulin I-set domains). As detailed above, the opening of the pF pocket occurs when selected interactions are broken and the bulky hydrophobic amino acid F157 is pulled out of the pocket. An analysis of the Iset sequence alignment (Pfam ID: PF07679, 66,871 sequences) shows that the position corresponding to F157 in MyBP-C C1 (X157) is occupied by another hydrophobic residue in almost all the non-gap cases (99.9%), with F (64.6%) and I (25.4%) being the most represented (Table S5). Inspection of selected 3D structures (Supplementary Methods) shows that the X157 residue is always buried, with a residue relative solvent accessibility of 10 ± 6% (Table S6). These data support the idea that mechano-sensing binding pockets could be found also in other domains of the I-set family. Moreover, the low level of sequence conservation of the charged and polar residues lining the pF pocket ( Figure S14) indicates that it might be possible to design small molecules that can selectively target specific domains even if the pocket opening mechanism in the family is similar. The variability in the pocket composition is reflected by the fraction of polar and charged residues lining the pocket, which ranges from 28 to 63% across the different proteins of the Pfam3D subset (Supplementary Methods), with an average of 45%. It is important to note that the mechanical stability of Iglike domains can be affected by the formation of disulphide bonds, which can have a different effect according to their position in the domain 16 . A direct influence of disulphide bonds on the pocket formation mechanism observed here seems unlikely, due to the very low occurrence of co-localised cysteine residues that could staple together the A-and G-strand regions around the pocket (Supplementary Methods). However, their formation in other parts of the protein could contribute to the modulation of hydrogen bonding interactions in the pocket region via long range effects. The present findings imply that when assessing the druggability of a protein that can be subject to mechanical stress under physiological conditions, it is necessary to take into account possible effects of a mechanical force on the molecular surface of the protein. Indeed, the presence of 'mechano-pockets' could be exploited to target proteins previously considered undruggable or to target them under specific conditions of mechanical stress. Further studies focused on a quantitative determination of the kinetics of pocket opening and ligand binding will be needed to determine the binding mechanism 1b, 17 and possible effects of the timedependence of the mechanical stress during muscle contraction. Our results could be used to open new avenues in drug design, in particular for the treatment of cardiac and skeletal muscle diseases. Indeed, several pathogenic mutations associated with inherited myopathies and cardiomyopathies have been found in sarcomeric cytoskeletal proteins, which are rich in Ig-like domains 18 . The MyBP-C C1 domain itself contains different hypertrophic cardiomyopathy mutations 19 . In some of these cases, targeting the damaged mutant with a drug could rescue the protein function 20 , following strategies similar to those recently applied for the reactivation of cancer mutants of the tumour suppressor protein p53 2a . Drugs that target the mechanical pocket could be used to increase the stability of the protein when this has been compromised by the mutation. Moreover, the fact that the pocket opens only in the presence of a force could be exploited to develop selective drugs that target proteins under specific conditions of mechanical stress.

METHODS
Methods and protocols are summarised here, full details can be found in the Supplementary Methods.

General MD simulation parameters
The X-ray structure of the C1 domain of human cardiac MyBP-C (PDB ID: 3CX2, UniProt ID: Q14896) was used to generate the initial system. All molecular dynamics (MD) simulations were performed using GROMACS 4.6.7 21 . Periodic boundary conditions were applied, using the Particle Mesh Ewald method for the calculation of electrostatic interactions. The protein was described using the AMBER99SB*-ILDN force field 22 and the TIP3P 23 model was used for water.

Constant force simulations (CFX)
The protein structure was placed in a rectangular box, with the vector connecting the N-and C-terminal Cα atoms aligned along the x axis and a minimum distance of 1.0 nm between the protein and the walls of the box. The box was then extended by 1.2 nm along the x axis to allow for possible structural modifications due to the force. The protein was solvated with TIP3P water mole-cules. After temperature and pressure equilibration (Supplementary Methods), a constant force was applied for 100 ns using the pull code implemented in GROMACS 24 . The position of the N-terminal Cα atom (D151) was restrained using a harmonic potential with a force constant of 2500 kJ/mol/nm 2 . The external force was applied using a linear potential along the x component of the distance vector between the C-terminal (E258) and N-terminal (D151) Cα atoms. Force magnitudes ranged from 187.5 to 450 pN (Table S1). Two replicas were run for each force magnitude to check for reproducibility. Control simulations were run at 0 pN. Constant force simulations with steering of CV pF (SCV-CFX) SCV-CFX simulations were performed by combining the application of an external force with the steering of the collective variable CV pF to induce the opening of the pF pocket. The CV pF variable was defined as the distance from the centre of mass of the F157 side-chain to the bottom of the pocket (centre of mass of the C239, E240, V241, D248, C249 and S250 Cα atoms). A harmonic steering potential was applied on CV pF . The reference value in the potential was linearly increased from the starting value of CV pF to a target value of 1.0 nm in 45 ns, using a force constant of 5000 kJ/mol/nm 2 . The restraint was then kept fixed at 1.0 nm for 5 ns. Two sets of 10 replicas were performed (Table S1), one in the absence (SCV-CF0) and the other in the presence (SCV-CF187.5) of an external force, for a total of 1 µs of simulation time. All SCV-CFX simulations were performed with the PLUMED 2.2.5 plugin 25 coupled with GROMACS 4.6.7.

Analysis
The opening of binding pockets during the simulations was detected using the MDpocket 11 software (standalone version). Ligand-binding hotspots were identified running the FTMap web server 14 on selected protein structures. Secondary structures and solvent accessible surfaces were determined with DSSP 26 . The sequence analysis of the I-set family of Ig domains was performed on a multiple sequence alignment from Pfam 31.0 27 (Pfam ID: PF07679). A subset composed by I-set sequences with known 3D structures was generated using the PDB/Pfam cross-mapping available from the Pfam website (Supplementary Methods).

ASSOCIATED CONTENT Supporting Information.
The Supporting Information is available free of charge on the ACS Publications website.
Experimental procedures, tables and figures reporting number and type of simulations, structural and sequence details of the pockets, evolution of the secondary structure of the protein and of the solvent accessible surface of key residues, FTMap hotspots, correlations between work values and hydrogen bond interactions. (PDF).