Shedding Light on the Inhibitory Mechanisms of SARS-CoV-1/CoV-2 Spike Proteins by ACE2-Designed Peptides

Angiotensin-converting enzyme 2 (ACE2) is the host cellular receptor that locks onto the surface spike protein of the 2002 SARS coronavirus (SARS-CoV-1) and of the novel, highly transmissible and deadly 2019 SARS-CoV-2, responsible for the COVID-19 pandemic. One strategy to avoid the virus infection is to design peptides by extracting the human ACE2 peptidase domain α1-helix, which would bind to the coronavirus surface protein, preventing the virus entry into the host cells. The natural α1-helix peptide has a stronger affinity to SARS-CoV-2 than to SARS-CoV-1. Another peptide was designed by joining α1 with the second portion of ACE2 that is far in the peptidase sequence yet grafted in the spike protein interface with ACE2. Previous studies have shown that, among several α1-based peptides, the hybrid peptidic scaffold is the one with the highest/strongest affinity for SARS-CoV-1, which is comparable to the full-length ACE2 affinity. In this work, binding and folding dynamics of the natural and designed ACE2-based peptides were simulated by the well-known coarse-grained structure-based model, with the computed thermodynamic quantities correlating with the experimental binding affinity data. Furthermore, theoretical kinetic analysis of native contact formation revealed the distinction between these processes in the presence of the different binding partners SARS-CoV-1 and SARS-CoV-2 spike domains. Additionally, our results indicate the existence of a two-state folding mechanism for the designed peptide en route to bind to the spike proteins, in contrast to a downhill mechanism for the natural α1-helix peptides. The presented low-cost simulation protocol demonstrated its efficiency in evaluating binding affinities and identifying the mechanisms involved in the neutralization of spike-ACE2 interaction by designed peptides. Finally, the protocol can be used as a computer-based screening of more potent designed peptides by experimentalists searching for new therapeutics against COVID-19.


■ INTRODUCTION
Coronaviruses (CoV) represent a diverse family of singlestranded, positive-sense RNA viruses capable of infecting humans and animals. 1 Historically, coronaviruses have infected humans causing mild cold; however, highly pathogenic strains have emerged over the past two decades. 2−5 In 2002−2003, severe acute respiratory syndrome coronavirus (SARS-CoV-1) infected over 8000 people worldwide, with approximately 800 deaths. In 2012, Middle East respiratory syndrome coronavirus (MERS-CoV), first identified in Saudi Arabia, caused 2500 confirmed cases and a death rate of 36%. 5 Even though a new outbreak of SARS-CoV-1 was expected by 2004, SARS-CoV-1 remained absent from human circulation until December 2019. Compared to SARS-CoV-1 and MERS-CoV, the new strain of coronaviruses (SARS-CoV-2, previously known as 2019-nCoV) spreads more efficiently. According to the World Health Organization (WHO) report, by January 24th of 2021, more than 97.4 million people have been infected worldwide, including more than 2.1 million deaths. Sequencing of the SARS-CoV-2 genome indicates that the 30 kb RNA encodes four structural proteins: spike (S), envelope (E), membrane (M), and nucleocapsid (N); 16 nonstructural proteins (Nsp1-16); and 9 putative accessory factors. 6,7 The trimeric glycosylated spike protein is the key used by both SARS-CoV-1 and SARS-CoV-2 viruses to enter host cells. During viral infection, the trimeric protein is cleaved into S1 and S2 subunits. S1 contains the receptor-binding domain (RBD), which recognizes the peptidase domain (PD) of angiotensinconverting enzyme 2 (ACE2), and S2 is responsible for membrane fusion 3,8,9 ( Figures 1A,B, and S1).
Theoretical models serve as a powerful predictive approach in aiding the design of new treatments to block viral infection. 10 In this sense, we propose that models with the foundation in the energy landscape theory might contribute to attack this problem. The energy landscape theory was initially developed to study protein folding 11−14 and it has been a valuable framework in revealing many biological mechanisms related to folding and many other biomolecular functions, such as functional conformation dynamics, domain swapping, and binding mechanisms. 15−24 The structural formation of biomolecules has been described in this framework as a complex process that takes place in a multidimensional configurational space. 25 Folding, binding, and conformational rearrangements are depicted in terms of diffusion of configurational search down a funnel-like energy landscape, in direction to the minimum energy structure. 26,27 Also, natural protein sequences with funnel-like landscapes have energetic roughness that is relatively small compared to the global native state and the funnel slope is steep enough to overcome the local traps imposed by the roughness so that the biological function occurs. 11 Biological function is controlled by the association of biomolecules, an essential event for cellular signaling and communication. Understanding the mechanisms involved in molecular recognition processes and the correspondent proteins' roles are key targets for therapeutic interventions. 28 The classical binding mechanism called "lock-and-key" treats two binding biomolecules as rigid entities in their docking process. 29 Proteins are rather flexible, involving conformational changes; therefore, two other coupling mechanisms can be proposed. One is the "conformational selection", where a conformational change of one binding partner occurs before binding to the other. The second is the "induced-fit", where the conformational change occurs after binding. 30 These two flexible mechanisms are reasonable for small molecules that dock to proteins on a time-scale faster than the protein conformational change. However, this time-scale separation is not always clear for protein−protein or protein−peptide binding, which is induced during folding events of at least one of the involved biomolecules. 31,32 The "binding-induced folding" association mechanisms can be cooperative"coupled binding−folding" or noncoopera-tive"binding prior folding" and "binding after folding". 32 These binding−folding scenarios have been more investigated in the association of intrinsically disordered proteins (IDPs) in which their global conformational change (or folding) is always associated with their partners. 33 The coupling kinetics of binding and folding has been studied by experiments, demonstrating a broad range of binding−folding scenarios. 34−47 In this manuscript, natural and designed peptide binders known for having inhibitory action against coronavirus were selected by blocking the binding of spike-RBD to ACE2-PD ( Figure 1C). Spike protein/gene is the preferred target site in SARS/MERS vaccine development, with many patents already deposited. 48 The objective is to determine the structural binding mechanisms and affinities of inhibitory peptides by characterizing their thermodynamic and kinetic profiles. This strategy could save lab work by computationally selecting new designs of peptides with a stronger affinity to SARS-CoV-2 than SARS-CoV-1. Also, virus surveillance will take place after the first generation of vaccines and drugs and there will be a continuous search for upgrades. Thus, new tools that are computationally less expensive can come in hand to select potential candidates to inhibit viral entry into human cells. At the beginning of the COVID-19 outbreak, many studies were performed to understand the structural aspects of the interaction between the virus and potential inhibitors and antibodies of spike-mediated infection 4,49−60 in a manner similar to the first SARS-CoV epidemic. 61−66 However, the dynamic aspects must be taken into account when designing new biomolecules, 4,66−69 and traditional molecular dynamics might be computationally expensive to screen a combinatorial variation of even small peptides, yet efficient to energyminimized designed structures. 67,68,70−75 In this perspective, here we present a relatively low demand protocol to computationally characterize the difference in the binding thermodynamics and mechanisms of the peptides designed to inhibit the surface spike proteins of the old and new coronaviruses. The protocol is based on structure-based model (SBM) simulations, which has its foundation in the energy landscape theory, 76 already applied to study the spike prefusion dynamics. 77 ■ RESULTS AND DISCUSSION Selected Peptide Inhibitors have Increased Hydrogen Bonds. The spike protein located on the surface of the coronavirus binds to the cellular receptor ACE2 of the host to deliver the virus genetic code. 78 Peptide binders are suitable therapeutics to interrupt the spike/ACE2 interaction by specifically binding to the interface region spike-RBD/ACE2-PD, preventing human infection. 79−82 Also, synthetic small biomolecules, such as aptamers of nucleic acid, bioconjugates of peptides, and polysaccharides, have been studied as protein blockers for coronavirus and other virus-related diseases. 56,83−87 Reports have shown that viral spike protein subunit vaccines produced higher neutralizing antibody titers and more complete protection than live-attenuated SARS-CoV-1, full-length spike protein, and DNA-based spike protein vaccines. 48 Protein vaccines targeting spike-RBD account for half of the vaccine patents.
Nanotechnology-enabled approaches like nanoparticles conjugated with the protein blockers enhance specificity/ efficiency when used as delivery systems to inactivate SARS-CoV-2 in patients. 88,89 The computational design of peptides can contribute as an important step for speeding up n a n o m e d i c i n e -b a s e d a p p r o a c h e s f o r C O V I D -19, 55,68,71−73,90−93 and characterizing their dynamics is a natural next step for the selection of potential candidates. For these reasons, combining in silico with experimental analyses narrows the large possible candidates among thousands of pharmaceutically active substances to be later employed in clinical trials. 88 Three peptides constructed from the N-terminal amino acid residues of ACE2-PD were selected: two natural segments of α 1 -helix (residues 22−44 and 22−57) and another peptide comprised of α 1 -helix (residues 22−44) linked by a glycine (G) with the loop segment 351−357; hereafter coined as peptide inhibitors 1, 2, and 3, respectively. The two natural peptides 1 and 2 exhibited a modest antiviral activity for SARS-CoV-1 RBD to human ACE2 with IC 50 of about 50 and 6 μM, respectively, implying that 2 had a stronger affinity to RBD. 94 The artificial peptide 3 exhibited an even more potent antiviral activity of about 0.1 μM, implying that it had a much stronger binding affinity for SARS-CoV-1 than peptides 1 and 2. Related to the novel coronavirus, it was reported that the natural fragment composed of residues 21−43 of the same human ACE2 α 1 -helix can strongly bind to SARS-CoV-2 RBD with a micromolar affinity (dissociation constant, K D = 1.3 μM) 68 that is comparable to the full-length ACE2 binding to SARS-CoV-2 RBD. 4 It was hypothesized by Huang et al. that the artificial peptide 3 may also bind stronger than peptide 21−43 (similar to peptide 1) due to the high similarity among SARS-CoV-1 and 2 RBD interfaces with ACE2-PD 55 ( Figures  S1 and S2).
Peptides 1 and 2 binding modes with SARS-CoV-1 and 2 RBDs were constructed from the complex structures deposited in the Protein Data Bank to be used in this work (PDB codes 2ajf 61 and 6m0j, 51 respectively). Peptide 3 bound to SARS-CoV-2 RBD was obtained from the computational design of Huang et al., 55 and CoV1+3 was prepared by superposition to the CoV2+3 and to the full-length ACE2-PD, followed by local energy minimization. Figure 2 shows the structures of the peptides bound to the spike-RBMs, representing the protein− peptide binding simulations of this study. Also, the sequence alignment of both spike-RBDs is presented in Figure S1, where the high similarity of the sequences is evident. Figure S1 also shows the sequence similarity among the three peptide inhibitors, residue binding contacts, and secondary structures.
Interfaces and interactions formed by SARS-CoV-1/SARS-CoV-2 and the inhibitors were analyzed using the protein interfaces, surfaces, and assemblies (PISA) server of the European Bioinformatics Institute (EBI), which predicts the most likely interfaces, based on size and energetic considerations. 95 Analysis of the complex's interfaces shows that SARS-CoV-1 and inhibitor 1 form a weak complex with only two hydrogen bonds (Y41/T486 and D38/Y436). This result agrees with our MD simulations, where the complex SARS-CoV-1 + 1 shows high flexibility. Nevertheless, Figure 2A shows that the addition of extra amino acids increases the binding efficiency of the inhibitors 2 and 3 toward the SARS-CoV-1 spike protein. The addition of 13 amino acids leads to the appearance of three extra hydrogen bonds in the SARS-CoV-1 + 2 complex (2 × Q42/Y436 and T487/Y41). Intriguingly, the inhibitor formed by the synthetic peptide of ACE2 boosted the number of hydrogen bonds to 7 (T41/ Y486, K353/Y481, K353/G482, K353/G488, E37/Y491, and D38/Y436). It may be recalled that the majority of these interactions are also present in the crystallographic structure of the complex formed by SARS-CoV-1-RBD/ACE2. 54 It is worth mentioning that our analysis is in agreement with the experiments performed by Han and col., where increasing the size of the peptide caused an increase in the hydrogen bonds. 94 Examination of Figure 2B indicates that SARS-CoV-2 binds all of the inhibitors with a higher number of hydrogen bonds than SARS-CoV-1. Furthermore, inhibitors 1 and 2 showed a similar interaction mode toward the RBD-ACE2 domain, presenting 9 hydrogen bonds (Q24/A475, Q42/Y449, Q42/ G446, D30/K417, E35/Q493, E37/Y505, D38/Y449, D38/ Q498, and D38/Y449) and a salt bridge formed by residues D30 and K417. Here, it is important to highlight the hydrogen bond and salt bridge formed by residues D30 and K417, which have been featured as one of the most important for the affinity improvement of the viral protein toward human ACE2. 54 Similar to SARS-CoV-1, the number of hydrogen bonds suggests that the synthetic peptide binds the SARS-CoV-2 protein with the highest number of hydrogen bonds ( Figure  2B, right corner). The PISA web server identified 13 hydrogen bonds in the interface of the complex SARS-CoV-2 + 3 (Q24/ A475, Y41/T500, Y41/N501, Q42/G446, Q42/Y449, D30/ K417, E35/Q493, E37/Y505, D38/Q498, D38/Y449, K353/ Q498, K353/G498, and K353/G502) and a salt bridge formed by D30 and K417, suggesting that sequence 3 might form the most stable complex and the most efficient inhibitor among the three sequences studied here. The designed 3 binding sites overlap with the binding sites of the de novo miniprotein inhibitors that showed picomolar to nanomolar affinity and blocked SARS-CoV-2 infection. 82 Local frustration in the structures of SARS-CoV-1 and 2 complexed with the inhibitors 1, 2, and 3 was recognized using the Frustratometer Web Server. 96,97 Mutational and configurational frustration indexes were calculated for each molecular complex. Table 1 shows the percentage of minimally, neutral, and highly frustrated interface contacts between SARS-CoV-1 and 2 and the inhibitors 1, 2, and 3. The complexes formed by SARS-CoV-1 and 2 with the inhibitor 3 presented a high increase in the number of interface contacts when compared to the inhibitor 1. A slight increase could also be seen in the number of interface contacts from the complexes formed by SARS-CoV-1 and 2 with the inhibitor 2. Furthermore, a lower percentage of the interface contacts between SARS-CoV-1 and 2 (specially SARS-CoV-2) and the inhibitor 3 was inclined to be highly frustrated, unfavorable, in comparison with the other complexes (see Figures S3 and S4).
Designed Peptide Folds via Two-State Mechanism after Binding to Spike-RBDs. Protein folding and binding are intertwined processes closely connected to the cellular function. Both processes are similar at the thermodynamic level: the dynamics are mapped as diffusion over an ensemble of partially structured states, accompanied by a simultaneous reduction of the configurational entropy and the free energy due to solvent exclusion, hydrogen bonds, and salt bridge formations. 98 In addition, folding/binding kinetics are influenced by the thermodynamic free-energy barrier that separates the unfolded/unbound and the folded/bound states. 99 The thermodynamic barriers for the real proteins were reasonably described by SBMs in the α-carbon (C α ) representation of the protein. 100,101 However, side-chain packing can play an important role in the native contact interactions presented in the transition state of coupled protein folding and binding reaction, 37 and this can be achieved computationally when all atoms (except nonpolar hydrogen) are included in the pure C α -SBM. 102,103 In either C α and allatom cases, and whether or not it is a Go ̅ -model, protein topology and native contacts determine binding mechanisms in coupled binding−folding that are reflected by funneled energy landscapes, even if intermediates occur. 100,103−105 The folding process of the selected peptides, coupled to the binding on the SARS-CoV-1 and 2 spike protein RBDs (CoV1 and CoV2, respectively), was simulated to characterize their thermodynamic quantities. The SBM, with all heavy (nonhydrogen) atom representation, was chosen to represent the system topology with default parameters. Figure S5A shows the residue contact map of three out of six simulated protein dimer interfaces CoV1+2, CoV2+1, and CoV2+3, showing just the intermolecular contacts between two chains in the native structure defined as Q bind . For each dimer, the model also accounts for the number of intramolecular contacts of the spike-RBD (Q spike ) and the peptide (Q peptide ). Q total is the sum of the intramolecular and intermolecular native contacts (Q spike , Q peptide , and Q bind ) of each protein−peptide simulated system.
The computational SBM implemented here does not have a direct connection with the real temperatures. However, differences between quantities extracted from the theoretical temperatures have been proved to be proportional to the experimental temperatures, such as the Φ-value analysis that is based on the free-energy variation or folding rates of mutant/ wild-type proteins. 106 The SBM is a topology-based model; thus, it is assumed that slight variations on the reduced temperature around the vicinity of the C v (T) up/down shape do not abruptly change the dimerization mechanism. Also, as experimentalists choose the quantity half-maximal inhibitory concentration (IC 50 ) to compare different inhibitory assays, we . Free energy (F) of simulated ACE2 peptide binding to SARS-CoV-1 (red) and SARS-CoV-2 (orange) spike-RBD proteins. Onedimensional F profiles are shown as a function of reaction coordinates: a fraction of native (A) peptide−protein chain interface contacts (Q bind ), (B) inter plus intrachain contacts (Q total ), and (C) peptide contacts (Q peptide ). (D, E) show two-dimensional F profiles as a function of native contacts Q bind and Q peptide . In (A), the designed peptide ACE22-44G351-357 (inhibitor 3) binds to the SARS-CoV-2 protein with a higher F barrier when compared with the α 1 -helix peptide ACE22-57 (peptide 2) binding to the SARS-CoV-1 spike domain. (C) shows that fragment 3 folds through a two-state mechanism upon binding to SARS-CoV-2 RBD, in contrast to a one-state (downhill) folding mechanism of 2 binding to the CoV1 surface. In addition, the coupled binding−folding processes are more cooperative to the designed 3 (E) than the natural 2 (D), despite the fact that the crossing barrier for 3 is higher and bumpier. F profiles are in units of k B T bind , with T bind being the respective system's binding temperature.
Journal of Chemical Information and Modeling pubs.acs.org/jcim Article choose T = T bind as the temperature to compare all of the simulated dimers. T bind was defined as the temperature where C v is maximum for each system that separates the transition between the states on (bound, low temperatures) and off (unbound, high temperatures). At this temperature, bound/ unbound states are maximally sampled and the computed statistical quantities are better estimated. Each dimer presented a broad yet a defined peak in the specific heat curve as a function of temperature (C v (T) in Figure S5B), determined by the thermodynamic analysis of the full set of simulated runs. In general, protein−peptide dimers of CoV1 showed higher T bind than the respective peptide binding and folding to CoV2, although they are comparable. Another thermodynamic result is related to the designed fragment 3 that showed a higher peak in C v for both coronavirus spike proteins compared to the natural peptides. This could be related to the insertion of the coil region on the spike-RBMs in addition to the α-helix, which may increase the energy/temperature necessary for transition.
The thermodynamic analysis of the free-energy profiles (F) as a function of selected reaction coordinates at the binding transition temperature (T bind ) was also computed. To improve the manuscript clarity, analyses were focused on the comparison of one natural with the designed peptide in complex with CoV1 and CoV2 and CoV1+2 and CoV2+3, respectively. The results of all of the six peptide complexes are presented in the Supporting Information section for the sake of completeness. Peptide 2, the longest selected α-helix peptide, was chosen to be compared with fragment 3 and complexed with CoV1 and CoV2. Their implications can also be discussed along with the different coronavirus spike proteins. CoV1 and CoV2 are very similar protein homologues in the sequence and structure. Inhibitors 2 (natural peptide) and 3 (designed fragment) differ after residue Ser44 in the C-terminal portion of the human ACE2-based peptides (Figures S1 and S2). Figure 3 shows F as a function of reaction coordinates based on native contacts (Qs) for the selected protein−inhibitor dimers CoV1+2 and CoV2+3. F as a function of Qs and position-based reaction coordinates radius of gyration (R g ), root-mean-square deviation (rmsd), and distance between the center of mass of the chains (d) for all of the simulated protein dimers are presented in Figure S6 for completion. The defined fraction of native binding contacts (Q bind ) was monitored for the coronavirus spike protein dimers, and the two-state on/off free-energy profiles are shown in Figure 3A for CoV1+2 and CoV2+3 and in Figure S6E for all dimers at T bind . A broad bound state (around Q bind = 0.7) is separated from a sharp unbound state (at Q bind = 0) by an F barrier around Q bind = 0.1. The natural peptide complex CoV1+2 presented the lowest binding F barrier in Q bind of around 9k B T compared with the other complexes that resulted in a barrier between 12 to 15k B T. The peptides complexed with CoV2 presented, on average, a slightly higher binding F(Q bind ) barrier in relation to the CoV1 complexes. The same is also true for F profiles as a function of Q total ( Figure 3B), Q peptide ( Figure 3C), and the position-based reaction coordinates ( Figure S6, right panels). In Figure 3B, the majority of the native contacts of Q total are related to the spike-RBDs (Q spike ), over 1600, in contrast to the total number of binding contacts of the two chains and the intramolecular contacts of the peptides (Q bind and Q peptide , respectively), both in the order of 100. Thus, in Figure 3B, Q spike is oscillating in its folded state and this is the reason why the bound−unbound behavior is presented for values higher than Q total = 0.8. The two states of F in Figure 3B contain the unbound/unfolded state of the peptide on the left well and the bound/folded state on the right well, representing that dimer association and peptide/spike folding occur without distinction, in contrast to Figure 3A,C, where binding and folding are dissociated by the coordinates. Higher F barriers for the dimers were found along with the reaction coordinate Q bind (9−15k B T), suggesting that over one-dimensional landscapes, the rate-limiting step in RBD-peptide dimerization is the formation of native binding contacts among the chains interface at the transition state of this coordinate. The dimerization rate limit is followed by the position-based coordinates (4−12k B T in Figure S6).
It is interesting to observe in Figures 3C and S6A that during the dimerization processes, two distinct folding scenarios emerged for the peptides over F along their one-dimensional folding reaction coordinate: one-state (downhill) and two-state peptide folding mechanisms. It can be observed that natural peptides fold via a completely downhill scenario, i.e., there is no thermodynamic barrier in F(Q peptide ). The natural peptides 1 and 2 manifested a downhill-like mechanism as they bind to CoV1 and CoV2, except for CoV2+1 that showed a small signal of two-state behavior in the coupled folding−binding two-dimensional landscapes, discussed in the next section. The downhill-like switches to a two-state folding scenario as the joined peptide 3 binds to CoV1 and CoV2 RBDs ( Figure S7A) with a folding barrier of 1−2k B T at the raised transition state (Q peptide ≈ 0.5) between the unfolded (Q peptide ≈ 0.3) and folded (Q peptide ≈ 0.7) states. Insertion of the loop sequence G351−357 on the natural peptide 1 to form the designed 3 created many additional inter and intrachain contacts (see Figure S5A), which may lead to this two-state folder. Enlarging the natural peptide from residues 22−44 to 22−57 only increases the size of the α-helix peptide and did not substantially increase the binding native contacts of the dimers as the addition of loop G351−357 ( Figure S5A) did, and this could be one of the reasons of this change in the folding scenario. Another possibility could be that the information of peptide folding might not only be on the peptide sequence but also templated on the sequence of the binding partner as coupling proceeds, 107 although the partner sequence changes from SARS-CoV-1 to SARS-CoV-2 spike-RBD without affecting protein topology. This is a common feature of IDPs, and a large fraction of proteins undergo a disorder-toorder transition when recognized (templated) by their physiological partners. 107 However, some IDPs have their folding−binding pathways encoded within their sequence and are not templated by partner proteins with the same topology. 108 The one-dimensional F profiles in Q bind and Q peptide were combined to produce the higher two-dimensional F for the dimers CoV1+2 and CoV2+3, shown in Figure 3D,E, respectively. F(Q bind , Q peptide ) represents the coupled binding−folding pathways that a given dimer causes during the experiments. As shown in Figure 3D,E, projections over these two-dimensions also raised one thermodynamic barrier (transition state) for each dimer dynamics, with CoV2+3 having a higher and rougher energetic barrier than CoV1+2. The recovered energetic barriers were on the same order of magnitude as the one-dimensional F barrier in Q bind . In addition, the one-and two-state folding aspects of the two peptides are presented in the F(Q bind , Q peptide ) projection on Q bind . In Figure 3D, the downhill-folder natural peptide binds and folds almost without distinction of the coupled dynamics.
Journal of Chemical Information and Modeling pubs.acs.org/jcim Article In contrast, Figure 3E shows that fragment 3 binds to spike after folding, recovering a two-state kind of coupled mechanism. 3 binding−folding pathways are more cooperative, although there is a higher and funneled energetic barrier to cross. Once bound, the natural peptide can fold and unfold more freely than the fragment, an aspect that will be thoroughly discussed in the next section. Highly dynamic "fuzzy" complexes, indicating a malleable binding−folding pathway, were previously reported for IDPs complexed with stable proteins by the experiments temperature-jump 22 and multinuclear relaxation dispersion NMR. 40 For another IDPprotein dimer, it was reported that a free-energy barrier shows up separating the self-assembly process. 104 Two-State Designed Peptide is Less Flexible on the Spike-RBMs. The coupled dynamics of peptide folding and binding to spike-RBD was investigated by analyzing two-dimensional free-energy profiles. The binding was captured as the one-dimensional F(Q bind ), as shown in Figure 3A. Folding was captured by Q peptide in Figure 3C. The composition of Q bind and Q peptide to form a two-dimensional F resulted in a landscape with a not so clear valley at the unbound state since this state has 0 intersubunit contacts ( Figure 3A), resulting in a sharp valley located at the boundary of Figure 3 (right panel, not clear to inspect) and another broad valley at the center, as previously described in many works. 21,103 After combining all of the analyzed reaction coordinates of Figure S6, it was found that the combinations presented in Figures 4, S7, and S8 presented two broader valleys at the bound and unbound states. These combinations enabled more binding−folding mechanism analyses in the course of the sampled states. Figure 4 shows two-dimensional free-energy landscapes for the two protein−peptide dimers CoV1+2 and CoV2+3 at the In the right panel, the rmsf for the designed peptide in the encounter complex (E) state is also shown (dash line). Representative simulated structures in these states (gray) are aligned with the native dimer structure (colored). F profiles are shown as a function of reaction coordinates of the native (B) peptide (Q peptide ) and protein (Q spike ) contacts and (C) inter plus intramolecular contacts (Q total ) and Q peptide . The designed peptide ACE22-44G351−357 (inhibitor 3) folds via a twostate mechanism (monitored by Q peptide ) upon binding to the SARS-CoV-2 RBD, which is oscillating in the protein native state (monitored by Q spike ); in contrast to a downhill folding mechanism of ACE22-57 (inhibitor 2) binding to the SARS-CoV-1 protein. Also, the rmsf of CoV1+2 is higher than CoV2+3. F profiles are in units of k B T bind , with T bind being the respective system binding temperature.
Journal of Chemical Information and Modeling pubs.acs.org/jcim Article defined transition temperature, T bind . Figure 4B presents F as a function of the reaction coordinates Q peptide and Q spike , clearly displaying one valley for CoV1+2 and two wells for CoV2+3. Projection of F in Q spike ( Figure S6C) shows one defined valley in F for both systems with the fluctuations and sampling along Q spike of the protein monomer being more rigid in CoV2. This is confirmed by the root-mean-square fluctuation (rmsf) analyses of the simulated protein dimers in Figure S10. Figure  S10 shows the rmsf of each residue, according to the primary structure of CoV1/CoV2 proteins and the ACE2-based peptides. Interestingly, the overall flexibility of the complex CoV1-2 is the highest compared to the complexes formed by CoV1 with inhibitors 1 and 3 and the three CoV2 dimers. These differences are even more evident in Figure S10D, where the rmsfs are incorporated in the complex structure as tube radius. The binding of the three inhibitors 1, 2, and 3 makes the overall structure of CoV2 spike-RBD less flexible, which is an indication of a tighter binding toward the novel coronavirus protein. In addition, peptide fluctuations are higher in CoV1 than in CoV2, as represented by the peptide rmsfs in Figures 4A and S10. This rigidity of the SARS-CoV-2 spike receptor-binding motif has already been reported by simulations and experiments, and it was hypothesized as one of the reasons for its enhanced infectivity. 59,60,67,74 In Figure 4B, it is the projection of F in Q peptide that differentiates the one-to-two wells in F(Q spike , Q peptide ) landscapes between the two dimers. The downhill and twostate folding mechanisms of the peptides along the Q peptide coordinate are again evident for CoV1+2 and CoV2+3, respectively. For the two-state fragment folder in CoV2+3, there is an F barrier separating the two valleys on the order of 2k B T, which is the double of the free-energy barrier of the onedimensional F(Q peptide ) ( Figure 3C). In this high order projection of F in Figure 4B, conformational rearrangements of the spike protein along Q spike are also taken into account combined with folding in Q peptide to overcome the barrier. Figure 4 shows one valley marked with the letter "N" standing for the "Native" state in which both protein and peptide are folded and bound. "E" denotes the "Encounter" complex state in which the dimer can be partially bound or unbound, the protein can be folded yet slightly loose and open, and the peptide is unfolded but partially structured ( Figure 4A). This two-state dimerization mechanism emerges due to the temperature selected to plot F(T bind ) that samples these mentioned states without separating them in the classic three or more dimerization states reported for bigger dimers. 21,103 C v has a wide peak around T bind in Figure S5B, suggesting a smooth transition where these states coexist. The completely unbound and unfolded state for the peptide was seen in higher simulated temperature data. Peptides are small compared with spike-RBDs, and the peptide dynamics are faster in sampling states at T bind as shown in a small part of the representative time trajectory in Figure S7B,C. Dimers of CoV2 with natural peptides 1 and 2 also presented this two-state dimerization profiles in F(Q spike , Q peptide ), however, with lower barrier and elevated E well minima comparing with N valleys ( Figure  S9A). The existence of more than two states at the C v (T bind ) peak does not guarantee that these wells will have the same depth in Figures 4, S8, and S9 at T bind . Figure 4C presents F(Q total , Q peptide ) landscapes at T bind for CoV1+2 and CoV2+3 with one and two global wells, respectively. Q total contains Q spike , despite being the largest amount of native contacts, and its value fluctuates around a narrow valley where spike-RBD is folded (Q spike minimum in Figure S6C). Q total also contains Q bind and Q peptide , and these reaction coordinates have a smooth change in F at the transition temperature T bind (for example, Figures S6E and  S7A, respectively). Apart from Q peptide that monitors the peptide folding mechanism separately, as shown in Figure 4C, we would say that the increase in Q total is proportional to the increase in Q bind , representing the minimal path of the diagonal. With that being stated, the CoV1 dimer increases Q total concomitantly with the increase in Q peptide , suggesting a mechanism that binding and folding of the peptide occurs concomitantly in Figure 4C. The E state is marked in Figure  4C as a reference for the discussion of the coupled binding− folding mechanism of the CoV1 dimer, although it is not a deep minimum valley per se. Figure S7B shows that Q peptide changes in a manner similar to Q bind during the selected few steps of CoV1+2 molecular dynamics, again suggesting the peptide coupled binding−folding mechanism.
CoV2 dimer dynamics over F(Q total , Q peptide ) shows a different minimal path diverging from the diagonal that separates E and N states. The off-diagonal path suggests that the folding of the designed fragment 3 (same for the natural 1 in Figure S9B) is coupled to binding, yet slightly different from 2 with CoV1. The separation F barrier of about 1k B T for the fragment folding is small compared to the thermal fluctuations that could also be around k B T; 109 thus, it does not impose an obligatory step to the fragment being completely folded to bind to the monomer. Yet, it seems that the peptide should be partially structured in the native conformation, reflected by the first increase in Q peptide , and then binding to CoV2 occurs, shown by the second increase in Q total . The off-diagonal path suggests a mechanism of folding after binding, although folding occurs by crossing an energetic barrier. Figure S7C shows that Q peptide still changes around Q peptide = 0.3 after the peptide is unbound at step 32, although Q bind remains equal to zero, again suggesting the folding-upon-binding mechanism of the fragment. CoV2+3 dimer forms after transposing the F(Q total , Q peptide ) barrier of over 2k B T that occurs after at least 30% of the designed native contacts being formed at T bind .
It is important to emphasize that our findings indicate that the limiting rate for the dimer assembling of the two coronaviruses was the thermodynamic barrier on the onedimensional Q bind coordinate (9−15k B T) and the energetic barrier along Q peptide that segregates one-to-two-state peptide folding; therefore, binding is the critical step. Figures S8C and  S9C show the two-dimensional F of the simulated dimers as a function of the distance between the center of mass between the two chains (d spike−peptide ) and Q peptide . F(d spike−peptide , Q peptide ) shows an unbound (U) state in which the distance between the protein and the peptide is bigger than the native/ bound (N) and the encounter (E) complex states, both located at almost the same d spike−peptide ≈ 2 nm. It can be observed that U are wide wells approximately aligned with the E state in which both have a low level of nativeness for the peptide, i.e., the peptide was found to be unfolded at U and E. Despite the fact that there is a difference between the two natural and the designed peptides en route to the N well. There is no barrier between E and N for the natural peptides; only the fragment has to overcome an energetic barrier to fold, even though the three peptides are limited by a free-energy barrier to reach the spike proteins at d spike−peptide ≈ 2 nm. Thus, this is an indication that the fragment has a more cooperative coupled binding− folding dynamics than the natural peptides. These two coupled Journal of Chemical Information and Modeling pubs.acs.org/jcim Article events determine the kinetics and the thermodynamics of the studied protein dimers and it is feasible to be characterized by all-atom SBM simulations. Designed Peptide has Binding Contacts Tightend by the Secondary Loop Site. The structural evolution content of the folding-upon-binding mechanism of the spike-ACE2based dimers was analyzed by building average contact formation probability maps. The probability map of an atom i to form native contacts along with a reaction coordinate Q was computed (p(Q, i) given by eq S3). In this probability map, each pixel corresponds to the average frequency of an atom belonging to a residue to form their atom−atom native contacts in a specific point of the reaction coordinate. p(Q, i) is computed over a long time trajectory at a constant temperature run. The simulations contain about 1600 atoms belonging to the two chains of each protein−peptide dimer. Thus, it was convenient to show in the y-label the residue index instead of the original atom numbering, which produced irregularly distributed labels, yet, human-readable. Figure 5 shows the probability map of the peptide atoms to form native contacts along the binding ( Figure 5A) and folding ( Figure 5B) reaction coordinates for the dimers CoV1+2 (left panels) and CoV2+3 (right panels), both at T bind . Figure 5B also shows the free-energy profile for the binding coordinate (F(Q bind )), with the defined transition state (TS) region shaded as the gray area. In addition, the probability maps for  Figures S11 and S12, respectively. Q total and Q spike contain thousands of native contacts with similar probability maps, and this was the reason to show only p(Q spike , i) in Figures S11C and S12C. The spike protein is relatively well folded (red maps in Figures S11C and S12C) during the binding and folding contact formation of the peptides, which are on the order of a hundred. The peptides are α-helix-based and this linearity in their structure and sequence are reflected in the high similarity between the two portions of Q bind probability maps of the proteins (top half) and peptides (bottom half) in Figures S11A and S12A. At the unbound state (blue strip at Q bind = 0 in Figures 5A,  S11A, and S12A), the almost folded spike proteins approach the peptides that are completely unfolded and disordered, represented by the blue strip region at Q peptide = 0 on the maps in Figures 5B, S11B, and S12B. The sharp transition to the bound state starts when there are between 4 and 10 binding native contacts formed for the peptides, which is the beginning of the TS in Figure 5A. At TS along Q bind , many nonspecific protein−peptide contacts are formed with a probability of up to 50%, except for the constructed 3 that also showed some binding contacts made with a probability of above 80% at some positions in TS (red dots at TS on the maps). Many other protein−peptide native interface contacts start to be made with high probability in red as binding proceeds by crossing TS along Q bind .
The natural peptides 1 and 2 have similar binding contacts being formed with the spike-RBMs ( Figure S5A). Figure 5A shows that CoV1 first casts the natural peptide 2 by binding the protein residue Asn479 to the peptide residue Hist34 at Q bind ≈ 0, followed by Tyr442 binding to peptide residues Lys31 and Glu35. CoV1 proceeds by binding residues Asn473 and Cys474 to the peptide residues Thr27 and Phe28. These two interface regions are highly hydrophilic on both surfaces, protein and peptide, as is shown in Figure S13 (left panels). Natural peptide residues over Tyr41 do not contribute to binding significantly. Previous experiments showed that CoV1 residues Asn479 and Thr487 were associated with the recognition of the human ACE2 receptor in addition to residues Tyr442, Leu472, Asp479, and Asp480 that allowed interspecies infection. 110 Residue Thr487 is a CoV1 residue important for binding of the designed 3 loop segment, as shown in Figure S11A (right panel). The corresponding CoV1 residues Asn479 and Thr487 in CoV2 are the polar residues Gln493 and Asn501, respectively, in which they also play a key role in the ACE2 recognition by CoV2. 111 The designed peptide 3 has a different sequence for the native binding contact formation revealed by the probability maps. Figure 5A shows that fragment 3 binds at the transition state to CoV2 by many interface contacts from its N-terminal portion attempting to follow the same sequence of events of the natural peptides. However, after crossing TS in the F(Q bind ) profile, 3 unbinds some of its N-terminal contacts and binds its C-terminal loop segment to CoV2. In Figure 5A, CoV2 binds to fragment 3 after TS by residue Asn501 to peptide residue Lys353, followed by Gly354, Asp355, and Arg357. CoV2 residues Gln498 to Tyr505 also contribute to these binding contacts to the loop segment. This is connected with the recent finding that, among the novel spike mutations, SARS-CoV-2 has the maximum van der Waals interaction between residues CoV2-Asp491 and ACE2-Lys353. 70 This partial interface formation proceeds by CoV2 residue Gln498 connecting to peptide residue Gln42. The loop segment of 3, present in the wild-type ACE2-PD as a secondary binding site, creates many additional intermolecular native contacts with the spike-RBD tightening up the protein−peptide interaction. Figure S13 shows how this partial interface, created by the turn residues Gln498, Pro499, and Asp501 of CoV2, is more hydrophilic than the similar (mutated) residues in CoV1 ( Figures S1C and S2). Also, the Trp500 surroundings in CoV2 showed to be more hydrophobic than in the same position in CoV1, as is pointed by the upper orange arrow in Figure S13. In CoV1, the first probable binding residue was Asn479 that is mutated by Gln493, which maintained the hydrophilic nature of this site in both proteins. However, other mutations surrounding Gln493 in CoV2 created a more hydrophobic site than in CoV1. This increase in hydrophobicity caused by mutations on this CoV2 site could be another reason for the late capture of fragment 3 residues His34 and Glu35 during the binding process, despite the large number of binding attempts at TS. This observation suggests room for fragment 3 binding affinity improvement. These two mechanisms described by the natural and designed peptides are similar in CoV1 and CoV2 with a difference that many binding contacts were made earlier in CoV2 ( Figure S12A) than in CoV1 ( Figure S11A).
The peptide binding processes are accompanied by the folding of their linear structures, which can be seen in the probability maps. Figure 5B shows the probability of a peptide residue to form intramolecular native contacts at a specific position of the folding coordinate, Q peptide . Figure 5B presents p(Q peptide , i) for dimers CoV1+2 (left panel) and CoV2+3 (right panel), both at T bind , in addition to the probability maps of the three peptides simulated with SARS-CoV-1 and 2 spike-RBDs presented in Figures S11B and S12B, respectively. It was discussed in the previous sections by inspecting one-and twodimensional F profiles that, upon binding, the natural peptides 1 and 2 fold via the downhill mechanism and the designed fragment folds via the two-state mechanism. These two folding mechanisms were again represented by F(Q peptide ) in Figure  5B, with the transition state shaded as the gray area only for the two-state-folder fragment. The probability maps of Q bind resemble those of Q peptide in Figures 5, S11, and S12. This is a signal that as a peptide atom/residue is captured (binding), on average, it assumes its native conformation (by folding) that is represented by the red pixels in Figure 5B.
An unstructured protein might have a greater capture radius for a specific binding site than the folded state with its restricted conformational freedom. Therefore, the unfolded state binds weakly, followed by folding, as the protein approaches the binding site, and the binding rate is enhanced. This scenario has been hypothesized over a couple of decades as the "fly-casting" mechanism and it has been studied by SBMs in which the thermodynamic barriers for the real proteins were reasonably described by the model in the αcarbon (C α ) and all-atom representations of proteins. 100−103 Thus, folding-upon-binding, similarly to the proposed flycasting mechanism, might be one of the possible ways of interaction between the spike-RBDs and the peptides.
In Figure 5B, the α-helix peptide 2 starts the downhill folding (at Q peptide ≈ 0.1) by their N-terminal residues in a similar manner as the residues bind to CoV1, as discussed in Figure 5A. The downhill F profile has its minimum at Q peptide ≈ 0.5, meaning that around 50% of the peptide native contacts were not formed most of the time during the simulation at T bind . The probability map of peptide 2 projected on F shows that residues over Tyr41 are not yet formed at the free-energy lowest level, suggesting that the C-terminal region of the natural peptide is more flexible, unbound, and loose in CoV1. However, folding of the natural peptides in the presence of CoV2 starts in residues closer to the C-terminal domain, as shown in Figure S12B, in addition to the fact that they are less flexible than in CoV1. Figure 5B shows that the folding of fragment 3 in the presence of CoV2 also starts from its Cterminal loop residues that have many of their intramolecular native contacts formed at TS of F(Q peptide ). In addition, few 3 residues close to the N-terminal sequence are not formed at the folded state in Q peptide ≈ 0.8. Regarding 3 with CoV1, the peptide native contacts from the N-terminal residues are formed before TS, similar to the natural peptides shown in Figure S11B. Nevertheless, some contacts are unmade after TS and a less specific native contact formation proceeds as the folding occurs, also leaving some N-terminal residues unfolded at the folded well.
In general terms, the three peptides (1, 2, and 3) in a disordered state are cast (fly-casting) by the two spike-RBDs CoV1/CoV2 with the hydrophilic Asn479/Gln493 residues being the bait of the exposed hydrophilic region close to the peptide residue His34. This precise requirement of a small partner search diameter is necessary for the α-based peptides to dock into the protein binding site interface. The capping loop CoV1/CoV2 residues Thr487/Asn501 play an essential role in the association of the ACE2-based secondary binding site, the 3 loop region. The inserted loop residues of 3 increase the binding contacts in the two spike-RBDs, a major hydrophilic region. These polar contacts formed near the casting and the loop residues correlate with the hydrogen bonds formed between the proteins and the peptides. Figure 2 shows that CoV2 has more hydrogen bonds formed with the peptides than CoV1, which are the new hydrophilic contacts that emerged in CoV2 by residue mutations that occurred in CoV1. It was reported by binding simulations of the full-length ACE2-RBD of SARS-CoV-1 and 2 that these differences in the hydrophobicity and hydrogen bond network of the two dimers govern the enhanced binding affinity of the novel coronavirus. 74 Also, polar residue mutations in SARS-CoV-2 result in a greater electrostatic complementarity than that of the SARS-CoV-1 complex, 74 although the simulated electrostatic and pH-dependent free-energy affinities of both dimers were similar. 112 It is suggested that this similar, yet enhanced, binding energy of the SARS-CoV-2 spike protein is not due to a single mutant but because of the sophisticated structural changes induced by all mutations together. 70 This has implications for antibody therapeutics and vaccines that might be effective in combating different SARS-CoV-2 isolates affected by common mutations. 113 Peptide Binding Affinity is Higher for SARS-CoV-2 Spike-RBD. Association rates and binding affinity constants have been investigated extensively for interactions between proteins that can be folded or partially folded and for complexes where one or more partner is initially disordered. 114 Solvent and temperature dependence of the association rates indicate that kinetics of a coupled folding and binding reaction is limited by the free-energy barrier or transition state. 115 The experimentally observed binding process of the binding− folding reaction was consistent with molecular dynamics simulations of the coarse-grained C α -SBM performed previously, 39 which was similar to the all-atom SBM simulations of this study. The molecular binding affinity is related to the equilibrium dissociation constant (K D = 1/K A ) that measures the propensity of a dimer in reversibly separating its two components. Therefore, a low dissociation constant is related to higher biding affinity and vice-versa. One can measure K D from the ratio between the kinetic dissociation (k off ) and association (k on ) rate constants (K D = k off /k on ). 39 From the thermodynamic point of view, K D can also be Figure 6. (A) Equilibrium dissociation constant (K D ) as a function of temperature for two simulated dimers: CoV1+2 (red) and CoV2+3 (orange). K D was calculated using eq 1 (circles) and the line represents a simple linear fit to the data that helps to guide the eye. CoV2+3 has lower dissociation constants for increasing T than CoV1+2, implying higher binding affinity in temperatures around T bind . (B) Free energy (F) as a function of the distance between the center of mass of the protein and the peptide at T bind . At T = T bind in (A), the CoV2+3 equilibrium K D is about one order of magnitude smaller than CoV1+2, which is a reflection of the higher ΔF stability between bound/unbound states in (B). Temperature is normalized by the respective dimer binding temperature (T/T bind ) and F profiles are in units of k B T bind . associated with the Gibbs free-energy change during the dissociation/binding events. The stability of a protein−ligand binding can be determined by the magnitude of the free-energy variation upon binding (ΔG) that is connected to K D by 116 From the thermodynamic stability given by eq 1, it is possible to measure the equilibrium K D and to infer the biomolecular binding affinity of the simulated protein−peptide dimers. The equilibrium dissociation constant was evaluated for a range of temperatures around T bind using the Helmholtz free-energy stability (ΔF) in eq 1, which is obtained in the simulations by the Weighted Histogram Analysis Method (WHAM). 117 F was evaluated as a function of the distance (d) between the center of mass of the spike-RBDs and the peptides ( Figure S6d), differing from the previous sections that analyzed F of native contact-based coordinates ( Figure S6, left panels). We observed that the coordinate Q bind is suitable for mapping the thermodynamics of the contact formation mechanism; however, Q bind can give imprecise information about the association. For instance, one of the dimer's components could be slightly rotated from the native bound conformation resulting in a low Q bind , but in a very close distance between the protein−ligand. Figure 6A shows an increase of the equilibrium dissociation constant (K D ) with temperature for the dimers CoV1+2 and CoV2+3. At T = T bind , Figure 6B presents the F(d) profiles for CoV1+2 and CoV2+3 that resulted in the thermodynamic stabilities between the bound/unbound states of ΔF ∼ 3k B T and ΔF ∼ 4k B T, respectively. This difference in ΔF between the two dimers at T bind is mainly responsible for the difference of about one order of magnitude in K D in Figure 6A. The dissociation constant of CoV2+3 is lower than the K D of CoV1+2, implying that the binding affinity is higher. This trend is also observed for the other equilibrium K D reported in Table 2, in which it predicts that the binding affinity of the three peptides with the novel spike-RBD CoV2 is higher than with SARS-CoV-1 RBD.
The analysis of binding events can be a rather difficult task. It is strongly dependent on choosing the perfect setup and experimental conditions, and it is also advisable to measure binding interactions on multiple platforms for cross-validation. Two label-free binding measurement platforms are commonly used: the surface plasmon resonance (SPR) and biolayer interference (BLI). 118 These two platforms were recently employed by the experimentalists to analyze the binding of optimized ACE2-based peptides and antibodies targeting the full spike protein (or its RBD) of the novel coronavirus in which K D ranged from picomolar to nanomolar affinities. 119 The full ACE2 enzyme dissociation constant of the SARS-CoV-2 spike protein is reported to be on the higher end, the nanomolar binding affinity, 4 though that of SARS-CoV-1 were reported by different BLI experiments to be similar 50,120 or to have an up to 20-fold higher K D , 4 meaning lowered affinity to CoV1. The de novo designed ACE2-based peptide and miniprotein affinities to CoV2 were also reported to have comparable or lower affinity to CoV1. 68,82,93,121−124 This is an indication that ACE2-derived peptides might be, in some cases, more sensitive to CoV2 than to CoV1, that is, qualitatively in line with the theoretical binding affinities estimated in Table 2. The natural peptides 1 and 2 and the designed 3 are based on the original sequence of the human ACE2 that is shown to be suboptimal for binding CoV2 in the nanomolar range. This opens the opportunity to redesign these peptides to significantly improve their binding affinity up to the picomolar range, inhibiting SARS-CoV-2 and its mutations from entering human cells and hindering its fast transmission.

■ CONCLUSIONS
Many cellular activities are determined by the recognition of at least two flexible biomolecules. In this manuscript, two sets of dimers relating to SARS-CoV-1 and 2 spike-RBDs were simulated (CoV1 and CoV2), in which each RBD was simulated with three peptide inhibitors of increasing sequence length based on the α 1 -helix of ACE2-PD, two naturals (1 and 2) and one designed (3) by including a secondary loop sequence. The thermodynamic analysis showed that the three peptides have different binding and folding free-energy barriers in the presence of CoV1 and CoV2 and, for some coordinates, even a different transition state position. Kinetic analysis of binding/folding native contact formation also revealed different patterns between the two sets of dimers. In these two sets, the three peptides have the same residue sequence; therefore, the difference in the binding partner sequence seems to have an important role in the binding affinity and folding of the peptides. This difference in the binding−folding process of the binder that is templated by the protein partner was already reported for some IDPs; 107 however, it is not a universal feature. 108 In the case of the coronavirus dimers of this study, just folding by itself separated the natural and designed peptides into two classes of mechanisms; thus, the coupled binding-upon-folding mechanisms were dictated by both protein−inhibitor residue sequences. Figure 7 summarizes the two binding−folding mechanisms of the studied protein−peptide dimers. Figure 7 shows that it starts with the conformational rearrangements of the superficially "open" spike-RBMs catching the peptides at the encounter complex step and then the binding occurs. The RBM surface accommodates the peptide at the same time that folding proceeds. This flexible binding−folding mechanism is more cooperative for the designed peptide, as shown in Figure  7B, due to the free-energy barrier for folding that 3 has to cross to reach the native/bound state ( Figure 7A). Free-energy barriers for binding over the native contact-based and coordinate-based reaction coordinates were similar when comparing SARS-CoV-1 spike protein dimers with SARS-CoV-2, although CoV2 dimers resulted in increased energetic barriers. Our results are in alignment with recent computer simulations showing that electrostatic and van der Waals binding energies are similar, yet higher, for the SARS-CoV-2 spike protein bound to ACE2-based peptide inhibitors, fulllength ACE2-PD, and monoclonal antibodies. 72,74,112 It is also  50 Concluding, the applied computational protocol was successful in determining binding free-energies and main residues for the binding kinetics of the natural and designed peptides in complex with both CoV1 and CoV2. The enhanced binding energy of the designed fragment was stated to be related to the increasing contacts of the secondary loop region, which switched the folding mechanism to two-state-like, in contrast to the downhill-folder natural peptides. In addition, the inserted loop sequence conferred to 3 a tight bounding with the two spike-RBDs, explained by the increased hydrogen bonds and polar contacts. It is important to remember that the theoretical protocol is a simplified all-atom model based on the protein−peptide experimental structure with no electrostatic potential term and without solutes included, and they are implicit on the functional Hamiltonian. These assumptions make the model fast to run when compared with traditional molecular dynamics simulations, which allows many different inhibitor peptides to be evaluated against the novel coronavirus protein targets. Notwithstanding, the model is extremely versatile and these features or many others could be included to evaluate their influence on the association dynamics. The design of new potent peptides and biomolecules that inhibit novel coronaviruses can be advantageous to the development of diagnosis, vaccines, and epidemic surveillance in the years to follow. 125 ■ METHODS Simulation and Analysis Details. Protein dimers were simulated with all heavy (nonhydrogen) atoms with the structure-based model (SBM), 126 also known as the Go ̅model. 23 SBMs define the energy minimum of the Hamiltonian function as the native conformation; thus, SBM is a topologybased model. 76 The simulation protocol was based on the previous work 26 with GROMACS suite 5.1.4. 127 Input files for the SBM simulations were prepared with SMOG2 version 2.3β with standard options for the all-atom model. 128 Disulfide bridges between the two cysteine residues (SS-bond) were included as a regular covalent SBM bond, which represents an SS-bond oxidized state. 129 Native structures were analyzed by FoldX 130 to minimize residue side-chain energy.
For each system simulated, the thermodynamic analysis was performed over 40 temperature runs from lower temperatures (fully folded and on states) to higher temperatures (off states), including the transition temperature between the on and off states (T bind ), by the weighted histogram analysis method (WHAM) 117 with a Python script, the PyWham package. 131 The WHAM computes the microcanonical density of states that is used to build the free-energy profiles. T b was defined as the temperature related to the specific heat peak between binding states for each protein. The number/fraction of native contacts within a given structure (Q) was used to evaluate nativeness as the reaction coordinate, which describes how similar is a given structure with respect to the initial native structure. 132 Root-mean-square fluctuation (rmsf), the radius of gyration (R g ), root-mean-square deviation (rmsd), and distance between the two chains (d) were computed with the analysis package contained in the GROMACS suite. Each simulation was executed over 1 μs with integration steps of 2 fs and snapshots of 4 ps. The all-heavy-atom SBM simulations were executed using periodic boundary conditions with no constraint on the distance between the two molecules. Enough sampling for the statistical calculations of bound/unbound states was achieved at T bind and temperatures in its vicinity.
Sequence alignment and secondary structure elements were identified using the ENDscript 2 web server. 133 Interfaces and interactions formed by CoV1/CoV2 and the inhibitors were analyzed using the PISA (protein interfaces, surfaces, and assemblies) server of the European Bioinformatics Institute (EBI) (http://www.ebi.ac.uk/msd-srv/prot_int/cgi-bin/ piserver). 95 Protein-energy frustration was obtained by submitting the native conformation to the AWSEM-MD Frustratometer 96,97 web server (http://frustratometer.qb.fcen. uba.ar). Protein structures were visualized with PyMOL (https://pymol.org), VMD, 134 and UCSF Chimera 135 softwares. Two-and three-dimensional curves were plotted using Grace (https://plasma-gate.weizmann.ac.il/Grace), Gnuplot (http://www.gnuplot.info), and Matplotlib 136 packages. The contact probability maps were generated using the code a v a i l a b l e a t h t t p s : / / g i t h u b . c o m / r o n a l d o l a b / pyContactProbability. Figure 7. Scheme of the binding and folding dynamics of the studied dimers with the natural peptides 1 and 2 and the designed fragment 3. (A) Schematic free energy as a function of a joint binding and folding reaction coordinate. (B) Schematic diagram projecting the association mechanisms in two-dimensional binding and folding reaction coordinates. The natural and designed peptides bind prior to folding, although the designed fragment has a more cooperative binding−folding mechanism due to its two-state folding mechanism.
■ ASSOCIATED CONTENT
Extended theory of the local frustration analysis, the structure-based model potential and settings used in the simulations, and the contact probability maps for all cases along the reaction coordinate; supporting figures regarding the sequence and structure alignments, local frustration, contact map probabilities, specific heat (C v (T)), free-energy profiles (F), root-mean-square fluctuation (rmsf), and hydrophobicity surfaces (PDF)