Atomistic Picture of Opening–Closing Dynamics of DNA Holliday Junction Obtained by Molecular Simulations

Holliday junction (HJ) is a noncanonical four-way DNA structure with a prominent role in DNA repair, recombination, and DNA nanotechnology. By rearranging its four arms, HJ can adopt either closed or open state. With enzymes typically recognizing only a single state, acquiring detailed knowledge of the rearrangement process is an important step toward fully understanding the biological function of HJs. Here, we carried out standard all-atom molecular dynamics (MD) simulations of the spontaneous opening–closing transitions, which revealed complex conformational transitions of HJs with an involvement of previously unconsidered “half-closed” intermediates. Detailed free-energy landscapes of the transitions were obtained by sophisticated enhanced sampling simulations. Because the force field overstabilizes the closed conformation of HJs, we developed a system-specific modification which for the first time allows the observation of spontaneous opening–closing HJ transitions in unbiased MD simulations and opens the possibilities for more accurate HJ computational studies of biological processes and nanomaterials.


Details of the molecular dynamics (MD) simulation preparations
The standard and enhanced sampling MD simulations were carried out with AMBER18 1 and GROMACS-v2018 2 , respectively. Each system was prepared in Leap module of AMBER with different ion models 3,4 and potassium concentrations (Table S1). CUfix 5 and the phosphate parameters from Steinbrecher et.al 6 were applied in ParmED. Hydrogen mass repartitioning was applied to allow a longer integration time step (0.004 ps) in simulations. 7 Afterwards, each system went through a series of minimizations and equilibrations. The initial minimization of the system was done while applying a 25 kcal/mol/Å 2 positional restraint on the solute atoms. The systems were then heated up from 100 K to 300 K while using the same positional restraint. Afterwards, alternating cycles of minimization and equilibration were implemented for five rounds. In each round, the positional restraint placed on the solute changed from 5 kcal/mol/Å 2 to 1 kcal/mol/Å 2 in decrements of 1 kcal/mol/Å 2 . The positional restraint for the final equilibration was 0.5 kcal/mol/Å 2 . For each minimization, we ran 500 steps of steepest descent followed by 500 steps of conjugated gradient minimization. The equilibrations were run for 50 picoseconds each.

The use of HBfix in HJ simulations
HBfix is a structure-specific potential which can be used to adjust stability of selected H-bonds. In case of HJ, we used it to prevent the fraying of the terminal base pairs which is a common phenomenon in simulations of nucleic acids. 8,9 A 2 kcal/mol potential HBfix per H-bond was applied to increase the stability of terminal base pairs in all four HJ arms. This includes the branch point base pairs since they are also termini when the HJ opens and can become frayed at that point. The fraying of branch point base pairs in the open state could represent a genuine attempt for branch migration. Even though we only simulated immobile HJs where the migration cannot succeed, the system can still make such attempts during random thermal fluctuations. Although intriguing, exploration of such states in the context of the present study would have been very challenging in terms of sampling. We plan to explore the mechanism of migration as well as behaviour of migrating HJs in future studies. Note that the HBfix potential in WT-MetaD-HREX simulations was 4 kcal/mol per H-bond instead of 2 kcal/mol as the latter was not enough to keep branch point base pairs from breaking under the effects of MetaD biases, as revealed by the initial calculations (not shown).

Protocol for scaling the LJ interactions of the branch point nucleotides in HJ
The LJ interactions among the branch point nucleotides were modified by scaling the ε parameters (LJ well depth) among the atomic pairs, including the nucleobases, the sugars and the phosphate groups, but excepting the atoms directly involved in the Watson-Crick base pairing H-bonds, i.e. atoms N4, H41, N3, O2 (cytosine), O4, N3, H3 (thymine), N6, H61, N1 (adenine) and O6, N1, H1, N2, H21 (guanine). This approach is similar to the Stafix potential recently proposed by our group which reduces the excessive intramolecular interactions in RNA simulations. 10 The main difference is that Stafix 10 avoids scaling of all the potential H-bond interactions whereas here, we avoid only those involved in the Watson-Crick base pairing H-bonds which occur in the B-DNA duplexes. The ε parameters were scaled by λ ranging from 0.5 to 0.9 until we identified the ideal λ which allows both opening and closing transitions in the standard MD simulation. The ideal λ was determined in two rounds of searching, where the first round was a rough search with the step size of 0.1. Once identifying the approximate λ where transitions in both directions could be observed, we then ran a second round with a more precise search step size of 0.025 to obtain the ideal λ value.

Reference simulations of B-DNA (Dickerson dodecamer) with scaled LJ interactions
We ran additional standard simulations of the Dickerson dodecamer (PDB ID:1BNA, resolution 1.9 Å) 11 to better identify potential side effects of the scaling protocol utilized to reduce the LJ interactions of the branch point nucleotides in HJ (see above). Depending on whether the scaling was used and whether we applied HBfix to stabilize the B-DNA terminal base pairs, four systems of B-DNA simulations were run (Table S1). We used the same preparation and simulation protocol as for the HJ simulations (see above and main text Methods). For all the scaled systems, the scaling factor λ=0.7 was applied for the entire DNA helix. In all simulations, the DNA duplex remained stable, with the helical parameters and base pair stability almost identical in both the scaled and non-scaled simulations ( Figure S7 and Figure S8) 12 . In simulations where we did not stabilize the terminal base pairs with HBfix, we observed terminal base pair fraying, regardless of whether the scaling was applied. In other words, the terminal base pair fraying was not significantly promoted by the scaling, even if the scaling was applied throughout the DNA helix. Therefore, we suggest that the scaling protocol we applied for the HJ systems does not deteriorate the B-DNA structure compared to the original force field. Furthermore, the same protocol could be transferable to simulations of some other DNA systems where reduction of the LJ interactions (mainly stacking) would be desirable.

Determination of the scaling factors in replicas of the WT-MetaD-HREX simulations
The same scaling protocol we used to weaken the LJ interactions among the branch point nucleotides in standard MD was also used to modify the Hamiltonian in the individual replicas of our WT-MetaD-HREX simulations (see main text Methods and the section above). We determined the λs ladder by iteratively running 10-ns-long two-replica simulations, attempting 1000 exchanges during the simulation time until we obtained λ with appropriate exchange rate between the neighbouring replicas (accepted exchange rates were between 22% and 28%). This calculation was repeated with each new obtained λ until we derived a replica ladder where the λ ranged from 1 in the lowest replica to 0.55 in the highest replica. In the end, six replicas were suggested by this protocol and their λ values are shown in Table S6. The replica exchange behaviour is also visualized in Figure S9, supporting our chosen replica ladder.

Preparation of the Isomer II starting structure for WT-MetaD-HREX
In the WT-MetaD-HREXs, we used the closed state HJ as the starting structure. However, we did not have available closed state isomer II structure of 64-nt J1 as the standard MD simulations mainly involved the shorter 48-nt J1. Therefore, to obtain it, we used GROMACS+Plumed 2,13,14 to run Steered MD (SMD) simulation 15 starting from 64-nt-open0 structure ( Figure S4) in which we forced the junction to close into isomer II. Just as in the WT-MetaD-HREX, the coordination numbers sI-IV and sII-III (without exponential factor) were used as the collective variables in SMD. We forced the sum of sI-IV and sII-III CVs to move from 0 to 180 within 10 ns using a dynamic harmonic restraint with a force constant of 0.04 kJ/mol. The SMD trajectory was visually inspected, and the isomer II structure was selected from one of the frames at the end of the trajectory. The selected structure was then equilibrated by running a standard MD simulation for 5 ns and the final structure of this simulation was then chosen as the starting structure of the WT-MetaD-HREX isomer II-open space sampling.

Free energy analysis of WT-MetaD-HREX
The trajectories obtained in WT-MetaD-HREX simulations were analyzed by reweighting. We obtained the MetaD biases in gaussian hills file for each replica and reweighted them into the nonexponential CV space. The weight of each frame was computed assuming a static bias V(s) computed at the end of the simulation. 16 The static bias was obtained from a time-average of the last 300 ns so as to reduce fluctuations. 17 The bias was then used to compute weights with eq.1, where k B is Boltzmann constant and T is temperature. The new weights were then used to estimate the probability density as a function of different CV (non-exponential CV) with a Kernel-density estimator in eq.2 18 , where ' denotes the CV and is the kernel centered at the current value with a bandwidth . ( -( ), ) ( ) In the end, the histogram was converted to the free-energy profile with eq.3. The free energies of discrete states (open/closed/half-closed IL/IR/IIL/IIR) were then calculated using eq.4.
The state free energies were further used to predict the population ratios of the two isomers and the open state in each replica, which was derived from the ΔG isomer I/II-open by using eq.5. The population ratios (P) were defined as or , and their values in each replica were ( + ) ( + ) plotted as black dots into the Figure 6 and Figure 8. The ΔG isomer I/II-open dependence on λ was further estimated by linear fitting (eq.6) with negligible error. By combining the results of eq.5 and 6, we obtained the relation between λs and the population of the closed state in form of a sigmoid function, shown in Figure 6 and Figure 8.

Errors estimation and convergence of WT-MetaD-HREX
Application of WT-MetaD-HREX flattened the free-energy landscapes of HJ in the CVs spaces with biases so that the junction could freely diffuse between the closed and open states. In order to measure the statistical errors and evaluate the convergence of our WT-MetaD-HREX, we first did the bootstrapping analysis, and then, our WT-MetaD-HREXs were extended for 300 ns without adding new MetaD bias in all the independent runs in isomer I-open and isomer II-open spaces, including the simulations with different ion concentrations (abbreviated as WT-MetaD-HREXs-EXT).
The bootstrapping was done independently in all replicas for all WT-MetaD-HREX setups, estimating the standard deviation of the free-energy difference between each HJ substate and the open state. In each replica, we divided the trajectory uniformly into 20 blocks and rebuilt the trajectory by randomly picking 20 blocks with replacements. Based on the rebuilt trajectory, we calculated the free energies of the substates with the weight of each frame using the protocol described in the previous section. The process of trajectory rebuilding, and free energies calculations was iterated 200 times. In the end, we obtained 200 free-energy samples and derived the standard deviations of free-energy differences for all substates against the open state from the samples. The standard deviations of the free energies (except the open state which was used as the reference) for all 6 replicas in each WT-MetaD-HREX run were plotted as the error bars in Figure 5 C,D,E,F and Figure 7. The statistical errors in WT-MetaD-HREXs are not significant. However, we also note that the true uncertainty in RE simulations is known to be larger than indicated by the statistical errors and needs to be assessed based on entirely independent simulation runs. [19][20][21] For WT-MetaD-HREXs-EXT simulations, the previously accumulated biases in all 6 replicas were included as a static bias potential and no new biases were added. The HREX protocol was kept the same. We first verified that the accumulated biases would enable the closing-opening transitions in the WT-MetaD-HREXs-EXT by tracking the coordination number in each continuous trajectory ( Figure S10 and Figure S11), which indicates convergence of the previous WT-MetaD-HREX runs. Similar result was observed for the WT-MetaD-HREXs-EXTs in high-salt and low-salt conditions. The HJ transitions were detectable although not in all trajectories. Especially in the first WT-MetaD-HREXs-EXT run of isomer II-open system, HJ in trajectory 0 remained open throughout the simulation. The same was also observed in the trajectory 4 of the second run. The other trajectories showed transitions, but their frequency was rather low compared to the previous WT-MetaD-HREXs. We explain this by the complicated and large-scale conformational change required for the transitions, leading to a slow diffusion in the coordination number space. Meanwhile, the filled free-energy landscapes are also not exactly flat, which is the result of statistical and systematic errors in WT-MetaD-HREX and the complexity of HJ dynamics. Regarding the latter, it should be noted that HJ has more CVs which are regularly studied and are orthogonal to the branch point base pairs coordination numbers, e.g. J twist and J roll . 22,23 Our 1 µs WT-MetaD-HREXs might not be sufficient to consider the converged sampling of other CVs, which could result in deviated entropy in the free energy calculation. In any cases, we still suggest that our results represent the general free-energy landscapes in the coordination number CV space with only minor deviation, which is supported by the sampled transitions in most of the continuous trajectories in WT-MetaD-HREXs-EXT.

Pure Hamiltonian Replica Exchange and WT-MetaDynamics simulations
We also initially tried the pure HREX as well as WT-MetaDynamics (WT-MetaD) methodologies for our simulations. However, neither of them led to convergence which was the reason why we finally applied the sophisticated combined WT-MetaD-HREX protocol (see the main text).
Regarding the pure HREX, we ran two independent simulations starting from open and closed isomer I states of J1 using the same λs ladder for the 6 replicas and the same replica exchange protocol as we did in subsequent WT-MetaD-HREX. Each HREX was run for 1 µs. As shown in Figure S13, the individual trajectories swiftly separated into two groups which rarely or no longer at all exchanged across the replica ladder. This started right after first few opening/closing transitions which then led to the closed and open HJs permanently occupying the lower and higher replicas, respectively. In the HREX started from closed state isomer I, this happened around 200 ns when two trajectories showing open state HJs almost permanently occupied replicas 5 and 6. Similarly, the HREX started from the open HJ showed one closing event (to isomer I) at about 50 ns with the closed HJ subsequently occupying the replica 0 for the rest of the simulation. We attribute this phenomenon to the relatively large potential energy gaps between the two states, leading to permanent separation between replicas as no further exchange attempts were accepted. This resulted in almost no round trips for the continuous trajectories.
Likewise, the pure WT-MetaD simulations were set up with the same protocol we later used for the WT-MetaD-HREX in the isomer I-open substates space (see main text Methods). The pure WT-MetaD simulations included only a single replica and the scaling protocol we designed for increasing the opening-closing transitions was not applied. Two independent WT-MetaD runs were done for 1 µs each and the free energy profiles obtained. Unlike the WT-MetaD-HREX, the two WT-MetaD runs led to significantly different free energy profiles ( Figure S14). In particular, the second run indicated the open HJ state to be more stable than the closed isomer I by ~12.8 kcal/mol, completely contradicting the conclusion we obtained from the standard simulations and the converged WT-MetaD-HREXs. The first WT-MetaD run showed the closed isomer I is more stable than open state by 4.1 kcal/mol, closer to the results of the WT-MetaD-HREX simulations in the same CV space. We also note the statistical errors estimated by our bootstrapping protocol are far lower than the difference of ΔG isomer I-open from the two individual WT-MetaD runs. This indicates the pure WT-MetaD protocol was incapable of efficiently sampling the highly complex conformational space of HJ opening-closing transitions, leading to the lack of convergence. Nevertheless, it is noteworthy the WT-MetaD still identified the local free energy minima corresponding to the two half-closed intermediates ( Figure S14A, left), confirming these intermediates are not artificially produced by our scaling protocol. In other words, same HJ opening-closing pathways (main text Figure 4) were observed when utilizing CV-based or scaling-based methodologies (or their combination), suggesting that neither of them is biasing the results and that the half-closed intermediates could be realistic.

Side-effects observed in simulations of HJ with short helical arms
In order to reduce computational demands, we also attempted simulations of the J1 system in which we shortened the length of its helical arms to three base pairs each (so called "short HJ", system "J1/short/LMCU/λ0.75"). Although the size reduction of the junction indeed led to significantly faster computational speed compared to larger HJ systems, we observed side effects in form of the spurious inter-helical interactions ( Figure S3). These unexpected contacts stabilized the closed state while simultaneously promoting the occurrence of the parallel HJ conformation which is not relevant for our study of the opening-closing transitions. To avoid both issues, we strongly suggest that the length of each HJ arm in simulations should always be at least six and ideally as many as eight base pairs. Tables   Table S1. List of standard MD simulations.
When not specified, standard net-neutral conditions were applied. The systems with scaling of the branch point base pairs are indicated with "λ" followed by the scaling factor applied (e.g. λ0.75 indicates 0.75 scaling). b The starting structures referred in the parentheses are visualized in Figure S4.
c HBfix was not used for the branch point base pairs. d "Long" and "huge" indicate that 64-nt HJ structure and significantly larger solvent box with the distance between the HJ and the box boundary at least 58 Å (so that net-neutral c(K + ) = 20 mM) were used, respectively. e "1BNA" system was prepared from PDB structure 1BNA (Dickerson dodecamer) and used as a control to verify possible side effects of the scaling on standard B-DNA structure. The terms "λ0.7" and "HBfix" refer to the systems with scaling (λ=0.7) on the entire B-DNA helix and with the application of HBfix on the terminal base pairing H-bonds, respectively.

Figure S13. Position of each continuous trajectory along the replica ladder in the two HREX simulations started from isomer I (A) or open (B) state.
The replica spaces were permanently separated into two groups early in the simulations, with open and isomer I states populated in higher and lower replicas, respectively, and with virtually no exchanges between the two groups. The method is thus not robust enough to sample the HJ system.