Tau Protein Binding Modes in Alzheimer’s Disease for Cationic Luminescent Ligands

The bi-thiophene-vinylene-benzothiazole (bTVBT4) ligand developed for Alzheimer’s disease (AD)-specific detection of amyloid tau has been studied by a combination of several theoretical methods and experimental spectroscopies. With reference to the cryo-EM tau structure of the tau protofilament (Nature2017, 547, 18528678775), a periodic model system of the fibril was created, and the interactions between this fibril and bTVBT4 were studied with nonbiased molecular dynamics simulations. Several binding sites and binding modes were identified and analyzed, and the results for the most prevailing fibril site and ligand modes are presented. A key validation of the simulation work is provided by the favorable comparison of the theoretical and experimental absorption spectra of bTVBT4 in solution and bound to the protein. It is conclusively shown that the ligand–protein binding occurs at the hydrophobic pocket defined by the residues Ile360, Thr361, and His362. This binding site is not accessible in the Pick’s disease (PiD) fold, and fluorescence imaging of bTVBT4-stained brain tissue samples from patients diagnosed with AD and PiD provides strong support for the proposed tau binding site.

, both basis set yield very similar values for absorption maxima for the S 1 ← S 0 transition. For the most stable conformers cis and trans (conformer 1 and 2 in the Fig. S1b), the ground state geometries was re-optimized at the B3LYP/aug-cc-pVDZ and dipole moment were calculated at the CAM-B3LYP(100%)/augcc-pVDZ method, keeping gauge-origin at the center of nuclear charge. The Fig. S2 shows the dipole moment vectors of trans conformer in the ground state (µ S 0 ), excited state (µ S 1 ) and their difference (µ diff ). The magnitudes of µ S 0 , µ S 1 , and µ diff are 5.54 D, 2.08 D, and 4.74 D, respectively. The dipole moment of the first excited state is significantly lower as compared to the ground state. This decrease in dipole moment can also be seen from the NPA charge analysis shown in Fig. 1.

S6
For generating the vibronic emission spectra, a hybrid scheme between the independentmode displaced harmonic oscillator (IMDHO) model 11,12 and anharmonic treatment via vibrational configuration interaction (VCI) was employed. For the IMDHO we applied an in-house Python code and with half-widths at half-maximum (HWHM) of 200 cm −1 . For the anharmonic Franck-Condon factor calculations, we employed MidasCpp. 13 For the calculation of Franck-Condon factors in MidasCpp, see also ref. 14. In the construction of the potential energy surfaces (PESs), we made use of a multi-state 15 adaptive approach for grid point selection. 16 For expansion of the PES, we applied polynomials up to 8th order. No mode-mode couplings have been accounted for in the PES. The vibrational self consistent field (VSCF) calculations underlying the VCI calculations were performed using a b-spline basis. 17 30 of the VSCF one-mode functions were considered in the following VCI calculations, which accounted for configurations with a maximum of three simulaneously excited modes (VCI [3]).

Molecular mechanics force field parameters of bTVBT4
Force field (FF) parametrization was carried out in two parts. In the first part, initial parameters were generated from the General Amber Force Field (GAFF). 18,19 In the second part, equilibrium bond distances, angles, and important dihedral potential were corrected based on optimized ground state geometry and dihedral potential calculated from DFT method.

Generation of initial parameters
To start with, rudimentary FF parameters of bTVBT4 were obtained from GAFF, following the steps outlined in an Amber tutorial, 20 summarized below.

• Step 1: Geometry Optimization
The initial geometry of bTVBT4 molecule is optimized in the Gaussian program by employing the B3LYP functional and 6-31+G(d,p) Pople's basis set.
• Step 2: Restrained Electrostatic Potential (RESP) 21 charge calculation and defining S7 atom types In Amber FFs, all residues (building blocks) on which macromolecules are built are restricted to integer charges. Having integer charges on building blocks help in transferring electrostatic parameters to macromolecules. RESP charges assure that rotationally degenerate atoms, such as hydrogens of methyl and methylene have equivalent charges. Determining the RESP charges and atom type involves; A) Calculating the electrostatic potential (ESP) at HF/6-31G* on an already optimized geometry in gas phase. B) Fitting charges at each nucleus by minimizing the sum squared error between calculated ESP from charges and ESP obtain from step 2A. This is achieved with the help of Antechamber (part of Amber tools). 18,19 • Step 3: Parameter check and building a parameter library Once the atom types are defined, a parameter library for bTVBT4 was generated using LEaP (part of Amber tools).

• Step 4: Format conversion
The topology obtained from the above procedure is in Amber format, and since we will be using Gromacs 22-25 for molecular dyanmics (MD), the topology is converted from Amber to Gromacs format using the ACPYPE 26 Python module.
Correcting bond distances, bond angles and dihedral angles During step 3 described above, the parmchk program (part of Amber tools) has created suggestions for some dihedrals which was not found in the GAFF database. We are interested in determining the dynamics of bTVBT4 in different environments and calculating an absorption spectrum on frames sampled from the dynamic trajectory. Hence, it is important to accurately describe bond distances, bond angles, and dihedral angles so that absorption spectra calculated on DFT and Molecular Mechanics (MM) optimized geometry give same result. In order to obtain more suitable GAFF parameters and validate the FF, the following steps were taken: S8 • Step 1: Dihedral scanning A relaxed dihedral scan for each of the dihedral angles defined in Fig. S1(a) was performed using B3LYP/6-31+G(d,p) as well as Molecular Mechanics (MM) with the GAFF. Comparing the two (DFT and MM) potential energy curves in Fig. S4 (illustrating the SCCS dihedral angle), we see that the initial topology that was generated is not an adequate description of SCCS potential. Due to the artificially high barrier represented in GAFF, an interconversion between the cis and trans conformation of bTVBT4 gets hindered during the MD simulation. Similarly, it is important to represent other important dihedral angles SCCC, NCCC, and CCCC accurately, in order to sample conformation at the 300 K. Figure S4: Dihedral potential of SCCS (marked in Fig. S1(a) of bTVBT4) from GAFF (blue), DFT (orange), and corrected FF (grey).
• Step 2: Modifying equilibrium bond distance and bond angle.
Bond distances and bond angles are crucial for absorption spectra, so all equilibrium bond distances and bond angles from GAFF were replaced with values obtained from DFT optimized geometries. All remaining parameters were kept the same as obtained from GAFF.
• Step 3: Excluding the energy contribution from dihedral of interest For the required dihedral, first, a contribution of torsion energy is excluded from total S9 MM energy by setting appropriate torsion parameters to zero in a FF file. Using this modified topology, MM based relaxed dihedral scan (called MM zero) is obtained.
Next, this MM zero energy profile is subtracted from DFT energy profile to get a missing energy contribution which will be added to torsion parameters of dihedral of interest. For the SCCS dihedral, the difference between DFT and MM zero curve is shown in Fig. S5 with red colour. Next task is to fit this curve with the Ryckaert-Bellemans function Eq. (1)). Figure S5: Curves shows the difference between potentials for DFT and MM zero (where all parameters that define SCCS are substituted to zero).
• Step 4: Fitting to DFT potential The curve obtained by taking the difference between DFT and MM zero potential is fitted with Ryckaert-Bellemans function (Eq. (1)). This function has previously been used to fit dihedrals involved in bi-thiophene based ligands. 27 Curve fitting was done using the Scipy 28 Python module. The fitted curve is shown in blue in Fig. S5.
• Step 5: Adding corrected parameters and scanning the dihedral angle Add corrected parameters for dihedral which was fitted and a relaxed dihedral scan S10 was performed using this modified topology. The result obtained in this step for the SCCS dihedral is shown in Fig. S4 (grey curve). The same procedure was employed for correcting all dihedrals (SCCC, NCCC, and CCCC, illustrated in Fig. S1a) iteratively.
Note: So far in the process of fitting dihedral potential, except for the step 2 (correcting the bonds and angles based on DFT optimized geometry) we followed the procedure which has been previously described in Ref. 29. Figure S6: Fitting of important dihedrals (marked in Fig. S1(a) of bTVBT4 to the DFT potential. The dihedral scans are performed at B3LYP/6-31+G(d,p).

• Step 6: Validation
After correcting all important dihedrals, the final dihedral scan is shown in Fig. S6.
In all the cases, the dihedral potential curve is in good agreement with DFT. For the energy barrier above 40 kJ/mol, the dihedral potential of the CCCN dihedral deviate from DFT potential, however this will not be the problem in current study as simulations are performed at room temperature (300 K). Next, if we compare the S11 relative energies of cis and trans obtained using different methods (DFT and the reparametrized FF), it is in good agreement (Table S1). This can also be seen from Boltzmann distribution shown in Table S2.

Functional benchmark for absorption spectra
Generally, changing the DFT functional induce a global shift in transition energies and does not influence the absorption spectrum profile. In order to have computational transition en-S12 ergies close to experimental values, a choice of functional was based on benchmark calculation shown in Table S4.

Molecular dynamics simulation
All classical MD simulations were performed using GROMACS (version 19. in water, the isotropic condition was applied. Periodic boundary conditions were applied to all three directions of the simulated box. Electrostatic interactions were simulated with the Particle Mesh Ewald (PME) 32 approach using a long-range cutoff of 1.5 nm. The cutoff distance of Lennard-Jones (LJ) interactions was also equal to 1.5 nm. The MD simulation time step was 2 fs with a pair-list update period of 10 steps. All H-bond lengths were kept S13 constant using the LINCS routine. 33

Molecular dynamics simulation of bTVTB4 in vacuum
The ligand bTVBT4 (with one Cl − counter ion) was first simulated in vacuum, and cis and trans conformations were sampled during a 4 µs long MD simulation. Percentages of cis and trans conformations of bTVBT4 are presented in Table S5. Initially, the Cl − ion was positioned at 24 nm from bTVBT4. It moves closer (< 4Å) to bTVBT4 during the trajectory, and starts interacting with bTVBT4 from about 98 ns and stays close to the ligand during the remainder of the simulation. Due to this fact, the expected percentage of cis and trans conformations from Boltzmann distribution is not achieved. One bTVBT4 molecule in trans conformer was solvated by a water solvent box of 5.2 × 5.2 × 5.2 nm 3 . One Cl − ion was added to neutralize the system. Next, the equilibration step was performed with the first energy minimization of the whole system, followed by a short 100 ps N P T simulation to stabilize the pressure of the system. Subsequently, we propagated the system in an N V T ensemble for the production run with accumulated time of 4 µs from 4 independent simulations.
The converged dihedral distribution over snapshots extracted from 4 µs data is shown in  to be equally populated in water due to an increase in interaction energy with water. The decomposition of interaction energy shows that the LJ interaction energy is approximately equal for both the conformers, as presented in Table S6. However, the Coulomb interaction of cis conformer in water is more stable than trans conformer by 6.2 kJ/mol. This increase in Coulomb interaction energy is attributed to interaction between thiophene moiety of cis and water.

Polarizable embedding and spectra in solution
In order to include the effect of the water environment while calculating the spectra of bTVBT4, we employed the method of Polarizable Embedding (PE) 34,35 on the snapshots extracted from the MD simulations. PE is based on a hybrid quantum-classical computational approach in which the total system is split into three regions: core, polarizable, and S15 non-polarizable. The core region having only a bTVBT4 molecule is calculated in the quantum chemical description. The polarizable region is described with Ahlström charges and isotropic polarizabilities, 36 while the non-polarizable region is described with just TIP3P charges. The shell thickness of 15Å polarizable and 5Å non-polarizable regions are considered to be reasonable to account for the effect of water. The Fig. S8 shows the test calculation of transition wavelength as a function of shell thickness of polarizable (red), non-polarizable (blue), and combination of both (green).  Table S7. One snapshot from the set of 100 trans frames that had a peak at the average maxima at around 480 nm was selected for further studies on the absorption spectra. On this particular snapshot, the QM region was increased to also encompass 5 water molecules in addition to the bTVBT4 molecule. The effect of this was only a negligible shift in the absorption spectrum (Table S8). However, increasing QM region further to accommodate 14 water molecules, the absorption maxima shifted by 5 nm from 480 nm (average value) towards the experimental value (470 nm). Nevertheless, this improvement comes at a large computational cost and therefore, we decided against including water molecules in the QM region.

Creating a tau fibril models
From the PDB ID 5O3l, first, coordinates of one protofilament out of ten were extracted.
Next, by taking the 2 1 screw symmetry of the first protofilament, another protofilament was generated to form a monomer. The consecutive monomers were obtained by rotating a previous monomer by 0.97 degrees and translated it by 5Å. This operation was continued until the last monomer had a twist of 180 degrees with respect to the first monomer. It is important to ensure a perfect fit between the last monomer and the first monomer of an adjacent periodic image of the simulation box. In this way a long fibril was modeled, which approximately resemble the experimentally found tau fibril while decreasing the computational cost by half. Employing the same strategy, tau models having adjacent monomer twist of 0.8, 0.9, 1.0, ad 1.2 were also constructed to find a fibril model with least stress induce due to forced periodicity. S18

MD simulation of full tau fibril
For each of the tau model created, simulated annealing was performed to slowly warm the system under N P T conditions. The heating of system from 150 K to 300 K was achieved with 14 annealing points, at the temperature step size of 25 K. The temperature of the system was gradually increased every 200 ps, and at each step, it was simulated for 2 ns except for the last step of 300 K where it was simulated for 13 ns.
It was observed that tau fibril generated with the adjacent monomer twist of 0.97 degrees had fewer kinks compared to the other tau models with different adjacent monomer twits (data not shown). The dynamics of the total fibril length along its axis is shown in the of 300 K shows that we have reached an equilibrium structure from 14 ns onward. In the equilibrium structure of the tau model, the average distance between two adjacent monomers was decreased from 5Å to 4.8Å.

Molecular dynamics of full periodic model of tau with 60 bTVBT4 molecules
Starting from the thermally stable structure of the tau fibril model, two kinds of system were built to increase the sampling in MD simulation. In the first system, the simulation box contained the tau fibril with 60 bTVBT4 molecules inside the tau cavity (Fig. S10a), and in the second system, 60 bTVBT4 outside the tau cavity (Fig. S10b). Each ligand was placed at random position and in random orientation.  Initially, in the process of equilibration, both kinds of systems were energy minimized, followed by short 100 ps N V T simulation with position restrain on bTVBT4 and tau fibril to stabilize the temperature. Next 100 ps short N P T simulation to stabilize the pressure. Finally, the MD simulation production run was carried out in N V T ensemble, maintaining the temperature at 300 K. For the 60 bTVBT4 molecules inside tau cavity case, five independent simulations were simulated for 40 ns each, whereas for the 60 bTVBT4 molecules outside tau cavity case, four independent simulations were setup and simulated for approximately 35 ns each.
Binding mode code assignment for Site A : Site A was distinguished as the strongest binding site at which bTVBT4 binds in various modes (see Fig. 3). These modes were characterized with respect to position of the ethyl group, bi-thiophene group of bTVBT4 and amino acid residues Ile360, and His362 of the tau fibril that bTVBT4 interact with.
Modes are defined by four letters assigned based on the following criteria: Fix the position of tau such that Ile360 is on top and His362 on the bottom side.
• Based on relative position of ethyl group with Ile360 and His362 assign "U" (ethyl pointing towards Ile360) or "D" (ethyl pointing towards His362) • ethyl group is at left "L" or right "R" (keep position of Ile360 and His362 fixed as mention above) • Bi-thiophene moiety in cis "C" or trans "T" conformation.
• CH 3 group on ethyl, inside "I" (pointing towards tau protein) or outside "O" (away from tau protein) Considering all possible combination, in total there are 16 modes. The numeric and alphabetic codes for all 16 modes are defined in the Table S10. Frequency of interconversion between "I" , "O" and "C" , "T" is grater as compared to "U" , "D" and "L" , "R". Hence, when only the first two letters of alphabetic code are considered, we define it as major mode and when all four letters are considered we defined it as minor modes (see Fig. S11).    The majority of the bTVBT4 population is found at site A at which it binds in 16 different minor modes (or 4 major modes) as defined in Table S10. The average total interaction energy (LJ+Coulomb) between bTVBT4 molecules and tau fibril calculated for each of the minor modes from the five independent MD simulations are shown in Table S10. Often interconversion from one minor mode to another is observed in the binding pocket during the trajectory, mainly between minor modes of type XXT X ← → XXCX and XXXI ← → XXXO, where X represent any other letter codes for minor modes as defined in Table S10.
The Fig. S12 shows all the interconversion between the minor modes. The average of the S23 total interaction energy over all minor modes is calculated to be −153 ± 19 kJ/mol. The bTVBT4 interaction with the site A residues account for 75% of the average total interaction energy when it interacts with the whole tau, as shown in Table S11.  with site A residues with protein bTVBT4 interaction −114 ± 17 −153 ± 19

Smaller binding site model
The smaller binding site models were generated from the system of 60 bTVBT4 molecules in the cavity of the tau fibril by cropping out the four major binding modes. Each smaller binding site model consist of one bTVBT4 molecule and ten protein chains at the side where the ligand was bound (see Fig. S13). Next, for each model, a simulation box of size 15 nm x 9 nm x 9 nm was created and filled with water molecules and counter anions Cl − . Further, after the energy minimization and equilibration process of a system, the MD production Figure S14: Time spent (ns) in each minor mode (diagonal) and number of exchanges between the minor modes (non-diagonal) from row to column. Characterized based on four major modes. Data obtained from two trajectories simulated for 250 ns for each major mode.   Figure S16: The PMF calculation for three pulling trajectory (green, red, blue) and its average (black) for each major modes.

PMF calculation details
We performed potential of mean force (PMF) calculations to obtain the height of the atomistic free-energy barrier for the bTVBT4 to bind at site A. It was achieved by using an umbrella sampling technique and a weighted histogram analysis method. Starting from the last snapshot of each major mode (short models) equilibrium MD simulation, bTVBT4 has pulled away from the binding site A to outwards. The center of mass (COM) pulling was applied between two groups Thr360 (middle residue at site A) and bTVBT4. The pulling rate of 0.001 nm/ps combined with a spring constant of 3000 kJ/mol was used. The ligand was pulled out from major modes of binding site A for about 5 nm. From these pulling trajectories, snapshots were taken to generate the starting configurations for the umbrella sampling windows. In each window, short 100 ps N V T followed by 2 ns N V T simulation was performed. Finally, analysis of results obtained from umbrella sampling was performed with the weighted histogram analysis method (WHAM). 38,39 Note that, for each major mode, three pulling simulations followed by umbrella sampling were conducted to get the average PMF curve finally (Fig. S16).

Absorption spectra calculation of bTVBT4 at the binding site A
To calculate absorption spectra, 300 uncorrelated snapshots were chosen based on the percentage of time spent in each minor mode and the major mode's Boltzmann percentage.
Boltzmann percentage for major modes was calculated based on their respective binding energies obtained from PMF. Table S12 shows the number of snapshots chosen for each minor mode.
For each snapshot, absorption spectra were calculated by employing the polarizable embedding method. The total system was split into two regions: core and polarizable region.
The core region having only a bTVBT4 molecule is treated in the quantum chemical description. The polar region consists of water molecules, counter anions (Cl − ), and amino acid residues within 20Å of bTVBT4. For water, Ahlström charges and isotropic polarizabilities were used. Standard charge and isotropic polarizability for Cl − atoms were taken from the library in the PyFrame module. 40 For the amino acid residues of tau protein, the average of charges and isotropic polarizabilities over 10 tau protofilament is used. The charges and isotropic polarizabilities were determined using the LoProp program. 41,42 The absorption calculations were carried out for the ten lowest excited states at the CAM-B3LYP(100%) and aug-cc-pVDZ basis set using the Dalton program. Subsequently, for each snapshot, the calculated absorption spectrum was Gaussian convoluted with standard deviation of 0.15 eV. The final spectral profile was obtained by taking an average over all the snapshots under consideration. S29

Planarity calculation
The planarity parameter used for the assessment of the molecular structure of bTVBT4 is adopted from Ref. 43

and reads
where θ i are the dihedral angles in the molecule, and N is the number of dihedral angles for which the planarity is calculated. For the bTVBT4 case, the planarity was calculated over four dihedrals (marked in Fig. S1 (a)). Fig. S17 shows the correlation between planarity and Transition energy (eV) bTVBT4 in water bTVBT4 in Tau Figure S17: Correlation between planarity (defined in Eq. 2) vs. transition energy for the first excited state of bTVBT4. In total, 200 points are shown for bTVBT4 in water (orange), and 300 points shown for bTVBT4 in the tau fibril environment of binding site A (green) , corresponding to the snapshots used in spectrum calculation. The average value of planarity for bTVBT4 in water is 3.43 ± 0.20 and in tau is 3.49 ± 0.18.

Qualitative picture of interactions between bTVBT4 and siteA
The intramolecular and intermolecular interactions are accounted within the Columbic term (electrostatic interactions) and Lennard-Jones function (van der Waals interactions) in the forcefield. Fig. S18 shows RESP charges mapped on site A and bTVBT4 atoms. The strongest Columbic interaction is observed between the imidazole ring of His362 (having negative charge cloud) and the bTVBT4.  Data were analyzed using SPCImage version 3.9.4. Emission spectra from bTVBT4 bound to tau pathology in AD were recorded between 561 nm to 687 nm using an inverted LSM S32 780 confocal microscope (Carl Zeiss, Oberkochen, Germany) and excitation spectra were collected with the same microscope system using a tunable In Tune laser and the excitation wavelength was scanned between 490 to 600 nm having the emission fixed at 612 nm.
When bound to tau aggregates in AD, bTVBT4 displayed strikingly longer decay times, 1.7 to 2.4 ns, than for the ligand in different solvents, verifying that bTVBT4 adopts a distinct conformation when bound to the aggregates. Figure S19: Intensity-weighted mean lifetime distributions of bTVBT4-stained tau deposits in AD brain tissue sections. The fluorescence lifetimes were collected with excitation at 535 nm.

Spectroscopy
Steady-state absorption spectra were recorded using a Shimadzu UV-1601PC spectrophotometer. Measurements were performed with 10 mm quartz cuvettes (Hellma Precision).
Steady-state photoluminescence measurements were carried out employing a PTI Quanta- Data acquisition and basic data-handling of steady state luminescence data were carried out with the Felix Data Analysis software and further processed and presented using Origin Pro.
Time-correlated single photon counting (TC-SPC) was used to register and analyze decay S33 traces of bTVBT4 in a selection of solvents using an IBH system as described recently in Ref. 44. In the experiments a picosecond laser diode at 469 nm was used for excitation and the emission was recorded at 600 nm with a 16 nm slit. The blue shift of the absorption spectra of bTVBT4 in more polar solvent might be due to a negative solvatochromic effect associated with an excited state that is less polar than the ground state. In such a situation the ground state becomes more effectively relaxed in the reaction field of the solvent and the ground-to-excited state energy difference thus becomes larger upon introducing more polar solvent. Interestingly, there is almost no, or at least very small, differences in the emission spectra with respect to solvent polarity, corroborating the notion that the excited state is less polar.
Attempted lifetime measurements of bTVBT4 at a concentration of 2.5 µM in selected solvents showed very rapid decays, at the limit of the resolution of our TC-SPC system.
Typically, for polar protic solvents such as PBS, methanol, and ethanol, the lifetimes were in the order of 10-20 ps (±10 ps) and for non-protic solvents such as acetonitrile, CHCl 3 , and DMF, around 30 ps (±5 ps). Two representative decay traces are shown in Fig. S21.  Table S13. Figure S21: Representative TC-SPC decays of bTVBT4 in EtOH and DMF (2.5 mM) along with attempted re-convolution fits (dashed lines). "Prompt" denotes the system response acquired from the scatter of the pure solvent at the excitation wavelength.