Discovery of High-Affinity Amyloid Ligands Using a Ligand-Based Virtual Screening Pipeline

Fibrillar protein aggregates are characteristic of neurodegenerative diseases but represent difficult targets for ligand design, because limited structural information about the binding sites is available. Ligand-based virtual screening has been used to develop a computational method for the selection of new ligands for Aβ(1–42) fibrils, and five new ligands have been experimentally confirmed as nanomolar affinity binders. A database of ligands for Aβ(1–42) fibrils was assembled from the literature and used to train models for the prediction of dissociation constants based on chemical structure. The virtual screening pipeline consists of three steps: a molecular property filter based on charge, molecular weight, and logP; a machine learning model based on simple chemical descriptors; and machine learning models that use field points as a 3D description of shape and surface properties in the Forge software. The three-step pipeline was used to virtually screen 698 million compounds from the ZINC15 database. From the top 100 compounds with the highest predicted affinities, 46 compounds were experimentally investigated by using a thioflavin T fluorescence displacement assay. Five new Aβ(1–42) ligands with dissociation constants in the range 20–600 nM and novel structures were identified, demonstrating the power of this ligand-based approach for discovering new structurally unique, high-affinity amyloid ligands. The experimental hit rate using this virtual screening approach was 10.9%.

For ligands where multiple binding constants were reported, the average of these measurements were used. The log of all binding constants in M -1 were taken and the log(Kd/M) values recorded. A binding class was assigned to each ligand based on this log(Kd/M) value; class 0 for log(Kd/M) ≤ -8, class 1 for -8 < log(Kd/M) ≤ -7, class 2 for -7 < log(Kd/M) ≤ -6, and class 3 for -6 < log(Kd/M). Figure S1. Aβ  ligands which target the FBH binding site.

3D Conformation Hunt
Conformational hunts for all ligands in the FBH dataset were performed using the "Accurate but Slow" calculation method in Forge. For each ligand, 200 conformations were generated with a gradient cutoff of 0.100 kcal/mol/A for conformer minimisation. Duplicate conformers were filtered at an RMS of 0.50 A, and an energy window of 3.00 kcal/mol were used. No coulombic or attractive vdW forces were considered. Field points were then generated by Forge using the XED (eXtended Electron Distribution) molecular mechanics forcefield. Positive electrostatic, negative electrostatic, van der Waals, and hydrophobic field points were calculated. The XED force field uses several monopoles to generate a multipole description of the underlying force fields.
All classification models were scored using the balanced accuracy, which is defined as the average of recall for each class. For a given class, the recall is defined in Equation S3 as where TP is the number of true positives, and FN is the number of false negatives. For a multiclassification problem, the balanced accuracy is defined in Equation S4 as Eq.S4 where Recalli is the recall for the i th class out of a total n classes.

Nested Cross-Validation
Nested cross-validation (CV) was used for hyperparameter optimisation and model evaluation. Both the outer and inner CV procedures used random k-fold cross validation (k = 5).
The outer CV loop uses a random k-fold cross validation (k = 5) procedure. Within each loop the inner CV loop is called using the outer CV training set and model hyperparameters are optimised. The outer test set is then used to evaluate model accuracy. Final model accuracy is the average model accuracy over the k folds.
The inner CV loop uses a k-fold cross validation (k = 5) procedure to evaluate model accuracy while optimising hyperparameters using Optuna. 7 Optuna suggests model hyperparameter values from a user-defined window which are then used to fit the inner CV training set. The inner CV test set is then used to evaluate model accuracy. Final model accuracy is the average model accuracy over the k folds. This procedure was repeated 200 times using new hyperparameter values suggested by Optuna with each iteration. The hyperparameters which produced the model with the highest accuracy were then passed to the outer CV loop for model evaluation.

Fitting the Final Model
To develop final models, a support vector machine regressor (rbf kernel) and random forest regressor were fitted by passing the entire dataset into the inner cross-validation procedure. Figure S2. Model development using Forge. 1) High-affinity reference ligands were selected with representative structures. 2) A diverse set of conformations were generated for each reference ligand. Conformations of each ligand were aligned to one another based on similarity in their calculated field points and shape. Blue field points describe regions of negative electrostatic potential, red field points describe regions of positive electrostatic potential, orange field points describe regions with high hydrophobicity, and yellow field points describe van der Waals interactions. 3) Conformations of all other ligands in the dataset were generated and aligned to the reference template. 4) Quantitative structure activity relationship (QSAR) models were developed from the relationship between the field points and the experimentally measured log(Kd/M) for each ligand.

Reference Template Generation
The FieldTemplater module of Forge (Cresset Inc., UK) was used to generate reference templates for ligand alignment. The ligands 5, 9, 10, and 11 were chosen as reference molecules due to (a) their low nanomolar binding affinities, and (b) their structural core being representative of a large part of the known dataset ( Figure S3). A conformational hunt of these ligands was performed as above, combined with an alignment step that was scored by a 50:50 consideration of shape similarity and fieldpoint alignment between ligands. Templates were generated with a minimum of three molecules, a maximum number of 100 comparisons between pairs, a maximum score delta of 0.10 per pair, a minimum link density of 0.80, and duplicate templates were filtered at an RMS of 0.7 A. Two templates were generated using this procedure: Template 1 was comprised of ligands 5, 9, and 10 in their lowest energy conformation with a field similarity score of 0.710, a shape similarity score of 0.806, and an overall similarity score of 0.758. Template 2 was comprised of ligands 5, 9, 10, and 11, with all ligands in their lowest energy conformation aside from ligand 11 which occupied it's 17 th -lowest energy conformation. This template had a field similarity score of 0.626, a shape similarity score of 0.749, and an overall similarity score of 0.687. Figure S3. Structures of the reference ligands used for Forge model development.

Compound Alignment and Partitioning
Compound alignment to each reference template was performed using a "Normal" calculation method in Forge. A maximum scoring method was used to align ligands to the templates containing multiple reference compounds. Alignment was scored by a 50/50 weighting of shape similarity and field point alignment. Once alignments were generated they were manually verified. Ligands that failed to align were discarded from the dataset. The remaining ligands were then partitioned into training and test set compounds in approximately a 4:1 ratio (Template 1: 169 training and 43 test set compounds; Template 2: 176 training and 46 test set compounds). This partitioning was activitystratified and performed manually to ensure each structural class was represented in both datasets.

Model Development
Five models were created for each template.

Field QSAR Model
Field quantitative structure-activity relationship (QSAR) models were created using a "Normal" calculation method in Forge. A maximum of 20 model components were allowed, with a sample point minimum distance of 1.0 A and 50 Y scrambles. Leave-one-out CV was used for model evaluation.

Relevance Vector Machine (RVM)
RVM models were built with a sample point minimum distance of 1.0 A. For cross-validation, 5 folds were used in k-fold CV to calculate a cross-validated coefficient of determination (q 2 ). A maximum of 50 optimiser iterations were allowed with a time limit of 60 min for global optimisation, and a gamma between 10 -5 and 10 -1 allowed. Both electrostatic and volume fields were used for model development.

Feature Importance
The feature importance of each descriptor included in the final random forest model was computed ( Figure S4). Figure S4. The feature importance of each descriptor included in the random forest regressor model. Figure S5. Variance in the model coefficients of Field QSAR models for each template.

ZINC15 Database and Molecular Property Filter
Tranches of compounds with 3D conformations were downloaded from the ZINC15 database corresponding to molecules that (a) were neutrally charged, (b) had a molecular weight between 200 and 500 Da, and (c) had clogP values between 3.5 and 5.5. 8 These tranches corresponded to 63 million ligands. Chemical descriptors were calculated for these compounds as described above.

Predicting Binding Affinities
The log(Kpred/M) of the 63 million ligands from the ZINC15 database were predicted using the final support vector machine regressor (rbf kernel) model. The top 10,000 ligands were then screened using Forge 3D models.
Four 3D models developed using Forge were used: the Field QSAR and RF models from Template 1, and the Field QSAR and SVM models from Template 2. Each of these models were used to predict the binding affinity of the 10,000 ligands passed from the chemical descriptor model. These 3D models predicted a broader range of -log(Kd/M) values compared to the SVM Chemical Descriptor model ( Figure S6). The Field QSAR models predict a wider distribution of -log(Kd/M) compared to the RF and SVM models. The Field QSAR models calculated a "Distance to the model" score. This score is a measure of how well the field points of the screened ligand are represented in the trained model. If the screened ligand has many field points that were not found in the data used to create the model, then the model will be unreliable for predicting ligand activity. Only ligands with a "Good" or "Excellent" distance to model score were therefore considered further.
The predicted affinity for the screened ligands were averaged between the two molecules for each template (ie. The predicted ligand affinity from the Template 1 Field QSAR and Template 1 RF models were averaged, and the predicted ligand affinity from the Template 2 Field QSAR and Template 2 SVM models were averaged). The two resultant rankings were then score-fused, and the top 25 predicted ligands from each template were purchased from Enamine (Ukraine) for screening. Of these ligands, 46 were successfully synthesised by Enamine (Ukraine) ( Figure S7). Figure S7. The 46 compounds purchased for experimental screening.

Calculating Similarity
RDKit fingerprints were calculated for E163, E197, E363, E570, and E704, and the FBH ligand database. Tanimoto and Dice similarity coefficients were then calculated between each of E163, E197, E363, E570, and E704, and the FBH ligand database. [9][10][11][12] Plots of the calculated similarity coefficients are shown in Figure S8. The most similar pairs are shown in Figure S9, with calculated similarity coefficients in Table S1. a) b) Figure S8. The calculated (a) Tanimoto similarity coefficients and (b) Dice similarity coefficients between E163, E197, E363, E570, and E704, and each ligand in the FBH database. RDKit fingerprints were used for similarity calculations. Figure S9. The chemical structures of the ligands in the FBH database with the highest Tanimoto and Dice similarity coefficients to each of the novel ligands discovered. The ligands with the highest Tanimoto coefficients also had the highest Dice coefficients.

Materials and Instrumentation
All solvents and chemicals were obtained from commercial sources and used without further purification unless otherwise stated. Protein LoBind (Eppendorf) microtubes were used for preparing and storing all solutions containing Aβ . Low retention pipette tips were used for all fluid handling. Ligands used for experimental screening were purchased from Enamine (Ukraine).

Aβ(1-42) Aggregation
Preparation and aggregation of Aβ  was performed based on standard literature procedures. 13,14 Monomeric Aβ(1-42) (1 mg: Rockland Immunochemicals, Limerick, PA, Lot #31541) was dissolved in 2 mM NaOH (1 mg/mL) on ice and gently agitated until the lyophilisate was wetted. The monomer was then left for 3 min to dissolve and a homogeneous appearance of the solution was achieved. The solution was then sonicated for 1 min, added to an Amicon Ultra-15 Centrifugal filter (30 kDa MWCO), and centrifuged (30 min, 4000 x g, 4 °C). The filter was washed with 2 mM NaOH (1 mL) and centrifuged (30 min, 4000 x g, 4 °C). The combined filtrates were kept on ice and gently rotated to homogenise the mixture. The absorbance at 280 nm was then measured (ε = 1440 M -1 cm -1 ) to determine the concentration of the solution (165 µM). The solution was then added to 2xPBS (2 mL, pH 7.1) to afford a final stock solution in 1xPBS (pH 7.4). This solution was then incubated at 37 °C for 72 h with agitation using a magnetic stir bar. Aliquots were than taken and stored at -78 °C until use.

Transmission Electron Microscopy
Nanoscale morphologies of fibril samples were observed by TEM using a Thermo Scientific (FEI Company) Talos F200X G2 microscope operating at 200 kV ( Figure S12). Images were recorded with a Ceta 4k x 4k CMOS camera. For sample preparation, TEM grids (continuous carbon film on 300 mesh Cu) were glow discharged using a Quorom Technologies GloQube at 25 mA for 60 s. A 2 µL sample of fibril in 1xPBS (1 µM Aβ ) was placed on a freshly glow-discharged grid, and after 1 min was carefully removed by blotting with filter paper. The sample was negatively stained using 2.0 µL of 2% (w/v) uranyl acetate solution in ethanol for 30 s. The grid was blotted and dried in air for 10 min at room temperature before use.

Circular Dichroism
CD spectra of Aβ  in 1xPBS (pH 7.4) were recorded with a Chirascan CD1 Spectrometer (Applied Photonics Ltd.) equipped with a Series 800 Temperature Controller (Alpha Omega Instruments) ( Figure  S13). Far-ultraviolet measurements (190 nm -250 nm) of Aβ(1-42) (1.0 μM) in 1xPBS (pH 7.4) were recorded at 25 °C with a 10 mm optical pathlength, a time-per-point of 1.0 s, and a wavelength step of 0.1 nm. CD spectra were averaged over six scans. Data were baseline corrected by subtracting the complete buffer spectrum of 1xPBS (pH 7.4) averaged over six scans. Applied Photophysics Pro-Data Chirascan software was then used to smooth the data using Savitsky-Golay smoothing and a window size of eight, then used to convert the data to molar ellipticity.  Fluorescence anisotropy experiments were performed using 10 nm excitation and emission slits, a scan rate of 120 nm/min, a data interval of 1.0 nm, an averaging time of 0.5 s, and medium PMT voltage at 25°C.
Two fluorescent coumarins, E265 and E736, were identified from VS. The coumarin E265 exhibited similar excitation and emission wavelengths to ThT, while the fluorescence spectra of E736 was more blue shifted (Table S1). Neither coumarin experienced a significant enhancement in fluorescence upon addition to Aβ(1-42) fibrils, nor a significant solvatochromic shift.

Ligand Dilution Series
Ligand dilution series were performed by repeating the above procedures in the absence of Aβ(1-42) fibrils. Figure S17: Dilution series of ThT into 1xPBS (pH 7.4, 25 °C). Spectra were recorded using λex = 440 nm and monitoring emission at λem = 483 nm. The experimental measurements are shown as points (error bars represent the 95% confidence interval calculated from at least three independent experiments).The line of best fit has a slope of ( . ± . ) × and a y-intercept of 0.       . Spectra were recorded using λex = 333 nm and monitoring emission at λem = 415 nm. Experimental measurements are shown as points (error bars represent the 95% confidence interval calculated from at least three independent experiments).

Data Fitting
Fluorescence spectra were analysed using a Microsoft Excel spreadsheet prepared by Professor Christopher Hunter. This spreadsheet fitted the measured fluorescence intensity at a fixed wavelength to a 1:1 binding isotherm using purpose-written VBA macros employing two algorithms, COGS and Simplex.