Linker-Dependent Folding Rationalizes PROTAC Cell Permeability

Proteolysis-targeting chimeras (PROTACs) must be cell permeable to reach their target proteins. This is challenging as the bivalent structure of PROTACs puts them in chemical space at, or beyond, the outer limits of oral druggable space. We used NMR spectroscopy and molecular dynamics (MD) simulations independently to gain insights into the origin of the differences in cell permeability displayed by three flexible cereblon PROTACs having closely related structures. Both methods revealed that the propensity of the PROTACs to adopt folded conformations with a low solvent-accessible 3D polar surface area in an apolar environment is correlated to high cell permeability. The chemical nature and the flexibility of the linker were essential for the PROTACs to populate folded conformations stabilized by intramolecular hydrogen bonds, π–π interactions, and van der Waals interactions. We conclude that MD simulations may be used for the prospective ranking of cell permeability in the design of cereblon PROTACs.


1 H-NMR chemical shift assignment
The assignment of PROTAC 1-3 were derived from 1 H, 13 C, TOCSY, NOESY, HSQC and HMBC NMR specta ( Figures S11-S29), recorded at -35 o C on a 800 MHz BRUKER Avance III HD NMR spectrometer equipped with a TCI cryogenic probe using CDCl 3 as a solvent. 1 H NMR chemical shifts are listed in Table S2-S4.   Figure S7: Structure and numbering of 2  Figure S8: Structure and numbering of 3

Monte Carlo molecular mechanics (MCMM) conformational search
The

Identification and characterization of solution ensembles using the NAMFIS algorithm
Solution ensembles were determined by fitting the experimentally measured distances and coupling constants to those back-calculated from computationally predicted conformations using the NAMFIS algorithm.  Tables S11-S13. S16 Noncovalent interaction analysis (NCI) plots were generated for the solution ensembles of PROTACs 1 and 2 as reported. 4 NCI isosurfaces were obtained from a cube of electron density values as implemented in the open source program Jmol. 5 The isosurfaces in Figures S11 and S12 were accessed using Rzepa's web implementation. 6 Figure S11. NCI plot analysis of PROTAC S,S-1. Figure S12. NCI plot analysis of PROTAC S,S-2.

Akaike information criterion (AIC) analyses
In addition to the NAMFIS analysis the two isomers were compared by performing Akaike information criterion (AIC) analyses. 7 AIC is a statistical method which is used to compare how well different models define the reality. The summary of AIC results for the isomers of PROTACs 1 and 2 are given below (Table S14). The best model has the lowest AIC value, the ERi of 1 and a LERi value of 0. The LERi difference is interpreted as 'weak' when LERi 0−0.5 , as 'substantial' 0.5−1, as 'strong' 1−2 and as 'decisive' >2.

Variable temperature NMR studies
The variable temperature NMR studies for PROTACs 1-3 were conducted by recording 1 H NMR spectra at different temperatures on a 500 MHz BRUKER Avance III HD NMR spectrometer equipped with a Z150347_0001 (CP TXO 500S2 C/ N-H-D-05 Z) probe. 1 H spectra were recorded using two different solvents (DMSO-d6 and CDCl3) with a relaxation delay of 0.7 s, 16 scans and 32768 points direct dimension.
The amide temperature coefficients (ΔδNH/ΔT, ppb K -1 ) were obtained from (δT,high-δT,low)/(Thigh-Tlow). A value of ΔδNH/ΔT; < 3 indicates a strong intramolecular hydrogen bond, between 3-5 indicates that the amide proton is in equilibrium between a solvent exposed and an intramolecular hydrogen bond and >5 indicates the amide proton is solvent exposed. The results are summarized in Table S17-S19. 8       Averaged (black lines) over the three independent replicates, gray shading represents standard deviations.

Conformation-dependent molecular property space
S42 Figure S36. Radius of gyration (Rgyr) time series for PROTACs 1-3. Averaged (black lines) over the three independent replicates, gray shading represents standard deviations.

Conformation subset selection
To further investigate the folding nature of the conformations in each of the five clusters, a subset of 26 conformations from each cluster was chosen using the Diverse Subset tool from the Molecular Operating Environment suite. 9 Principal components (PC1 and PC2) from a PCA were chosen as optional descriptors during the subset selection. Each conformation was manually analysed and classified into one of the following categories; folded, semi-folded or linear ( Figure S43). Figure S43. Schematic illustration of the three conformer classes. Table S22. Summary of conformation classification in the five clusters of PROTACs 1-5.