Identification and Detection of a Peptide Biomarker and Its Enantiomer by Nanopore

Until now, no fast, low-cost, and direct technique exists to identify and detect protein/peptide enantiomers, because their mass and charge are identical. They are essential since l- and d-protein enantiomers have different biological activities due to their unique conformations. Enantiomers have potential for diagnostic purposes for several diseases or normal bodily functions but have yet to be utilized. This work uses an aerolysin nanopore and electrical detection to identify vasopressin enantiomers, l-AVP and d-AVP, associated with different biological processes and pathologies. We show their identification according to their conformations, in either native or reducing conditions, using their specific electrical signature. To improve their identification, we used a principal component analysis approach to define the most relevant electrical parameters for their identification. Finally, we used the Monte Carlo prediction to assign each event type to a specific l- or d-AVP enantiomer.


■ INTRODUCTION
L-Amino acid enantiomers are the predominant forms found in animal tissues.−4 Altered regulation production of these isomers can have profound effects on organisms.For example, during the natural aging process, there is a decline in the level of D-serine. 5Conversely, in pathological aging conditions, such as Alzheimer's disease, there is excessive activation of serine racemase, resulting in hyperstimulation of the NDMA receptor due to an excess of D-Ser. 5Notably, levels of D-Arg levels are decreased. 6dditionally, several D-amino acids containing peptides (DAACPs) have been identified in various pathological conditions associated with diseases or natural aging processes, as evidenced by the presence of D-β-Asp-containing peptides in elastic fibers of sun-damaged skin, 7 D-Asp in αA-crystallin from the lenses of individuals with cataracts, 8 and the β-amyloid peptide of Alzheimer's patients. 9These DAACPs are mainly formed following the post-translational modification of L- amino acids, either by enzymatic racemization of L-amino acids into their D-form or spontaneously, despite a very slow process, within long-lived proteins and tissues. 1 It has been established that oxidative stress and free radicals can induce such modifications in proteins. 10,11Notably, the alteration of a specific residue from L-to D-configuration often leads to a modification of the biological activity of the peptide and could be developed as therapeutic molecules. 12Modifications of peptide functional properties are likely due to a change in the higher-order structure of a protein. 13,14The significant increase of DAACPs discovered in diseases has encouraged researchers to consider them as potential biomarkers, 1 and any new technique capable of detecting/quantifying or finding new DAACPs, even at low concentrations, would represent a major advancement in early diagnosis.
As mentioned above, there is a clear need to develop fast, low-cost, and direct techniques to identify chiral amino acids in a polypeptide chain without prechemical/enzymatic treatment.Due to the difference between L-and D-amino acids, we expect the polypeptide chains to adopt different conformations.−17 To the best of our knowledge, the classical methods used, such as mass spectrometry, liquid chromatography, or enzymatic assays, for the detection of L-and D-amino acids do not allow the simultaneous detection of proteins, their conformation, and their chiral amino-acid composition. 14he nanopore electrical detection technique offers a unique multiscale analytical tool for peptide and protein enantiomer biomarker detection.−40 Machine learning is now used to validate the data from the sequencing of individual amino acids 37 to the identification of biomolecules 41,42 and biomarkers. 43hile the first study on detecting individual chiral amino acids (Trp, Phe, Tyr, Cys, and Asp) using an engineered Cu 2+ phenanthroline alpha-hemolysin channel was published in 2012 by Bayley's group, 18 only a few studies have been published since.Two other groups showed the ability to discriminate individual L-and D-amino acids using cyclodextrin adapters 44 inside a covalent organic framework-(His) or an alpha-hemolysin mutant 45 (Phe, Trp, Tyr).Mubarak et al. developed a nanopipette system to detect single amino-acid enantiomers (Tyr, Trp, and Phe) using a polymeric conical nanopore functionalized with BSA. 46They used current rectification to detect each kind of enantiomer.The first example of detecting chiral amino acids within a polypeptide biomarker was described by Luchian's group, where they identified, using Cu 2+ chiral recognition, the presence of histidine enantiomers. 47A recent study used mutant OmpF to control the side chain orientation of a β-amyloid peptide inside the nanopore due to its lateral electric field. 48They were able to discriminate between D-Ser and D-Asp isoforms and mutants.Finally, Maglia's group studied the detection of D-Ala and D-Leu in enkephalin peptides using FraC and CytK mutants. 49ntil now, no studies have been conducted to detect enantiomers within a biologically relevant peptide that has secondary and tertiary structures such as disulfide bonds.Furthermore, most previous studies needed an adapter or several pore mutations to detect chiral amino acids in the polypeptide sequence.To prove the ability to sense enantiomers containing elements in secondary and tertiary structures under native conditions, we used vasopressin as a model peptide.Vasopressin (L-Arg-AVP, later named L-AVP), a hormone produced in mammals, is involved in the regulation of water balance and blood pressure, as well as having an influence on social behavior, memory, and the cardiovascular system through its interaction with V1a, V1b, and V2 receptors. 50Vasopressin quantification was performed to assess hyper-or hyposecretion pathologies.In the case of diabetes insipidus, 51 quantifying vasopressin levels allows to determine the cause of the disease, which can be AVP receptor insensitivity or a decrease in hormone production.Notably, the preferred technique for detecting L-AVP in biomedical analyses has shifted to HPLC-MS/MS, sidelining traditional immunological techniques, partly due to its low molecular weight.This choice, however, requires essential preliminary steps for extracting and purifying the peptide of interest.The vasopressin nonapeptide possesses two cysteine residues at positions 1 and 6, forming a constrained peptide loop structure via a disulfide bridge.NMR analysis has revealed saddle and open conformations within the cyclic part of the molecule, while the noncyclic part demonstrates greater flexibility. 50The D-Arg-AVP (later named D-AVP) enantiomer represents a synthetic derivative of L-AVP, primarily employed for research purposes. 52onversely, the C-terminally deaminated form of D-AVP, known as desmopressin, is used for therapeutic applications.
Indeed, desmopressin selectively binds to a single receptor, V2, unlike the natural hormone L-AVP, limiting its physiological effects on water retention.Including D-Arg in the peptide chain makes it more resistant to proteolysis and enhances the molecule's lifespan in the bloodstream. 53his work proves that a wild-type (WT) aerolysin nanopore can discriminate, at the single-molecule level, enantiomeric peptides, L-Arg8 or D-Arg8, contained in the native peptide chain.The L-and D-AVP can also be identified in a mixture.Interestingly, we can detect multiple AVP conformations, such as open and saddle states already observed by NMR.Furthermore, we show that after using a reducing agent several conformations observed in native conditions disappear, and the nanopore is still sensitive enough to identify both L- and D-forms in single and mixed experiments.By changing the relative ratio of each component in the mixture, we demonstrate the identification of each enantiomeric state according to its blockage level or volume.To identify each kind of event population and attribute them to different conformations, we used a PCA approach to first determine the best electrical parameters for their discrimination.We also developed a Monte Carlo prediction approach to identify each enantiomer conformation.

■ RESULTS AND DISCUSSION
Electrical Enantiomer Characterization in Native Conditions.L-and D-AVP are two analogous peptides of 9 amino acids differing from a single chiral amino acid Arg8 (Figure 1a,b).Chirality can be represented as the nonsuperposable mirror image of an object.Therefore, the main difference between L-and D-AVP is the orientation of the lateral chain of Arg8 (Figure 1b).The two peptides form a disulfide bond between Cys1 and Cys6, creating a looped peptide structure (Figure 1a).As well as a disulfide bond, it has been found that L-AVP can possess one or two beta-turns stabilized by hydrogen bonds: 50 the first one within the loop created by the disulfide bond creates the saddle conformation, and the other is formed across Pro7.L-and D-AVP structures are similar, 54 so we can hypothesize that D-AVP can also form these two beta-turns.Discriminating L-and D-AVP with a WT aerolysin nanopore would enable us to show that we can detect a slight difference in conformation with a single chiral amino acid in structurally complex peptides.By reducing the disulfide bonds, we want to show if we can detect a conformational difference between the native and reduced states and determine the best conditions to discriminate between L-and D-AVP.
Using L-and D-AVP as model peptides, we want to show if with a wild-type aerolysin nanopore we can discriminate and identify a single amino acid enantiomer in both native and reducing conditions (Figure 1c).We performed nanopore experiments with L-and D-AVP, independently or in a mix, in the presence or absence of tris(2-carboxyethyl)phosphine (TCEP), a reducing agent.A WT aerolysin nanopore is inserted into a lipid bilayer separating two compartments (cis and trans) filled with an electrolyte solution (4 M KCl, 25 mM Tris, pH 7.5) (Figure 1c).A potential difference is applied by using two electrodes in each compartment.Ions in the electrolyte solutions flow through the pore, and we can measure an ionic current of 250 pA at 110 mV in our experimental conditions (I 0 ).In the presence of an analyte at the pore entry, we observe current blockades (I b ), characterized by a blockade level (DI b = I 0 − I b ) and by a dwell time (T t ) (Figure 1d).In native conditions, we observe two types of events for both L-and D-AVP: Type I with short dwell times of ∼440 and ∼390 μs, respectively, and a lower blockade current of ∼60 and ∼70 pA; and Type II with longer dwell times: for L-AVP ∼820 μs and D-AVP ∼1140 μs and a higher blockade current of ∼150 pA (Figure 1e) (Supporting Information S1, S2).In the presence of TCEP, a reducing agent, we observe only one type of event for each peptide characterized by long dwell times with a medium blockade current of ∼130 pA (D-AVP) and ∼120 pA (L-AVP) (Figure 1f).Interestingly, for these individual events, we can already observe a difference in the average blockade current for L-and D-AVP in either native or reducing conditions (Figure 1e and f, respectively).
For experiments in native conditions, the characteristic parameters of dwell time and average blockade level for each event were extracted from the current traces and plotted as a bidimensional cloud (Figure 2a,b,d,e,g,h,j,k).−57 We observed the two main types of events characterized by at least two defined current blockades, 0.43 ± 0.01 (Type IIa) and 0.72 ± 0.01 (Type I) for L-AVP, while 0.39 ± 0.01 (Type IIa) and 0.71 ± 0.01 (Type I) for D-AVP (Figure 2c,f) (Supporting Information S1).These results show that we can discriminate and identify L-and D-AVP in native conditions using the Type II events and their average current blockade.Unfortunately, the less frequent Type I events for L-and D-AVP cannot be discriminated.These events could be attributed to the entrance of the cycle formed by the disulfide bound.In fact, due to the volume of this cycle, we can suppose it is entropically unfavorable for entry into the pore.On the other hand, it could explain the highest blockade level observed.Furthermore, we observe a discrete subgroup in the Type II population that blocks the pore slightly more (Figure 2b,c,e,f).For L-AVP, we can measure a blockade level of 0.48 ± 0.01 (Type IIb) for this population and 0.43 ± 0.01 for D-AVP (Type IIb) (Supporting Information S1).Blockade levels depend on the volume of the chain interacting with or passing through the pore. 36,49,58,59everal studies showed that L-AVP could adopt multiple conformations.NMR studies 50,60 showed that L-AVP can be either in a "saddle" or "open" conformation with a ratio of 70:30, respectively (Supporting Information S5).These two populations with similar observed blockade levels (Types IIa and IIb) could be due to this conformation change, with the most probable population observed being the saddle conformation, or Type IIa, and the less probable population being the open conformation, or Type IIb.These two conformations have different values for molecule volume. 50o better discriminate and identify each population, we will discuss this later in the paper by performing semisupervised classification to assign types of events to specific conformations.
Electrical Enantiomer Characterization in Reducing Conditions.In experiments repeated in the presence of 5 mM TCEP, the bidimensional clouds for dwell time and blockade level display a single population of events with an average current blockade of 0.53 ± 0.01 for L-AVP and 0.49 ± 0.01 for D-AVP (Figure 2h,k; Supporting Information S1, S4).This single population in the presence of a reducing agent shows that we indeed have a change or loss of conformation.We confirmed that this single population is not due to TCEP affecting the detection of the peptides (Supporting Information S6).By comparing the previous experiments in native conditions, we can attribute events Type I and II to the disulfide bond.In the presence of the reducing agent, we observe that for both peptides the current blockade for Type II events increases, from 0.39 ± 0.01 for D-AVP to 0.49 ± 0.01 for D-AVP+TCEP and from 0.43 ± 0.01 for L-AVP to 0.53 ± 0.01 for L-AVP+TCEP (Figure 2i,j) (Supporting Information S1).The reduction of the disulfide bond releases the constraint on the peptide conformation, conferring more flexibility.To confirm these results, we analyzed and compared the event frequency in native and reducing conditions (Supporting Information S7).We can observe that event frequency at similar concentrations of peptide and applied voltage increased drastically after treatment with TCEP, from 19.1 ± 1.0 Hz to 153.3 ± 8.5 Hz for L-AVP at 110 mV in native and reducing conditions and 28.1 ± 0.9 and 107.2 ± 3.6 Hz for D-AVP, respectively.This result shows a reduced energy barrier for the entry of peptides treated with TCEP inside the nanopore.The reduction of the disulfide bond allows the peptide chain to adopt different conformations, which makes the peptide more flexible.This result is confirmed by an increase in the blockade level between the native and the reduced peptides (Figure 2c,f,i,l).
Enantiomer Discrimination in a Mix.Since we observed a significant difference in blockade level between L-and D-AVP in native and reducing conditions, we performed equimolar mixes in both conditions to determine (1) if we can discriminate them in a mix and ( 2) what are the best experimental conditions for their discrimination (native or reducing conditions, applied voltage).
We compared the superposition of the histograms of each independent experiment and the mixes (Figure 3a,b,g,h).In the absence of TCEP, we focused on the Type II events that allowed discrimination between the peptides.We varied the applied voltage (Supporting Information S8) and salt concentration (Supporting Information S9) to determine the best experimental conditions, with data at 50 and 110 mV in 4 M KCl displayed in Figure 3.At 50 mV, the blockade level histograms of L-and D-AVP in native or reducing conditions overlap (Figure 3a,c,e,g,i).Indeed, in a mix, we could not discriminate between each peptide population at this voltage.By increasing the voltage to 110 mV, the blockade levels for each peptide were better resolved (Figure 3b,d,f,h,j).
We measured the most probable mean blockade level for both mixes (Supporting Information S10).In native conditions, blockade levels of 0.39 ± 0.01 and 0.43 ± 0.01 were measured, while with TCEP they were found to be 0.49 ± 0.01 and 0.53 ± 0.01 (Figure 3d,j) (Supporting Information S11), confirming that we can discriminate L-and D-AVP in an equimolar mix in both conditions.To verify that each population observed is attributed to L-or D-AVP, we changed the concentration ratio to 75% L-AVP and 25% D-AVP (Figure 3e,f).We observed a decrease in the number of events for the population attributed to D-AVP compared to L-AVP.To further confirm this, we changed the ratio to 25% L-AVP and 75% D-AVP and showed the same ability to discriminate between the peptides (Supporting Information S12).Here, we demonstrated that we could discriminate and identify L-and D-AVP, two native or reduced peptides differing only by one chiral amino acid in a mix using a WT aerolysin nanopore at 110 mV.
Identification by Principal Component Analysis and Monte Carlo Approaches.In order to accurately identify each enantiomeric peptide and their different conformations, we performed a principal component analysis (PCA).For the experimental data analysis, we considered just the blockade level DI b to characterize each blockade.−43 Then, we can consider four other parameters defined in Figure 4a: maximum and minimum blockades (DI bmax , DI bmin , respectively), duration (T t ), and standard deviation (σ).We removed the blockades characterized by a standard deviation smaller than 1 pA, which were due to bumping, or by a blockade level smaller than 0.2 (see above).These data are plotted in Figure 4b from the results obtained with 10 μM D-AVP and show two main clusters: a small one characterized by a high blockade level of around 0.7 and a large one around 0.4.
If we want to take the five parameters together into account, it is necessary to perform a PCA to reduce the number of relevant parameters.Then, we consider the two most relevant parameters: the first and second principal components (PC1 and PC2, respectively; Figure 4e).We observe two clusters similar to those in Figure 4b: a small cluster and a large one.We are particularly interested in the second principal component (PC2) and its Gaussian distribution for the larger population, which is similar to the blockade levels (Figure 2c,f).The two peptides have very similar structures, so the corresponding distributions are very close.These two distributions overlap, making classical clustering analyses with well-separated domains impossible (Supporting Information S13).
To overcome this difficulty, we considered the results obtained from experiments with individual peptides.As a first approximation, each distribution is fitted by a Gaussian centered at μ D (respectively μ L ) with a standard deviation of σ D (respectively σ L ) for the D-AVP (respectively L-AVP) peptides.Blockades belonging to the domain [μ-3•σ, μ+3•σ] are attributed to the corresponding peptide (D-AVP or L-AVP), allowing the classification of 99% of blockades (Figure 4e,f and Supporting Information S13).Fitting the histogram of the main PC2 population for the D-AVP like this determines μ D = −0.039and σ D = 0.043 (Figure 4e).The classification is performed by taking into account all of the blockades in the range [μ D -3•σ D , μ D +3•σ D ].These data are colored in orange.The remaining data are not labeled and are in cyan (Figure 4f).
We followed the same approach with the L-AVP experimental data using the correlation matrix previously calculated from D-AVP data.We can observe a similar behavior: a PC2 distribution fitted by a Gaussian function as before, where μ L = 0.047 and σ L = 0.035 are the fitted values for L-AVP.All the blockades in the range [μ L − 3*σ L , μ L + 3*σ L ] are attributed to L-AVP peptides (in green).The other blockades are unassigned (in cyan) (Supporting Information S13).
We used these criteria to discriminate between D-AVP and L-AVP in a mixture.First, we combined data from D-AVP and L-AVP current traces to define the training data (Figure 5a).These classifications led to labeling the previous data combined (Figure 5a).The PCA is computed using the correlation matrix previously calculated from D-AVP data.The PC2 distribution shows the two Gaussian peaks previously observed for each peptide (Figure 5b).As both distributions are overlaid, the logistic regression approach to define clusters is irrelevant (Supporting Information S14).We follow a Monte Carlo approach to discriminate the two peptides and label them (Figure 5c), followed by a comparison of our prediction with the original labeling of D-AVP and L-AVP by calculating a confusion matrix (Figure 5e).The success rates are 71% for D- AVP and 75% for L-AVP recognition.Nevertheless, the false- positive rates are similar for both peptides (around 22%) due to the overlap in the two distributions.
We applied this approach to data collected from an equimolar mixture (10 μM D-AVP and L-AVP).The PC2 distribution (Figure 5f) is similar to the one observed for the combined data from D-AVP and L-AVP in Figure 5b.We used the same criteria as used previously to label the blockades: 1 = D-AVP (orange), 2 = L-AVP (green), and 3 = unlabeled data (cyan).Figure 5g shows the scatter plot of the two firstprincipal components according to this labeling, which leads to the blockade trace represented in Figure 5h.
We applied this method to nonequimolar mixtures (2.5 μM L-AVP/7.5 μM D-AVP and 7.5 μM L-AVP/2.5 μM D-AVP), leading to the ratio calculation of the number of L-AVP blockades divided by that of D-AVP blockades (Supporting Information S15).The correlation between this ratio and the relative composition of the mixtures (1:3, 1:1, and 1:3 L- AVP:D-AVP, respectively) shows the accuracy of our approach.
Using five parameters provides a more comprehensive view of the blockade phenomena and allows the determination of the most relevant parameters.The correlation matrix calculated from the D-AVP data was systematically used for each analysis.The scatter plots representing the first two principal components show a structure with several domains (a large and a small) comparable to classic representations of blockade levels versus durations, whether for D-AVP or L-AVP peptides.
In the case of data collected from a mixture, we followed a Monte Carlo approach to assign the blockades to one peptide or the other (Figure 5f).Our approach was validated using the combined data sets from experiments performed with the L- AVP and D-AVP peptides separately.The corresponding confusion matrix shows a success rate of 73%, which is explained by the overlapping between the two distributions.This result is interesting and demonstrates the relevance of this approach in the statistical analysis of peptides by nanopores.
Furthermore, these blockade level distributions (Figure 2c,f) highlight a shoulder attributed to the open conformation. 50he PC2 distributions clearly show two symmetrical shoulders for D-AVP (Figure 4e), whereas there is only one for L-AVP (Supporting Information S13).Then, we could also discriminate the two peptides from the PC2 shape distribution.Principal component analyses are very promising for further analysis of the different conformations of peptides at the singlemolecule level.■ CONCLUSION The world's population is increasing, and people worldwide are living longer.The degradation of the environment also impacts health.In this context, we expect an increase in chronic diseases, cancers, and neurodegenerative diseases. 40To answer this health challenge, we need a powerful, sensitive, and specific tool to perform early disease detection.Many molecules, including established peptide and protein clinical biomarkers and yet-to-be-identified biomarkers, remain underutilized or unexploited for clinical applications.This is due to the absence of techniques capable of discerning different conformations, post-translational modifications, and D/L amino acids.This study paves the way for the ability to identify, characterize, and quantify free or contained D-amino acids in peptides or proteins, which will be crucial for the early detection of human diseases.From a chemical and pharmaceutical point of view, this technique would also allow the scientific community to discriminate isomers 61 or follow the conversion of molecule chirality 62 and explain nonenzymatic racemization pathways. 63While this study focuses on a specific peptide model, there is no doubt that this technique can be extended to other enantiomer peptides (lanthipeptide and Aβ peptides), as shown in two recent publications using different protein nanopores such as OmpF, 48 CytK, and FraC. 49e showed that we can detect and discriminate biologically relevant peptides differing by a single amino-acid enantiomer, L-or D-Arg.The peptides can be identified individually or as a mixture.We detected two different peptide conformations, saddle (Type IIa) and open (Type IIb), which NMR already described.Due to their unique conformation, we could identify each peptide using two electrical parameters: dwell time and current blockade.Using a reducing agent, we did not detect any more conformational variants observed under native conditions.On the other hand, we can still identify L-and D-AVP that adopt different conformations.We also used a PCA approach to confirm our experimental data analysis and define the best electrical parameters to discriminate each population of events.This method is up-and-coming and will make it possible to study the conformations of biomarkers in more detail at the single-molecule level.Protein Production.Proaerolysin was produced by Dreampore SAS (Cergy, France) as described previously. 21riefly, C-terminally His-tagged protein was expressed into the periplasm of BL21 Ros 2 cells (MilliporeSigma), harvested by osmotic shock, before further purification by nickel affinity and buffer exchange chromatography (Cytiva, Malborough MA, USA) in standard buffers containing 350 mM NaCl.Purified protein was stored at 4 °C until further use, when it was activated using trypsin-bound beads (Thermo Scientific, Waltham, MA, USA).It was then used in nanopore experiments at a 0.5−1 nM final concentration of aerolysin monomers.

Peptides
Nanopore Experiments.Nanopore experiments were performed using a vertical planar lipid bilayer setup (Warner Instruments, Hamden CT, USA) with a 150 mM aperture, as described previously. 57A 10 mg/mL stock of diphytanoylphosphatidyl-choline (DphPC, Avanti Polar Lipids, Alabaster, AL, USA), dissolved in decane, was used to create a planar lipid bilayer separating compartments containing 1 mL of electrolyte solution, 4 M KCl, and 25 mM Tris, pH 7.4.Once a single aerolysin pore was inserted, peptide analytes (individually or in a mix) were introduced at different concentrations before a 110 or 50 mV voltage differential was applied through Ag/AgCl electrodes.
Data Acquisition and Analysis.Electrical measurements were performed by using an Axopatch 200B and a Digidata 1440 digitizer.Data were recorded at 250 kHz or 4 μs sampling time and filtered at 5 kHz using Clampex software (Axon Instruments, Union City, CA, USA).Three-minute recordings taken at 50 or 110 mV were cleaned and analyzed with Igor Pro (Wavemetrics, Portland, OR, USA), using inhouse algorithms to detect events and extract their characteristic parameters (current baseline level I 0 , values for the average (I b ), maximum (I bmax ), and minimum (I bmin ) current blockade level within the event, event noise or standard deviation of the current values within the event (I bs ), dwell time, and blockade time location in the data set.The average open pore current (I 0 ) and standard deviation (σ) for each recording were determined statistically, 64 and a threshold of I 0 − 7σ was used to define events for the target peptides in the characterization experiments (Figures 2, 3).Extracted parameters from multiple recordings were concatenated for low-frequency events to provide sufficient information for robust statistics.To remove bias introduced by short bumping events with a high blockade fraction, events with dwell times equal to or greater than 200 μs were used for further analysis.This is a reasonable approach given the overall distribution of average blockade fraction vs dwell time data (Supporting Information S2, S3), relative length of event dwell times, and the use of a 5 kHz filter.Histograms of the average blockade fraction for each event (DI b /I 0 , where DI b = I 0 − I b ; binned at 0.005) were fit with Gaussian or bi-Gaussian functions to determine the average blockade fraction for each population of events.Semilog histograms of the number of events against dwell time, with 30 bins per decade, were fit with an exponential function to determine the most probable dwell time (Supporting Information S2).Semilog distributions of interevent time, binned at 2 ms for global frequency and 4 ms for population frequency, were fit with single-exponential functions to determine the event frequency (Supporting Information S7).Each experiment contained at least 1700 events of more than 200 μs in dwell time.Depending on their blockade level distribution, populations were selected to determine each mean dwell time value.For Type I (Land D-AVP), the chosen points were found between 0.7 and 0.8 blockade level, and for Type II, between 0.35 and 0.55.In the presence of TCEP, points were selected between 0.4 and 0.8.The average and standard deviation determined from 3 independent fits were used for each fit to account for fit robustness.
Semisupervised Clustering Using Principal Component Analysis and Logistic Regression Classification.The five parameters (T t , DI bmin , DI bmax , DI b , and I bs ) were determined (Figure 5b) using DI bmin = I 0 − I bmax , DI bmax = I 0 − I bmin , and DI b = I 0 − I b, respectively.PCA was performed using the PCA module of the scikit-learn library in Python. 65,66riefly, the covariance matrix between these previously normalized five parameters was calculated to determine this matrix's eigenvalues and vectors.Each parameter is projected onto each eigenvector, with only the eigenvectors characterized by the most significant eigenvalues (PC1 and PC2) used for further analysis.The calculations with L-AVP or mixtures were performed using the correlation matrix already calculated with D-AVP.The logistic regression classification 67 was performed using the corresponding module in the scikit-learn Python library.The training is performed with 20% of the data set composed by concatenating data from D-AVP and L-AVP experiments (without or with TCEP).We can also follow a Monte Carlo type approach.We first used the histograms of PC2 obtained with each D-AVP or L-AVP peptide.These histograms are fitted by distinct Gaussian distributions centered in μ with a standard sigma deviation.We used these two Gaussian distributions to classify the blockages of a mixture according to a Monte Carlo approach.This method is evaluated using data obtained after the concatenation of data obtained previously with D-AVP and L-AVP peptides.We only take into account the blockages included in the band [μ-3•σ, μ+3•σ] of each Gaussian distribution.The predictions are compared to the true initial values to evaluate this method and calculate the corresponding confusion matrix using the corresponding sci-kit-learn library in Python.
Additional experimental details and results as well as tables regrouping population characteristics in each condition (PDF)

Figure 1 .
Figure 1.(a) Isomeric representation of L-and D-AVP (chemical structures created with ChemDraw).(b) Chemical representation of the chirality of vasopressin Arg8.(c) Representation of experimental conditions for the analysis of L-AVP (green) and D-AVP (orange) using a wild-type aerolysin nanopore (ribbon representation of PDB 5JZT using ChimeraX), inserted in a lipid bilayer in 4 M KCl and 25 mM Tris, pH 7.5.With an applied voltage, Cl − and K + ions go through the pore toward the oppositely charged electrodes, creating an ionic current (I 0 ).(Created using BioRender.com.)(d) Example of an event showing the open pore current (I 0 ) and blockade current (I b ) resulting in a blockade level (DI b ) over a dwell time (T t ), characteristic of a peptide at 110 mV.(e) Example of the most representative types of events of L-AVP (green) and D-AVP (orange) at 110 mV.(f) Example of the most representative type of event of L-AVP (blue) and D-AVP (red) in the presence of 5 mM TCEP at 110 mV.(Image generated using Biorender.com.)

Figure 2 .Figure 3 .
Figure 2. Experimental result of the independent analysis of 10 μM L-and D-AVP with or without 5 mM TCEP using an aerolysin nanopore in 4 M KCl 25 and mM Tris pH 7.5.(a, d, g, j) Example of current traces (filtered at 5 kHz) for each independent experimental condition at 110 mV: L- AVP (green), D-AVP (orange), L-AVP+TCEP (blue), and D-AVP+TCEP (red).Open pore currents, I 0 = 243.82± 1.77 pA for L-AVP (a), I 0 = 246.78± 2.04 pA for D-AVP (d), I 0 = 259.22± 3.33 pA for L-AVP+TCEP (g), and I 0 = 250.16± 3.46 pA for D-AVP+TCEP (j).(b, e, h, k) Scatter plot representing the normalized average blockade level (DI b ) against the dwell time of each event longer than 200 μs between 0.2 and 1.0 in the blockade level representing the interacting population of each peptide in the absence and presence of TCEP.(Raw scatter plot can be found in Supporting Information S3 and S4.) (b, e) Scatter plot showing populations of events for L-and D-AVP in the absence of TCEP: Type I between 0.6 and 0.8 blockade level and Type II between 0.38 and 0.6 blockade level.(c, f, i, l) Histograms of normalized blockade levels as a function of the number of events fitted with a Gaussian function (black lines) to determine the most probable blockade level for Type I and a bi-Gaussian for Type II: L-AVP: Type I, 0.72 ± 0.01; Type IIa, 0.43 ± 0.01 for the population with the most events; Type IIb, 0.48 ± 0.01; and D-AVP: Type I, 0.71 ± 0.01; Type IIa, 0.39 ± 0.01; Type IIb, 0.43 ± 0.01.With TCEP average blockade levels fitted by a Gaussian for L-AVP + TCEP: 0.53 ± 0.01 and D-AVP + TCEP: 0.49 ± 0.01.Data shown are from a single recording, with fitted values being mean and standard deviation for three independent fits.NL-AVP = 2812 events; ND-AVP = 3961 events; NL-AVP+TCEP = 13383 events; ND-AVP+TCEP = 7644 events.The number of events was calculated by selecting the population with a dwell time superior to 200 μs, depending on their blockade level.(Image generated using Biorender.com.)

Figure 4 .
Figure 4. Principal component analysis (PCA).(a) Parameters used to characterize each blockade minimum, average, maximum blockades (DI bmin , DI b , DI bmax ), standard deviation (σ), and duration (T t ) of each blockade.(b) Scatter plot of the average blockade of each D-AVP blockade according to its duration.(c) PCA strategy to decrease the parameter numbers.(d) Second principal component (PC2) according to the first one (PC1).(e) PC2 histogram fitted by a Gaussian distribution (μ D = 0.0474, σ D = 0.0348, describing the most probable value and the standard deviation of the distribution, respectively).(f) Selection of the relevant blockades in the range [μ D ± σ D ] in 10 μM D-AVP using an aerolysin nanopore in 4 M KCl and 25 mM Tris, pH 7.5, V = 110 mV.

Figure 5 .
Figure 5. Prediction of D-and L-AVP mixtures.(a) Combined data from blockade traces obtained for D-(orange) or L-AVP (green), respectively.The gray events correspond to the ones with σ < 1 pA (10 μM D-or L-AVP using an aerolysin nanopore in 4 M KCl and 25 mM Tris, pH 7.5, V = 110 mV).(b) Distribution of the second principal component (PC2), fitted by bi-Gaussian (gray) or Gaussian (orange or green) curves.(c) Scatter plot of the two principal components.The area colored in orange or green corresponds to the bands centered in μ D (μ L ) and with a width of ±3 σ D (σ L ).It is labeled according to the nature of each isomer.(d) Evaluation of Monte Carlo prediction.Predicted blockade trace according to four types of labeling obtained from the data used in (a).(e) Confusion matrix calculated from the Monte Carlo prediction.(f) Equimolar mixture of 10 μM D-and L-AVP distribution of the second principal component (PC2), fitted by bi-Gaussian (gray) or Gaussian (orange or green) curves.ΔV = 110 mV.(g) Scatter plot of the two principal components.The area colored in orange or green is calculated according to a Monte Carlo algorithm.(h) Blockade trace according to four types of labeling: σ < 1 (gray), D-AVP (orange), L-AVP (green), and not attributed (cyan).