Effect of the Southeast Asian Ovalocytosis Deletion on the Conformational Dynamics of Signal-Anchor Transmembrane Segment 1 of Red Cell Anion Exchanger 1 (AE1, Band 3, or SLC4A1)

The first transmembrane (TM1) helix in the red cell anion exchanger (AE1, Band 3, or SLC4A1) acts as an internal signal anchor that binds the signal recognition particle and directs the nascent polypeptide chain to the endoplasmic reticulum (ER) membrane where it moves from the translocon laterally into the lipid bilayer. The sequence N-terminal to TM1 forms an amphipathic helix that lies at the membrane interface and is connected to TM1 by a bend at Pro403. Southeast Asian ovalocytosis (SAO) is a red cell abnormality caused by a nine-amino acid deletion (Ala400–Ala408) at the N-terminus of TM1. Here we demonstrate, by extensive (∼4.5 μs) molecular dynamics simulations of TM1 in a model 1-palmitoyl-2-oleoyl-sn-glycero-3-phosphocholine membrane, that the isolated TM1 peptide is highly dynamic and samples the structure of TM1 seen in the crystal structure of the membrane domain of AE1. The SAO deletion not only removes the proline-induced bend but also causes a “pulling in” of the part of the amphipathic helix into the hydrophobic phase of the bilayer, as well as the C-terminal of the peptide. The dynamics of the SAO peptide very infrequently resembles the structure of TM1 in AE1, demonstrating the disruptive effect the SAO deletion has on AE1 folding. These results provide a precise molecular view of the disposition and dynamics of wild-type and SAO TM1 in a lipid bilayer, an important early biosynthetic intermediate in the insertion of AE1 into the ER membrane, and extend earlier results of cell-free translation experiments.

T he anion exchanger AE1 (also called Band 3 or SLC4A1) is an abundant glycoprotein in the plasma membrane of the red cell where it mediates the electro-neutral exchange of chloride and bicarbonate ions. 1−3 Human protein contains 911 amino acids with a single N-glycosylation site at Asn642. The protein consists of two domains: the amino-terminal cytosolic domain that provides a link to the underlying cytoskeleton 4,5 and the carboxyl-terminal membrane domain that spans the lipid bilayer 14 times 6−11 and is responsible for the anion transport function. 12 A structure of the membrane domain of human AE1 has been recently elucidated by X-ray crystallography ( Figure 1A) that confirms the 14-TM model. 13 Southeast Asian ovalocytosis (SAO) is an inherited condition caused by a nine-amino acid deletion [residues Ala400−Ala408 ( Figure 1B)] in the boundary between the cytosolic domain and the first transmembrane segment (TM1) of AE1. 14−16 SAO Band 3 is incapable of mediating anion transport 14,17 and is unable to bind anion transport inhibitors such as stilbene disulfonates. 18,19 Individuals with SAO are heterozygotes, and their red cells contain both SAO (45%) and normal (55%) AE1. Their anion transport activity is severely reduced (∼50% of normal 20 ) because of the presence of nonfunctional SAO AE1. AE1 is a dimer, and the presence of SAO AE1 in the heterodimer affects the structure, transport, and inhibitor binding properties of the normal subunit. 20−22 SAO AE1 exhibits a circular dichroism spectrum similar to that of normal AE1, suggesting that there is little difference in secondary structure. 18,19 Interestingly, differential thermal calorimetry studies of SAO membranes have shown that SAO AE1 does not undergo the thermal transition exhibited by normal AE1. 18 These studies indicate that although the TM helices of SAO AE1 are formed they are not packed together properly. As convincingly shown by the recent crystal structure, 13 the dimer interface consists of a four-helix bundle consisting of TM5 and -6 from the gate domain and does not involve TM1 directly, which is located in the core domain. It is therefore plausible that the gate domain is folded correctly, allowing dimer formation, while the core domain is disordered, and is akin to a molten globule.
The level of expression of SAO AE1 in transfected HEK and MDCK cells is lower than that of normal AE1, and the misfolded protein is retained in the endoplasmic reticulum (ER) where it is subjected to more rapid degradation. 23 SAO AE1 can, however, form heterodimers with normal AE1, facilitating its trafficking to the cell surface. 23,24 SAO AE1 can also be transported to the cell surface in K562 cells that express glycophorin A. 25 Glycophorin A interacts with AE1 in the ER 26 and is known to facilitate the trafficking of AE1 to the cell surface. 27−30 Arg61 in glycophorin A is proposed to interact directly with Glu658 in AE1 to create the Wright (Wr) blood group antigen. 31 The abundant expression of SAO AE1 in red cells indicates that trafficking of the protein to the plasma membrane in red cell precursors is not severely impaired likely because of its interaction with glycophorin A and heterodimer formation. TM1 in AE1 acts as a signal-anchor sequence to target the nascent chain to the ER and mediates cotranslational insertion of the growing polypeptide into the membrane. 32−34 During AE1 biosynthesis, TM1 can move laterally into the lipid bilayer in its proper orientation only after the synthesis of TM2 and the short loop connecting it to TM1 ( Figure 1C). Scanning N-glycosylation mutagenesis and cell-free translation experiments 35 have localized the C-terminal end of the hydrophobic span of TM1 to Phe423 during AE1 biosynthesis. 9 Stable integration of TM2 into the membrane to act as a stop-transfer sequence requires the signal sequence properties of TM1; however, an overly long intervening sequence allowed translocation of TM2 into the ER lumen. 36,37 N-Glycosylation sites engineered into TM2 could be N-glycosylated in the cell-free translation system, leading to the suggestion that TM2 and TM3 form a re-entrant loop that is transiently exposed to the ER lumen during biosynthesis but folds into the protein at a later stage. 9 Thus, TM1 may not insert into the lipid bilayer until after the synthesis of TM4 ( Figure 1C), which acts as an efficient stop-transfer sequence. 36 Regardless of the point at which TM1 moves out of the translocon, it is located within the lipid bilayer before the majority of AE1 synthesis is completed. Sequence alignment between the sequences of the wild type and the SAO mutant in the region of TM1. The SAO AE1 mutant is missing nine residues (Ala400−Ala408) from the N-terminus of transmembrane segment 1 (TM1). The numbering refers to the wild-type sequence. The three proline residues are highlighted, as are two key aspartate residues. (C) Scanning N-glycosylation mutagenesis experiments 35 suggest TM1 acts as a signal-anchor sequence during AE1 biosynthesis. TM1 moves laterally into the lipid bilayer after synthesis of TM2 or after synthesis of TM2 and -3, which are transiently in the endoplasmic reticulum lumen followed by TM4 that acts as a stop-transfer sequence. (D) Structure of this sequence in the AE1 crystal structure, consisting of a cytosolic helix H1 connected by a sharp bend to the TM1 helix that begins at P403 and terminates at T431 with a slight kink at Pro419. (E) The ensemble of 21 wild-type structures of residues 389−430 resolved by nuclear magnetic resonance (labeled WT-NMR) has a high degree of structural variability, especially in their N-and C-terminal regions. For the sake of clarity, they have been fitted onto the structure of the wild-type transmembrane region using the Cα atoms of residues 403−418. (F) The nuclear magnetic resonance ensemble of 21 structures of SAO mutant residues 389−430 (SAO-NMR) has less structural variability, consisting predominantly of a single helical segment. For the sake of clarity, they have been fitted onto the structure of the wild-type transmembrane region using the Cα atoms of residues 409−417. Model helices of the (G) Arg384−Lys430 wild-type sequence (WT-HEL1) and (H) the SAO deletion (SAO-HEL1) as constructed by PyMol. The kinks are due to the proline residues at positions 403 and 419.

Article
Indeed, TM1 expressed alone can act as a signal anchor, target the ER, and insert into the membrane in its proper orientation. 36,38 Hydrophobicity plot analyses suggest that TM1 begins at Val405 and extends to Phe423, creating a 19-amino acid hydrophobic segment long enough to span the hydrophobic core of a lipid bilayer as an α-helix. 8 In the crystal structure of AE1, the TM1 helix begins at Pro403 and ends at Thr431. 13 Ensembles of structures of both wild-type TM1 [residues 389− 430, labeled WT-NMR ( Figure 1E)] and the SAO deletion mutant [SAO-NMR ( Figure 1F)] in a 1:1 (v/v) chloroform/ methanol detergent have been resolved by nuclear magnetic resonance (NMR). 39 These structures are missing the RDIRR sequence at the N-terminus that, because of its charged nature, is likely to lie at the interface between the lipids and water.
The hydrophobic length of TM1 was found to be critical for efficient membrane insertion of AE1, and the SAO deletion compromised this function. 38 The same region of SAO AE1 was poorly N-glycosylated, showing that TM1 was impaired in its ability to function as a signal-anchor sequence. 9,40 Scanning N-glycosylation of AE1 and SAO AE1 containing an insertion in EC loop 2 to allow efficient N-glycosylation showed that the positions of the C-terminal ends of TM1 in SAO AE1 in the ER membrane were the same as in wild-type AE1. 40 This suggests that TM1 in SAO AE1 assumes a transmembrane disposition by pulling in polar residues from the cytosolic side into the membrane.
In this paper, we studied the effect of the SAO deletion on the structure and dynamics of TM1 of AE1 in a model membrane using molecular dynamics (MD) simulations. Our approach was to simulate the dynamics of a peptide corresponding to residues 384−430, which includes the hydrophobic core of TM1, in a phospholipid bilayer, and thereby to study the behavior of TM1 in the ER membrane during the biosynthesis of AE1. All simulations were repeated with the SAO mutant to determine the effect of this deletion on the interaction of the shortened TM1 of AE1 with the lipid bilayer. Our results provide the first view of the segmental dynamics of signal-anchor TM1 in a lipid bilayer and the effect of a deletion mutation linked to a red cell shape change that disrupts the proper folding of AE1.

■ METHODS
Model helices of sequences Arg384−Lys430 and Arg389− Lys430 were constructed using PyMol. The ensembles of the NMR structures 39 for the sequence Arg383−Lys430 were downloaded from the Protein Data Bank (PDB entries 1BNK and 1BNX, respectively). The relevant structure was first converted to a MARTINI coarse-grained representation 41 by version 2.4 of the martinize Python script 42 with an elastic network applied between all beads that were within 0.5 and 0.9 nm of one another. The force constant was 500 kJ mol −1 nm −2 . The resulting coarse-grained peptide was then placed at the center of a box with dimensions of 0.95 nm × 0.95 nm × 0.70 nm. The energy of the system was then minimized for 10 steps using version 4.6.x of the GROMACS molecular dynamics package, 43 and 180 coarse-grained 1-palmitoyl-2-oleoyl-snglycero-3-phosphocholine (POPC) lipids were randomly added, ensuring no clashes. A simple POPC lipid bilayer was chosen because it is a major component of erythrocyte membranes. No additional lipid species were added because the lateral diffusion of lipids is a slow process and hence would not be resolved by our molecular dynamics simulations performed here. The z dimension of the box was increased to 1.1 nm, and all the coordinates were moved up by 0.2 nm, thereby leaving a 0.2 nm gap above and below the lipids. A large number of water beads (5000) were then added, again randomly. The energy of the system was again minimized using the steepest descent algorithm until machine precision was reached. Then the dynamics of the system was simulated for 10 ns using an integration time step of 20 fs. This was found to be sufficient to allow the lipid bilayer to form. The gaps at the top and bottom of the box introduced a bias ensuring the bilayer always formed in the x−y plane, simplifying subsequent analysis. Standard parameters for a MARTINI MD simulation were used. A Verlet cutoff scheme was employed, while van der Waals interactions were cut off at 1.2 nm with a switching function applied from 0.9 nm. Electrostatic forces were calculated using the reaction-field method with a cutoff of 1.5 nm and a relative dielectric constant of 15. The dielectric constant beyond the cutoff was set to infinity. A Berendsen thermostat applied separately to the lipids, protein, and solvent with a relaxation time of 1.0 ps was used to maintain the , and, to allow comparison, a second model helix covering Arg389−Lys430 (HEL2). Fifty simulations of each of the model helices were run, while each of the structures in the NMR ensembles was repeated three times, making a total of 63 simulations. A proportion of simulations fail to reach completion. Of these, a further subset does not have the protein in a transmembrane orientation and is therefore discarded from subsequent analysis. Because it appears that the SAO deletion mutation causes Asp396 and Asp399 to enter the lipid bilayer, we assume that these residues are protonated and therefore neutral. To check the effect of this assumption, all the SAO deletion mutation simulations were repeated with Asp396 and Asp399 in their default charged state. These simulations are marked with asterisks.
Biochemistry Article temperature at 310 K. The pressure was held at 1.0 bar using a Berendsen barostat applied semi-isotropically with a relaxation time of 2.0 ps and a compressibility of 3 × 10 −4 bar −1 . The final frame from this self-assembly simulation 44 was then converted back to an atomistic representation, 45 with the protein having neutral termini and protonating Asp396 and Asp399 as required. This conversion procedure occasionally failed because of steric clashes between the protein and lipids ( Table 1). The GROMOS53a6 atomistic force field was used. 46 A short 0.1 ns molecular dynamics simulation with the position of the protein restrained was run before a 10 ns unrestrained molecular dynamics simulation. Both simulations used an integration time step of 2 fs with the lengths of all bonds involving a hydrogen restrained using the LINCS algorithm. A Verlet cutoff scheme was used, and electrostatic forces were calculated using the particle mesh Ewald method using a real space cutoff of 1.2 nm. van der Waals forces were cut off at 1.2 nm. The temperature was maintained at 310 K using a Langevin thermostat with a relaxation time of 2 ps. Finally, the pressure was held at 1 bar by a Berendsen barostat applied semi-isotropically with a relaxation time of 1 ps and a compressibility of 4.46 × 10 −5 bar −1 . Table 1 describes how many simulations were run. Fifty repeats of each of the model helices were tried, and three repeats of each of the 21 structures in the NMR ensemble, making a total of 63, were also run. Simulations were not included in the final analysis either because they failed to complete the pipeline, usually because the conversion back to atomistic coordinates was not successful, or because the sequence did not adopt a transmembrane orientation. This was defined as the sequence having Cα atoms 1.4 nm above and below the midplane of the bilayer at the end of the selfassembly process.
Between 44 and 88% of simulations satisfied the criteria described above (Table 1). These were then analyzed as follows. First the sequence was divided into segments, as defined in Figures 2−4. For each frame of the trajectory, the upper and lower leaflets and the midplane of the membrane were defined using the phosphate atoms of the lipids. Then the helicity of each segment was determined using the STRIDE algorithm. 47 The helical axis of the segment was calculated by finding the first eigenvector of the backbone heavy atoms. It is defined as pointing toward the C-terminus. The tilt angle can then be calculated using linear algebra. Next the depth of the segment is calculated by subtracting the membrane midplane from the center of mass of the segment. All atoms within 0.6 nm of each residue were examined to determine the local environment, such as the accessibility to water. The depth of each residue, relative to the membrane midplane, was also calculated. All this analysis was performed in Python using the MDAnalysis 48 module. Graphs were plotted using gnuplot, and all images were rendered using VMD.

■ RESULTS
A single transmembrane helix, such as TM1 of AE1, is likely to be more dynamic on its own in a lipid bilayer, as will be the case during biosynthesis, than when confined within the full transmembrane protein. This variation in conformation sampled by the sequence Arg384−Lys430, which includes TM1, makes it well-suited to being studied by molecular dynamics simulation. Although there are two ensembles of NMR structures for the wild-type sequence [Arg389−Lys430, labeled WT-NMR ( Figure 1E and Table 1)] and the

Biochemistry
Article corresponding SAO deletion [SAO-NMR ( Figure 1F)] that we used in our simulations, it is likely that at least some members of these ensembles are not representative of native conformational states due to the nonphysiological solvent mixture of chloroform and methanol [1:1 (v/v)]. 39 We therefore also constructed two model α-helices, one with the extended RDIRR sequence at the N-terminus [WT-HEL1 ( Figure 1G)] and one without the extension (WT-HEL2). The latter has the same sequence as the NMR structure [WT-NMR ( Figure 1E)], facilitating direct comparison. The slight kinks in the model helices are introduced by the three proline residues that cannot participate in backbone hydrogen bonding.
Together with the ensembles of NMR structures (21 each for the WT-NMR and SAO-NMR peptides), these structures form a set of putative initial conformations for the wild-type sequence. SAO deletion variants of the longer [SAO-HEL1 ( Figure 1H)] and shorter (SAO-HEL2) helices as well as the NMR ensemble (SAO-NMR) of structures were also considered. Because the deletion of residues Ala400−Ala408 is likely to result in Asp396 and Asp399 entering the hydrophobic core of the lipid bilayer, we have assumed that these residues are protonated in the SAO mutant. To check the effect of this assumption, we repeated all the SAO deletion simulations with both these residues in their default, charged state; these are labeled SAO-HEL1*, SAO-HEL2*, and SAO-NMR*.
In the remainder of the paper, we shall focus on the behavior of the sequence Arg389−Lys430 assuming it adopts an initial helical conformation [WT-HEL1 (Table 1)] and the corresponding SAO deletion with both Asp396 and Asp399 protonated (SAO-HEL1). Where appropriate, we compared to the results for the wild-type sequence and SAO deletion mutant that were modeled as a α-helix without the N-terminal RDIRR motif (WT-HEL2 and SAO-HEL2) or the ensemble of NMR structures (WT-NMR and SAO-NMR). Because it is likely that an individual simulation could become trapped in a metastable conformation, we ran a large number of simulations and analyzed their statistical behavior. Fifty simulations of either αhelix were run (Table 1). Because each NMR ensemble contains 21 structures, three repeats of each were run, making 63 simulations for each ensemble. Overall, therefore, 489 simulations were run, each 10 ns long, making in total 4.46 μs of dynamics.
We started by assuming that the sequence spans the membrane once, and therefore, the first step was to embed the different structures in our set of putative conformations into a membrane, in this case a simple POPC lipid bilayer as described in Methods. To allow each conformation to relax, the first half of each trajectory was then discarded and the resulting data set analyzed. Our rationale is that repeating and analyzing many, short simulations of the sequence Arg384−Lys430 is likely to better sample the dynamics than running a few, much longer simulations. 49 The Sequence R384−K430 Is, on Average, Helical in the Membrane but Is Highly Dynamic. Twenty-two of the initial 50 WT-HEL1 simulations were successfully embedded and simulated for 10 ns in a transmembrane orientation in a POPC lipid bilayer. Examining the average helicity of this ensemble (Figure 2A) shows that, as one might expect, the sequence is mainly helical with the termini being less helical, in agreement with the NMR studies. The three prolines (Pro391, Pro403, and Pro419) all approximately mark the start of a local region of increased helicity. To simplify the task of analyzing the ensembles of simulations, we defined three segments (Figure 2A) that are mainly helical and start at each of these proline residues (H1, TM1a, and TM1b). The center of mass (COM) of the first segment (H1) tends, on average, to be found at the interface of the lipid bilayer and the cytoplasm; we define the interface by the position of the phosphate atoms in the POPC lipids. It adopts an angle of 66°relative to the bilayer normal and is 81% helical, on average, and therefore can be described as an interfacial helix, which is not surprising considering its amphipathic nature. The next two segments form a kinked helix spanning the bilayer and are therefore

Biochemistry
Article labeled TM1a and TM1b. TM1a is longer and more helical, and its COM is approximately at the center of the bilayer. It is slightly tilted, making an angle of 27°, on average, with the bilayer normal. TM1b is much shorter and less helical, and its COM is close to the extracellular side of the membrane. Like H1, it is tilted away from the membrane normal, making an angle of 60°on average. TM1a and TM1b therefore form a kinked transmembrane helix.
As one might expect for an isolated transmembrane sequence, this average conformation hides a considerable degree of dynamics. If we characterize each segment by a coordinate consisting of its tilt and COM depth, then the resulting density plot ( Figure 2D) suggests that the depth of the COM of all three segments varies by ∼1 nm and H1, TM1a, and TM1b explore a wide range of tilt angles (0−100°, 0−50°, and 45−100°, respectively). Taking the average and dynamical descriptions together gives us a more complete picture of the behavior of the sequence Arg384−Lys430 in a lipid bilayer. A similar image emerges if we examine the simulations of the shorter model helix [WT-HEL2 ( Figure S1)] or the ensemble of NMR conformations [WT-NMR ( Figure  S2)]. There are, however, some notable differences. Removing the RDIRR sequence from the model helix appears to allow the H1 segment to explore a wider range of conformations ( Figure  S1) and reduces its helicity. This is consistent with the charged RDIRR sequence interacting with the lipids and thereby restricting the dynamics of the H1 segment. The behavior of the flanking segments, H1 and TM1b, is different when the simulations of the ensemble of wild-type NMR structures are analyzed ( Figure S2). The helicity of both segments is reduced, and they explore a far wider range of tilt angles, resulting in an average tilt angle of 108°for the H1 segment. These differences are likely due to bias introduced by some of the conformations present in the NMR ensemble: in several of these, the N-and C-termini are bent around sufficiently that they will be initially embedded within the lipid bilayer. Although this conformation is probably unstable, it is highly likely that the simulations are not long enough to allow the termini to escape the bilayer (see, for example, Figure S2B), and hence, the overall behavior is biased. We attribute this behavior of the peptide in the NMR experiment to the nonphysiological solvent mixture used.
The SAO Deletion Results in a Helical, Dynamical Transmembrane Peptide. Now let us consider how the SAO-HEL1 ensemble behaves. Like the wild-type sequence, it is predominantly helical ( Figure 3A) with fraying at both termini. Crucially, one of the prolines (Pro403) is missing in the mutant, leading to two rather than three segments, starting at Pro391 and Pro419. The first segment, which we call TM1a′, is composed of the residues that make up H1 in the wild-type sequence and the second half of the TM1a sequence. It retains a helicity of 94% ( Figure 3C), and its COM is 0.8 nm below the midplane of the lipid bilayer, on average. TM1a′ is more tilted than TM1a in the wild-type sequence, making an angle of 34°, on average, with the bilayer normal. The second segment, To facilitate comparison, the wild-type sequence was divided into four sequences: an interfacial amphipathic segment (H1) and a short transmembrane segment (TM1b) that are also present in the SAO mutant and two central transmembrane segments (SAO and ΔTM1a). The SAO segment consists of the nine residues that are deleted in the mutant, and the ΔTM1a segment is the TM1 segment without the six residues at the Nterminus. The colors defined here are used throughout this figure. Illustrative snapshots, to-scale schematics of the average conformations, with the lengths, tilts, and helicity of each segment labeled, and density plots showing the variation in the depth and tilt angle of all segments for (B) the wildtype sequence and (C) the SAO deletion mutant. Note that the H1 segment is pulled into the lipid bilayer in the SAO peptide, as well as the TM1b segment, to compensate for the nine-amino acid deletion.

Article
TM1b, comprises the same residues as the wild-type sequence and behaves similarly with one exception: while it has an average helicity of 76% and an average tilt angle of 60%, the COM is only 0.9 nm, on average, above the bilayer midplane, a shift of 0.4 nm closer to the center of the bilayer. Like that of the wild-type sequence, this average behavior hides a high degree of dynamics. The vertical position of the COMs of both segments varies by up to 1.5 nm, while the TM1a′ segment explores tilt angles in the range of 10−55°and the TM1b segment, like the wild type, a wider range of tilt angles (10− 130°). A similar picture is observed when we analyze either the ensemble of simulations started from the shorter model helix [SAO-HEL2 ( Figure S3)] or the ensemble of simulations started from the NMR structures [SAO-NMR ( Figure S4)]. Interestingly, the TM1b segment is both more helical and less dynamic when the RDIRR sequence is absent ( Figure S3D), suggesting that the latter binds more strongly to the bilayer, as one might expect; therefore, when it is present, it anchors the sequence, leaving the TM1b segment more free, whereas when it is absent, the TM1b segment instead interacts more closely with the bilayer. This analysis assumes that Asp396 and Asp399 in TM1a′ are protonated and therefore neutral. If they are deprotonated, the overall effect is to reduce the helicity in the vicinity of these two residues (Figures S5−S7) and pull both the TM1a′ and TM1b segments toward the cytoplasmic side of the membrane.
The SAO Deletion Causes the N-and C-Termini To Be Pulled into the Membrane by Five and Four Residues, Respectively. Overall, a picture is emerging from the simulations of the SAO deletion causing the C-terminus to be pulled into the lipid bilayer. To gain a more detailed view, we need to make a direct comparison between the wild-type sequence and the SAO deletion mutant. To achieve this, let us define segments that are the same in both sequences ( Figure  4A). The wild-type sequence is then described by the behavior of four segments: H1 as before, then the SAO sequence, the remainder of TM1a, which we call ΔTM1a, and TM1b, also as described above. The SAO deletion mutant is therefore identical, except the nine-residue SAO segment is missing. Repeating exactly the same analysis as before and considering first the average and then the dynamical behavior show that the average effect of the deletion is for the H1 segment to be pulled into the lipid bilayer from the cytoplasmic side by 0.6 nm on average ( Figure 4B,C) and for the ΔTM1a and TM1b segments to be pulled into the lipid bilayer from the extracellular side by 0.5 and 0.4 nm, respectively. The average conformations of the ΔTM1a and TM1b segments are not significantly altered; however, the H1 segment becomes slightly more helical, and the tilt angle decreases from 66°to 40°. All segments remain highly dynamic, with those firmly embedded in the lipid bilayer (SAO and ΔTM1a) displaying a variation in tilt angles smaller than those of the other segments (H1 and TM1b). Similar trends are seen when the simulations starting from the shorter model helix or the ensembles of NMR structures are analyzed (Figures S8 and S9). Our overall picture is now more nuanced: the SAO deletion causes both ends of the protein to be pulled into the lipid bilayer by approximately the same amount, causing the N-terminus to be become more helical, thereby forming an approximately continuous helical region with the ΔTM1a segment, as shown in Figure 3.
To more precisely determine how much each end of the sequence moves in response to the SAO deletion, we have calculated the average distance of each residue relative to the midplane of the lipid bilayer over all the simulations in each ensemble ( Figure 5A). This shows that, on average, the Nterminus is pulled in by five residues and the C-terminus is pulled in by four residues. Repeating this analysis for the simulations seeded by either the shorter model helix or the ensemble of NMR structures yields similar but not identical patterns; the former suggests the N-and C-termini are pulled in by six and three residues, respectively, while the latter suggests that the eight residues of the N-terminus are pulled into the bilayer ( Figure S10). We treat the latter result with caution because of the previously noted problems with trapped structures and equilibration. Taken together, these results suggest that while both termini are pulled into the bilayer, the effect is stronger at the N-terminus than at the C-terminus. This conclusion assumes both Asp396 and Asp399 are protonated; repeating the analysis with both residues charged alters the behavior. For either model helix, the N-terminus is Figure 5. Structure of the transmembrane segment of the SAO deletion mutant that is different from the structure of wild-type TM1. (A) Average distance from the central plane of the lipid bilayer for each residue in both the wild-type and SAO deletion mutant sequences. This analysis suggests that deleting the nine residues causes the N-terminus to shift toward the extracellular side by five residues and the C-terminus to shift toward the intracellular side by four residues. (B) Shifting the sequence by these amounts allows us to find the sequence in the SAO mutant equivalent to the TM1 segment in the wild type. (C) Comparing the ensemble of structures generated by the simulations with the structure of TM1 found in the full AE1 structure shows that the equivalent sequence in the SAO mutant both is more different on average, with higher root-mean-square deviation values, and rarely samples conformations similar to that seen in the experimental structure of the transmembrane part of AE1.

Biochemistry
Article now only pulled into the bilayer by three residues on average ( Figure S11), while the C-terminus is pulled in by six residues. As described above, the effect is different for the ensemble of NMR structures where the N-and C-termini are pulled in by six and three residues, respectively.
Unlike That of the Wild-Type Sequence, the Transmembrane Helix of the SAO Deletion Mutant Rarely Resembles TM1 in the Structure of AE1. Using this result, we can align residues in both the wild-type and SAO deletion mutant sequences based on their depth in the membrane ( Figure 5B). This shows that, for example, Tyr403 in the SAO mutant is found at the same depth as Val409 in the wild-type sequence, allowing us to identify the residues in the SAO deletion mutant equivalent in terms of their depth in the bilayer to those of the TM1a segment of the wild type. If we consider the biosynthesis of AE1 for a moment, then TM1 will be the first transmembrane helix ejected from the translocon into the membrane ( Figure 1C). As our simulations suggest, it is likely that TM1 on its own is highly dynamic once it moves into the lipid bilayer. It is, however, reasonable to assume that TM1 samples, perhaps only occasionally, the conformation it ultimately adopts in the folded structure of the whole AE1 transmembrane protein. If this were not true, it would imply either that the structure adopted by TM1 in the whole protein has a high energy, and is therefore not favored, or that the local environment around TM1 in the folded protein is significantly different from what it experiences immediately after synthesis. Consistent with this idea, we find that a few (8%) of the structures sampled by TM1 in our wild-type simulations are very similar to the conformation of the same residues in the full AE1 structure, as defined by having a Cα root-mean-square deviation of <0.05 nm. The conformations adopted by the equivalent residues in the SAO deletion mutant are on average more different and also very rarely (0.2%) sample conformations similar to that adopted by TM1 in the full AE1 structure. This hints at the disruptive effect of deleting residues Ala400− Ala408 on the biosynthesis and proper folding of AE1.

■ DISCUSSION
The red blood cells of heterozygotes with SAO have an abnormal shape and exhibit an ∼50% decrease in their level of anion transport because of the presence of nonfunctional SAO AE1. SAO results from a nine-residue deletion at the Nterminal end of TM1 of AE1, which removes a bend in the protein chain resulting in a misfolded protein ( Figure 1B). TM1 acts as a signal-anchor sequence targeting the nascent chain to the ER, and therefore, it is important to consider the effect of this mutation on the biosynthesis of AE1 where TM1 is initially isolated in the membrane before the distal transmembrane segments are ejected from the translocon and inserted into the membrane. Indeed, cell-free translation experiments 9,36 have indicated that TM2 and -3 translocate into the ER lumen with TM4 acting as a stop-transfer sequence. Here we have shown by extensive MD simulations that the wild-type sequence Arg384−Lys430, which includes TM1, is predominantly helical in conformation and can be characterized by three helical segments: an amphipathic helix (H1) that lies at the interface between the lipid bilayer and the cytoplasm and a transmembrane helix composed of two helical segments that begins at Pro403 and is kinked at a conserved proline residue (Pro419). This average view hides, however, considerable dynamics, as expected for a single helix in a lipid bilayer. The same overall picture is recovered if we seed the simulations with structures resolved by NMR experiments 39 or start with a model of the sequence as a classical α-helix (Figures 2, S1, and S2).
Removing the nine residues that cause SAO, Ala400−Ala408, from near the N-terminal end of TM1 not only alters its average conformation but also causes the N-and C-termini to be pulled into the lipid bilayer to accommodate this deletion (Figures 4, S8, and S9). We estimate this effect is more pronounced at the N-terminus with at least five residues entering the membrane (Figures 5 and S10). As a consequence, the sequence adopts a single, kinked transmembrane helix, but with no interfacial H1 helix (Figures 3, S3, and S4). Starting the simulations from either a model of the sequence as a helix or structures resolved by NMR experiments 39 leads to similar conclusions. This result assumes that both aspartic acid residues in the H1 helix that are pulled into the bilayer become protonated. Leaving both these N-terminal residues in their default charged state still results in the sequence folding into a single, kinked transmembrane helix ( Figures S5−S7); however, as expected, only three residues are pulled into the bilayer from the N-terminus ( Figure S11), while six residues are pulled in from the C-terminus.
In all cases, the simulations produce a transmembrane helix that is highly dynamic. This helix ultimately becomes TM1 in the transmembrane region of AE1. 13 By comparing the ensemble of transmembrane structures seen in our simulations and the recent structure of the transmembrane region of AE1, 13 we have shown that, although dynamic, the wild-type sequence occasionally samples conformations very similar to that seen in the folded structure of AE1. It is possible, therefore, to see how this sequence can become incorporated into the fully folded structure of AE1. The SAO deletion, however, leads to an ensemble of transmembrane structures that samples conformations similar to TM1 in AE1 less frequently, and consequently, one can see how this mutation could lead to packing defects in AE1, leading to its physiological effects. We have noted that in the crystal structure of the transmembrane region of AE1, the helical portion of TM1 finishes around two helical turns later than we have seen in our simulations at Thr431. This may be because the local environment around TM1 in the full structure causes that region to adopt a helical conformation.
To validate these results, let us compare the simulations to the results of N-glycosylation scanning mutagenesis experiments by considering the average water accessibility of each residue ( Figure 6). If we consider the wild-type sequence, then, as expected, the transmembrane SAO and ΔTM1a segments are only marginally accessible to water. Overall, as we move farther from the central transmembrane region, the water accessibility increases, as expected. The "sawtooth" pattern with peaks every three or four residues seen in the N-terminus is consistent with the formation of amphipathic helices lying at the interface between the lipid bilayer and the solvent, as seen in the behavior of the H1 segment ( Figure 2). The fact that a similar pattern with a smaller magnitude is seen in the TM1b segment is consistent both with this sequence forming a helix less often and with it being more buried in the bilayer ( Figure  2). N-Glycosylation is a cotranslational event that occurs on the luminal side of the ER membrane while the nascent polypeptide chain is located within the translocon. N-Glycosylation acceptor sites must be located a minimum of 12 and 14 residues (12 + 14 rule) from the proximal and distal ends, respectively, of the hydrophobic regions of trans-Biochemistry Article membrane segments to be efficiently N-glycosylated in the lumen of the ER. 8 Scanning N-glycosylation mutagenesis can therefore be used to predict the position of the lumen ends of transmembrane segments during biosynthesis. N-Glycosylation experiments 40 using cell-free translation and transfected HEK cells with an insertion in the short loop between TM1 and -2 in AE1 to facilitate N-glycosylation identified Pro419 as the end of the hydrophobic region of TM1 in both wild-type and SAO AE1. This indicates that the position of TM1 in the translocon is the same in wild-type and SAO AE1. The efficiency of Nglycosylation of the SAO constructs was always lower than that of the wild-type protein, suggesting that the SAO deletion impairs the signal-anchor function of TM1.
Subsequent N-glycosylation scanning experiments using transfected HEK cells and constructs without the insertion revealed that introduced N-glycosylation sites within TM2 and -3 in AE1 could be N-glycosylated. This led to the suggestions that TM2 and -3 are translocated into the ER lumen during biosynthesis and folded into the protein as re-entrant loops. These experiments found that the luminal end of TM1 was located at Phe423, one helical turn more distal than Pro419, the estimate made with the insertion constructs. Interestingly, residues Pro419 and Phe423 are the positions of the first two peaks in water accessibility on the extracellular side of the protein, regardless of the initial structure(s) used in the simulation (Figures 6 and S12). This is evidence that our simulations are accurately modeling the dynamics of this peptide in a simple lipid bilayer, placing these residues on the same side of a helix at the membrane interface region. In our simulations of the sequence Arg384−Lys430 in AE1 SAO, because TM1b is pulled into the lipid bilayer ( Figure 4B) by the deletion, the sawtooth pattern is missing and it is not until Gly428 that the water accessibility is >10%. We note that in the crystal structure of AE1 the helical portion of TM1 begins at Pro403 and ends at Thr431, 13 around two helical turns later than where we have assumed TM1b finishes, well into the aqueous phase. In addition, TM1 is buried within the protein structure with little exposure to the lipid bilayer. Thus, it is not surprising that the shortened TM1 segment in AE1 SAO cannot assume a native conformation, resulting in disruption of the packing of TM segments.
It is notable but not surprising that the three proline residues in the sequence we have considered are important in determining the secondary structure. Because of its inability to form backbone hydrogen bonds with the residue four positions toward the N-terminus, proline is acknowledged as "breaking" α-helices. We observed that local increases in the helicity tended to coincide with a proline residue, consistent with this suggestion. These three prolines are conserved, 1 and it has been suggested for a long time that proline residues not only are enriched in the transmembrane helices of transport proteins 50 but also can play important roles in the functioning of membrane proteins. Like their soluble cousins, proline residues in membrane proteins can also induce helix initiation. This notion is supported by our simulations in which the sequences distal to proline residues are highly helical. For example, it has been suggested that straightening the kinks introduced into a transmembrane helix by a proline is a way of storing energy (as strain) in membrane proteins, such as transporters, that can be unwound later in the functional cycle. 51 We have made several assumptions throughout that it is important to clarify. The first is that the structure of the polypeptide, including the first transmembrane segment TM1, of AE1 can be described as a helix, and the second is that it is preferable to run a large number of short simulations rather than a few long simulations. Given we obtain similar results when the simulations are seeded with structures determined by NMR experiments, 39 these assumptions appear to be reasonable. We have also assumed that this polypeptide is inserted into a transmembrane orientation by the translocon; again this is realistic given the experimental data. 32−34 Finally, we have assumed that a simple POPC lipid bilayer is a good mimic for a single helix that has been just ejected from the translocon into the ER membrane.

* S Supporting Information
The Supporting Information is available free of charge on the ACS Publications website at DOI: 10

Notes
The authors declare no competing financial interest. Figure 6. Water accessibility of the wild-type and SAO mutant peptide. A residue is defined as being accessible to water if at least one water oxygen atom is found within 0.6 nm of the residue, averaged over all the simulations. The sawtooth pattern is due to the amphipathic nature of the H1 helix in the flanking segments, which lie approximately perpendicular to the membrane normal with one side facing water and the other facing the lipid bilayer. This positions polar and/or charged residues Arg387, Arg388, Tyr392, Asp396, and Asp399 facing water and nonpolar Ile386, Pro391, Leu394, and Ile 397 facing the lipid bilayer. Considering the C-terminal end of TM1, our simulations predict that the first residues that are accessible to water are Pro419 and Phe423.

Biochemistry
Article