Coupled Mutations-Enabled Glycerol Transportation in an Aquaporin Z Mutant

Aquaporins are transmembrane channel proteins with key function being transportation of water or other small substrates. Escherichia coli Aqp Z transports water molecules only, whereas Glp F is permeable to glycerol. It is intriguing to explore the possibility to induce glycerol permeability in Aqp Z by targeted mutations. The Aqp Z mutants with mutated selectivity filter (SF) residues exhibit poor permeability for both glycerol and water. For addressing the complexity of protein systems, pair correlation information in protein sequence analyses is instructive to identify residues that are coupled by coevolution and motion. In this study, we analyze the correlation between residues and unravel the clustering patterns of coupled residues, beyond SF residues, in aquaglyceroporins (AQGPs). The identified coupled motifs are proposed to be sequenced into aquaporin (Aqp Z) to introduce glycerol permeability. These residues are located in the vicinity of SF region, C-loop, and M6–M7 linkage domain. Significant enlargement of SF pore size of the proposed Aqp Z mutant is observed by an all-atom replica exchange molecular dynamics simulation, which is critical to facilitate considerable glycerol passage as characterized in calculated free-energy landscapes. Clearly, the hidden connections among residues play crucial roles in water/glycerol selectivity. In contrast, single-site mutation-based scheme may even lead to undesirable effects in AQGPs, such as the blocking of water transportation by aromatic π-stacked gate. As demonstrated in this work, the pair correlation analysis guided rational mutagenesis provides a feasible strategy to modulate proteins’ functions.


Section 1. Cross reference table
The aquaporin sequences for the SCA analysis were obtained using BLAST searches against a non-redundant protein database. 14 aquaporin sequences representing unique aquaporin types: were selected from the database. a PSI-BLAST 1 (e < 0.001) was run for each of these 14 sequences to generate groups of more than 3000 homologous sequences for each type. By removing sequence with high identity (>90%) and selecting only bacteria proteins with annotation of either "water transporter" or "glycerol facilitator", 305 sequences were obtained. A multiple sequence alignment was conducted on the bacteria aquaporin sequences. The aligned sequences had 192 columns (positions) after removing columns containing more than 30% of gaps. For convenience, equivalent residues of each position in AQGP and AQP were represented by their corresponding residues in Glp F and Aqp Z.

General network analysis of Aquaporin results
By connecting the positions with SCA scores, we generated a network (referred as SCA network) of bacteria aquaporin proteins of both AQPs and AQGPs. The multiple sequence alignment gave a set of sequences with 192 columns each. Thus, a 192x192 matrix was given by SCA with 18528 ((192x192+192)/2) non-redundant couples (Fig. S1), the SCA score were represented by gradient color from high to low. Figure S1 192x192 matrix of SCA scores, the dot i j,k represents the SCA score of two positions j and k. The value increases evenly as right shows.

Section 3. Comparison between SF residues' bond angles in mAqpZ & Glp F
The average bond angle degree of each angle in SF residues in mAqpZ and Glp F were computed, elucidated that the local molecular configuration in SF region of mAqpZ was very similar to that of Glp F.

Low
High S10 Figure S2 The average bond angle degree of each angle in SF residues in mAqpZ and Glp F

Section 4. Amino acids distributed on and near the channels of Glp F, Aqp Z and mAqpZ
As reported in a previous study 2 , the electrostatic profile in ar/R region of AQGPs exhibits a more negative charge compared to that of AQPs. The electrostatic status of the proteins is considered to be correlated with glycerol permeability. In this work, we use Pore-walker 3 to determine the pore-lining residues of Glp F, Aqp Z and mAqp Z respectively. From the results, residues in vicinity of SF, or ar/R region are presented in Fig. S3. Aqp Z exhibits a more positive-charged profile in the SF region, with two positive-charged residues, Arg and His. In contrast, Glp F has a more negative-charged profile, with a positive-charged Arg and a negativecharged Asp. The Glu152, although not a pore-lining residue, is located very close to the lumen, which contributes to the negative-charged profile of Glp F. In mAqp Z, a negative-charged electrostatic profile is also observed. With a positive-charged Arg and two negative-charged residues, Glu and Asp, the electrostatic profile of mAqp Z is more negative than that of Glp F. Therefore, glycerol is more favorable to be transported through mAqp Z. Figure S3 Amino acids distributed on and near channels of Glp F, mAqp Z and wtAqp Z. Acidic residues are colored in blue, basic residues in red, polar residues in purple, and non-polar ones in black. The Glu152, although not a pore-lining residue, is located very close to the lumen, as presented above Ala in Glp F's sequence.

Section 5 Hydrogen bond acceptors in mAqp Z
Hydrogen bond interactions are considered to be important for both water and glycerol conduction [4][5] . Here, we analyze the structure of pore-lining residues in the vicinity of SF in Fig.  S4. These residues act as H-bond acceptors to facilitate both water and glycerol transportation. For instance, Glu138 and Asp190 can form H-bond with glycerol with the carboxyl group on their side chains and make the channel more conductive to glycerol molecules.