ACS Publications. Most Trusted. Most Cited. Most Read
Sequence-Based Prediction of Protein–Carbohydrate Binding Sites Using Support Vector Machines
My Activity

Figure 1Loading Img
    Article

    Sequence-Based Prediction of Protein–Carbohydrate Binding Sites Using Support Vector Machines
    Click to copy article linkArticle link copied!

    View Author Information
    † ‡ School of Information and Communication Technology and Institute for Glycomics, Griffith University, Parklands Drive, Southport, Queensland 4215, Australia
    Other Access Options

    Journal of Chemical Information and Modeling

    Cite this: J. Chem. Inf. Model. 2016, 56, 10, 2115–2122
    Click to copy citationCitation copied!
    https://doi.org/10.1021/acs.jcim.6b00320
    Published September 13, 2016
    Copyright © 2016 American Chemical Society

    Abstract

    Click to copy section linkSection link copied!
    Abstract Image

    Carbohydrate-binding proteins play significant roles in many diseases including cancer. Here, we established a machine-learning-based method (called sequence-based prediction of residue-level interaction sites of carbohydrates, SPRINT-CBH) to predict carbohydrate-binding sites in proteins using support vector machines (SVMs). We found that integrating evolution-derived sequence profiles with additional information on sequence and predicted solvent accessible surface area leads to a reasonably accurate, robust, and predictive method, with area under receiver operating characteristic curve (AUC) of 0.78 and 0.77 and Matthew’s correlation coefficient of 0.34 and 0.29, respectively for 10-fold cross validation and independent test without balancing binding and nonbinding residues. The quality of the method is further demonstrated by having statistically significantly more binding residues predicted for carbohydrate-binding proteins than presumptive nonbinding proteins in the human proteome, and by the bias of rare alleles toward predicted carbohydrate-binding sites for nonsynonymous mutations from the 1000 genome project. SPRINT-CBH is available as an online server at http://sparks-lab.org/server/SPRINT-CBH.

    Copyright © 2016 American Chemical Society

    Read this article

    To access this article, please review the available access options below.

    Get instant access

    Purchase Access

    Read this article for 48 hours. Check out below using your ACS ID or as a guest.

    Recommended

    Access through Your Institution

    You may have access to this article through your institution.

    Your institution does not have access to this content. Add or change your institution or let them know you’d like them to include access.

    Cited By

    Click to copy section linkSection link copied!

    This article is cited by 56 publications.

    1. Parth Bibekar, Lucien Krapp, Matteo Dal Peraro. PeSTo-Carbs: Geometric Deep Learning for Prediction of Protein–Carbohydrate Binding Interfaces. Journal of Chemical Theory and Computation 2024, 20 (8) , 2985-2991. https://doi.org/10.1021/acs.jctc.3c01145
    2. Can Wang, Xianqin Lu, Jia Gao, Xuezhi Li, Jian Zhao. Xylo-oligosaccharides Inhibit Enzymatic Hydrolysis by Influencing Enzymatic Activity of Cellulase from Penicillium oxalicum. Energy & Fuels 2018, 32 (9) , 9427-9437. https://doi.org/10.1021/acs.energyfuels.8b01424
    3. Zijuan Zhao, Zhenling Peng, Jianyi Yang. Improving Sequence-Based Prediction of Protein–Peptide Binding Residues by Introducing Intrinsic Disorder and a Consensus Method. Journal of Chemical Information and Modeling 2018, 58 (7) , 1459-1468. https://doi.org/10.1021/acs.jcim.8b00019
    4. Anna Carbery, Martin Buttenschoen, Rachael Skyner, Frank von Delft, Charlotte M. Deane. Learnt representations of proteins can be used for accurate prediction of small molecule binding sites on experimentally determined and predicted protein structures. Journal of Cheminformatics 2024, 16 (1) https://doi.org/10.1186/s13321-024-00821-4
    5. Xinheng He, Lifen Zhao, Yinping Tian, Rui Li, Qinyu Chu, Zhiyong Gu, Mingyue Zheng, Yusong Wang, Shaoning Li, Hualiang Jiang, Yi Jiang, Liuqing Wen, Dingyan Wang, Xi Cheng. Highly accurate carbohydrate-binding site prediction with DeepGlycanSite. Nature Communications 2024, 15 (1) https://doi.org/10.1038/s41467-024-49516-2
    6. Shima Shafiee, Abdolhossein Fathi, Ghazaleh Taherzadeh. DP-site: A dual deep learning-based method for protein-peptide interaction site prediction. Methods 2024, 229 , 17-29. https://doi.org/10.1016/j.ymeth.2024.06.001
    7. Qianmu Yuan, Chong Tian, Yuedong Yang. Genome-scale annotation of protein binding sites via language model and geometric deep learning. eLife 2024, 13 https://doi.org/10.7554/eLife.93695.3
    8. Qianmu Yuan, Chong Tian, Yuedong Yang. Genome-scale annotation of protein binding sites via language model and geometric deep learning. eLife 2024, 13 https://doi.org/10.7554/eLife.93695
    9. Qianmu Yuan, Chong Tian, Yuedong Yang. Genome-scale annotation of protein binding sites via language model and geometric deep learning. 2024https://doi.org/10.7554/eLife.93695.2
    10. Qianmu Yuan, Chong Tian, Yuedong Yang. Genome-scale annotation of protein binding sites via language model and geometric deep learning. 2024https://doi.org/10.7554/eLife.93695.1
    11. Samuel W. Canner, Sudhanshu Shanker, Jeffrey J. Gray. Structure-based neural network protein–carbohydrate interaction predictions at the residue level. Frontiers in Bioinformatics 2023, 3 https://doi.org/10.3389/fbinf.2023.1186531
    12. Shima Shafiee, Abdolhossein Fathi, Ghazaleh Taherzadeh. SPPPred: Sequence-Based Protein-Peptide Binding Residue Prediction Using Genetic Programming and Ensemble Learning. IEEE/ACM Transactions on Computational Biology and Bioinformatics 2023, 20 (3) , 2029-2040. https://doi.org/10.1109/TCBB.2022.3230540
    13. Wei Wang, Bin Sun, MengXue Yu, ShiYu Wu, Dong Liu, HongJun Zhang, Yun Zhou. GraphPLBR: Protein-Ligand Binding Residue Prediction With Deep Graph Convolution Network. IEEE/ACM Transactions on Computational Biology and Bioinformatics 2023, 20 (3) , 2223-2232. https://doi.org/10.1109/TCBB.2023.3239983
    14. Rewati Dixit, Khushal Khambhati, Kolli Venkata Supraja, Vijai Singh, Franziska Lederer, Pau-Loke Show, Mukesh Kumar Awasthi, Abhinav Sharma, Rohan Jain. Application of machine learning on understanding biomolecule interactions in cellular machinery. Bioresource Technology 2023, 370 , 128522. https://doi.org/10.1016/j.biortech.2022.128522
    15. Maciej Staszak, Katarzyna Staszak. In silico approaches for carbohydrates. 2023, 129-155. https://doi.org/10.1016/B978-0-323-90995-2.00005-9
    16. Wajid Arshad Abbasi, Asma Anjam, Sadia Khalil, Saiqa Andleeb, Maryum Bibi, Syed Ali Abbas. COYOTE: Sequence-derived structural descriptors-based computational identification of glycoproteins. Journal of Bioinformatics and Computational Biology 2022, 20 (05) https://doi.org/10.1142/S0219720022500196
    17. Kavipriya Gananathan, Manjula Dhanabalachandran, Vijayan Sugumaran. Chronological Order Based Wrapper Technique for Drug-Target Interaction Prediction (CO-WT DTI). Current Bioinformatics 2022, 17 (6) , 541-557. https://doi.org/10.2174/1574893617666220509185052
    18. Aida Tayebi, Niloofar Yousefi, Mehdi Yazdani-Jahromi, Elayaraja Kolanthai, Craig Neal, Sudipta Seal, Ozlem Garibay. UnbiasedDTI: Mitigating Real-World Bias of Drug-Target Interaction Prediction by Using Deep Ensemble-Balanced Learning. Molecules 2022, 27 (9) , 2980. https://doi.org/10.3390/molecules27092980
    19. Wei Yang, Zhentao Hu, Lin Zhou, Yong Jin. Protein secondary structure prediction using a lightweight convolutional network and label distribution aware margin loss. Knowledge-Based Systems 2022, 237 , 107771. https://doi.org/10.1016/j.knosys.2021.107771
    20. Adeel Malik, Sathiyamoorthy Subramaniyam, Chang-Bae Kim, Balachandran Manavalan. SortPred: The first machine learning based predictor to identify bacterial sortases and their classes using sequence-derived information. Computational and Structural Biotechnology Journal 2022, 20 , 165-174. https://doi.org/10.1016/j.csbj.2021.12.014
    21. Jianwen Chen, Shuangjia Zheng, Huiying Zhao, Yuedong Yang. Structure-aware protein solubility prediction from sequence through graph convolutional network and predicted contact map. Journal of Cheminformatics 2021, 13 (1) https://doi.org/10.1186/s13321-021-00488-1
    22. Cheng Chen, Han Shi, Zhiwen Jiang, Adil Salhi, Ruixin Chen, Xuefeng Cui, Bin Yu. DNN-DTIs: Improved drug-target interactions prediction using XGBoost feature selection and deep neural network. Computers in Biology and Medicine 2021, 136 , 104676. https://doi.org/10.1016/j.compbiomed.2021.104676
    23. Teng-Ruei Chen, Chia-Hua Lo, Sheng-Hung Juan, Wei-Cheng Lo, . The influence of dataset homology and a rigorous evaluation strategy on protein secondary structure prediction. PLOS ONE 2021, 16 (7) , e0254555. https://doi.org/10.1371/journal.pone.0254555
    24. Babacar Gaye, Dezheng Zhang, Aziguli Wulamu, . Improvement of Support Vector Machine Algorithm in Big Data Background. Mathematical Problems in Engineering 2021, 2021 , 1-9. https://doi.org/10.1155/2021/5594899
    25. Shima Shafiee, Abdolhossein Fathi. Prediction of protein–peptide-binding amino acid residues regions using machine learning algorithms. 2021, 1-6. https://doi.org/10.1109/CSICC52343.2021.9420568
    26. Jaykumar Jani, Anju Pappachan. Protein Analysis: From Sequence to Structure. 2021, 59-82. https://doi.org/10.1007/978-981-33-6191-1_4
    27. Jinyong Cheng, Ying Xu, Yunxiang Zhao. Prediction of protein secondary structure based on deep residual convolutional neural network. Biotechnology & Biotechnological Equipment 2021, 35 (1) , 1881-1890. https://doi.org/10.1080/13102818.2022.2026815
    28. Zhe Sun, Shuangjia Zheng, Huiying Zhao, Zhangming Niu, Yutong Lu, Yi Pan, Yuedong Yang. To improve the predictions of binding residues with DNA, RNA, carbohydrate, and peptide via multi-task deep neural networks. IEEE/ACM Transactions on Computational Biology and Bioinformatics 2021, 39 , 1-1. https://doi.org/10.1109/TCBB.2021.3118916
    29. Sofi Siti Shofiyah, Dewi Yuliani, Nurul Widya, Fean D. Sarian, Fernita Puspasari, Ocky Karna Radjasa, Ihsanawati, Dessy Natalia. Isolation, expression, and characterization of raw starch degrading α-amylase from a marine lake Bacillus megaterium NL3. Heliyon 2020, 6 (12) , e05796. https://doi.org/10.1016/j.heliyon.2020.e05796
    30. Saeed Ahmed, Muhammad Kabir, Muhammad Arif, Zakir Ali, Zar Nawab Khan Swati. Prediction of human phosphorylated proteins by extracting multi-perspective discriminative features from the evolutionary profile and physicochemical properties through LFDA. Chemometrics and Intelligent Laboratory Systems 2020, 203 , 104066. https://doi.org/10.1016/j.chemolab.2020.104066
    31. Maxim Shapovalov, Roland L. Dunbrack, Slobodan Vucetic, . Multifaceted analysis of training and testing convolutional neural networks for protein secondary structure prediction. PLOS ONE 2020, 15 (5) , e0232528. https://doi.org/10.1371/journal.pone.0232528
    32. S.M. Hasan Mahmud, Wenyu Chen, Han Meng, Hosney Jahan, Yongsheng Liu, S.M. Mamun Hasan. Prediction of drug-target interaction based on protein features using undersampling and feature selection techniques with boosting. Analytical Biochemistry 2020, 589 , 113507. https://doi.org/10.1016/j.ab.2019.113507
    33. Suraj Gattani, Avdesh Mishra, Md Tamjidul Hoque. StackCBPred: A stacking based prediction of protein-carbohydrate binding sites from sequence. Carbohydrate Research 2019, 486 , 107857. https://doi.org/10.1016/j.carres.2019.107857
    34. Ghazaleh Taherzadeh, Abdollah Dehzangi, Maryam Golchin, Yaoqi Zhou, Matthew P Campbell, . SPRINT-Gly: predicting N- and O- linked glycosylation sites of human and mouse proteins by using sequence and predicted structural properties. Bioinformatics 2019, 35 (20) , 4140-4146. https://doi.org/10.1093/bioinformatics/btz215
    35. Alok Sharma, Artem Lysenko, Yosvany López, Abdollah Dehzangi, Ronesh Sharma, Hamendra Reddy, Abdul Sattar, Tatsuhiko Tsunoda. HseSUMO: Sumoylation site prediction using half-sphere exposures of amino acids residues. BMC Genomics 2019, 19 (S9) https://doi.org/10.1186/s12864-018-5206-8
    36. Abel Avitesh Chandra, Alok Sharma, Abdollah Dehzangi, Tatushiko Tsunoda. EvolStruct-Phogly: incorporating structural properties and evolutionary information from profile bigrams for the phosphoglycerylation prediction. BMC Genomics 2019, 19 (S9) https://doi.org/10.1186/s12864-018-5383-5
    37. Farshid Rayhan, Sajid Ahmed, Dewan Md Farid, Abdollah Dehzangi, Swakkhar Shatabda. CFSBoost: Cumulative feature subspace boosting for drug-target interaction prediction. Journal of Theoretical Biology 2019, 464 , 1-8. https://doi.org/10.1016/j.jtbi.2018.12.024
    38. Hamendra Manhar Reddy, Alok Sharma, Abdollah Dehzangi, Daichi Shigemizu, Abel Avitesh Chandra, Tatushiko Tsunoda. GlyStruct: glycation prediction using structural properties of amino acid residues. BMC Bioinformatics 2019, 19 (S13) https://doi.org/10.1186/s12859-018-2547-x
    39. Michael Flot, Avdesh Mishra, Aditi Sharma Kuchi, Md Tamjidul Hoque. StackSSSPred: A Stacking-Based Prediction of Supersecondary Structure from Sequence. 2019, 101-122. https://doi.org/10.1007/978-1-4939-9161-7_5
    40. Vineet Singh, Alok Sharma, Abel Chandra, Abdollah Dehzangi, Daichi Shigemizu, Tatsuhiko Tsunoda. Computational Prediction of Lysine Pupylation Sites in Prokaryotic Proteins Using Position Specific Scoring Matrix into Bigram for Feature Extraction. 2019, 488-500. https://doi.org/10.1007/978-3-030-29894-4_39
    41. Adeel Malik, Mohammad H. Baig, Balachandran Manavalan. Protein-Carbohydrate Interactions. 2019, 666-677. https://doi.org/10.1016/B978-0-12-809633-8.20661-4
    42. S. M. Hasan Mahmud, Wenyu Chen, Hosney Jahan, Yongsheng Liu, Nasir Islam Sujan, Saeed Ahmed. iDTi-CSsmoteB: Identification of Drug–Target Interaction Based on Drug Chemical Structure and Protein Sequence Using XGBoost With Over-Sampling Technique SMOTE. IEEE Access 2019, 7 , 48699-48714. https://doi.org/10.1109/ACCESS.2019.2910277
    43. Joe Tiralongo, Oren Cooper, Tom Litfin, Yuedong Yang, Rebecca King, Jian Zhan, Huiying Zhao, Nicolai Bovin, Christopher J. Day, Yaoqi Zhou. YesU from Bacillus subtilis preferentially binds fucosylated glycans. Scientific Reports 2018, 8 (1) https://doi.org/10.1038/s41598-018-31241-8
    44. Abel Chandra, Alok Sharma, Abdollah Dehzangi, Shoba Ranganathan, Anjeela Jokhan, Kuo-Chen Chou, Tatsuhiko Tsunoda. PhoglyStruct: Prediction of phosphoglycerylated lysine residues using structural properties of amino acids. Scientific Reports 2018, 8 (1) https://doi.org/10.1038/s41598-018-36203-8
    45. Abdollah Dehzangi, Yosvany López, Ghazaleh Taherzadeh, Alok Sharma, Tatsuhiko Tsunoda. SumSec: Accurate Prediction of Sumoylation Sites Using Predicted Secondary Structure. Molecules 2018, 23 (12) , 3260. https://doi.org/10.3390/molecules23123260
    46. Huiying Zhao, Ghazaleh Taherzadeh, Yaoqi Zhou, Yuedong Yang. Computational Prediction of Carbohydrate‐Binding Proteins and Binding Sites. Current Protocols in Protein Science 2018, 94 (1) https://doi.org/10.1002/cpps.75
    47. Ghazaleh Taherzadeh, Yuedong Yang, Haodong Xu, Yu Xue, Alan Wee‐Chung Liew, Yaoqi Zhou. Predicting lysine‐malonylation sites of proteins using sequence and predicted structural features. Journal of Computational Chemistry 2018, 39 (22) , 1757-1763. https://doi.org/10.1002/jcc.25353
    48. Jessica Poole, Christopher J. Day, Mark von Itzstein, James C. Paton, Michael P. Jennings. Glycointeractions in bacterial pathogenesis. Nature Reviews Microbiology 2018, 16 (7) , 440-452. https://doi.org/10.1038/s41579-018-0007-2
    49. Md. Raihan Uddin, Alok Sharma, Dewan Md Farid, Md. Mahmudur Rahman, Abdollah Dehzangi, Swakkhar Shatabda. EvoStruct-Sub: An accurate Gram-positive protein subcellular localization predictor using evolutionary and structural features. Journal of Theoretical Biology 2018, 443 , 138-146. https://doi.org/10.1016/j.jtbi.2018.02.002
    50. Ghazaleh Taherzadeh, Yaoqi Zhou, Alan Wee-Chung Liew, Yuedong Yang, . Structure-based prediction of protein– peptide binding regions using Random Forest. Bioinformatics 2018, 34 (3) , 477-484. https://doi.org/10.1093/bioinformatics/btx614
    51. Yosvany López, Alok Sharma, Abdollah Dehzangi, Sunil Pranit Lal, Ghazaleh Taherzadeh, Abdul Sattar, Tatsuhiko Tsunoda. Success: evolutionary and structural properties of amino acids prove effective for succinylation site prediction. BMC Genomics 2018, 19 (S1) https://doi.org/10.1186/s12864-017-4336-8
    52. Farshid Rayhan, Sajid Ahmed, Swakkhar Shatabda, Dewan Md Farid, Zaynab Mousavian, Abdollah Dehzangi, M. Sohel Rahman. iDTI-ESBoost: Identification of Drug Target Interaction Using Evolutionary and Structural Features with Boosting. Scientific Reports 2017, 7 (1) https://doi.org/10.1038/s41598-017-18025-2
    53. Chao Fang, Yi Shang, Dong Xu. A New Deep Neighbor Residual Network for Protein Secondary Structure Prediction. 2017, 66-71. https://doi.org/10.1109/ICTAI.2017.00022
    54. Laercio Pol-Fachin. Insights into the effects of glycosylation and the monosaccharide-binding activity of the plant lectin CrataBL. Glycoconjugate Journal 2017, 34 (4) , 515-522. https://doi.org/10.1007/s10719-017-9766-7
    55. Abdollah Dehzangi, Yosvany López, Sunil Pranit Lal, Ghazaleh Taherzadeh, Jacob Michaelson, Abdul Sattar, Tatsuhiko Tsunoda, Alok Sharma. PSSM-Suc: Accurately predicting succinylation using position specific scoring matrix into bigram for feature extraction. Journal of Theoretical Biology 2017, 425 , 97-102. https://doi.org/10.1016/j.jtbi.2017.05.005
    56. Yosvany López, Abdollah Dehzangi, Sunil Pranit Lal, Ghazaleh Taherzadeh, Jacob Michaelson, Abdul Sattar, Tatsuhiko Tsunoda, Alok Sharma. SucStruct: Prediction of succinylated lysine residues by using structural properties of amino acids. Analytical Biochemistry 2017, 527 , 24-32. https://doi.org/10.1016/j.ab.2017.03.021
    57. Thusitha S. Gunasekera, Loryn L. Bowen, Carol E. Zhou, Susan C. Howard-Byerly, William S. Foley, Richard C. Striebich, Larry C. Dugan, Oscar N. Ruiz, . Transcriptomic Analyses Elucidate Adaptive Differences of Closely Related Strains of Pseudomonas aeruginosa in Fuel. Applied and Environmental Microbiology 2017, 83 (10) https://doi.org/10.1128/AEM.03249-16
    58. Yuedong Yang, Jianzhao Gao, Jihua Wang, Rhys Heffernan, Jack Hanson, Kuldip Paliwal, Yaoqi Zhou. Sixty-five years of the long march in protein secondary structure prediction: the final stretch?. Briefings in Bioinformatics 2016, 82(Suppl 2) , bbw129. https://doi.org/10.1093/bib/bbw129

    Journal of Chemical Information and Modeling

    Cite this: J. Chem. Inf. Model. 2016, 56, 10, 2115–2122
    Click to copy citationCitation copied!
    https://doi.org/10.1021/acs.jcim.6b00320
    Published September 13, 2016
    Copyright © 2016 American Chemical Society

    Article Views

    1129

    Altmetric

    -

    Citations

    Learn about these metrics

    Article Views are the COUNTER-compliant sum of full text article downloads since November 2008 (both PDF and HTML) across all institutions and individuals. These metrics are regularly updated to reflect usage leading up to the last few days.

    Citations are the number of other articles citing this article, calculated by Crossref and updated daily. Find more information about Crossref citation counts.

    The Altmetric Attention Score is a quantitative measure of the attention that a research article has received online. Clicking on the donut icon will load a page at altmetric.com with additional details about the score and the social media presence for the given article. Find more information on the Altmetric Attention Score and how the score is calculated.