Pair your accounts.

Export articles to Mendeley

Get article recommendations from ACS based on references in your Mendeley library.

Pair your accounts.

Export articles to Mendeley

Get article recommendations from ACS based on references in your Mendeley library.

You’ve supercharged your research process with ACS and Mendeley!

Click to create an ACS ID

Please note: If you switch to a different device, you may be asked to login again with only your ACS ID.

Please note: If you switch to a different device, you may be asked to login again with only your ACS ID.

Please note: If you switch to a different device, you may be asked to login again with only your ACS ID.

Your Mendeley pairing has expired. Please reconnect
ACS Publications. Most Trusted. Most Cited. Most Read
My Activity

Procrustes Cross-Validation—A Bridge between Cross-Validation and Independent Validation Sets

Cite this: Anal. Chem. 2020, 92, 17, 11842–11850
Publication Date (Web):August 10, 2020
Copyright © 2020 American Chemical Society

    Article Views





    Other access options
    Supporting Info (1)»


    Abstract Image

    In this paper, we propose a new approach for validation of chemometric models. It is based on k-fold cross-validation algorithm, but in contrast to conventional cross-validation, our approach makes it possible to create a new dataset, which carries sampling uncertainty estimated by the cross-validation procedure. This dataset, called a pseudo-validation set, can be used similar to an independent test set, giving a possibility to compute residual distances, explained variance, scores, and other results, which cannot be obtained in the conventional cross-validation. The paper describes theoretical details of the proposed approach and its implementation as well as presents experimental results obtained using simulated and real chemical datasets.

    Read this article

    To access this article, please review the available access options below.

    Get instant access

    Purchase Access

    Read this article for 48 hours. Check out below using your ACS ID or as a guest.


    Access through Your Institution

    You may have access to this article through your institution.

    Your institution does not have access to this content. You can change your affiliated institution below.

    Supporting Information

    Jump To

    The Supporting Information is available free of charge at

    • (Section S1) Computing the rotation matrix between two latent variable subspaces, (Figure S2) DoF and PRESS plots for the olives, and (Figure S3) PCA distance plots for A = 3 and 5 (PDF)

    Terms & Conditions

    Most electronic Supporting Information files are available without a subscription to ACS Web Editions. Such files may be downloaded by article for research use (if there is a public use license linked to the relevant article, that license may permit other uses). Permission may be obtained from ACS for other uses through requests via the RightsLink permission system:

    Cited By

    This article is cited by 22 publications.

    1. Rongpei Gou, Jingyi Yang, Menghan Guo, Yingjun Chen, Weiwei Xue. CNSMolGen: A Bidirectional Recurrent Neural Network-Based Generative Model for De Novo Central Nervous System Drug Design. Journal of Chemical Information and Modeling 2024, 64 (10) , 4059-4070.
    2. Ariana P. Pagani, Gonzalo Camargo, Gabriela A. Ibañez, Alejandro C. Olivieri, Alexey L. Pomerantsev, Oxana Ye. Rodionova. Data-Driven Version of Multiway Soft Independent Modeling of Class Analogy (N-Way DD-SIMCA): Theory and Application. Analytical Chemistry 2024, 96 (12) , 4845-4853.
    3. Norwell Brian C. Bautista, Gerard G. Dumancas, Johnziel G. Ubas, Eleo Jean D. Bandeling, Rhett Adrian C. Seduco, Jay O. Martizano, Steve P. Janagap. Quantification of Lactobacillus reuteri ProTectis in MRS Broth Using Attenuated Total Reflectance–Fourier Transform Infrared (ATR-FTIR) Spectroscopy and Chemometrics. Journal of Agricultural and Food Chemistry 2023, 71 (48) , 19101-19110.
    4. Priyanka Lochab, Basant Kumar, D P Ghai, P Senthilkumaran, Kedar Khare. Real time characterization of atmospheric turbulence using speckle texture. Journal of Optics 2024, 26 (1) , 015602.
    5. Amelie Sina Wilde, Søren Sørensen, Sergey Kucheryavskiy, Ellen Hebo Lange, Nicolai Zederkopff Ballin. Patterns in official food control data – Modelling dioxin and PCB profiling data for authentication of Baltic Sea salmon. Journal of Food Composition and Analysis 2023, 124 , 105607.
    6. Julia Gabel, Gesa Gnegel, Waltraud Kessler, Pierre-Yves Sacré, Lutz Heide. Verification of the active pharmaceutical ingredient in tablets using a low-cost near-infrared spectrometer. Talanta Open 2023, 8 , 100270.
    7. Eneko Lopez, Jaione Etxebarria-Elezgarai, Jose Manuel Amigo, Andreas Seifert. The importance of choosing a proper validation strategy in predictive models. A tutorial with real examples. Analytica Chimica Acta 2023, 1275 , 341532.
    8. Yingjie Zeng, Zi-quan Liu, Xian-guang Fan, Xin Wang. Modified denoising method of Raman spectra-based deep learning for Raman semi-quantitative analysis and imaging. Microchemical Journal 2023, 191 , 108777.
    9. Sergey Kucheryavskiy, Oxana Rodionova, Alexey Pomerantsev. Procrustes cross-validation of multivariate regression models. Analytica Chimica Acta 2023, 1255 , 341096.
    10. Huazhong Yang, Zhongju Chen, Huajian Yang, Maojin Tian. Predicting Coronary Heart Disease Using an Improved LightGBM Model: Performance Analysis and Comparison. IEEE Access 2023, 11 , 23366-23380.
    11. O.Ye. Rodionova, A.V. Titova, F.Y. Godin, K.S. Balyklova, A.L. Pomerantsev, D.N. Rutledge. Monitoring of the natural aging of Diclofenac tablets, NIR and MIR-ATR spectroscopy coupled with chemometrics data analysis. Journal of Pharmaceutical and Biomedical Analysis 2022, 219 , 114917.
    12. Alisa K. Pautova, Andrey S. Samokhin, Natalia V. Beloborodova, Alexander I. Revelsky. Multivariate Prognostic Model for Predicting the Outcome of Critically Ill Patients Using the Aromatic Metabolites Detected by Gas Chromatography-Mass Spectrometry. Molecules 2022, 27 (15) , 4784.
    13. Jin-Rong Yang, Qiang Chen, Hao Wang, Xu-Yang Hu, Ya-Min Guo, Jian-Zhong Chen. Reliable CA-(Q)SAR generation based on entropy weight optimized by grid search and correction factors. Computers in Biology and Medicine 2022, 146 , 105573.
    14. Lidija Strojnik, Doris Potočnik, Marta Jagodic Hudobivnik, Darja Mazej, Boštjan Japelj, Nadja Škrk, Suzana Marolt, David Heath, Nives Ogrinc. Geographical identification of strawberries based on stable isotope ratio and multi-elemental analysis coupled with multivariate statistical analysis: A Slovenian case study. Food Chemistry 2022, 381 , 132204.
    15. Ekaterina Boichenko, Andrey Panchenko, Margarita Tyndyk, Mikhail Maydin, Stepan Kruglov, Viacheslav Artyushenko, Dmitry Kirsanov. Validation of classification models in cancer studies using simulated spectral data – A “sandbox” concept. Chemometrics and Intelligent Laboratory Systems 2022, 225 , 104564.
    16. Oksana Rodionova, A. V. Titova, F. Godin, K. S. Balyklova, Alexey L. Pomerantsev, Douglas N. Rutledge. Monitoring of the Natural Aging of Diclofenac Tablets, Nir and Mir-Atr Spectroscopy Coupled with Chemometrics Data Analysis. SSRN Electronic Journal 2022, 68
    17. Alexey L. Pomerantsev, Oxana Ye. Rodionova. New trends in qualitative analysis: Performance, optimization, and validation of multi-class and soft models. TrAC Trends in Analytical Chemistry 2021, 143 , 116372.
    18. O.Ye. Rodionova, A.V. Titova, N.A. Demkin, K.S. Balyklova, A.L. Pomerantsev. Influence of the quality of capsule shell on the non-invasive monitoring of medicines using Terizidone as an example. Journal of Pharmaceutical and Biomedical Analysis 2021, 204 , 114245.
    19. Oxana Rodionova, Sergey Kucheryavskiy, Alexey Pomerantsev. Efficient tools for principal component analysis of complex data— a tutorial. Chemometrics and Intelligent Laboratory Systems 2021, 213 , 104304.
    20. M. Arif, G. Chilvers, S. Day, S.A. Naveed, M. Woolfe, O.Ye. Rodionova, A.L. Pomerantsev, O. Kracht, C. Brodie, A. Mihailova, A. Abrahim, A. Cannavan, S.D. Kelly. Differentiating Pakistani long-grain rice grown inside and outside the accepted Basmati Himalayan geographical region using a ‘one-class’ multi-element chemometric model. Food Control 2021, 123 , 107827.
    21. Arslan Siraj, Dae Yeong Lim, Hilal Tayara, Kil To Chong. UbiComb: A Hybrid Deep Learning Model for Predicting Plant-Specific Protein Ubiquitylation Sites. Genes 2021, 12 (5) , 717.
    22. Alexey L. Pomerantsev, Oxana Ye. Rodionova. Procrustes Cross-Validation of short datasets in PCA context. Talanta 2021, 226 , 122104.