ACS Publications. Most Trusted. Most Cited. Most Read
Early Diagnosis: End-to-End CNN–LSTM Models for Mass Spectrometry Data Classification
My Activity
  • Open Access
Article

Early Diagnosis: End-to-End CNN–LSTM Models for Mass Spectrometry Data Classification
Click to copy article linkArticle link copied!

  • Khawla Seddiki
    Khawla Seddiki
    Centre de Recherche du CHU de Québec-Université Laval, Québec City, Québec G1V 4G2, Canada
    Univ. Lille, Inserm, CHU Lille, U1192-Protéomique Réponse Inflammatoire Spectrométrie de Masse-PRISM, Lille F-59000, France
  • Fŕed́eric Precioso
    Fŕed́eric Precioso
    Université Ĉote d’Azur, CNRS, INRIA, I3S, Sophia Antipolis 06900, France
  • Melissa Sanabria
    Melissa Sanabria
    Université Ĉote d’Azur, CNRS, INRIA, I3S, Sophia Antipolis 06900, France
  • Michel Salzet
    Michel Salzet
    Univ. Lille, Inserm, CHU Lille, U1192-Protéomique Réponse Inflammatoire Spectrométrie de Masse-PRISM, Lille F-59000, France
  • Isabelle Fournier
    Isabelle Fournier
    Univ. Lille, Inserm, CHU Lille, U1192-Protéomique Réponse Inflammatoire Spectrométrie de Masse-PRISM, Lille F-59000, France
  • Arnaud Droit*
    Arnaud Droit
    Centre de Recherche du CHU de Québec-Université Laval, Québec City, Québec G1V 4G2, Canada
    *Email: [email protected]
    More by Arnaud Droit
Open PDFSupporting Information (1)

Analytical Chemistry

Cite this: Anal. Chem. 2023, 95, 36, 13431–13437
Click to copy citationCitation copied!
https://doi.org/10.1021/acs.analchem.3c00613
Published August 25, 2023

Copyright © 2023 The Authors. Published by American Chemical Society. This publication is licensed under

CC-BY-NC-ND 4.0 .

Abstract

Click to copy section linkSection link copied!

Liquid chromatography–mass spectrometry (LC–MS) is a powerful method for cell profiling. The use of LC–MS technology is a tool of choice for cancer research since it provides molecular fingerprints of analyzed tissues. However, the ubiquitous presence of noise, the peaks shift between acquisitions, and the huge amount of information owing to the high dimensionality of the data make rapid and accurate cancer diagnosis a challenging task. Deep learning (DL) models are not only effective classifiers but are also well suited to jointly learn feature representation and classification tasks. This is particularly relevant when applied to raw LC–MS data and hence avoid the need for costly preprocessing and complicated feature selection. In this study, we propose a new end-to-end DL methodology that addresses all of the above challenges at once, while preserving the high potential of LC–MS data. Our DL model is designed to early discriminate between tumoral and normal tissues. It is a combination of a convolutional neural network (CNN) and a long short-term memory (LSTM) Network. The CNN network allows for significantly reducing the high dimensionality of the data while learning spatially relevant features. The LSTM network enables our model to capture temporal patterns. We show that our model outperforms not only benchmark models but also state-of-the-art models developed on the same data. Our framework is a promising strategy for improving early cancer detection during a diagnostic process.

This publication is licensed under

CC-BY-NC-ND 4.0 .
  • cc licence
  • by licence
  • nc licence
  • nd licence
Copyright © 2023 The Authors. Published by American Chemical Society

Introduction

Click to copy section linkSection link copied!

Liquid chromatography–mass spectrometry (LC–MS) is a powerful analytical method for fast and precise assessment by determining the molecular composition of samples. The application of LC–MS technology in cancer studies has seen a major increase. Advances in this domain promote its use for diagnosis purposes by giving accurate tumor results. (1) However, mass spectra can be generated by a variety of instruments, with a variety of ionization technologies at different resolutions. The multiplicity of protocols makes the development of effective and reproducible pipelines the main challenge facing efforts to deduce diagnosis patterns. Furthermore, noise, acquisition artifacts, and high dimensionality of the data make analysis more difficult, thus requiring efficient computational tools.
Deep learning (DL) is a subset of the broader family of artificial intelligence methods. In recent years, DL approaches have been successfully applied in various bioinformatics problems with promising results. (2) A key feature driving the success of DL approaches is the representation learning concept: models are designed to automatically learn representations from input and transform them into meaningful outputs through a series of successive layers of increasing abstraction. (3,4) Recent advances in the speed and sensitivity of MS data collection have opened new avenues of application. (5)
In the MS data classification context, several approaches have been proposed for cancer diagnosis. Conventional step-by-step machine learning (ML) algorithms are the most common approaches for the discrimination of tumor samples from normal ones. Methods like support vector machine (SVM) (6) and random forest (RF) (7) have been used and compared. (8−10) However, building models with these algorithms requires preprocessing to remove non-sample-related data artifacts. This preprocessing includes many steps such as denoising, baseline correction, normalization, peaks detection, and spectra alignment. (11) Preprocessing is time-consuming and, therefore, makes ML algorithms unsuitable for rapid clinical analysis. In addition, the combination of preprocessing and the engineering of feature selection can be a source of errors. Moreover, preprocessing may be effective for one dataset and not for new ones generated from different instruments or with different settings. (12) End-to-end DL algorithms represent an attractive approach offering various advantages over conventional ML algorithms. Seddiki et al. (13) showed that DL algorithms can be used effectively to learn discriminative features from raw MS data, thereby removing the need for preprocessing and feature engineering steps before classification.
There has been increasing interest for a few years in the problem of early time series classification. Early classification techniques have been successfully applied to solve many time-critical problems related to medical diagnosis. The key idea is to determine whether we can classify a time series with sufficient accuracy after seeing only some data points without waiting for the full-length series. (14) This problem can be seen as a multiobjective problem, where the accuracy and earliness of the class predictions must be optimized simultaneously. (15) Higher accuracy can be achieved by waiting for more data points, but opportunities for early interventions might be missed. (14) The classification without waiting for the whole data to appear would allow us to act in time-critical applications in which some interventions are possible. For instance, early classification during an infection diagnosis process is crucial and enables doctors to design appropriate treatment strategies at the early stages of the disease. However, building efficient classifiers able to make early predictions is far from trivial.
Convolutional neural networks (CNNs) are a type of DL architecture. A typical CNN includes convolutional layers which learn spatially invariant features from an input stored in feature maps, pooling operators that extract the most prominent structures, and fully connected layers for classification. (16) Recently, CNNs have been successfully employed for one-dimensional (1D) and two-dimensional MS-based clinical diagnosis. (17,18) Long short-term memory (LSTMs) (19) are one more application of the most successful DL architectures. LSTM networks have become a prominent technique in various fields related to time series data. (20) Recently, LSTMs have gained enormous interest in healthcare domains such as electronic health record (EHR) (21) and clinical notes analysis. (22) Reported applications of LSTMs on MS data classification are very limited. Zanjani et al. (23) investigated LSTMs on MS imaging classification. Liu et al. (24) used an LSTM network for substance detection using a time-step parameter specific to a proton-transfer spectrometer. An LSTM-RNN network was proposed by Zhang et al. (25) to classify chemical substances by exploiting the evolution of mass-to-charge ratio (m/z) values.
In this study, we propose a new two-stage CNN–LSTM model for early MS data classification. We evaluate our model on two public datasets containing tumorous and healthy samples. Our model has the advantage of accurately classifying samples while handling the high-dimensionality issue. Performance evaluations are conducted to demonstrate the strengths of our model. First, we compare our model with a baseline LSTM model. Then, we present a comparison of our model with a baseline hybrid CNN–LSTM model. Finally, we compare our model to state-of-the-art models tested on the same datasets used in this study. In all comparisons, our two-stage CNN–LSTM model outperforms both baseline and state-of-the-art models. The main contributions of this paper can be summarized as follows. First, different from the previous studies, this study overcomes the barrier of dimensionality using a CNN model as a feature extractor to create embedded data representations; these representations are subsequently fed to an LSTM model to generate class predictions. Our model takes advantage at once of the CNN spatial patterns extraction and the LSTM capability to capture temporal patterns. Second, unlike the usual classification procedure of extracting the retention time (RT) only for the molecules of interest. We formulate the classification task as a time series classification framework, where all RT points act as time steps. Finally, we develop a powerful tool for early cancer detection during the diagnosis process. Our study could be the first study reporting the early classification of MS data using DL models.

Materials and Methods

Click to copy section linkSection link copied!

Datasets

We use two publicly available MS datasets to evaluate our proposed approach. The first one is from Jiang et al.’s study. (26) It contains 112-paired tumor and non-tumor tissues of early-stage hepatocellular carcinoma. Molecules are ranged from 300 to 1.400 Da and are acquired over a 78 min RT gradient. The second dataset is from Bifarin et al.’s study. (27) It contains 82 renal cell carcinoma (RCC) and 174 control samples obtained from urine samples. Molecules are ranged from 70 to 1.060 Da and are acquired in a positive and a negative mode over a 10 min RT gradient.

Experimental Design

All analyses in the present study are performed on data without undergoing any preprocessing steps. We import datasets and bin them on the m/z and RT dimensions as described in the Matrix construction section of the Supporting Information. Then, we scale intensity values linearly between 0 and 1. Models hyperparameters optimization is described in the Supporting Information. The performance of classifiers is measured by three metrics: accuracy, sensitivity, and specificity. In order to evaluate the statistical significance of the differences in terms of accuracy between the different models, we perform t-test evaluations. We used the t-test with p values <0.001 over 10 independent iterations to measure the statistical significance in accuracy between results.

Our Two-Stage CNN–LSTM Models

A key challenge in MS analysis arises from the high dimensionality of the data. This problem is known as the curse of dimensionality. It typically affects the effectiveness of classification algorithms when the considered mass range is large. This is especially the case for untargeted studies. To overcome this phenomenon, we propose to use a two-stage CNN–LSTM model. First, we reduce the high dimensionality of the data while preserving the relevant features through a CNN embedding model. To this end, we map the input mass spectra into 1000-dimensional space with the first CNN dense layer of 1000 neurons. This embedding reduces the number of features from 9.680 and 9.920 m/z for hepatic and renal data, respectively, to 1000 m/z. Then, we feed these learned representations to LSTM models. The workflow of our proposed approach is described in Supporting Information Figure S.1.

Comparison of Our Models with LSTM Models

Several studies reported that LSTM networks suffer from high dimensionality. (28) The aim of this experiment is to assess LSTM performance in a context where data are high dimensional. We compare our model performances to baseline LSTM models applied to raw data (without embedding). To make a fair comparison with our two-stage CNN–LSTM models, we adopt the same LSTM hyperparameters’ optimization process.

Comparison of Our Models with Hybrid CNN–LSTM Models

Many studies reported an increase in accuracy as well as a better generalization performance due to the feature extraction capability of hybrid CNN–LSTM models. (29) In this experiment, we compare our model performances to baseline hybrid CNN–LSTM models applied to raw data. The aim is to assess the contribution of convolutional layers to LSTM models. For each of the RT binning windows, three CNN architectures are tested: a two-layer CNN and two single-layer CNNs. The three architectures and their specificities are detailed in the Supporting Information Figure S.2. We combine each of the three CNNs with the best-performing LSTM model.

Comparison of Our Models with State-of-the-Art Models

The aim of this experiment is to compare our model performances to the state-of-the-art models applied to the same MS data used in this study. We chose two studies from two proteomic reference journals. The first study (30) proposed a CNN model to classify hepatic data. The authors compared their CNN to 5 ML models, namely PCA-SVM, SVM, gradient boosting decision tree, logistic regression, and RF. They started by detecting precursor ions and their extracted ion chromatography. Then, they selected the most common precursors in all samples. After that, they filtered the selected precursors using SVM to choose only the best discriminating ones. Finally, they trained the six classifiers aforementioned to distinguish the tumor samples from the normal ones. The second study (27) proposed an ML approach to classify RCC samples and to identify potential biomarkers from MS and nuclear magnetic resonance (NMR) data. The authors tested four ML classification models, namely RF, k-nearest neighbors (k-NN), linear kernel SVM, and radial basis function kernel SVM. They started using a propensity score matching to make healthy and RCC groups comparable with respect to sample covariates. This resulted in the selection of 31 healthy and 31 RCC samples to form the training cohort. 7097 features detected with LC–MS (4623 from positive mode and 2474 from negative mode) and 50 features quantified with NMR were used. Then, they compared a variety of feature selection strategies before testing the four ML models aforementioned to predict the RCC status in the test cohort composed of 143 healthy and 51 RCC samples.

Early Classification with Our Two-Stage CNN–LSTM Models

Once we identify the best-performing two-stage CNN–LSTM model, we investigate its performance in achieving early data classification. The hepatic RT series has a length of 468, 936, and 1560 when the RT binning window is 10, 5, and 3 s, respectively. The renal RT series has a length of 61, 121, and 301 when the RT binning window is 10, 5, and 2 s, respectively. We first divide each of the RT series into smaller segments following the principle of overlapping chunk-by-chunk evaluation. We make predictions on each segment with the best two-stage CNN–LSTM model. The first hepatic data segment has a length of 50 values. The first renal data segment has a length of 10 values. Then, we progressively add 50 RT points for the hepatic data and 10 RT points for the renal data until reaching the total RT length.

Results

Click to copy section linkSection link copied!

In this section, we show results from hepatic, positive, and negative mode renal data. All accuracy, sensitivity, and specificity metrics of each dataset are reported in the Supporting Information.

Our Two-Stage CNN–LSTM Models Performances

In this experiment, we begin by producing a latent representation of MS spectra. Multiple 1D-CNN models are trained on the concatenated matrices to classify samples as healthy or cancerous. For the hepatic dataset, the best model for the RT binning at 10 and 5 s is a two-convolutional layer model. The best model for the RT binning at 3 s is one-convolutional layer model. For the renal dataset, the best model for the RT binning at 10, 5, and 2 s is a one convolutional layer model. 3 layers model suffers from overfitting (data not shown). All 1D-CNN architectures used for the subsequent analysis as well as their classification accuracies are detailed in the Supporting Information (Figure S.3 and Table S.1).
Figure 1 reports the classification accuracies of different LSTM architectures models at the three RT binning time of embedded hepatic and renal datasets. We find that LSTM models are very efficient at the three binning time of hepatic data. Increasing the number of neurons shows no noticeable improvement. Increasing the depth of the network does not lead to increasing the accuracy as performances are roughly close for both single- and double-layer models. The best accuracy score (0.96 ± 0.02) is obtained from a single layer of 30 neurons used on 10 s RT binning (Supporting Information Table S.2). Comparable results are found on renal data. Model accuracies are very close and there is no significant difference whether the used model is one or two layers. Adding neurons does not increase the accuracy further. The best accuracy scores (0.95 ± 0.01 to 0.98 ± 0.02) are obtained with multiple architectures at the three RT binning time (Supporting Information Table S.2). These results reveal the potential of our two-stage CNN–LSTM classification model. Its main advantage is that CNN networks are useful in extracting the most discriminating spatial features, while LSTM networks capture the temporal pattern information.

Figure 1

Figure 1. Accuracies of LSTM models on embedded (a) hepatic, (b) positive renal, and (c) negative renal datasets at three RT binning time (10, 5, and 3 or 2 s). (30) means one LSTM layer of 30 neurons and (30,30) means two LSTM layers of 30 neurons each.

Comparison of Our Models with LSTM Models Performances

Figure 2 reports the classification accuracies of different LSTM architectures at the three RT binning time of raw hepatic and renal datasets. For hepatic data, we observe that LSTM models do not perform well enough in most classification tasks. Increasing the number of LSTM neurons slightly decreases the accuracy at the three binning time. In addition, accuracies are also gradually deteriorated as the binning time decreases. Using double-layer models performs substantially better than single-layer models. The best accuracy score (0.87 ± 0.04) is obtained with an architecture of a double layer of both 30 and 60 neurons used on 10 s RT binning (Supporting Information Table S3). The same findings are encountered in positive renal data. We observe that accuracy decreases when the number of neurons increases and when the binning time decreases. The best accuracy score (0.83 ± 0.03) is obtained with an architecture of a double layer of 30 neurons used on 10 s RT binning. For negative renal data, the best accuracy score (0.78 ± 0.04) is obtained with an architecture of a double layer of 60 neurons used on 10 and 5 s RT binning (Supporting Information Table S3). Compared to previous classification results from hepatic and renal data with our two-stage model, we find a clear performance superiority of the LSTM model when preceded by a CNN embedding step.

Figure 2

Figure 2. Accuracies of LSTM models on raw (a) hepatic, (b) positive renal, and (c) negative renal datasets at three RT binning time (10, 5, and 3 or 2 s). (30) means one LSTM layer of 30 neurons and (30,30) means two LSTM layers of 30 neurons each.

Comparison of Our Models with Hybrid CNN–LSTM Models Performances

As previously mentioned, we do not evaluate all CNN–LSTM combinations. We select the best-performing LSTM architecture from Figure 2 to evaluate the performance of the hybrid CNN–LSTM models. When several LSTM models have similar performances, we choose the simplest one as it requires fewer parameters and less computational resources. Figure 3 lists the classification accuracies of CNN–LSTM models at the three RT binning time of raw hepatic and renal datasets.

Figure 3

Figure 3. Accuracies of hybrid CNN–LSTM models on raw (a) hepatic, (b) positive renal, and (c) negative renal datasets at three RT binning time (10, 5, and 3 or 2 s). For hepatic and positive renal data, M1: Conv(32,21)(16,11)-LSTM(30,30), M2: Conv(32,21)-LSTM(30,30) and M3: Conv(32,11)-LSTM(30,30). For negative renal data, M1: Conv(32,21)(16,11)-LSTM(60)(10 s)/LSTM(60,60)(5 s)/LSTM(120)(2 s), M2: Conv(32,21)-LSTM(60)(10 s)/LSTM(60,60)(5 s)/LSTM(120)(2 s) and M3: Conv(32,11)-LSTM(60)(10 s)/LSTM(60,60)(5 s)/LSTM(120)(2 s). Conv(32,21) means 32 kernels in the convolutional layer with size 21. LSTM(30,30) means two LSTM layers of 30 neurons each.

Results indicate that hybrid CNN–LSTM models perform considerably better than LSTM models for all hepatic RT binning. The best improvement in performance is 3% for 10 s binning, 7% for 5 s binning, and 10% for 3 s binning compared to LSTM models. The best accuracy score (0.90 ± 0.04) is obtained with an architecture of an LSTM double layer of 30 neurons combined with two convolutional layers used on 10 s RT binning (Supporting Information Table S4). We note the same accuracy improvements for the renal dataset. The best improvement in performance is 6 and 9% for 10 s binning, 7 and 10% for 5 s binning, 5 and 6% for 2 s binning compared to LSTM models for positive and negative renal data, respectively. The best positive renal accuracy score (0.88 ± 0.04) is obtained with an architecture of an LSTM double layer of 30 neurons combined with a convolutional layer used on 10 s RT binning. The best negative renal accuracy score (0.86 ± 0.02) is obtained with an architecture of an LSTM double layer of 60 neurons combined with a convolutional or two convolutional layers used on 5 s RT binning (Supporting Information Table S4). These results show that feature extraction of the convolutional part adds a significant class discrimination value. It confirms that convolutional layers can extract relevant spatial features while filtering noise from input data. Nevertheless, because of the high dimensionality of the data, the models’ accuracies are not as efficient as our two-stage CNN–LSTM model. Results from LSTM and hybrid CNN–LSTM performances strengthen our previous hypothesis that these models suffer from the high dimensionality of the data. This is especially the case when the considered mass range is large, requiring a dimensionality reduction phase prior to classification.

Comparison of Our Models with State-of-the-Art Models

In this section, we report the classification results from two independent studies that use the same datasets. The best classification model from the first study to classify hepatic data was a CNN model with an accuracy score of (0.94 ± 0.01). On the same data, we achieve an accuracy score of (0.96 ± 0.02) when using our two-stage CNN–LSTM model on data binned at 10 s. It should be pointed out that the classification accuracy in this study is obtained on the training data unlike the usual procedure which consists in reporting the accuracy score on an independent test set as in our study.
After a feature selection step in the second study, the authors identified a first panel of 10 metabolites. The best classification result on this panel was obtained with a k-NN model of 0.87 accuracy. Eight of the 10 selected metabolites were in lower relative abundance in RCC samples versus control. Then, authors identified a second panel containing 5 metabolites with higher relative abundance in the RCC samples versus control. The best classification result on this panel was obtained with a k-NN model with an accuracy score of 0.81. A third metabolite panel was formed to include only annotated features from the two previous panels. The most accurate model was a linear SVM with an accuracy score of 0.88. Our two-stage CNN–LSTM models achieve an accuracy score of (0.98 ± 0.01 and 0.97 ± 0.01) on this dataset at the three RT binning time. On top of that, our model is designed to classify raw MS data, while the first model requires several filtering steps to classify only the most discriminating signal. The second model requires sophisticated preprocessing and feature selection steps.

Early Classification Performances with Our Two-Stage CNN–LSTM Models

In this section, we investigate the application of our two-stage CNN–LSTM models for early MS data classification. Our goal is to achieve good performance at early spectra chunks without waiting for the full-length RT. To this end, we divide the RT series into small segments of 50 and 10 length for hepatic and renal data, respectively. We make predictions on each segment with the best-performing architecture from Figure 1. Figure 4 illustrates the accuracies of our model on hepatic and renal datasets at the three RT binning times. All accuracies are reported in Supporting Information Tables S.5–S.13.

Figure 4

Figure 4. Early classification accuracies of our two-stage CNN–LSTM model on (a) hepatic, (b) positive renal, and (c) negative renal datasets at three RT binning time (10, 5, and 3 or 2 s).

There are several interesting outcomes in these results. For hepatic data, we observe that for the 10 RT binning time, the performance is equally good and is similar regardless of the used segment. For the 5 RT binning time, the model starts to be effective from 100 RT points. A similar situation occurs for the 3 RT binning time, where the model is only efficient starting from 200 RT points. This may be due to the low number and the low intensities of molecules that elute at the beginning of the experiment acquisition, making it difficult to distinguish between the two classes. From these results, we can propose an early detection threshold for the hepatic dataset from the first seconds when the chosen binning time is 10 s. For renal data, the performance is equally good and is similar regardless of the used segment whatever the binning time. We can propose a unified early detection threshold for the renal dataset from the first seconds and whatever the considered binning time.

Discussion

Click to copy section linkSection link copied!

LSTM networks have become increasingly popular in several research problems that need to model sequential data. However, their applicability in MS studies is not well investigated. Many challenges remain in fully exploiting LSTM, owing mainly to high dimensionality of inputs. In the present study, we proposed a novel end-to-end CNN–LSTM framework to early distinguish between tumor and non-tumor samples. We developed an accurate classification model by combining CNN and LSTM networks. Trained on two independent public MS datasets, our model not only outperformed baseline and state-of-the-art models but was also able to achieve early data classification. The goal of early classification is to correctly predict the label of a time series before it is fully observed. We showed that only the first seconds of the whole time series were sufficient to accurately classify data. Our model has proven to be effective both on a dataset with a short (10 min) and a long (78 min) RT gradient. The advantages of our proposed method are as follows: the embedding step that precedes the classification allows us to avoid the curse of dimensionality. Indeed, embedding allowed to filter noise and made that variance correspond to relevant biological differences. By combining a CNN network for embedding and an LSTM network for classification, our model benefited from their respective advantages. CNN performs at extracting spatial features information, while LSTM captures the temporal pattern. In addition, our model was able to classify raw MS data without the need for preprocessing steps, thus decreasing time, expertise, and computational costs. Using raw MS data has the potential to contribute significantly to the development of a diagnosis workflow for rapid and reliable detection of cancers. Although we have focused our study on tumor and healthy cancer diagnosis, we believe that our methodology would be applicable to other challenging MS classification tasks such as subtypes of cancer or disease diagnosis.

Conclusions

Click to copy section linkSection link copied!

Our new classification model significantly widens the feasibility in real-life of MS decision-making procedures. There are several interesting future directions. One direction is to extend our framework to set an end-to-end stopping rule to be able to detect at which time point the MS acquisition can be stopped. Another interesting future research line could focus on the interpretation of classification models to identify which features influence the classification decision. These features could be used as new diagnostic or therapeutic targets.

Supporting Information

Click to copy section linkSection link copied!

The Supporting Information is available free of charge at https://pubs.acs.org/doi/10.1021/acs.analchem.3c00613.

  • Additional results and figures including matrix construction, our two-stage CNN–LSTM classification model, hybrid CNN–LSTM models architectures, 1D-CNN embedding models, classification performances and early classification performances with our two-stage CNN-LSTM model (PDF)

Terms & Conditions

Most electronic Supporting Information files are available without a subscription to ACS Web Editions. Such files may be downloaded by article for research use (if there is a public use license linked to the relevant article, that license may permit other uses). Permission may be obtained from ACS for other uses through requests via the RightsLink permission system: http://pubs.acs.org/page/copyright/permissions.html.

Author Information

Click to copy section linkSection link copied!

  • Corresponding Author
  • Authors
    • Khawla Seddiki - Centre de Recherche du CHU de Québec-Université Laval, Québec City, Québec G1V 4G2, CanadaUniv. Lille, Inserm, CHU Lille, U1192-Protéomique Réponse Inflammatoire Spectrométrie de Masse-PRISM, Lille F-59000, France
    • Fŕed́eric Precioso - Université Ĉote d’Azur, CNRS, INRIA, I3S, Sophia Antipolis 06900, France
    • Melissa Sanabria - Université Ĉote d’Azur, CNRS, INRIA, I3S, Sophia Antipolis 06900, France
    • Michel Salzet - Univ. Lille, Inserm, CHU Lille, U1192-Protéomique Réponse Inflammatoire Spectrométrie de Masse-PRISM, Lille F-59000, FranceOrcidhttps://orcid.org/0000-0003-4318-0817
    • Isabelle Fournier - Univ. Lille, Inserm, CHU Lille, U1192-Protéomique Réponse Inflammatoire Spectrométrie de Masse-PRISM, Lille F-59000, FranceOrcidhttps://orcid.org/0000-0003-1096-5044
  • Author Contributions

    K.S. conducted all the experiments and wrote the paper. K.S., F.P., M.S., and A.D. designed the methodology and analyzed the results. A.D., I.F., and M.S. obtained funds for the project. A.D. supervised the study. All authors approved the final version of the manuscript.

  • Notes
    The authors declare no competing financial interest.

    The code that supports the findings in this study is available at: https://github.com/KhawlaSeddiki/ MS_CNN-LSTM. Keras package (version 2.7.0, Python version 3.6.8) was executed in a GPU Linux environment.

Acknowledgments

Click to copy section linkSection link copied!

This research is supported by funding from the Fonds de recherche du Québec-Santé (FRQS), Ministère de l’Enseignement Supérieur, de la Recherche et de l’Innovation (MESRI), Institut National de la Santé et de la Recherche Médicale (Inserm), Agence Nationale de la Recherche (ANR), SATT Nord, Institut Nationale du Cancer (INCA), and Université de Lille, with the support of ”Service de coopération et d’action culturelle du Consulat général de France à Québec”

References

Click to copy section linkSection link copied!

This article references 30 other publications.

  1. 1
    Feider, C. L.; Krieger, A.; DeHoog, R. J.; Eberlin, L. S. Ambient Ionization Mass Spectrometry: Recent Developments and Applications. Anal. Chem. 2019, 91, 42664290,  DOI: 10.1021/acs.analchem.9b00807
  2. 2
    Berrar, D.; Dubitzky, W. Deep learning in bioinformatics and biomedicine. Briefings Bioinf. 2021, 22, 15131514,  DOI: 10.1093/bib/bbab087
  3. 3
    LeCun, Y.; Bengio, Y.; Hinton, G. Deep learning. nature 2015, 521, 436444,  DOI: 10.1038/nature14539
  4. 4
    Chollet, F. Deep Learning with Python; Simon & Schuster, 2021.
  5. 5
    Wen, B.; Zeng, W.-F.; Liao, Y.; Shi, Z.; Savage, S. R.; Jiang, W.; Zhang, B. Deep Learning in Proteomics. Proteomics 2020, 20, 1900335,  DOI: 10.1002/pmic.201900335
  6. 6
    Cortes, C.; Vapnik, V. Support-vector Networks, 3rd ed.; Springer, 1995; Vol. 20, pp 273297.
  7. 7
    Breiman, L. Mach. Learn. 2001, 45, 532,  DOI: 10.1023/a:1010933404324
  8. 8
    Wu, B.; Abbott, T.; Fishman, D.; McMurray, W.; Mor, G.; Stone, K.; Ward, D.; Williams, K.; Zhao, H. Comparison of statistical methods for classification of ovarian cancer using mass spectrometry data. Bioinformatics 2003, 19, 16361643,  DOI: 10.1093/bioinformatics/btg210
  9. 9
    Gredell, D. A.; Schroeder, A. R.; Belk, K. E.; Broeckling, C. D.; Heuberger, A. L.; Kim, S.-Y.; King, D. A.; Shackelford, S. D.; Sharp, J. L.; Wheeler, T. L. Comparison of Machine Learning Algorithms for Predictive Modeling of Beef Attributes Using Rapid Evaporative Ionization Mass Spectrometry (REIMS) Data. Sci. Rep. 2019, 9, 5721,  DOI: 10.1038/s41598-019-40927-6
  10. 10
    Datta, S.; DePadilla, L. M. Feature selection and machine learning with mass spectrometry data for distinguishing cancer and non-cancer samples. Stat. Methodol. 2006, 3, 7992,  DOI: 10.1016/j.stamet.2005.09.006
  11. 11
    Cannataro, M.; Guzzi, P. H.; Mazza, T.; Veltri, P. ACM SIGIR 2005, 112, 1726
  12. 12
    Engel, J.; Gerretzen, J.; Szymańska, E.; Jansen, J. J.; Downey, G.; Blanchet, L.; Buydens, L. M. TrAC, Trends Anal. Chem. 2013, 50, 96106,  DOI: 10.1016/j.trac.2013.04.015
  13. 13
    Seddiki, K.; Saudemont, P.; Precioso, F.; Ogrinc, N.; Wisztorski, M.; Salzet, M.; Fournier, I.; Droit, A. Nat. Commun. 2020, 11, 5595,  DOI: 10.1038/s41467-020-19354-z
  14. 14
    Gupta, A.; Gupta, H. P.; Biswas, B.; Dutta, T. IEEE Transactions on Artificial Intelligence; IEEE, 2020; Vol. 1, pp 4761.
  15. 15
    Mori, U.; Mendiburu, A.; Dasgupta, S.; Lozano, J. A. Early Classification of Time Series by Simultaneously Optimizing the Accuracy and Earliness. IEEE Transact. Neural Networks Learn. Syst. 2018, 29, 45694578,  DOI: 10.1109/tnnls.2017.2764939
  16. 16
    Szegedy, C.; Liu, W.; Jia, Y.; Sermanet, P.; Reed, S.; Anguelov, D.; Erhan, D.; Vanhoucke, V.; Rabinovich, A. Proceedings of the IEEE Conference on Computer Vision and Pattern Recogni-tion; IEEE, 2015, pp 19.
  17. 17
    Papagiannopoulou, C.; Parchen, R.; Rubbens, P.; Waegeman, W. Fast Pathogen Identification Using Single-Cell Matrix-Assisted Laser Desorption/Ionization-Aerosol Time-of-Flight Mass Spectrometry Data and Deep Learning Methods. Anal. Chem. 2020, 92, 75237531,  DOI: 10.1021/acs.analchem.9b05806
  18. 18
    Skarysz, A.; Salman, D.; Eddleston, M.; Sykora, M.; Hunsicker, E.; Nailon, W. H.; Darnley, K.; McLaren, D. B.; Thomas, C.; Soltoggio, A. 2020, arXiv preprint arXiv:2006.01772.
  19. 19
    Hochreiter, S.; Schmidhuber, J. Long Short-Term Memory. Neural Comput. 1997, 9, 17351780,  DOI: 10.1162/neco.1997.9.8.1735
  20. 20
    Van Houdt, G.; Mosquera, C.; Nápoles, G. A review on the long short-term memory model. Artif. Intell. Rev. 2020, 53, 59295955,  DOI: 10.1007/s10462-020-09838-1
  21. 21
    Shickel, B.; Tighe, P. J.; Bihorac, A.; Rashidi, P. Deep EHR: A Survey of Recent Advances in Deep Learning Techniques for Electronic Health Record (EHR) Analysis. IEEE J. Biomed. Health Inf. 2018, 22, 15891604,  DOI: 10.1109/jbhi.2017.2767063
  22. 22
    Nguyen, P.; Tran, T.; Wickramasinghe, N.; Venkatesh, S. Deepr: A Convolutional Net for Med-ical Records; arXiv. org, 2016.
  23. 23
    Zanjani, F. G.; Panteli, A.; Zinger, S.; van der Sommen, F.; Tan, T.; Balluff, B.; Vos, D. N.; Ellis, S. R.; Heeren, R. M.; Lucas, M.; IEEE 16th International Symposium on Biomedical Imaging; ISBI 2019, 2019, pp 674678.
  24. 24
    Liu, J.; Zhang, J.; Luo, Y.; Yang, S.; Wang, J.; Fu, Q. Mass Spectral Substance Detections Using Long Short-Term Memory Networks. IEEE Access 2019, 7, 1073410744,  DOI: 10.1109/access.2019.2891548
  25. 25
    Zhang, J.; Liu, J.; Luo, Y.; Fu, Q.; Bi, J.; Qiu, S.; Cao, Y.; Ding, X. IEEE 17th International Conference on Communication Technology; ICCT, 2017, pp 19941997.
  26. 26
    Jiang, Y.; Sun, A.; Zhao, Y.; Ying, W.; Sun, H.; Yang, X.; Xing, B.; Sun, W.; Ren, L.; Hu, B. Proteomics identifies new therapeutic targets of early-stage hepatocellular carcinoma. Nature 2019, 567, 257261,  DOI: 10.1038/s41586-019-0987-8
  27. 27
    Bifarin, O. O.; Gaul, D. A.; Sah, S.; Arnold, R. S.; Ogan, K.; Master, V. A.; Roberts, D. L.; Bergquist, S. H.; Petros, J. A.; Fernandez, F. M. Machine Learning-Enabled Renal Cell Carcinoma Status Prediction Using Multiplatform Urine-Based Metabolomics. J. Proteome Res. 2021, 20, 36293641,  DOI: 10.1021/acs.jproteome.1c00213
  28. 28
    Su, Z.; Xie, H.; Han, L. Multi-Factor RFG-LSTM Algorithm for Stock Sequence Predicting. Comput. Econ. 2021, 57, 10411058,  DOI: 10.1007/s10614-020-10008-2
  29. 29
    Ma, F.; Zhang, J.; Chen, W.; Liang, W.; Yang, W. Discrete Dynamics in Nature and Society; Hindawi, 2020.
  30. 30
    Dong, H.; Liu, Y.; Zeng, W.-F.; Shu, K.; Zhu, Y.; Chang, C. A Deep Learning-Based Tumor Classifier Directly Using MS Raw Data. Proteomics 2020, 20, 1900344,  DOI: 10.1002/pmic.201900344

Cited By

Click to copy section linkSection link copied!
Citation Statements
Explore this article's citation statements on scite.ai

This article is cited by 12 publications.

  1. Chenyu Yang, He Chen, Yun Wu, Xiangguo Shen, Hongchun Liu, Taotao Liu, Xizhong Shen, Ruyi Xue, Nianrong Sun, Chunhui Deng. Deep Learning-Enabled Rapid Metabolic Decoding of Small Extracellular Vesicles via Dual-Use Mass Spectroscopy Chip Array. Analytical Chemistry 2025, 97 (1) , 271-280. https://doi.org/10.1021/acs.analchem.4c04106
  2. Xin-Yu Lu, Hao-Ping Wu, Hao Ma, Hui Li, Jia Li, Yan-Ti Liu, Zheng-Yan Pan, Yi Xie, Lei Wang, Bin Ren, Guo-Kun Liu. Deep Learning-Assisted Spectrum–Structure Correlation: State-of-the-Art and Perspectives. Analytical Chemistry 2024, 96 (20) , 7959-7975. https://doi.org/10.1021/acs.analchem.4c01639
  3. Andre Nicolle, Sili Deng, Matthias Ihme, Nursulu Kuzhagaliyeva, Emad Al Ibrahim, Aamir Farooq. Mixtures Recomposition by Neural Nets: A Multidisciplinary Overview. Journal of Chemical Information and Modeling 2024, 64 (3) , 597-620. https://doi.org/10.1021/acs.jcim.3c01633
  4. Yuxia Zheng, Limei Yin, Heera Jayan, Shuiquan Jiang, Hesham R. El-Seedi, Xiaobo Zou, Zhiming Guo. In situ self-cleaning PAN/Cu2O@Ag/Au@Ag flexible SERS sensor coupled with chemometrics for quantitative detection of thiram residues on apples. Food Chemistry 2025, 473 , 143032. https://doi.org/10.1016/j.foodchem.2025.143032
  5. Yijiang Liu, Feifan Zhang, Yifei Ge, Qiao Liu, Siyu He, Xiaotao Shen. Application of LLMs/Transformer-Based Models for Metabolite Annotation in Metabolomics. Health and Metabolism 2025, , 7. https://doi.org/10.53941/hm.2025.100014
  6. Blessina Preethi R, Berin Shalu S, Saranya Nair M, Vergin Raja Sarobin M. EE-SAMS: An adaptive, SNN based energy-efficient data aggregation framework for agrovoltaic monitoring systems. Results in Engineering 2025, 25 , 104053. https://doi.org/10.1016/j.rineng.2025.104053
  7. Tara Menon Pattilachan, Maria Christodoulou, Sharona Ross. Diagnosis to dissection: AI’s role in early detection and surgical intervention for gastric cancer. Journal of Robotic Surgery 2024, 18 (1) https://doi.org/10.1007/s11701-024-02005-6
  8. Olga Kapustina, Polina Burmakina, Nina Gubina, Nikita Serov, Vladimir Vinogradov. User-friendly and industry-integrated AI for medicinal chemists and pharmaceuticals. Artificial Intelligence Chemistry 2024, 2 (2) , 100072. https://doi.org/10.1016/j.aichem.2024.100072
  9. Huanhuan Li, Wei Sheng, Selorm Yao-Say Solomon Adade, Xorlali Nunekpeku, Quansheng Chen. Investigation of heat-induced pork batter quality detection and change mechanisms using Raman spectroscopy coupled with deep learning algorithms. Food Chemistry 2024, 461 , 140798. https://doi.org/10.1016/j.foodchem.2024.140798
  10. Muhammad Shalahuddin Al Ja’farawy, Vo Thi Nhat Linh, Jun-Yeong Yang, Chaewon Mun, Seunghun Lee, Sung-Gyu Park, In Woong Han, Samjin Choi, Min-Young Lee, Dong-Ho Kim, Ho Sang Jung. Whole urine-based multiple cancer diagnosis and metabolite profiling using 3D evolutionary gold nanoarchitecture combined with machine learning-assisted SERS. Sensors and Actuators B: Chemical 2024, 412 , 135828. https://doi.org/10.1016/j.snb.2024.135828
  11. Ushas A K, Dhanya M B, Jerrin Thomas Panachakel, Deepak S, Viji H, Smitha V Thampi, R Satheesh Thampi. Peak Pattern Classification in Mass Spectrometer Data using Deep Learning Techniques. 2024, 1-5. https://doi.org/10.1109/ICECCC61767.2024.10593905
  12. Ming Wei. Performance comparison of multiple neural networks for brain tumor classification. 2024, 12-20. https://doi.org/10.1109/ICBEA62825.2024.00012

Analytical Chemistry

Cite this: Anal. Chem. 2023, 95, 36, 13431–13437
Click to copy citationCitation copied!
https://doi.org/10.1021/acs.analchem.3c00613
Published August 25, 2023

Copyright © 2023 The Authors. Published by American Chemical Society. This publication is licensed under

CC-BY-NC-ND 4.0 .

Article Views

1123

Altmetric

-

Citations

Learn about these metrics

Article Views are the COUNTER-compliant sum of full text article downloads since November 2008 (both PDF and HTML) across all institutions and individuals. These metrics are regularly updated to reflect usage leading up to the last few days.

Citations are the number of other articles citing this article, calculated by Crossref and updated daily. Find more information about Crossref citation counts.

The Altmetric Attention Score is a quantitative measure of the attention that a research article has received online. Clicking on the donut icon will load a page at altmetric.com with additional details about the score and the social media presence for the given article. Find more information on the Altmetric Attention Score and how the score is calculated.

  • Abstract

    Figure 1

    Figure 1. Accuracies of LSTM models on embedded (a) hepatic, (b) positive renal, and (c) negative renal datasets at three RT binning time (10, 5, and 3 or 2 s). (30) means one LSTM layer of 30 neurons and (30,30) means two LSTM layers of 30 neurons each.

    Figure 2

    Figure 2. Accuracies of LSTM models on raw (a) hepatic, (b) positive renal, and (c) negative renal datasets at three RT binning time (10, 5, and 3 or 2 s). (30) means one LSTM layer of 30 neurons and (30,30) means two LSTM layers of 30 neurons each.

    Figure 3

    Figure 3. Accuracies of hybrid CNN–LSTM models on raw (a) hepatic, (b) positive renal, and (c) negative renal datasets at three RT binning time (10, 5, and 3 or 2 s). For hepatic and positive renal data, M1: Conv(32,21)(16,11)-LSTM(30,30), M2: Conv(32,21)-LSTM(30,30) and M3: Conv(32,11)-LSTM(30,30). For negative renal data, M1: Conv(32,21)(16,11)-LSTM(60)(10 s)/LSTM(60,60)(5 s)/LSTM(120)(2 s), M2: Conv(32,21)-LSTM(60)(10 s)/LSTM(60,60)(5 s)/LSTM(120)(2 s) and M3: Conv(32,11)-LSTM(60)(10 s)/LSTM(60,60)(5 s)/LSTM(120)(2 s). Conv(32,21) means 32 kernels in the convolutional layer with size 21. LSTM(30,30) means two LSTM layers of 30 neurons each.

    Figure 4

    Figure 4. Early classification accuracies of our two-stage CNN–LSTM model on (a) hepatic, (b) positive renal, and (c) negative renal datasets at three RT binning time (10, 5, and 3 or 2 s).

  • References


    This article references 30 other publications.

    1. 1
      Feider, C. L.; Krieger, A.; DeHoog, R. J.; Eberlin, L. S. Ambient Ionization Mass Spectrometry: Recent Developments and Applications. Anal. Chem. 2019, 91, 42664290,  DOI: 10.1021/acs.analchem.9b00807
    2. 2
      Berrar, D.; Dubitzky, W. Deep learning in bioinformatics and biomedicine. Briefings Bioinf. 2021, 22, 15131514,  DOI: 10.1093/bib/bbab087
    3. 3
      LeCun, Y.; Bengio, Y.; Hinton, G. Deep learning. nature 2015, 521, 436444,  DOI: 10.1038/nature14539
    4. 4
      Chollet, F. Deep Learning with Python; Simon & Schuster, 2021.
    5. 5
      Wen, B.; Zeng, W.-F.; Liao, Y.; Shi, Z.; Savage, S. R.; Jiang, W.; Zhang, B. Deep Learning in Proteomics. Proteomics 2020, 20, 1900335,  DOI: 10.1002/pmic.201900335
    6. 6
      Cortes, C.; Vapnik, V. Support-vector Networks, 3rd ed.; Springer, 1995; Vol. 20, pp 273297.
    7. 7
      Breiman, L. Mach. Learn. 2001, 45, 532,  DOI: 10.1023/a:1010933404324
    8. 8
      Wu, B.; Abbott, T.; Fishman, D.; McMurray, W.; Mor, G.; Stone, K.; Ward, D.; Williams, K.; Zhao, H. Comparison of statistical methods for classification of ovarian cancer using mass spectrometry data. Bioinformatics 2003, 19, 16361643,  DOI: 10.1093/bioinformatics/btg210
    9. 9
      Gredell, D. A.; Schroeder, A. R.; Belk, K. E.; Broeckling, C. D.; Heuberger, A. L.; Kim, S.-Y.; King, D. A.; Shackelford, S. D.; Sharp, J. L.; Wheeler, T. L. Comparison of Machine Learning Algorithms for Predictive Modeling of Beef Attributes Using Rapid Evaporative Ionization Mass Spectrometry (REIMS) Data. Sci. Rep. 2019, 9, 5721,  DOI: 10.1038/s41598-019-40927-6
    10. 10
      Datta, S.; DePadilla, L. M. Feature selection and machine learning with mass spectrometry data for distinguishing cancer and non-cancer samples. Stat. Methodol. 2006, 3, 7992,  DOI: 10.1016/j.stamet.2005.09.006
    11. 11
      Cannataro, M.; Guzzi, P. H.; Mazza, T.; Veltri, P. ACM SIGIR 2005, 112, 1726
    12. 12
      Engel, J.; Gerretzen, J.; Szymańska, E.; Jansen, J. J.; Downey, G.; Blanchet, L.; Buydens, L. M. TrAC, Trends Anal. Chem. 2013, 50, 96106,  DOI: 10.1016/j.trac.2013.04.015
    13. 13
      Seddiki, K.; Saudemont, P.; Precioso, F.; Ogrinc, N.; Wisztorski, M.; Salzet, M.; Fournier, I.; Droit, A. Nat. Commun. 2020, 11, 5595,  DOI: 10.1038/s41467-020-19354-z
    14. 14
      Gupta, A.; Gupta, H. P.; Biswas, B.; Dutta, T. IEEE Transactions on Artificial Intelligence; IEEE, 2020; Vol. 1, pp 4761.
    15. 15
      Mori, U.; Mendiburu, A.; Dasgupta, S.; Lozano, J. A. Early Classification of Time Series by Simultaneously Optimizing the Accuracy and Earliness. IEEE Transact. Neural Networks Learn. Syst. 2018, 29, 45694578,  DOI: 10.1109/tnnls.2017.2764939
    16. 16
      Szegedy, C.; Liu, W.; Jia, Y.; Sermanet, P.; Reed, S.; Anguelov, D.; Erhan, D.; Vanhoucke, V.; Rabinovich, A. Proceedings of the IEEE Conference on Computer Vision and Pattern Recogni-tion; IEEE, 2015, pp 19.
    17. 17
      Papagiannopoulou, C.; Parchen, R.; Rubbens, P.; Waegeman, W. Fast Pathogen Identification Using Single-Cell Matrix-Assisted Laser Desorption/Ionization-Aerosol Time-of-Flight Mass Spectrometry Data and Deep Learning Methods. Anal. Chem. 2020, 92, 75237531,  DOI: 10.1021/acs.analchem.9b05806
    18. 18
      Skarysz, A.; Salman, D.; Eddleston, M.; Sykora, M.; Hunsicker, E.; Nailon, W. H.; Darnley, K.; McLaren, D. B.; Thomas, C.; Soltoggio, A. 2020, arXiv preprint arXiv:2006.01772.
    19. 19
      Hochreiter, S.; Schmidhuber, J. Long Short-Term Memory. Neural Comput. 1997, 9, 17351780,  DOI: 10.1162/neco.1997.9.8.1735
    20. 20
      Van Houdt, G.; Mosquera, C.; Nápoles, G. A review on the long short-term memory model. Artif. Intell. Rev. 2020, 53, 59295955,  DOI: 10.1007/s10462-020-09838-1
    21. 21
      Shickel, B.; Tighe, P. J.; Bihorac, A.; Rashidi, P. Deep EHR: A Survey of Recent Advances in Deep Learning Techniques for Electronic Health Record (EHR) Analysis. IEEE J. Biomed. Health Inf. 2018, 22, 15891604,  DOI: 10.1109/jbhi.2017.2767063
    22. 22
      Nguyen, P.; Tran, T.; Wickramasinghe, N.; Venkatesh, S. Deepr: A Convolutional Net for Med-ical Records; arXiv. org, 2016.
    23. 23
      Zanjani, F. G.; Panteli, A.; Zinger, S.; van der Sommen, F.; Tan, T.; Balluff, B.; Vos, D. N.; Ellis, S. R.; Heeren, R. M.; Lucas, M.; IEEE 16th International Symposium on Biomedical Imaging; ISBI 2019, 2019, pp 674678.
    24. 24
      Liu, J.; Zhang, J.; Luo, Y.; Yang, S.; Wang, J.; Fu, Q. Mass Spectral Substance Detections Using Long Short-Term Memory Networks. IEEE Access 2019, 7, 1073410744,  DOI: 10.1109/access.2019.2891548
    25. 25
      Zhang, J.; Liu, J.; Luo, Y.; Fu, Q.; Bi, J.; Qiu, S.; Cao, Y.; Ding, X. IEEE 17th International Conference on Communication Technology; ICCT, 2017, pp 19941997.
    26. 26
      Jiang, Y.; Sun, A.; Zhao, Y.; Ying, W.; Sun, H.; Yang, X.; Xing, B.; Sun, W.; Ren, L.; Hu, B. Proteomics identifies new therapeutic targets of early-stage hepatocellular carcinoma. Nature 2019, 567, 257261,  DOI: 10.1038/s41586-019-0987-8
    27. 27
      Bifarin, O. O.; Gaul, D. A.; Sah, S.; Arnold, R. S.; Ogan, K.; Master, V. A.; Roberts, D. L.; Bergquist, S. H.; Petros, J. A.; Fernandez, F. M. Machine Learning-Enabled Renal Cell Carcinoma Status Prediction Using Multiplatform Urine-Based Metabolomics. J. Proteome Res. 2021, 20, 36293641,  DOI: 10.1021/acs.jproteome.1c00213
    28. 28
      Su, Z.; Xie, H.; Han, L. Multi-Factor RFG-LSTM Algorithm for Stock Sequence Predicting. Comput. Econ. 2021, 57, 10411058,  DOI: 10.1007/s10614-020-10008-2
    29. 29
      Ma, F.; Zhang, J.; Chen, W.; Liang, W.; Yang, W. Discrete Dynamics in Nature and Society; Hindawi, 2020.
    30. 30
      Dong, H.; Liu, Y.; Zeng, W.-F.; Shu, K.; Zhu, Y.; Chang, C. A Deep Learning-Based Tumor Classifier Directly Using MS Raw Data. Proteomics 2020, 20, 1900344,  DOI: 10.1002/pmic.201900344
  • Supporting Information

    Supporting Information


    The Supporting Information is available free of charge at https://pubs.acs.org/doi/10.1021/acs.analchem.3c00613.

    • Additional results and figures including matrix construction, our two-stage CNN–LSTM classification model, hybrid CNN–LSTM models architectures, 1D-CNN embedding models, classification performances and early classification performances with our two-stage CNN-LSTM model (PDF)


    Terms & Conditions

    Most electronic Supporting Information files are available without a subscription to ACS Web Editions. Such files may be downloaded by article for research use (if there is a public use license linked to the relevant article, that license may permit other uses). Permission may be obtained from ACS for other uses through requests via the RightsLink permission system: http://pubs.acs.org/page/copyright/permissions.html.