Repeated double cross-validation applied to the PCA-LDA classification of SERS spectra: a case study with serum samples from hepatocellular carcinoma patients.

Affiliation

Gurian E(1), Di Silvestre A(1), Mitri E(1), Pascut D(2), Tiribelli C(2), Giuffrè M(2)(3), Crocè LS(2)(3), Sergo V(1)(4), Bonifacio A(5).
Author information:
(1)Raman Spectroscopy Lab, Dipartimento di Ingegneria e Architettura
(DIA), University of Trieste, via Valerio 6, 34127, Trieste, TS, Italy.
(2)Fondazione Italiana Fegato - ONLUS, Area Science Park, SS14, km163.5, 34149, Basovizza, Trieste, TS, Italy.
(3)Department of Medical Sciences, University of Trieste, Strada di Fiume, 447, 34129, Trieste, Italy.
(4)Faculty of Health Sciences, University of Macau, Macau, SAR, People's Republic of China.
(5)Raman Spectroscopy Lab, Dipartimento di Ingegneria e Architettura
(DIA), University of Trieste, via Valerio 6, 34127, Trieste, TS, Italy. [Email]

Abstract

Intense label-free surface-enhanced Raman scattering (SERS) spectra of serum samples were rapidly obtained on Ag plasmonic paper substrates upon 785 nm excitation. Spectra from the hepatocellular carcinoma (HCC) patients showed consistent differences with respect to those of the control group. In particular, uric acid was found to be relatively more abundant in patients, while hypoxanthine, ergothioneine, and glutathione were found as relatively more abundant in the control group. A repeated double cross-validation (RDCV) strategy was applied to optimize and validate principal component analysis-linear discriminant analysis (PCA-LDA) models. An analysis of the RDCV results indicated that a PCA-LDA model using up to the first four principal components has a good classification performance (average accuracy was 81%). The analysis also allowed confidence intervals to be calculated for the figures of merit, and the principal components used by the LDA to be interpreted in terms of metabolites, confirming that bands of uric acid, hypoxanthine, ergothioneine, and glutathione were indeed used by the PCA-LDA algorithm to classify the spectra.