Construction and Validation of a Prognostic Gene-Based Model for Overall Survival Prediction in Hepatocellular Carcinoma Using an Integrated Statistical and Bioinformatic Approach.


Dessie EY(1), Tu SJ(2), Chiang HS(2), Tsai JJP(1), Chang YS(2), Chang JG(2), Ng KL(1)(3)(4).
Author information:
(1)Department of Bioinformatics and Medical Engineering, Asia University, No. 500, Lioufeng Rd., Wufeng, Taichung 41354, Taiwan.
(2)Department of Laboratory Medicine and Center for Precision Medicine, China Medical University and Hospital, No. 2, Yude Rd., North District, Taichung 404332, Taiwan.
(3)Department of Medical Research, China Medical University Hospital, China Medical University, No. 2, Yude Rd., North Dist., Taichung 404332, Taiwan.
(4)Center for Artificial Intelligence and Precision Medicine Research, Asia University, No. 500, Lioufeng Rd., Wufeng, Taichung 41354, Taiwan.


Hepatocellular carcinoma (HCC) is one of the most common lethal cancers worldwide and is often related to late diagnosis and poor survival outcome. More evidence is demonstrating that gene-based prognostic models can be used to predict high-risk HCC patients. Therefore, our study aimed to construct a novel prognostic model for predicting the prognosis of HCC patients. We used multivariate Cox regression model with three hybrid penalties approach including least absolute shrinkage and selection operator (Lasso), adaptive lasso and elastic net algorithms for informative prognostic-related genes selection. Then, the best subset regression was used to identify the best prognostic gene signature. The prognostic gene-based risk score was constructed using the Cox coefficient of the prognostic gene signature. The model was evaluated by Kaplan-Meier (KM) and receiver operating characteristic curve (ROC) analyses. A novel four-gene signature associated with prognosis was identified and the risk score was constructed based on the four-gene signature. The risk score efficiently distinguished the patients into a high-risk group with poor prognosis. The time-dependent ROC analysis revealed that the risk model had a good performance with an area under the curve (AUC) of 0.780, 0.732, 0.733 in 1-, 2- and 3-year prognosis prediction in The Cancer Genome Atlas (TCGA) dataset. Moreover, the risk score revealed a high diagnostic performance to classify HCC from normal samples. The prognosis and diagnosis prediction performances of risk scores were verified in external validation datasets. Functional enrichment analysis of the four-gene signature and its co-expressed genes involved in the metabolic and cell cycle pathways was constructed. Overall, we developed a novel-gene-based prognostic model to predict high-risk HCC patients and we hope that our findings can provide promising insight to explore the role of the four-gene signature in HCC patients and aid risk classification.