Fu-SulfPred: Identification of Protein S-sulfenylation Sites by Fusing Forests via Chou's General PseAAC.

Affiliation

College of Science, Dalian Maritime University, Dalian 116026, PR China. Electronic address: [Email]

Abstract

Protein S-sulfenylation is an essential post-translational modification (PTM) that provides critical information to understand molecular mechanisms of cell signaling transduction, stress response and regulation of cellular functions. Recent advancements in computational methods have contributed towards the detection of protein S-sulfenylation sites. However, the performance of identifying protein S-sulfenylation sites can be influenced by a class imbalance of training datasets while the application of various computational methods. In this study, we designed a Fu-SulfPred model using stratified structure of three kinds of decision trees in order to identify possible protein S-sulfenylation sites by means of reconstructing training datasets and sample rescaling technology. Experimental results showed that the correlation coefficient values of Fu-SulfPred model were found to be 0.5437, 0.3736 and 0.6809 on three independent test datasets, respectively, all of which outperformed the Matthews coefficient values of S-SulfPred model. Fu-SulfPred model provides a promising scheme for the identification of protein S-sulfenylation sites and other post-translational modifications.

Keywords

Decision trees,Feature extraction,Forest,Prediction,S-Sulfenylation,Sampling,