A novel conjoint triad auto covariance (CTAC) coding method for predicting protein-protein interaction based on amino acid sequence.

Affiliation

Institute of Technical Biology & Agriculture Engineering, Chinese Academy of Sciences, Science Island, HeFei City, AnHui Province 230031, China; Institute of Intelligent Machine, Hefei Institutes of Physical Science, Chinese Academy of Sciences, Science Island, HeFei City, AnHui Province 230031, China; University of Science and Technology of China, Hefei City, Anhui Province 230026, China. Electronic address: [Email]

Abstract

Protein-protein interactions (PPIs) play a crucial role in the life-sustaining activities of organisms. Although various methods for the prediction of PPIs have been developed in the past decades, their robustness and prediction accuracy need to be improved. Therefore, it is necessary to develop an effective and accurate method to predict PPIs. Aiming at making sure that PPIs can be predicted effectively, in this paper, we propose a new sequence-based approach based on deep neural network (DNN) and conjoint triad auto covariance (CTAC) to improve the effectiveness of predicting PPIs. The coding method of CTAC combines the advantages of conjoint triad and auto covariance. Therefore, the CTAC can obtain more PPIs information from the amino acid sequence. The model of DNNCTAC achieved an accuracy of 98.37%, recall of 99.41%, area under the curve (AUC) of 99.24% and loss of 22.7%, respectively, on human dataset. These results indicate that DNNCTAC can enhance the predictive power of PPIs and can significantly enhance the accuracy of the prediction. And, it has proved to be a useful complement to future proteomics research. The source codes and all datasets are available at https://github.com/smalltalkman/hppi-tensorflow.

Keywords

Conjoint triad auto covariance,Deep neural networks,Protein-protein interaction,