A novel structural position-specific scoring matrix for the prediction of protein secondary structures

Dapeng Li, Tonghua Li, Peisheng Cong, Wenwei Xiong, Jiangming Sun

Research output: Contribution to journalArticlepeer-review

Abstract

Motivation: The precise prediction of protein secondary structure is of key importance for the prediction of 3D structure and biological function. Although the development of many excellent methods over the last few decades has allowed the achievement of prediction accuracies of up to 80%, progress seems to have reached a bottleneck, and further improvements in accuracy have proven difficult. Results: We propose for the first time a structural position-specific scoring matrix (SPSSM), and establish an unprecedented database of 9 million sequences and their SPSSMs. This database, when combined with a purpose-designed BLAST tool, provides a novel prediction tool: SPSSMPred. When the SPSSMPred was validated on a large dataset (10 814 entries), the Q3 accuracy of the protein secondary structure prediction was 93.4%. Our approach was tested on the two latest EVA sets; accuracies of 82.7 and 82.0% were achieved, far higher than can be achieved using other predictors. For further evaluation, we tested our approach on newly determined sequences (141 entries), and obtained an accuracy of 89.6%. For a set of low-homology proteins (40 entries), the SPSSMPred still achieved a Q3 value of 84.6%.

Original languageEnglish
Pages (from-to)32-39
JournalBioinformatics
Volume28
Issue number1
DOIs
Publication statusPublished - 2012 Jan
Externally publishedYes

Free keywords

  • Machine learning

Fingerprint

Dive into the research topics of 'A novel structural position-specific scoring matrix for the prediction of protein secondary structures'. Together they form a unique fingerprint.

Cite this