TY - JOUR
T1 - Structure-based neural network protein–carbohydrate interaction predictions at the residue level
AU - Canner, Samuel W.
AU - Shanker, Sudhanshu
AU - Gray, Jeffrey J.
N1 - Publisher Copyright:
Copyright © 2023 Canner, Shanker and Gray.
PY - 2023
Y1 - 2023
N2 - Carbohydrates dynamically and transiently interact with proteins for cell–cell recognition, cellular differentiation, immune response, and many other cellular processes. Despite the molecular importance of these interactions, there are currently few reliable computational tools to predict potential carbohydrate-binding sites on any given protein. Here, we present two deep learning (DL) models named CArbohydrate–Protein interaction Site IdentiFier (CAPSIF) that predicts non-covalent carbohydrate-binding sites on proteins: (1) a 3D-UNet voxel-based neural network model (CAPSIF:V) and (2) an equivariant graph neural network model (CAPSIF:G). While both models outperform previous surrogate methods used for carbohydrate-binding site prediction, CAPSIF:V performs better than CAPSIF:G, achieving test Dice scores of 0.597 and 0.543 and test set Matthews correlation coefficients (MCCs) of 0.599 and 0.538, respectively. We further tested CAPSIF:V on AlphaFold2-predicted protein structures. CAPSIF:V performed equivalently on both experimentally determined structures and AlphaFold2-predicted structures. Finally, we demonstrate how CAPSIF models can be used in conjunction with local glycan-docking protocols, such as GlycanDock, to predict bound protein–carbohydrate structures.
AB - Carbohydrates dynamically and transiently interact with proteins for cell–cell recognition, cellular differentiation, immune response, and many other cellular processes. Despite the molecular importance of these interactions, there are currently few reliable computational tools to predict potential carbohydrate-binding sites on any given protein. Here, we present two deep learning (DL) models named CArbohydrate–Protein interaction Site IdentiFier (CAPSIF) that predicts non-covalent carbohydrate-binding sites on proteins: (1) a 3D-UNet voxel-based neural network model (CAPSIF:V) and (2) an equivariant graph neural network model (CAPSIF:G). While both models outperform previous surrogate methods used for carbohydrate-binding site prediction, CAPSIF:V performs better than CAPSIF:G, achieving test Dice scores of 0.597 and 0.543 and test set Matthews correlation coefficients (MCCs) of 0.599 and 0.538, respectively. We further tested CAPSIF:V on AlphaFold2-predicted protein structures. CAPSIF:V performed equivalently on both experimentally determined structures and AlphaFold2-predicted structures. Finally, we demonstrate how CAPSIF models can be used in conjunction with local glycan-docking protocols, such as GlycanDock, to predict bound protein–carbohydrate structures.
KW - deep learning
KW - glycan binding
KW - neural networks
KW - oligosaccharide binding
KW - protein–carbohydrate binding
KW - site prediction
UR - http://www.scopus.com/inward/record.url?scp=85171572437&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85171572437&partnerID=8YFLogxK
U2 - 10.3389/fbinf.2023.1186531
DO - 10.3389/fbinf.2023.1186531
M3 - Article
C2 - 37409346
AN - SCOPUS:85171572437
SN - 2673-7647
VL - 3
JO - Frontiers in Bioinformatics
JF - Frontiers in Bioinformatics
M1 - 1186531
ER -