Abstract
Characterizing the structural dynamics of proteins with heterogeneous conformational landscapes is crucial to understanding complex biomolecular processes. To this end, dimensionality reduction algorithms are used to produce low-dimensional embeddings of the high-dimensional conformational phase space. However, identifying a compact and informative set of input features for the embedding remains an ongoing challenge. Here, we propose to harness the power of Residue Interaction Networks (RINs) and their centrality measures, established tools to provide a graph theoretical view on molecular structure. Specifically, we combine the closeness centrality, which captures global features of the protein conformation at residue-wise resolution, with EncoderMap, a hybrid neural-network autoencoder/multidimensional-scaling like dimensionality reduction algorithm. We find that the resulting low-dimensional embedding is a meaningful visualization of the residue interaction landscape that resolves structural details of the protein behavior while retaining global interpretability. This feature-based graph embedding of temporal protein graphs makes it possible to apply the general descriptive power of RIN formalisms to the analysis of protein simulations of complex processes such as protein folding and multi-domain interactions requiring no protein-specific input. We demonstrate this on simulations of the fast folding protein Trp-Cage and the multi-domain signalling protein FAT10. Due to its generality and modularity, the presented approach can easily be transferred to other protein systems.