Abstract
The Protein Data Bank contains more than 223,000 three dimensional biostructures and is growing at a rate of nearly 10% per year. The lack of a tool that facilitates the classification between apo and holo structures and differentiates between covalent and non-covalent ligand protein complexes, makes it difficult to manage a large number of structures. To address this issue, we present PDBCAT, a user-friendly tool that facilitates the categorization and extraction of key information from PDBx/mmCIF files. PDB-CAT is a program that classifies a group of protein structures based on their ligands into three categories: apo, covalently, and non-covalently bonded. Besides this classification, the program can verify if there are any mutations in the protein sequence by comparing it to a reference sequence. PDB-CAT is designed to be user-friendly, with its output clearly defining every entity present in each entry to facilitate decision-making. PDB-CAT is now
available on GitHub (https://github.com/URV-cheminformatics/PDB-CAT).