These are preliminary reports that have not been peer-reviewed. They should not be regarded as conclusive, guide clinical practice/health-related behavior, or be reported in news media as established information. For more information, please see our FAQs.
2 files

Automatic Cavity Identification and Decomposition into Subpockets with CAVIAR

revised on 15.09.2020, 09:34 and posted on 16.09.2020, 07:57 by Jean-Rémy Marchand, Bernard Pirard, Peter Ertl, Finton Sirockin

Motivation. The detection of small molecules binding sites in proteins is central to structure-based drug design and chemical biology. Many tools were developed in the last 40 years, but few of them are still available in 2020, open-source, and suitable for the analysis of large databases or for the integration in automatic workflows. No software can characterize subpockets solely with the information of the protein structure, a pivotal concept in fragment-based drug design.

Results. CAVIAR is a new open source tool for protein cavity identification and rationalization, supporting PDB and mmCIF files as well as DCD trajectories from molecular dynamics simulations. The protein structure serves as input for automatic cavity detection and computation of properties, including ligandability. A subcavity segmentation algorithm decomposes binding sites into subpockets without requiring the presence of a ligand. The defined subpockets mimick the empirical definitions of subpockets in medicinal chemistry projects. A tool like CAVIAR may be valuable to support chemical biology, medicinal chemistry and ligand identification efforts. Our analysis of the PDB shows that liganded cavities tend to be bigger, more hydrophobic and more complex than apo cavities. Moreover, in line with the paradigm of fragment-based drug design, the binding affinity scales relatively well with the number of subcavities filled by the ligand. Compounds binding to more than three subcavities are mostly in the nanomolar or better range of affinities to their target.

Availability and implementation. Installation notes, user manual and support for CAVIAR are available at The CAVIAR GUI and CAVIAR command line tool are available on GitHub at and a conda package is hosted on Anaconda cloud at The software suite is free and all of the source code is available under a permissive MIT license. The lists of PDB files used for validation, as well as the results of subpocket decomposition with CAVIAR and DoGSite are hosted on GitHub at



Email Address of Submitting Author





ORCID For Submitting Author


Declaration of Conflict of Interest

All authors are employed by Novartis AG

Version Notes

Version 2 - shortened the main text to make it more readable