Advanced Database Mining of Efficient Biocatalysts by Sequence and Structure Bioinformatics and Microfluidics


Next-generation sequencing doubles genomic databases every 2.5 years. The accumulation of sequence data provides a unique opportunity to identify interesting biocatalysts directly in the databases without tedious and time-consuming engineering. Herein, we present a pipeline integrating sequence and structural bioinformatics with microfluidic enzymology for bioprospecting of efficient and robust haloalkane dehalogenases. The bioinformatic part identified 2,905 putative dehalogenases and prioritized a “small-but-smart” set of 45 genes, yielding 40 active enzymes, 24 of which were biochemically characterized by microfluidic enzymology techniques. Combining microfluidics with modern global data analysis provided precious mechanistic insights related to the high catalytic efficiency of selected enzymes. Overall, we have doubled the dehalogenation “toolbox” characterized over three decades, yielding biocatalysts that surpass the efficiency of currently available wild-type and engineered enzymes. This pipeline is generally applicable to other enzyme families and can accelerate the identification of efficient biocatalysts for industrial use.

Version notes

Update version of the manuscript.


Thumbnail image of Vasina_MAIN.pdf

Supplementary material

Thumbnail image of Vasina_SI.pdf
Supplementary material
Supplementary material