Protein-ligand docking programs are indispensable tools for predicting the binding pose of a ligand to the receptor protein in current structure-based drug design. In this paper, we evaluate the performance of grey wolf optimization (GWO) in protein-ligand docking. Two versions of the GWO docking program – the original GWO and the modified one with random walk – were implemented based on AutoDock Vina. Our rigid docking experiments show that the GWO programs have enhanced exploration capability leading to significant speedup in the search while maintaining comparable binding pose prediction accuracy to AutoDock Vina. For flexible receptor docking, the GWO methods are competitive in pose ranking but lower in success rates than AutoDockFR. Successful redocking of all the flexible cases to their holo structures reveals that inaccurate scoring function and lack of proper treatment of backbone are the major causes of docking failures.