An Efficient RI-MP2 Algorithm for Distributed Many-GPU Architectures

Calum Snowdon; Giuseppe Maria Junior Barca

doi:10.26434/chemrxiv-2024-9091h-v2

Theoretical and Computational Chemistry

Search within Theoretical and Computational Chemistry

An Efficient RI-MP2 Algorithm for Distributed Many-GPU Architectures

19 September 2024, Version 2

Working Paper

Show author details

This content is a preprint and has not undergone peer review at the time of posting.

Abstract

Second-order Møller-Plesset perturbation theory (MP2) using the Resolution of the Identity approximation (RI-MP2) is a widely used method for computing molecular energies beyond the Hartree-Fock mean-field approximation. However, its high computational cost and lack of efficient algorithms for modern supercomputing architectures limit its applicability to large molecules. In this paper, we present the first distributed-memory many-GPU RI-MP2 algorithm explicitly designed to utilize hundreds of GPU accelerators for every step of the computation. Our novel algorithm achieves near-peak performance on GPU-based supercomputers through the development of a distributed memory algorithm for forming RI-MP2 intermediate tensors with zero inter-node communication, except for a single O(N^2) asynchronous broadcast, and a distributed memory algorithm for the O(N^5) energy reduction step, capable of sustaining near-peak performance on clusters with several hundred GPUs. Comparative analysis shows our implementation outperforms state-of-the-art quantum chemistry software by over 3.5 times in speed while achieving an eightfold reduction in computational power consumption. Benchmarking on the Perlmutter supercomputer, our algorithm achieves 11.8 PFLOP/s (83% of peak performance) performing and the RI-MP2 energy calculation on a 314-water cluster with 7,850 primary and 30,144 auxiliary basis functions in 4 minutes on 180 nodes and 720 A100 GPUs. This performance represents a substantial improvement over traditional CPU-based methods, demonstrating significant time-to-solution and power consumption benefits of leveraging modern GPU-accelerated computing environments for quantum chemistry calculations.

Keywords

GPU

MP2

RI-MP2

Graphics Processing Units

High-Performance Computing

Quantum Chemistry

Comments

Comments are not moderated before they are posted, but they can be removed by the site moderators if they are found to be in contravention of our Commenting Policy - please read this policy before you post. Comments should be used for scholarly discussion of the content in question. You can find more information about how to use the commenting feature here .

This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.

Version History

Sep 19, 2024 Version 2

Sep 17, 2024 Version 1

Version Notes

Some changes to the introduction and references.

Metrics

407

188

Views

Downloads

Citations

License

The content is available under CC BY NC ND 4.0

DOI

10.26434/chemrxiv-2024-9091h-v2

Author’s competing interest statement

Giuseppe Barca is co-founder at QDX Technologies

Ethics

The author(s) have declared ethics committee/IRB approval is not relevant to this content

An Efficient RI-MP2 Algorithm for Distributed Many-GPU Architectures

Authors

Abstract

Keywords

Comments

Version History

Version Notes

Metrics

License

DOI

Author’s competing interest statement

Ethics

Share