Single-precision open-shell CCSD and CCSD(T) calculations on graphics processing units
It has been shown that coupled-cluster calculations with single-precision data are able to provide correlation energy with insignificant loss of accuracy. In this work, we employed consumer GPUs to accelerate open-shell spin-unrestricted CCSD and CCSD(T) calculations based on single-precision data. Some open-shell molecules are calculated to benchmark the acceleration performance of GPUs. In CCSD calculations, good acceleration performance on consumer GPUs is achieved for molecules when all the two-electron integrals can be saved in host memory. On the other hand, I/O operations cost a lot of time for larger molecules and the performance of GPU is not as significant. Good acceleration performance can usually be obtained in calculating the (T) correction employing GPUs since matrix contractions are always more costly than other operations. For systems with less than four hundred basis functions, our single-precision GPU code could provide an acceleration of 4-14 times for CCSD calculations and 12-20 times for (T) correction compared with double-precision CPU codes on the same hardware level.