I am trying to use ScaLAPACK to improve the performance of the SCF procedure but the wall times with and without ScaLAPACK are nearly the same. To my understanding, the usage of the .SCALAPACK keyword should generally speed up matrix operations, provided LSDalton was linked to the appropriate MKL libraries. Is that correct or are there cases in your experience where you would advise against using .SCALAPACK?
I noticed that when using .SCALAPACK, the following message gets repeatedly printed to stdout during the SCF procedure:
Code: Select all
FALLBACK: mat_write_to_disk
Some technical details:
I compiled LSDalton 2018.0 with the Intel 2018 compilers and the following setup command:
Code: Select all
./setup --mpi --omp --csr --scalapack --extra-fc-flags="-O3 -xCORE-AVX2" --extra-cc-flags="-O3 -xCORE-AVX2" --extra-cxx-flags="-xHost -O3 -xCORE-AVX2"
I greatly appreciate any suggestions and I am happy to provide further information if needed.
Sebastian