Tests time out after build

Problems with Dalton installation? Find answers or ask for help here
Post Reply
dwh1d17
Posts: 5
Joined: 23 Jul 2020, 11:40
First name(s): David
Last name(s): Hempston
Affiliation: University of Southampton
Country: United Kingdom

Tests time out after build

Post by dwh1d17 » 23 Jul 2020, 12:40

Installing on a RHEL6.10 x86 machine. Standard build but using openmpi.

mpicc --version
gcc (GCC) 6.1.0

Code: Select all

git clone --recursive https://gitlab.com/dalton/dalton.git
cd dalton/
git checkout Dalton2018.0
git submodule update
Configured the build -- configured dalton and created the directory build

Code: Select all

./setup --fc=mpif90 --cc=mpicc --mpi --mkl=sequential --prefix=/local/software/dalton/2018.0
cd build/
make -j 4 
The build seemed to work great. No errors. Now test the build -- creating a scratch directory for this purpose

Code: Select all

mkdir /scratch/hpc/dalton
export DALTON_TMPDIR=/scratch/hpc/dalton
export DALTON_NUM_MPI_PROCS=4
make test
Only the first test seems to pass can then they just start timing out.

1/496 Test #1: dft_ac_grac ...................... Passed 3.06 sec
Start 2: dft_b3lyp_cart
2/496 Test #2: dft_b3lyp_cart ...................***Timeout 1200.16 sec
Start 3: dft_b3lyp_magsus_nosym
3/496 Test #3: dft_b3lyp_magsus_nosym ...........***Timeout 1200.17 sec
Start 4: dft_b3lyp_molhes_nosym
4/496 Test #4: dft_b3lyp_molhes_nosym ...........***Timeout 1200.14 sec
Start 5: dft_b3lyp_nosym
5/496 Test #5: dft_b3lyp_nosym ..................***Timeout 1200.16 sec


Does anyone have any ideas? Is this a support build method?

kind regards,
David H

User avatar
magnus
Posts: 519
Joined: 27 Jun 2013, 16:32
First name(s): Jógvan Magnus
Middle name(s): Haugaard
Last name(s): Olsen
Affiliation: Hylleraas Centre, UiT The Arctic University of Norway
Country: Norway

Re: Tests time out after build

Post by magnus » 23 Jul 2020, 13:17

Yes, it is supported. Could you provide the CMake output from when you run the setup command? There should be a file called setup_cmake_output in your build directory.

dwh1d17
Posts: 5
Joined: 23 Jul 2020, 11:40
First name(s): David
Last name(s): Hempston
Affiliation: University of Southampton
Country: United Kingdom

Re: Tests time out after build

Post by dwh1d17 » 27 Jul 2020, 14:22

- The Fortran compiler identification is GNU 6.1.0
-- The C compiler identification is GNU 6.1.0
-- The CXX compiler identification is GNU 6.1.0
-- Check for working Fortran compiler: /local/software/openmpi/3.1.3/gcc/bin/mpif90
-- Check for working Fortran compiler: /local/software/openmpi/3.1.3/gcc/bin/mpif90 -- works
-- Detecting Fortran compiler ABI info
-- Detecting Fortran compiler ABI info - done
-- Checking whether /local/software/openmpi/3.1.3/gcc/bin/mpif90 supports Fortran 90
-- Checking whether /local/software/openmpi/3.1.3/gcc/bin/mpif90 supports Fortran 90 -- yes
-- Check for working C compiler: /local/software/openmpi/3.1.3/gcc/bin/mpicc
-- Check for working C compiler: /local/software/openmpi/3.1.3/gcc/bin/mpicc -- works
-- Detecting C compiler ABI info
-- Detecting C compiler ABI info - done
-- Detecting C compile features
-- Detecting C compile features - done
-- Check for working CXX compiler: /local/software/gcc/6.1.0/bin/g++
-- Check for working CXX compiler: /local/software/gcc/6.1.0/bin/g++ -- works
-- Detecting CXX compiler ABI info
-- Detecting CXX compiler ABI info - done
-- Detecting CXX compile features
-- Detecting CXX compile features - done
-- Math lib search order is MKL;ESSL;OPENBLAS;ATLAS;ACML;SYSTEM_NATIVE
-- You can select a specific type by defining for instance -D BLAS_TYPE=ATLAS or -D LAPACK_TYPE=ACML
-- or by redefining MATH_LIB_SEARCH_ORDER
-- Found MPI_C: /local/software/openmpi/3.1.3/gcc/lib/libmpi.so
-- Found MPI_CXX: /local/software/openmpi/3.1.3/gcc/lib/libmpi.so
-- Found MPI_Fortran: /local/software/openmpi/3.1.3/gcc/lib/libmpi_usempif08.so;/local/software/openmpi/3.1.3/gcc/lib/libmpi_usempi_ignore_tkr.so;/local/software/openmpi/3.1.3/gcc/lib/libmpi_mpifh.so;/local/software/openmpi/3.1.3/gcc/lib/libmpi.so
-- Performing Test MPI_COMPATIBLE
-- Performing Test MPI_COMPATIBLE - Success
-- Performing Test MPI_F90_I4
-- Performing Test MPI_F90_I4 - Success
-- Performing Test MPI_F90_I8
-- Performing Test MPI_F90_I8 - Failed
-- Performing Test ENABLE_MPI3_FEATURES
-- Performing Test ENABLE_MPI3_FEATURES - Success
-- Found Git: /usr/bin/git
-- Polarizable Continuum Model via PCMSolver DISABLED
-- Configuring done
-- Generating done
-- Build files have been written to: /home/local/software/dalton/2018/dalton/build
I should note that I didn't use the --mkl=sequential flag in the end because it caused gfortan errors during build.

User avatar
magnus
Posts: 519
Joined: 27 Jun 2013, 16:32
First name(s): Jógvan Magnus
Middle name(s): Haugaard
Last name(s): Olsen
Affiliation: Hylleraas Centre, UiT The Arctic University of Norway
Country: Norway

Re: Tests time out after build

Post by magnus » 27 Jul 2020, 14:33

Yes, --mkl is for Intel compilers only.

The CMake output looks ok, except that I would expect a message about which BLAS and LAPACK it found. Anyway it would not be able to finish the build without them.

It would be good to know whether or not it is related to MPI, so could you try to run serially (by not setting DALTON_NUM_MPI_PROCS) and running "ctest -L essential --output-on-failure" instead of "make test"

User avatar
magnus
Posts: 519
Joined: 27 Jun 2013, 16:32
First name(s): Jógvan Magnus
Middle name(s): Haugaard
Last name(s): Olsen
Affiliation: Hylleraas Centre, UiT The Arctic University of Norway
Country: Norway

Re: Tests time out after build

Post by magnus » 27 Jul 2020, 14:39

I just noticed that you didn't specify "--cxx=mpicxx" in the setup command. Not sure if this will cause errors but perhaps worth a try.

dwh1d17
Posts: 5
Joined: 23 Jul 2020, 11:40
First name(s): David
Last name(s): Hempston
Affiliation: University of Southampton
Country: United Kingdom

Re: Tests time out after build

Post by dwh1d17 » 28 Jul 2020, 11:33

The serial tests all pass fine and I tried adding "--cxx=mpicxx" but no improvement.

We have Atlas and blas installed via rpm so I'm guessing the installer is able to find then in the default location.

Code: Select all

|11:29:26| [dwh1d17@cyan02 lib64]$ rpm -aq | grep "blas\|atlas"
atlas-3.8.4-2.el6.x86_64
blas-3.2.1-5.el6.x86_64
I tried using openmpi v2.0.2 and this seems to be working much better. If I was to re-install openmpi, would it be better to use v3.1.6 or the newer v4.0.4?

dwh1d17
Posts: 5
Joined: 23 Jul 2020, 11:40
First name(s): David
Last name(s): Hempston
Affiliation: University of Southampton
Country: United Kingdom

Re: Tests time out after build

Post by dwh1d17 » 28 Jul 2020, 14:41

Just to update, the tests are now finished using openmpi 2 and only the benchmark tests failed.

98% tests passed, 10 tests failed out of 496

Label Time Summary:
aosoppa = 61.51 sec (18 tests)
benchmark = 10.55 sec (10 tests)
cc = 34.01 sec (87 tests)
cc3 = 11.96 sec (31 tests)
ccr12 = 32.51 sec (68 tests)
cholesky = 4.77 sec (10 tests)
dalton = 4082.18 sec (475 tests)
dft = 609.43 sec (45 tests)
dpt = 3.09 sec (8 tests)
energy = 284.70 sec (19 tests)
essential = 105.42 sec (117 tests)
fde = 67.25 sec (2 tests)
gen1int = 29.74 sec (2 tests)
geo = 742.53 sec (29 tests)
long = 700.49 sec (18 tests)
mcscf = 61.14 sec (3 tests)
medium = 1552.28 sec (75 tests)
mp2r12 = 9.00 sec (17 tests)
multistep = 328.40 sec (9 tests)
numder = 20.52 sec (5 tests)
pcm = 397.13 sec (17 tests)
peqm = 85.67 sec (24 tests)
prop = 583.30 sec (44 tests)
qfit = 2.04 sec (3 tests)
qm3 = 100.10 sec (25 tests)
qmmm = 118.43 sec (8 tests)
rsp = 899.58 sec (79 tests)
runtest = 2871.17 sec (204 tests)
short = 1572.58 sec (373 tests)
soppa = 89.22 sec (18 tests)
unknown = 0.75 sec (1 test)
verylong = 198.26 sec (18 tests)
walk = 176.27 sec (8 tests)
weekly = 14.80 sec (21 tests)

Total Test time (real) = 4097.49 sec

The following tests FAILED:
487 - benchmark_eri_adz (Failed)
488 - benchmark_eri_adzs (Failed)
489 - benchmark_eri_atzs (Failed)
490 - benchmark_eri_r12 (Failed)
491 - benchmark_eri_r12xl (Failed)
492 - benchmark_her_adz (Failed)
493 - benchmark_her_adzs (Failed)
494 - benchmark_her_atzs (Failed)
495 - benchmark_her_r12 (Failed)
496 - benchmark_her_r12xl (Failed)
Errors while running CTest
make: *** [test] Error 8

User avatar
magnus
Posts: 519
Joined: 27 Jun 2013, 16:32
First name(s): Jógvan Magnus
Middle name(s): Haugaard
Last name(s): Olsen
Affiliation: Hylleraas Centre, UiT The Arctic University of Norway
Country: Norway

Re: Tests time out after build

Post by magnus » 28 Jul 2020, 19:33

Great! The benchmark tests are known to fail so I wouldn't worry about those failing.

I thought that we used OpenMPI v3 in our CI but it turns out that we do not. So while I'd expect it to work in general, I cannot say for sure. We use OpenMPI v1.8, v2.1, and v4.0, so those should work.

dwh1d17
Posts: 5
Joined: 23 Jul 2020, 11:40
First name(s): David
Last name(s): Hempston
Affiliation: University of Southampton
Country: United Kingdom

Re: Tests time out after build

Post by dwh1d17 » 29 Jul 2020, 10:29

Using opempi v4.0 seems to be the best solution. I built it using gcc 8.2 and now get the following errors.
The following tests FAILED:
51 - energy_stex (Failed)
487 - benchmark_eri_adz (Failed)
488 - benchmark_eri_adzs (Failed)
489 - benchmark_eri_atzs (Failed)
490 - benchmark_eri_r12 (Failed)
491 - benchmark_eri_r12xl (Failed)
492 - benchmark_her_adz (Failed)
493 - benchmark_her_adzs (Failed)
494 - benchmark_her_atzs (Failed)
495 - benchmark_her_r12 (Failed)
496 - benchmark_her_r12xl (Failed)
Errors while running CTest
The only other warning is the following from linking
/usr/bin/ld: warning: libgfortran.so.3, needed by /usr/lib64/libblas.so.3.2.1, may conflict with libgfortran.so.5
Which version of blas and gcc do you test against?

User avatar
magnus
Posts: 519
Joined: 27 Jun 2013, 16:32
First name(s): Jógvan Magnus
Middle name(s): Haugaard
Last name(s): Olsen
Affiliation: Hylleraas Centre, UiT The Arctic University of Norway
Country: Norway

Re: Tests time out after build

Post by magnus » 29 Jul 2020, 11:34

At the moment, we test GCC 5-10 but only one per major version. We use Fedora docker images for this. In all cases, we use OpenBLAS together with the system native LAPACK. Not sure which versions though.

The warning you get looks like it could be related to the fact that BLAS has been compiled with a lower version of GCC (the one that comes standard with your system), whereas you use a more recent GCC to compile Dalton.

Post Reply

Who is online

Users browsing this forum: No registered users and 0 guests