Problems with parallel Dalton and DFT
-
- Posts: 7
- Joined: 11 Mar 2014, 13:41
- First name(s): Luca
- Last name(s): De Vico
- Affiliation: Copenhagen University
- Country: Denmark
Problems with parallel Dalton and DFT
Dear Dalton,
I receive an error whenever I try to run any DFT calculation in parallel. Let me give you the details of my installation.
Dalton 2013.2
Intel compilers + MKL vers. 13.0.1.117
Open MPI vers. 1.6.5
cmake vers 2.8.9
Dalton was compiled using:
./setup --mpi --mkl=parallel --int64 --fc=mpif90 --cc=mpicc --cxx=mpicxx --type=debug
I used type=debug to try and obtain further info on the error.
I use the following input files:
test.dal
**DALTON INPUT
.RUN WAVE
**WAVE FUNCTIONS
.DFT
B3LYP
**END OF DALTON INPUT
test.mol
BASIS
cc-pVDZ
this is a comment line
this is a comment line
1
10. 1
Ne 0.0 0.0 0.0
If I run this calculation in a serial environment (that is, using the code compiled with MPI support but on 1 cpu on 1 node), it successfully completes.
If I try to run this calculation on 2 cpus I get the following not so informative error:
Error in /users/software/kemi/openmpi-1.6.5/bin/mpiexec -mca plm_rsh_disable_llspawn 1 -np 2 /users/software/kemi/Luca/DALTON-2013.2-Source-parallel/build/dalton.x, exit code 15
and the following error from the queue:
[node036.xxx:21850] *** An error occurred in MPI_Recv
[node036.xxx:21850] *** on communicator MPI_COMM_WORLD
[node036.xxx:21850] *** MPI_ERR_TRUNCATE: message truncated
[node036.xxx:21850] *** MPI_ERRORS_ARE_FATAL: your MPI job will now abort
The Dalton output stops at:
Automatic occupation of symmetries with 10 electrons.
Iter Total energy Error norm Delta(E) SCF occupation
-----------------------------------------------------------------------------
1 Screening settings (-IFTHRS, DIFDEN) -5 F
without any further message. It appears to me that there is some problem of I/O connected with OpenMPI, but I'm not able to understand exactly what. Moreover, it seems to be strongly related to DFT, since I tried many other tests from the test suite and only those involving DFT failed.
Any idea?
Thanks for any help, have a nice day
/Luca
I receive an error whenever I try to run any DFT calculation in parallel. Let me give you the details of my installation.
Dalton 2013.2
Intel compilers + MKL vers. 13.0.1.117
Open MPI vers. 1.6.5
cmake vers 2.8.9
Dalton was compiled using:
./setup --mpi --mkl=parallel --int64 --fc=mpif90 --cc=mpicc --cxx=mpicxx --type=debug
I used type=debug to try and obtain further info on the error.
I use the following input files:
test.dal
**DALTON INPUT
.RUN WAVE
**WAVE FUNCTIONS
.DFT
B3LYP
**END OF DALTON INPUT
test.mol
BASIS
cc-pVDZ
this is a comment line
this is a comment line
1
10. 1
Ne 0.0 0.0 0.0
If I run this calculation in a serial environment (that is, using the code compiled with MPI support but on 1 cpu on 1 node), it successfully completes.
If I try to run this calculation on 2 cpus I get the following not so informative error:
Error in /users/software/kemi/openmpi-1.6.5/bin/mpiexec -mca plm_rsh_disable_llspawn 1 -np 2 /users/software/kemi/Luca/DALTON-2013.2-Source-parallel/build/dalton.x, exit code 15
and the following error from the queue:
[node036.xxx:21850] *** An error occurred in MPI_Recv
[node036.xxx:21850] *** on communicator MPI_COMM_WORLD
[node036.xxx:21850] *** MPI_ERR_TRUNCATE: message truncated
[node036.xxx:21850] *** MPI_ERRORS_ARE_FATAL: your MPI job will now abort
The Dalton output stops at:
Automatic occupation of symmetries with 10 electrons.
Iter Total energy Error norm Delta(E) SCF occupation
-----------------------------------------------------------------------------
1 Screening settings (-IFTHRS, DIFDEN) -5 F
without any further message. It appears to me that there is some problem of I/O connected with OpenMPI, but I'm not able to understand exactly what. Moreover, it seems to be strongly related to DFT, since I tried many other tests from the test suite and only those involving DFT failed.
Any idea?
Thanks for any help, have a nice day
/Luca
-
- Posts: 1210
- Joined: 26 Aug 2013, 13:22
- First name(s): Radovan
- Last name(s): Bast
- Affiliation: none
- Country: Germany
Re: Problems with parallel Dalton and DFT
hi Luca,
this looks to me like an integer type mismatch between the C code
(DFT) and Fortran (rest of Dalton). unfortunately i cannot investigate this further
at this stage but i will file a ticket in the bugtracking to verify it.
the easy workaround is to not use 64bit integers, do you need them?
best regards,
radovan
this looks to me like an integer type mismatch between the C code
(DFT) and Fortran (rest of Dalton). unfortunately i cannot investigate this further
at this stage but i will file a ticket in the bugtracking to verify it.
the easy workaround is to not use 64bit integers, do you need them?
best regards,
radovan
-
- Posts: 7
- Joined: 11 Mar 2014, 13:41
- First name(s): Luca
- Last name(s): De Vico
- Affiliation: Copenhagen University
- Country: Denmark
Re: Problems with parallel Dalton and DFT
Hi Radovan,
Unfortunately yes, we would like to have 64bit integers. Is there any keyword I can add to setup in order to have a more complete error output or any way to increase the print level?
Thanks!
/Luca
Unfortunately yes, we would like to have 64bit integers. Is there any keyword I can add to setup in order to have a more complete error output or any way to increase the print level?
Thanks!
/Luca
-
- Posts: 7
- Joined: 11 Mar 2014, 13:41
- First name(s): Luca
- Last name(s): De Vico
- Affiliation: Copenhagen University
- Country: Denmark
Re: Problems with parallel Dalton and DFT
Hi Radovan,
I tried to compile Dalton in parallel but without --int64. Unfortunately I receive the same error as before. However, I did not change the OpenMPI libraries, which are still compiled with 64bit integers.
/Luca
I tried to compile Dalton in parallel but without --int64. Unfortunately I receive the same error as before. However, I did not change the OpenMPI libraries, which are still compiled with 64bit integers.
/Luca
-
- Posts: 1210
- Joined: 26 Aug 2013, 13:22
- First name(s): Radovan
- Last name(s): Bast
- Affiliation: none
- Country: Germany
Re: Problems with parallel Dalton and DFT
hi Luca,
for 32bit integer Dalton you should use 32bit integer OpenMPI.
unfortunately i am a bit busy right now but i will try to reproduce the problem with 64bit integers
next week. you have sent good information to reproduce it but just cannot do it right now.
best wishes,
radovan
for 32bit integer Dalton you should use 32bit integer OpenMPI.
unfortunately i am a bit busy right now but i will try to reproduce the problem with 64bit integers
next week. you have sent good information to reproduce it but just cannot do it right now.
best wishes,
radovan
-
- Posts: 7
- Joined: 11 Mar 2014, 13:41
- First name(s): Luca
- Last name(s): De Vico
- Affiliation: Copenhagen University
- Country: Denmark
Re: Problems with parallel Dalton and DFT
No problem and no hurry. I'll try and compile also a 32bit version of openmpi.
Best!
/Luca
Best!
/Luca
-
- Posts: 5
- Joined: 24 Jul 2019, 19:51
- First name(s): Juan
- Middle name(s): Jose
- Last name(s): Aucar
- Affiliation: AFA
- Country: Argentina
Re: Problems with parallel Dalton and DFT
Hi everyone,
I'd like to know if this issue was solved out.
I have the same problem: I get the error described on this thread whenever I want to do parallel calculations using DFT, with DALTON compiled on 64bits.
I tried compiling the Dalton Code with omp (as suggested here) but the error persists. By the way, I'm using the 2018 version.
Thanks in advance,
Juan
I'd like to know if this issue was solved out.
I have the same problem: I get the error described on this thread whenever I want to do parallel calculations using DFT, with DALTON compiled on 64bits.
I tried compiling the Dalton Code with omp (as suggested here) but the error persists. By the way, I'm using the 2018 version.
Thanks in advance,
Juan
- magnus
- Posts: 524
- Joined: 27 Jun 2013, 16:32
- First name(s): Jógvan Magnus
- Middle name(s): Haugaard
- Last name(s): Olsen
- Affiliation: Aarhus University
- Country: Denmark
Re: Problems with parallel Dalton and DFT
Unfortunately there are still issues with 64-bit integers. However, I think they a rarely needed for DFT calculations in Dalton. What kind of issue are you having with a 32-bit integer compilation?
-
- Posts: 5
- Joined: 24 Jul 2019, 19:51
- First name(s): Juan
- Middle name(s): Jose
- Last name(s): Aucar
- Affiliation: AFA
- Country: Argentina
Re: Problems with parallel Dalton and DFT
Thanks for the reply. I don't really need those kind of calculations. I was just doing some tests on my 64-bit integer compiled version, and when I've found this error I thought I did something wrong during the installation process.
I assume now that it's something not solved for the 64-bit integers compilation. Thanks!
Who is online
Users browsing this forum: No registered users and 1 guest