DSCF error termination at 'Dispersion Energy Correction'

Find answers or ask questions regarding Dalton calculations.
Please upload an output file showing the problem, if applicable.
(It is not necessary to upload input files, they can be found in the output file.)

Post Reply
Saruti
Posts: 3
Joined: 29 Oct 2020, 14:26
First name(s): Zeer
Last name(s): Sirimatayanant
Affiliation: Wroclaw University of Science and Technology
Country: Poland

DSCF error termination at 'Dispersion Energy Correction'

Post by Saruti » 01 Dec 2020, 12:27

Hello Dalton users!

Im a newbie in Dalton and this is technically my first calculation (Ive practiced retracing other researcher's procedures but this is the first time I've tried running my own setup) and Ive been having issues with dscf calculations in qm/mm. Ultimately, I would like to produce an OPA/TPA absorption spectra so I would need to create a molecular orbital file (SIRIUS.rst) first, hence Im running dscf. Unfortunately my dscf has been error terminating at around the stage of 'Dispersion Energy Correction'. I apologize for copy/pasting the Dalton.e* file below but the forum wouldn't let me upload the error file. In any case, personally I don't think it has something to do with my compilers/mpi processors mainly because when I was retracing other people's procedure, I came upon a similar error when I included too many atoms into my mm region (multipoles/polarizabilities) or when I don't allocate enough RAM to the calculation.

For this calculation I used ~100gb of ram over 2 nodes and 12 cores, the PBS queuing system also mentioned that I only used 11.14gb/97.66gb, so I honestly don't think its a RAM issue neither. However I do think I might have made a mistake in generating my molecule.inp or potential.inp file. I've checked it over many times but I cant tell where the mistake is.

Could anyone give me any suggestions? Are my assumptions about my errors even in the right ballpark?

/opt/pbs/lib/python/altair/pbs/v1/_base_types.py:1436: DeprecationWarning: object.__new__() takes no parameters
return object.__new__(cls, value, is_entity)
/opt/pbs/lib/python/altair/pbs/v1/_svr_types.py:259: DeprecationWarning: object.__new__() takes no parameters
return object.__new__(cls, value)
/opt/pbs/lib/python/altair/pbs/v1/_base_types.py:793: DeprecationWarning: object.__init__() takes no parameters
super(pbs_str,self).__init__(value)
/opt/pbs/lib/python/altair/pbs/v1/_base_types.py:767: DeprecationWarning: object.__init__() takes no parameters
super(pbs_int,self).__init__(value)
/opt/pbs/lib/python/altair/pbs/v1/_svr_types.py:379: DeprecationWarning: object.__new__() takes no parameters
return object.__new__(cls, value)
binutils/2.25 load complete.
intel/13.1 load complete.
'openmpi/1.8.4-intel13.1' load complete.
forrtl: error (78): process killed (SIGTERM)
Image PC Routine Line Source
libmpi.so.1 00007F1737FC4AA8 Unknown Unknown Unknown
libopen-pal.so.6 00007F1736BA7B7A Unknown Unknown Unknown
libmpi.so.1 00007F1737FC24AE Unknown Unknown Unknown
libmpi.so.1 00007F1737E7BD0C Unknown Unknown Unknown
libmpi_mpifh.so.2 00007F17383B3B4B Unknown Unknown Unknown
dalton.x 0000000001310724 Unknown Unknown Unknown
dalton.x 000000000128C0BA Unknown Unknown Unknown
dalton.x 00000000008DF62E Unknown Unknown Unknown
dalton.x 0000000000423AFF Unknown Unknown Unknown
dalton.x 00000000008A7236 Unknown Unknown Unknown
dalton.x 00000000016F164B Unknown Unknown Unknown
dalton.x 0000000001711179 Unknown Unknown Unknown
dalton.x 00000000016E1D1A Unknown Unknown Unknown
dalton.x 00000000016E08B0 Unknown Unknown Unknown
dalton.x 00000000016DD051 Unknown Unknown Unknown
dalton.x 000000000041DCDD Unknown Unknown Unknown
dalton.x 0000000000419D98 Unknown Unknown Unknown
dalton.x 0000000000413577 Unknown Unknown Unknown
dalton.x 000000000040C8EC Unknown Unknown Unknown
libc.so.6 00007F1737A61D20 Unknown Unknown Unknown
dalton.x 000000000040C7E9 Unknown Unknown Unknown
cp: nie można wykonać stat na `DALTON_MOLECULE_POTENTIAL.tar.gz': Nie ma takiego pliku ani katalogu
Attachments
DALTON.out
(6.48 KiB) Downloaded 22 times
output.log
(262.25 KiB) Downloaded 20 times

reinholdt
Posts: 6
Joined: 10 Apr 2016, 21:01
First name(s): Peter
Last name(s): Reinholdt
Affiliation: University of Southern Denmark
Country: Denmark

Re: DSCF error termination at 'Dispersion Energy Correction'

Post by reinholdt » 01 Dec 2020, 14:46

This is certainly interesting. Could you also post the potential file?
I cannot test with PE before I get it, but I think the remaining part of the calculation is able to proceed through the SCF iterations on my machine.
Can you attach the potential file you used so I can test PE?

Some other comments:
- The calculation will run a bit faster if you use a smaller shell of PE-ECPs around your QM molecule. This is the reason why a long time is spent in ONEDRV.
- With a zero charge on your molecule, you get an open shell – is this intended?
- The D3 dispersion correction will be a bit strange; I think it will also include contributions from all the atoms you placed MM ECPs on.

Saruti
Posts: 3
Joined: 29 Oct 2020, 14:26
First name(s): Zeer
Last name(s): Sirimatayanant
Affiliation: Wroclaw University of Science and Technology
Country: Poland

Re: DSCF error termination at 'Dispersion Energy Correction'

Post by Saruti » 02 Dec 2020, 11:52

Hello reinholdt! and thank you for your reply!

I've attached the potential file in 3 parts (it was ~4mb and max upload here is 2mb).

>>With a zero charge on your molecule, you get an open shell – is this intended?
Yes I've realized later that my system net charge was actually -1, I don't use pyframe to produce all files in one go, but rather in steps and in one of my formatting scripts I made a mistake!

>>The calculation will run a bit faster if you use a smaller shell of PE-ECPs around your QM molecule. This is the reason why a long time is spent in ONEDRV.
I chose the cutoff at about 11Angstroms from the QM region, I was told that such a large region would be on the safer side because Im not sure how much of the protein's conformation would change.

On the other hand I was told by a friend from my University that I may have partitioned the QM/MM region wrong because I cut through amine/peptide bonds. I will also try to run this calculation again but with a different partition. I will update my result when I can!
Attachments
POT3.inp
(1.78 MiB) Downloaded 16 times
POT2.inp
(1.55 MiB) Downloaded 14 times
POT1.inp
(576.15 KiB) Downloaded 17 times

reinholdt
Posts: 6
Joined: 10 Apr 2016, 21:01
First name(s): Peter
Last name(s): Reinholdt
Affiliation: University of Southern Denmark
Country: Denmark

Re: DSCF error termination at 'Dispersion Energy Correction'

Post by reinholdt » 02 Dec 2020, 13:43

With the potential file you posted, I was able to successfully pass the section where you had the crash.

Can you try recompiling dalton with debug flags (either a --type debug build or a standard release build with --extra-fc-flags='-g') and run again so we can resolve what the symbols of the traceback are?
Do you have the possibility of trying another compiler/mpi combination?

Otherwise, maybe there is a problem with the PBS setup? Can you try running on just a single node?

Saruti
Posts: 3
Joined: 29 Oct 2020, 14:26
First name(s): Zeer
Last name(s): Sirimatayanant
Affiliation: Wroclaw University of Science and Technology
Country: Poland

Re: DSCF error termination at 'Dispersion Energy Correction'

Post by Saruti » 20 Dec 2020, 19:21

Hello Reinholdt!

Im sorry for the delay but I was caught up in some other duties. I have tried running the simulation again in debug build, I have attached the entire directory for you to view. I have also tried running a similar setup for TPA (quadratic response) on a single node (but 12 cores), I received this setup from a colleague who's graduated recently. So I think the setup itself is fine, I don't remember making any modifications to it. Anyway, this calculation also failed at dispersion energy correction. So maybe it really is something wrong with my build! Although I'm quite new at this so I was leaning towards bad QM/MM partition.
Attachments
debug.zip
(783.76 KiB) Downloaded 15 times

reinholdt
Posts: 6
Joined: 10 Apr 2016, 21:01
First name(s): Peter
Last name(s): Reinholdt
Affiliation: University of Southern Denmark
Country: Denmark

Re: DSCF error termination at 'Dispersion Energy Correction'

Post by reinholdt » 20 Dec 2020, 19:40

Yes, well – the error happens in the PE parts of the code,
but perhaps also the first parallel section?
With the debug build we can see the traceback:

Code: Select all

Image              PC                Routine            Line        Source             
dalton.x           00000000034FA6B6  Unknown               Unknown  Unknown
dalton.x           00000000033672E1  pelib_mpi_mp_mpi_         208  pelib_mpi.F90
dalton.x           000000000335174C  pelib_mp_pelib_sl        1950  pelib.F90
dalton.x           00000000031046E3  pelib_interface_m         996  pelib_interface.F90
dalton.x           0000000000427853  dalton_nodedriver         237  dalpar.F
dalton.x           00000000004161EA  MAIN__                    665  dalton.F
dalton.x           000000000040D81E  Unknown               Unknown  Unknown
libc.so.6          00007F9EB93ACD20  Unknown               Unknown  Unknown
dalton.x           000000000040D729  Unknown               Unknown  Unknown
and error messages:

Code: Select all

forrtl: severe (408): fort: (8): Attempt to fetch from allocatable variable CRDS when it is not allocated
but I don't think this necessarily has to do with PE, but maybe parallelization instead?

Does your job run without PE?
Does your job run in serial (no MPI)?

Sometimes I have seen similar problems when using the wrong mpirun executable.
Maybe there is a mismatch between the loaded modules?

Post Reply

Who is online

Users browsing this forum: No registered users and 1 guest