memory problems using SOPPA (NMR couplings)

Find answers or ask questions regarding Dalton calculations.
Please upload an output file showing the problem, if applicable.
(It is not necessary to upload input files, they can be found in the output file.)

moe
Posts: 8
Joined: 01 Jun 2016, 10:37
First name(s): Marc-Olivier
Last name(s): Ebert
Affiliation: ETH Zurich
Country: Switzerland

memory problems using SOPPA (NMR couplings)

Post by moe » 01 Jun 2016, 10:58

Dear Dalton community,

I want to calculate spin-spin coupling constants using SOPPA. My test molecule is fluoropropene and I want to use aug-cc-pVTZ as basis set. All my runs end with the same error message that I get shortly after the start of ABACUS (a set of typical output files is attached to this post):

Symmetry -> DSO by analytical integration.

Integral transformation: Total CPU and WALL times (sec) 2679.560 2695.624

MEMGET ERROR, insufficient free space for next allocation
( Need: 72965, available (LFREE): 8631 )


The things I have found out by now are (also by reading the posts in this forum):

1. That I can only use one processor since both the MP2 and SOPPA portions do not run MPI parallel. I get a small gain (ca. 20%) however, if I use two processors and parallel threading via -omp 2 (linked to the parallel version of OpenBLAS). But this does not help with my problem.

2. That I have to explicitly specify the amount of (static) working memory via the -mb option. But that the maximum amount of the work array that can be used by dalton is 16 GB.

I have also asked lsf to assign more than 16 GB of memory to my process (still keeping -mb 16384) or increased the amount of scratch memory available to the job. But this does not make a difference, the jobs always run out of memory. So I guess the 16 GB limitation of Dalton itself is the problem.

Any suggestions what I can do? Changing to a smaller basis set helps (tried) but this is not what I want. Writing some integrals to disk, changing SOPPA default parameters?

Best regards,

Oli
com_cisfluoropropene.out
(102.82 KiB) Downloaded 321 times
Attachments
cisfluoropropene-lsfo.txt
lsf output
(6.67 KiB) Downloaded 265 times

frj
Posts: 11
Joined: 09 May 2014, 08:37
First name(s): Frank
Last name(s): Jensen
Affiliation: Aarhus University
Country: Denmark

Re: memory problems using SOPPA (NMR couplings)

Post by frj » 01 Jun 2016, 19:02

I can't help you with the specific problem, but would like to point out that standard basis sets, like aug-cc-pVTZ, are not suitable for calculating spin-spin coupling constants.
I suggest using specialized basis sets, like pcJ-n. pcJ-1 is much smaller than aug-cc-pVTZ, and likely gives much lower basis set errors, and may perhaps also solve your memory problem. pcJ-2 is comparable in size to aug-cc-pVTZ, and likely produce results close to the basis set limit for the particular method.

moe
Posts: 8
Joined: 01 Jun 2016, 10:37
First name(s): Marc-Olivier
Last name(s): Ebert
Affiliation: ETH Zurich
Country: Switzerland

Re: memory problems using SOPPA (NMR couplings)

Post by moe » 01 Jun 2016, 19:51

Thanks for your suggestion! I will certainly try that. The fluoropropene was intended as test case to get a feeling for the performance of the SOPPA calculation. My initial plan for the real molecule (a bit bigger) was to use aug-cc-pVTZ-J for the atoms participating in the coupling of interest and aug-cc-pVTZ for the rest (thus the aug-cc-pVTZ). In the meantime I have turned to cc-pVTZ for the "passive" atoms (results still pending).

sauer
Posts: 45
Joined: 27 Aug 2013, 16:37
First name(s): Stephan P. A.
Last name(s): Sauer
Affiliation: Department of Chemistry, University of Copenhagen
Country: Denmark
Contact:

Re: memory problems using SOPPA (NMR couplings)

Post by sauer » 01 Jun 2016, 21:17

Dear Oli,

I have a few comments:
a) yes, Frank's pcJ basis sets would be an alternative to our aug-cc-pVTZ-J. It depends on how large the PSO or SD terms are. If the coupling is dominated by the FC, then go for the aug-cc-pVTZ-J, if PSO and SD are important then you better use the larger pcJ-2
b) The SOPPA coupling constant program is not yet parallel. We are working on that and I hope that by the end of this year, we have a MPI parallelised version. Until then you cannot use MPI.
c) You can certainly use much more memory with Dalton - we do. But you have to compile it in the 64 Bit version in order to do so - I think
d) Your problem is not in the SOPPA code, but actually in the integral transformation code. And it a bit strange, I have to admit, because it is only very little memory the programs seems to want more.

So, I will try to run your job on our machine here in Copenhagen and let you know later, what happens here.

Best wishes
Stephan

moe
Posts: 8
Joined: 01 Jun 2016, 10:37
First name(s): Marc-Olivier
Last name(s): Ebert
Affiliation: ETH Zurich
Country: Switzerland

Re: memory problems using SOPPA (NMR couplings)

Post by moe » 01 Jun 2016, 21:56

Dear Stephan,

Thanks for your help! The coupling I am interested in should clearly be dominated by FC (according to DFT calculations). From reading the posts here it seemed to me that compiling a 64 bit version would be quite tricky. So I am looking forward to the results of your test in Copenhagen. Maybe there is an easier solution to my memory problem, especially as it seems that I am fighting the symptoms of some other issue.

Best regards,

Oli

sauer
Posts: 45
Joined: 27 Aug 2013, 16:37
First name(s): Stephan P. A.
Last name(s): Sauer
Affiliation: Department of Chemistry, University of Copenhagen
Country: Denmark
Contact:

Re: memory problems using SOPPA (NMR couplings)

Post by sauer » 01 Jun 2016, 23:28

Dear Oli,

I had not problem running your input on one of our machines. See the output and it was even without 64Bit integers. Although I did not test it, I would expect that it would have worked also with less than the 16 GB of RAM. But as you might notice from the output, I was not using the official release but the developers master version, as I do not have installed the 2016 release version yet or ever. Secondly, I did not use an MPI compiled version.

My conclusion is that either there is a problem in the release version or more likely that something went wrong with your installation or it is a problem with running an MPI compiled version of the program. Did you run all the test in the test suite including all SOPPA tests?

I recommend make a second build without MPI and then try again. If that does not work, then post it again, but under the heading "Problems with the new integral transformation", because then you will get attention of the people who know more about this than I.

Finally, I can see that this coupling is FC dominated, so the aug-cc-pVTZ-J basis set will be just fine.

Best wishes
Stephan
Attachments
fpropen_fpropen.out
(109.06 KiB) Downloaded 258 times

taylor
Posts: 545
Joined: 15 Oct 2013, 05:37
First name(s): Peter
Middle name(s): Robert
Last name(s): Taylor
Affiliation: Tianjin University
Country: China

Re: memory problems using SOPPA (NMR couplings)

Post by taylor » 02 Jun 2016, 00:26

Could you post the complete output of your job, not just the logfile? But as an immediate response, your integral file (AOTWOINT) is pretty large, about 8GB, and this may create problems in some addressing situations. But I cannot say more without more information.

I am not an expert on nuclear spin-spin coupling calculations, but I have to say I cannot conceive of why anyone would favour cc-J type basis sets over Frank Jensen's pc-J sets. I would always go with the pc-J sets.

Best regards
Pete

moe
Posts: 8
Joined: 01 Jun 2016, 10:37
First name(s): Marc-Olivier
Last name(s): Ebert
Affiliation: ETH Zurich
Country: Switzerland

Re: memory problems using SOPPA (NMR couplings)

Post by moe » 02 Jun 2016, 20:15

@Stephan: Thank you very much for your help!
@Pete: Thank you. I am not quite sure what you mean by output. The DALTON.* files in the tar file?

taylor
Posts: 545
Joined: 15 Oct 2013, 05:37
First name(s): Peter
Middle name(s): Robert
Last name(s): Taylor
Affiliation: Tianjin University
Country: China

Re: memory problems using SOPPA (NMR couplings)

Post by taylor » 02 Jun 2016, 20:49

If you run Dalton using the dalton runscript in bin there is an option

Code: Select all

-o <filename>
that can be used to ensure the script writes the job output back to the directory of job submission while the job is actually running. This is described/discussed in the manual. The output file is (I think --- it's years, no, decades! since I tried to deal with this!!) is DALTON.OUT in the scratch directory in which Dalton runs. There are multiple problems with this and the script option is a result of Kenneth Ruud tackling those problems in San Diego close to twenty years ago. First, depending on what the compiler/libraries/operating system do about flushing buffers and closing files it is difficult to be sure that if the job dies, the output file is "up to date" --- it may be that data didn't get written to it properly when the job ended. Second, there is an issue with networked file systems (such as a computer where the user filesystem is NFS-mounted across multiple nodes, which is almost universal today) and when and how files are updated. Third, many resource management systems on multinode compute facilities: software like SLURM, TORQUE, PBSPro, LSF, are set up by the sysadmins (this is not a feature/bug of those resource management systems, it is usually a conscious choice, often for good reason, by the site running the system(s)) so that the scratch directory created for the user is automatically deleted at the end of the job, irrespective of whether the job ends successfully or not. If you are running on your own system and have set things up so such directories are preserved, then it's fine because you can go in and ferret out the output fie. But if the operating system has deleted your scratch directory in the meantime, well, then it's lost.

So in most situations users should use the dalton runscript with the -o option so that the directory from which they initiated the job always has an output file that is completely up to date. If the job aborts, most likely this file (and not always the very last lines, in cases where things go wrong earlier in the calculation!) will contain the most useful information to understand and correct the problem. And that it is what it is essential for users to post.

Best regards
Pete

moe
Posts: 8
Joined: 01 Jun 2016, 10:37
First name(s): Marc-Olivier
Last name(s): Ebert
Affiliation: ETH Zurich
Country: Switzerland

Re: memory problems using SOPPA (NMR couplings)

Post by moe » 02 Jun 2016, 21:11

Dear Pete,

Thanks for the clarification. I have run the job using the -ow option which I thought would give me the DALTON.OUT file and just rename it (according to the manual). This is the file I have attached to my original post. I am afraid this is all I have left (we use lsf here, and the scratch is deleted by now). So the DALTON.OUT file I get with -o is not the same file just with another name?

Best regards,

Oli

sauer
Posts: 45
Joined: 27 Aug 2013, 16:37
First name(s): Stephan P. A.
Last name(s): Sauer
Affiliation: Department of Chemistry, University of Copenhagen
Country: Denmark
Contact:

Re: memory problems using SOPPA (NMR couplings)

Post by sauer » 02 Jun 2016, 21:15

Dear Oli,

I guess Pete did not see the output file, which you had attached.

Did you manage to run the test suite of Dalton?

Best wishes and let me know, whether you need more help
Stephan

moe
Posts: 8
Joined: 01 Jun 2016, 10:37
First name(s): Marc-Olivier
Last name(s): Ebert
Affiliation: ETH Zurich
Country: Switzerland

Re: memory problems using SOPPA (NMR couplings)

Post by moe » 02 Jun 2016, 21:34

Dear Stephan,

I did not compile the program myself. But according to the person in our cluster support group who helped me with it the tests run through OK. It seems I will need some time to sort things out...

Many thanks and best regards,

Oli

taylor
Posts: 545
Joined: 15 Oct 2013, 05:37
First name(s): Peter
Middle name(s): Robert
Last name(s): Taylor
Affiliation: Tianjin University
Country: China

Re: memory problems using SOPPA (NMR couplings)

Post by taylor » 04 Jun 2016, 06:25

Belatedly, my apologies for having missed that you had posted the output file. In my own defence, it did not appear as an attachment in my browser window, and even after reloading the page it does not show in the attachment list but elsewhere on the page just as a link! This may be a browser problem, or a forum (website) problem, but it's certainly not your problem!

One last comment, if your cis-fluoropropene is only for testing and your "real molecule" is a bit bigger, you will end up with a pretty big calculation sticking with aug-cc-pVTZ...

Best regards
Pete

xiongyan21
Posts: 184
Joined: 24 Sep 2014, 08:36
First name(s): yan
Last name(s): xiong
Affiliation: CENTRAL CHINA NORMAL UNIVERSITY
Country: China

Re: memory problems using SOPPA (NMR couplings)

Post by xiongyan21 » 06 Jun 2016, 03:33

Dear Prof. Sauer
The calculation cannot finish using Dalton2016.1 whether MPI is employed or not.
A sequential run encounters memory problem although 16 GB memory is also used.

* Work memory size : 2048000000 = 15.259 gigabytes.
...

Only the spin-spin couplings between the following nuclei will be calculated: -----------------------------------------------------------------------------

1 2



Changes of defaults for SPIN-S:
-------------------------------


NSETUP: MO transformation level too low or no MO integral file found.
NSETUP: generating MO 2-el. integral file. Transformation level: 10


Integral transformation: Total CPU and WALL times (sec) 2569.961 2615.575

Center of mass dipole origin : 0.000000 -0.000000 0.000000

Center of mass gauge origin : 0.000000 -0.000000 0.000000


.------------------------------------------------.
| Starting in Static Property Section (ABACUS) - |
`------------------------------------------------'



Date and time (Darwin) : Mon Jun 6 07:18:17 2016
Host name :

Symmetry -> DSO by analytical integration.

Integral transformation: Total CPU and WALL times (sec) 2676.252 2877.802

MEMGET ERROR, insufficient free space for next allocation
( Need: 72965, available (LFREE): 27181 )


QTRACE dump of internal trace stack

========================
level module
========================
9 MEMGET2
8 N_NXTH2M
7 DRCACD
6 DRCCTL
5 ABARSP
4 ABACTL
3 ABACUS
2 DALTON
1 DALTON main
========================


--- SEVERE ERROR, PROGRAM WILL BE ABORTED ---
Date and time (Darwin) : Mon Jun 6 08:07:01 2016
Host name :
@ MPI MASTER, node no.: 0
@ Reason: MEMGET ERROR, insufficient work space in memory

Total CPU time used in DALTON: 1 hour 55 minutes 42 seconds
Total wall time used in DALTON: 2 hours 0 minutes 15 seconds


QTRACE dump of internal trace stack

========================
level module
========================
9 MEMGET2
8 N_NXTH2M
7 DRCACD
6 DRCCTL
5 ABARSP
4 ABACTL
3 ABACUS
2 DALTON
1 DALTON main
========================



It seems the MP2 and latter processes are much faster if an AMD cpu is used, which is your case.
A parallel run gets stuck during MP2 calculation.
The decrease of the transformation level may let the calculation pass.



Very Best Regards!
Last edited by xiongyan21 on 08 Jun 2016, 06:44, edited 2 times in total.

sauer
Posts: 45
Joined: 27 Aug 2013, 16:37
First name(s): Stephan P. A.
Last name(s): Sauer
Affiliation: Department of Chemistry, University of Copenhagen
Country: Denmark
Contact:

Re: memory problems using SOPPA (NMR couplings)

Post by sauer » 06 Jun 2016, 09:31

Hi again,

ok. Give me a bit of time and I will install 2016.1 on our computers and run the job again with this version. This will tell us, whether there is a problem with 2016 or with your installation.

Best regards
Stephan

moe
Posts: 8
Joined: 01 Jun 2016, 10:37
First name(s): Marc-Olivier
Last name(s): Ebert
Affiliation: ETH Zurich
Country: Switzerland

Re: memory problems using SOPPA (NMR couplings)

Post by moe » 06 Jun 2016, 09:45

Dear all,

Some news from my side: I have now a recompiled version of Dalton using 64bit integers and no MPI. If I compare output from SOPPA runs that DO work (smaller molecules, smaller basis sets) in both the 32bit and the 64bit integers version, there is not difference. So the new version seems to work (although there was a floating point exception error in one of the standard tests). I am now able to to allocate more than 16 GB of working memory (which the program also acknowledges).

But there are still problems with my (big) cisfluoropropene calculation (complete output attached). The error message is now:

Symmetry -> DSO by analytical integration.

Integral transformation: Total CPU and WALL times (sec) 2640.130 2647.441

Sorting integrals to Dirac format: Total CPU and WALL times (sec) 1.610 1.617

MEMCHK ERROR, not a valid memget id in work(kalloc-1)
Text from calling routine : RSPOLI.h2m+d (called from MEMREL)
KFIRST,KALLOC,IALLOC = 1 357638 11
found memory checks: 4554529536261458283 ( value as real*8: 0.319509643204E-03 )
expected : 1234567890

In the lsf standard error I find the following error message:

Program received signal SIGFPE: Floating-point exception - erroneous arithmetic operation.

But I am not sure if this is directly related to the error in the test suite. To me it seems to be again a memory issue related to integral transformation or/and sorting...

Best regards,

Oli
Attachments
cisfluoropropene-lsfe.txt
(675 Bytes) Downloaded 261 times
cisfluoropropene-lsfo.txt
(6.57 KiB) Downloaded 212 times
com_cisfluoropropene.out
(104.6 KiB) Downloaded 247 times

taylor
Posts: 545
Joined: 15 Oct 2013, 05:37
First name(s): Peter
Middle name(s): Robert
Last name(s): Taylor
Affiliation: Tianjin University
Country: China

Re: memory problems using SOPPA (NMR couplings)

Post by taylor » 06 Jun 2016, 09:57

The SIGFPE error itself is likely misleading, but it is an unavoidable Linux "feature" when a code issues "CALL ABORT" to abort the calculation. In this case what is significant is that there is a memory check error. The program, when it allocates arrays within an overall large scratch memory array (in the computing jargon this is a stack-based allocation scheme), puts a special bit-sequence at the start and end of each array. This can then subsequently be checked --- if the special bit-sequence has been corrupted then the array has been addressed incorrectly, which would be a bug in the code (or, less likely, but not impossible, a bug in the code generated by the compiler rather than in the code itself). So the error you are seeing is reflected in the memory check message, the code "knows" something has gone wrong, and calls the system abort routine. That in turn generates a SIGFPE at the system level, but that is beyond Dalton's control.

Best regards
Pete

sauer
Posts: 45
Joined: 27 Aug 2013, 16:37
First name(s): Stephan P. A.
Last name(s): Sauer
Affiliation: Department of Chemistry, University of Copenhagen
Country: Denmark
Contact:

Re: memory problems using SOPPA (NMR couplings)

Post by sauer » 06 Jun 2016, 15:26

Hi Oli, Hans Jørgen and Peter,

I have now run the input with both the Dalton2016.1 version and the GIT master version. It works with the GIT master version (with 16 GB and with 8GB), but it does not work with the official DALTON2016.1.

So it is not that this job would need an unreasonable amount of memory, but I think, that there is a problem in DRCACD or N_NXTH2M in the DALTON2016.1. And that is a part of the code, I do not know nothing about - unfortunately.

Best wishes
Stephan
Attachments
fpropen_fpropen-GIT-16GB.out
with GIT version and 16 GB
(109.06 KiB) Downloaded 196 times
fpropen_fpropen-GIT-8GB.out
with GIT version and 8 GB
(109.08 KiB) Downloaded 204 times
fpropen_fpropen-2016.out
with Dalton 2016
(103.04 KiB) Downloaded 231 times

moe
Posts: 8
Joined: 01 Jun 2016, 10:37
First name(s): Marc-Olivier
Last name(s): Ebert
Affiliation: ETH Zurich
Country: Switzerland

Re: memory problems using SOPPA (NMR couplings)

Post by moe » 07 Jun 2016, 16:34

Dear Stephan,

Thanks for your help. Would it be possible to have access to an older version of the parts of the code you are mentioning?

Best regards,

Oli

sauer
Posts: 45
Joined: 27 Aug 2013, 16:37
First name(s): Stephan P. A.
Last name(s): Sauer
Affiliation: Department of Chemistry, University of Copenhagen
Country: Denmark
Contact:

Re: memory problems using SOPPA (NMR couplings)

Post by sauer » 07 Jun 2016, 19:27

Well, you could try to download and install Dalton 2015. But give me a day and I will try to run your input with my local DALTON 2015 installation in order to see, whether the problem also occurs with that version.

Best wishes
Stephan

sauer
Posts: 45
Joined: 27 Aug 2013, 16:37
First name(s): Stephan P. A.
Last name(s): Sauer
Affiliation: Department of Chemistry, University of Copenhagen
Country: Denmark
Contact:

Re: memory problems using SOPPA (NMR couplings)

Post by sauer » 07 Jun 2016, 22:36

Yes, on my computer it works fine with the Dalton 2015 version and 8 GB. So try to get hold of Dalton 2015.

Best wishes
Stephan

m775b097
Posts: 18
Joined: 26 Apr 2016, 03:55
First name(s): Matthew
Last name(s): Barclay
Affiliation: University of Kansas
Country: United States

Re: memory problems using SOPPA (NMR couplings)

Post by m775b097 » 11 Jul 2016, 20:30

Hello everyone,

I was wondering if I could get in on this (in-progress?) discussion, as I ran into a similar problem when attempting to compute the excited-state polarizabilities of a somewhat large molecule.

I've attached the failed .out file, but suffice to say I end with the same result:

Code: Select all

 MEMGET ERROR, insufficient free space for next allocation
               ( Need:    273803, available (LFREE):     18283 )
....mind this is after increasing WRKMEM to ~10 GB. The default (~488 MB) failed similarly (only difference being that the reported LFREE was 93383).

This is actually something I've been trying to understand for awhile, and ....I must first confess that, other than what's on the manual, I know pathetically little about exactly how the memory storage works in DALTON [or with computers in general].

But specifically, if it's not too basic a question, would someone mind helping me understand the nature of LFREE?
I assume that's the free space in the allotted WRKMEM, but based on my outputs, there doesn't seem to be any consistent correlation between how much -mw I ask for and how much LFREE is reported.....so I must be misunderstanding something.

And....have we confirmed what the issue is yet?
Is it really just a matter of ineffective compilation? (my version also utilizes MPI)

Thanks so much,
~Matt
Attachments
wrkmem4_dplus1.out
(115.27 KiB) Downloaded 198 times

taylor
Posts: 545
Joined: 15 Oct 2013, 05:37
First name(s): Peter
Middle name(s): Robert
Last name(s): Taylor
Affiliation: Tianjin University
Country: China

Re: memory problems using SOPPA (NMR couplings)

Post by taylor » 11 Jul 2016, 21:00

Well, the code (compiled with 32-bit addressing) can address 16GB (per task if you are running parallel), so provided you have a machine where this much memory is available you could in principle specify that much memory per task using WRKMEM and your queueing system. Well, not quite that much because you have to accommodate the code itself, of course. Assuming you have not enabled "store integrals in memory" a WRKMEM of 15GB should be no problem. Just checking: you are running on a system that has either no limits or you are specifying to the queueing system (SLURM, TORQUE, PBS, whatever) enough memory to accommodate your Dalton WRKMEM request?

This seems to me to a be a big but not outrageously big calculation, and I am a bit surprised it feels it has run out of memory. That said, as I said in my earlier posting, Dalton is sufficiently old-school that it uses a stack-based maximum-fixed-at-runtime allocation scheme, and this is just as subject to odd coding failures as failing to deallocate arrays is in heap-based allocation systems like modern Fortran. One thing that might be worth trying to see if there really is a bug would be to run this job as an SCF calculation, rather than using DFT. This might at least provide some clues as to what is going wrong.

But at the end of the day, I'm afraid where you are encountering problems is not my code, and I don't have any better ideas than flailing around looking at what might work and what (like your current calculation) definitely doesn't...

Best regards
Pete

kennethruud
Posts: 252
Joined: 27 Aug 2013, 16:42
First name(s): Kenneth
Last name(s): Ruud
Affiliation: UiT The Arctic University of Norway
Country: Norway

Re: memory problems using SOPPA (NMR couplings)

Post by kennethruud » 11 Jul 2016, 21:28

Dear Matt,

As Peter says, it might very well be a bug for this particular calculation setup. However, I note you run your calculation writing two-electron integrals to disk, and the crash happens when transforming these integrals into the MO basis. Could I recommend rerunning the calculation in integral-direct mode, or even better, as a parallel calculation? I think this should do the trick.


Best regards,

Kenneth

taylor
Posts: 545
Joined: 15 Oct 2013, 05:37
First name(s): Peter
Middle name(s): Robert
Last name(s): Taylor
Affiliation: Tianjin University
Country: China

Re: memory problems using SOPPA (NMR couplings)

Post by taylor » 11 Jul 2016, 21:35

Yes, I hadn't really paid attention to this aspect and certainly running .DIRECT or .PARALLEL is the first thing to try!

Best regards
Pete

Post Reply

Who is online

Users browsing this forum: No registered users and 3 guests