MEMGET ERROR
-
- Posts: 28
- Joined: 30 May 2019, 09:01
- First name(s): hong
- Last name(s): wang
- Affiliation: Nanjing Tech University
- Country: China
MEMGET ERROR
Dear all:
I encountered a problem when I used dalton2015.0 to do phosphorescence calculation,"Reason: MEMGET ERROR, insufficient work space in memory".I used 32 cores and 15000mb.I don't know what went wrong, and I don't know how to fix it.Does anyone have any ideas? Can you help me?
Attached are the input and output files, and the content in the image is the content in the run.sh file.
Looking forward to your reply. Thank you very much!
I encountered a problem when I used dalton2015.0 to do phosphorescence calculation,"Reason: MEMGET ERROR, insufficient work space in memory".I used 32 cores and 15000mb.I don't know what went wrong, and I don't know how to fix it.Does anyone have any ideas? Can you help me?
Attached are the input and output files, and the content in the image is the content in the run.sh file.
Looking forward to your reply. Thank you very much!
- Attachments
-
- 4-3C2.out
- (208.76 KiB) Downloaded 143 times
-
- C2.dal
- (170 Bytes) Downloaded 142 times
-
- 8RACGI3I]I1HE6EZA92Q0Z6.png (5.99 KiB) Viewed 10653 times
Last edited by wanghong on 28 Jun 2019, 07:47, edited 1 time in total.
-
- Posts: 395
- Joined: 27 Jun 2013, 18:44
- First name(s): Hans Jørgen
- Middle name(s): Aagaard
- Last name(s): Jensen
- Affiliation: Universith of Southern Denmark
- Country: Denmark
Re: MEMGET ERROR
Your output shows that the calculation only used 488 MB and not 15000 MB.
PS. I recommend upgrading to Dalton2018, but that is unrelated to your memory problem.
PS. I recommend upgrading to Dalton2018, but that is unrelated to your memory problem.
-
- Posts: 28
- Joined: 30 May 2019, 09:01
- First name(s): hong
- Last name(s): wang
- Affiliation: Nanjing Tech University
- Country: China
Re: MEMGET ERROR
Thank you very much for your reply.In addition to downloading the new version of Dalton, do you know how to solve my current problem?
I also noticed that no matter how much computing memory I set, the system only runs around 480MB. Is this the system setting?What do I need to do?
I also noticed that no matter how much computing memory I set, the system only runs around 480MB. Is this the system setting?What do I need to do?
-
- Posts: 395
- Joined: 27 Jun 2013, 18:44
- First name(s): Hans Jørgen
- Middle name(s): Aagaard
- Last name(s): Jensen
- Affiliation: Universith of Southern Denmark
- Country: Denmark
Re: MEMGET ERROR
I looked at your output file again. I notice that you have asked for 150GB of memory with 32 bit integers. With 32 bit integers you can max ask for 15GB of memory, corresponding to approx. 2 gigawords in double precision, otherwise you get integer overflow in the addressing og the internal work array. So either you must reduce to max 15GB or compile with 64 bit integers. (I believe we have programmed Dalton 2018 to provide more information about the problem, and not just silently use the default because the specified work memory could not be read with 32 bit integers.)
-
- Posts: 28
- Joined: 30 May 2019, 09:01
- First name(s): hong
- Last name(s): wang
- Affiliation: Nanjing Tech University
- Country: China
Re: MEMGET ERROR
Thank you very much for your reply.I encountered the following problem when updating Dalton, and I don't know how to solve it. Could you give me some advice?
[root@jiewei6 ~]# cmake --version
cmake version 3.14.4
CMake suite maintained and supported by Kitware (kitware.com/cmake).
[root@jiewei6 ~]# git clone --recursive https://gitlab.com/dalton/dalton.git
Cloning into 'dalton'...
fatal: unable to access 'https://gitlab.com/dalton/dalton.git/': Failed connect to gitlab.com:443; Connection timed out
I would appreciate it if you could give me some help.
Looking forward to your reply.
[root@jiewei6 ~]# cmake --version
cmake version 3.14.4
CMake suite maintained and supported by Kitware (kitware.com/cmake).
[root@jiewei6 ~]# git clone --recursive https://gitlab.com/dalton/dalton.git
Cloning into 'dalton'...
fatal: unable to access 'https://gitlab.com/dalton/dalton.git/': Failed connect to gitlab.com:443; Connection timed out
I would appreciate it if you could give me some help.
Looking forward to your reply.
- magnus
- Posts: 524
- Joined: 27 Jun 2013, 16:32
- First name(s): Jógvan Magnus
- Middle name(s): Haugaard
- Last name(s): Olsen
- Affiliation: Aarhus University
- Country: Denmark
Re: MEMGET ERROR
There was a major Google Cloud outage that affected GitLab among many others, so perhaps that is why it failed. Can you try again?
-
- Posts: 28
- Joined: 30 May 2019, 09:01
- First name(s): hong
- Last name(s): wang
- Affiliation: Nanjing Tech University
- Country: China
Re: MEMGET ERROR
No, not yet. I guess it's the network.I'm trying to figure out how to solve this problem.
Thank you very much for your reply.
Thank you very much for your reply.
-
- Posts: 1210
- Joined: 26 Aug 2013, 13:22
- First name(s): Radovan
- Last name(s): Bast
- Affiliation: none
- Country: Germany
Re: MEMGET ERROR
Were you able to clone and build the code in the meantime? Anything I can help with?
-
- Posts: 5
- Joined: 24 Jul 2019, 19:51
- First name(s): Juan
- Middle name(s): Jose
- Last name(s): Aucar
- Affiliation: AFA
- Country: Argentina
Re: MEMGET ERROR
Hi wanghong
I have recently had some problem similar to the one you describe here (by the way, I dont know if you already solved it).
Have you tried running the calculation in less than 32 cores?
In order to clarify something Ive read on the answers, Id like to say here that the output line "* Work memory size : 64000000 = 488.28 megabytes." only refers to the work memory size assigned to each core (the 15.000 mb distributed equally in them all).
I have recently had some problem similar to the one you describe here (by the way, I dont know if you already solved it).
Have you tried running the calculation in less than 32 cores?
In order to clarify something Ive read on the answers, Id like to say here that the output line "* Work memory size : 64000000 = 488.28 megabytes." only refers to the work memory size assigned to each core (the 15.000 mb distributed equally in them all).
-
- Posts: 395
- Joined: 27 Jun 2013, 18:44
- First name(s): Hans Jørgen
- Middle name(s): Aagaard
- Last name(s): Jensen
- Affiliation: Universith of Southern Denmark
- Country: Denmark
Re: MEMGET ERROR
Some clarifying comments (I hope):
- the 488.28 MB (64 megawords) is the default for each MPI master and worker
- "dalton -mb 15000" or "dalton -gb 15" will allocate for work memory 15 GB on each MPI master and worker. That is, if you run MPI on 32 cores with shared memory, you will use 32*15 GB of the shared memory = 480 GB. So you should have 512GB to use that option
- "dalton -mb 15000 -nb 2000" will allocate for work memory 15 GB on MPI master and 2 GB on each MPI worker; for MPI on 32 cores 1*15 + 31*2 GB = 77 GB. For most application (LUCITA excluded) 2 GB is enough for each MPI worker, but the master can often benefit from more memory.
- the 488.28 MB (64 megawords) is the default for each MPI master and worker
- "dalton -mb 15000" or "dalton -gb 15" will allocate for work memory 15 GB on each MPI master and worker. That is, if you run MPI on 32 cores with shared memory, you will use 32*15 GB of the shared memory = 480 GB. So you should have 512GB to use that option
- "dalton -mb 15000 -nb 2000" will allocate for work memory 15 GB on MPI master and 2 GB on each MPI worker; for MPI on 32 cores 1*15 + 31*2 GB = 77 GB. For most application (LUCITA excluded) 2 GB is enough for each MPI worker, but the master can often benefit from more memory.
-
- Posts: 5
- Joined: 24 Jul 2019, 19:51
- First name(s): Juan
- Middle name(s): Jose
- Last name(s): Aucar
- Affiliation: AFA
- Country: Argentina
Re: MEMGET ERROR
Thanks for the comments Hans Jørgen. Ive got confused about the -mb optionhjaaj wrote: ↑27 Nov 2019, 10:49Some clarifying comments (I hope):
- the 488.28 MB (64 megawords) is the default for each MPI master and worker
- "dalton -mb 15000" or "dalton -gb 15" will allocate for work memory 15 GB on each MPI master and worker. That is, if you run MPI on 32 cores with shared memory, you will use 32*15 GB of the shared memory = 480 GB. So you should have 512GB to use that option
- "dalton -mb 15000 -nb 2000" will allocate for work memory 15 GB on MPI master and 2 GB on each MPI worker; for MPI on 32 cores 1*15 + 31*2 GB = 77 GB. For most application (LUCITA excluded) 2 GB is enough for each MPI worker, but the master can often benefit from more memory.
-
- Posts: 28
- Joined: 30 May 2019, 09:01
- First name(s): hong
- Last name(s): wang
- Affiliation: Nanjing Tech University
- Country: China
Re: MEMGET ERROR
Hello, engineer:
I use a 56 core, 250GB server, The contents of run.sh are as follows. However, when I calculate 30 states, there will always be insufficient memory. Why? How can I solve this problem?Please help me.
Now parameter in rspprp. H file (maxlbl = 100000), parameter in infohso. H file (mxphos = 110).
#!/bin/sh
export PATH=/home/DALTON/build:$PATH
export DALTON_TMPDIR=/tmp/DALTON
export DALTON_LAUNCHER="mpirun -np 56"
dalton -gb 2 -noarch
I use a 56 core, 250GB server, The contents of run.sh are as follows. However, when I calculate 30 states, there will always be insufficient memory. Why? How can I solve this problem?Please help me.
Now parameter in rspprp. H file (maxlbl = 100000), parameter in infohso. H file (mxphos = 110).
#!/bin/sh
export PATH=/home/DALTON/build:$PATH
export DALTON_TMPDIR=/tmp/DALTON
export DALTON_LAUNCHER="mpirun -np 56"
dalton -gb 2 -noarch
-
- Posts: 395
- Joined: 27 Jun 2013, 18:44
- First name(s): Hans Jørgen
- Middle name(s): Aagaard
- Last name(s): Jensen
- Affiliation: Universith of Southern Denmark
- Country: Denmark
Re: MEMGET ERROR
It fails with insufficient memory because Dalton uses the default of ca. 0.5 GB/node, i.e. 26 GB in total in your case. I do not understand this if you used "dalton -gb 2", did you forget the -gb 2? However, most memory is needed on the master. I would therefore suggest something like: "dalton -mb 15000 -nb 2000" for 15GB on master and 2 Gb on workers.
Hans Jørgen Aa. Jensen, professor in computational chemistry
Hans Jørgen Aa. Jensen, professor in computational chemistry
-
- Posts: 28
- Joined: 30 May 2019, 09:01
- First name(s): hong
- Last name(s): wang
- Affiliation: Nanjing Tech University
- Country: China
Re: MEMGET ERROR
Professor:
Thank you very much for your reply.
When I changed the contents of run.sh. No matter how I set -mb, -nb, the actual work memory size is displayed in the output file: 64000000 = 488.28 megabytes.
What's wrong with my settings? Looking forward to your reply.
Thank you very much for your reply.
When I changed the contents of run.sh. No matter how I set -mb, -nb, the actual work memory size is displayed in the output file: 64000000 = 488.28 megabytes.
What's wrong with my settings? Looking forward to your reply.
- Attachments
-
- FM-1.out
- (139.07 KiB) Downloaded 55 times
-
- run.sh.png (7.75 KiB) Viewed 5380 times
-
- Posts: 28
- Joined: 30 May 2019, 09:01
- First name(s): hong
- Last name(s): wang
- Affiliation: Nanjing Tech University
- Country: China
Re: MEMGET ERROR
I seem to have found out why I don't have enough working memory. The last line in the script file is not recognized, which results in insufficient working memory. Is there a problem with my script file, professor?
-
- Posts: 395
- Joined: 27 Jun 2013, 18:44
- First name(s): Hans Jørgen
- Middle name(s): Aagaard
- Last name(s): Jensen
- Affiliation: Universith of Southern Denmark
- Country: Denmark
Re: MEMGET ERROR
Very strange. Maybe you could replace "dalton ..." with "bash -x -v dalton ... >& run.log", then the run.log should show what happens.
-
- Posts: 600
- Joined: 15 Oct 2013, 05:37
- First name(s): Peter
- Middle name(s): Robert
- Last name(s): Taylor
- Affiliation: Tianjin University
- Country: China
Re: MEMGET ERROR
How are you executing this script (i.e., do you run via a queueing system)? Many "interactive" shells, which includes using nohup, have inbuilt limits on resources that can be requested (see, e.g., the ulimit command). But I admit this does not seem very likely here, because your run has memory that is an "exact" number of 64-bit words (64000000) which is not a very convincing default number when specified in megabytes. Still, either the operating system or a queueing system, if you use one, might be causing the issue.
We run under SLURM. Because old habits die hard, I use the -mw parameter which specified memory in 64-bit words (not megawords) as (e.g.)
and in the output this gives
Best regards
Pete
We run under SLURM. Because old habits die hard, I use the -mw parameter which specified memory in 64-bit words (not megawords) as (e.g.)
/home/taylor/src/2018/dalton/build_new/dalton \
-mw 2000000000 -o ${SLURM_JOB_NAME}.lis -dal MKcas -mol MKstart
and in the output this gives
* Work memory size : 2000000000 = 14.901 gigabytes.
Best regards
Pete
-
- Posts: 28
- Joined: 30 May 2019, 09:01
- First name(s): hong
- Last name(s): wang
- Affiliation: Nanjing Tech University
- Country: China
Re: MEMGET ERROR
I am deeply grateful for your assistance.
I have solved the difficulty.It is not clear if the.bashrc file affected the permissions or if there were other issues.When I abandoned the run.sh file, I attached the working memory Settings when I submitted the calculation command, so that it could be recognized and run normally.
I have solved the difficulty.It is not clear if the.bashrc file affected the permissions or if there were other issues.When I abandoned the run.sh file, I attached the working memory Settings when I submitted the calculation command, so that it could be recognized and run normally.
Who is online
Users browsing this forum: No registered users and 0 guests