Page 1 of 1

Check running status or run queue

Posted: 28 Oct 2015, 06:15
by ankit7540
Hi all,
Is it possible to check the running status of calculation jobs submitted (like whether they are stopped or still running ) and check job queue by some script etc.. ?
(new to Dalton )
Ankit

Re: Check running status or run queue

Posted: 28 Oct 2015, 09:43
by bast
ankit7540 wrote:Hi all,
Is it possible to check the running status of calculation jobs submitted (like whether they are stopped or still running ) and check job queue by some script etc.. ?
(new to Dalton )
Ankit
dear Ankit,
most probably yes.
But this is not a problem/question directly related to Dalton.
This is something that depends on the cluster where you
run and that you should ask the local
support. We cannot know how your cluster is set up.
good luck,
radovan

Re: Check running status or run queue

Posted: 28 Oct 2015, 10:59
by lyzhao
Hi Ankit,
I guess you are looking for a PBS script, here is an example,

#!/bin/bash
#PBS -N dalton1
#PBS -j oe
#PBS -V
#PBS -l nodes=1:ppn=16

## Join the standard error and the standard output into 1 file output
cd $PBS_O_WORKDIR

JOB=job.dal
MEM="-mb 8000"
SCR="-t ./"
NPROC="-N 16"
PROG=/home/lyzhao/calcsoft/Dalton/dalton

$PROG $MEM $SCR $NPROC $JOB

Re: Check running status or run queue

Posted: 28 Oct 2015, 18:21
by taylor
I won't post it immediately because if it's not needed it would just cause confusion and all you may need is a PBS-type script (which will work also for TORQUE, MOAB, and other closed- and open-source queueing systems) and lyzhao has provided you with that. But some sites use SLURM, which --- although I don't like it --- we used at the computer centre I ran in Melbourne, because it is the only queueing system available on IBM BlueGene/Q. If your site uses SLURM let me know and I can dig out similar example scripts for SLURM.

As Radovan says, how to check status etc. depends a bit on your system and system configuration. If it is some type of cluster system or compute farm with filesystems mounted from a central server, the -o option to the dalton runscript may be of help so that the output file is constantly updated as the calculation runs. Commands like qstat on a PBS-type queueing system will tell you what you, and perhaps other people, have running. If you want to monitor performance, the command "top" will tell you what is running, but you have to be able to log on the the particular machine(s) your job is running on and some systems do not allow this. There are many other tools that can be used to monitor performance, and to monitor progress, but we will need more information about your setup before we can offer more detailed advice.

Best regards
Pete

Re: Check running status or run queue

Posted: 29 Oct 2015, 09:17
by ankit7540
Dear Mr. Lyzaho and Mr Taylor,
Thank you very much for your quick response.
Sorry for my late reply. I use DALTON on my lab workstations (2 in number) and I do not use any queueing application. i just give the ( ./dalton --- ) command either by ssh (putty) or personally. Then I leave. I want to check if calculation finished by remote location. I can always ssh to my computers. Usually I check whether dalton process is running or not. So I asked whether there is any better way to find out or some built in functionality in Dalton.

(Alternatively, I can write a script to check the dalton process running ! )

Thank you again. :-)

Re: Check running status or run queue

Posted: 29 Oct 2015, 09:35
by bast
Two ideas: you follow the output with "tail -f your_output.out",
or you set up a script which launches dalton and sends you an email
once it has finished. There are many solutions but they are all
in the Linux territory and not in the Dalton territory.

Re: Check running status or run queue

Posted: 30 Oct 2015, 13:47
by taylor
Perhaps I don't understand the issue, but what more do you need than ssh'ing into your machines and doing a "ps" to see if there is a Dalton process running (or a "top" if you want performance information)? I agree with Radovan that whatever the question, the answer is more with Linux than with Dalton, but I still don't understand what more information you need/want about a running (or presumably finished) calculation?

Incidentally, it is perfectly possible to download the open-source TORQUE queueing system and build it yourself if you would prefer to have a batch system rather than just using the "dalton" runscript. SLURM is also available for download and building, but I have to say I think SLURM is overkill for a relatively small setup. At home on our compute cluster we use TORQUE, for example.

Best regards
Pete