[Halld-offline] Slurm jobs

Alexander Austregesilo aaustreg at jlab.org
Mon Feb 11 15:47:54 EST 2019

Dear Colleagues,

Recent changes in the JLab batch farm system required an update to the 
launch scripts which were introduced during the last GlueX software 


In particular, you have to svn update the following file:


The update will make sure that the directory for the log files exists 
prior to registering the job to swif, which is a requirement for the new 
scheduler slurm.

Please find more information about the changes below.

Best regards,


On 2/11/2019 3:44 PM, Ying Chen wrote:
> Dear halld users,
> Sorry for this later announcement on the change for our farm cluster.
> Since last week, we have quietly moved 20% swif jobs to our new slurm 
> cluster.
> Some of the jobs failed on slurm due to the log directory doesn't exist.
> The error message will like this:
> Status: FAILED (Job failed due to invalid stdour/stderr)
> This is because auge-slurm will not manage (save, copy and trim the 
> log files)
> the stdout and stderr for the job, instead of pass the location to 
> slurm so
> slurm will direct write the meesage to these files. This changes will 
> have a
> good benefit because the log files will be available to user after job 
> starts.
> By default, the job .out and .err files will under 
> /lustre/expphy/farm_out/<user>,
> name pattern will be JOB_NAME-AUGER_JOB_ID-HOSTNAME.out and
> JOB_NAME-AUGER_JOB_ID-HOSTNAME.err.  Auger will create this directory
> for any user who first time run a job on slurm. By if you use <Stdout> 
> and
> <Stderr> tag to change the log file location, it is user's 
> responsibility to make
> sure the log directory exists, otherwise the job will fail 
> with'invalid stdour/stderr'.
> Please check this document for detailed information on auger-slurm.
> https://scicomp.jlab.org/docs/auger_slurm
> Auger-Slurm (Beta test) | JLab Scientific Computing 
> <https://scicomp.jlab.org/docs/auger_slurm>
> The auger-slurm commands are installed under 
> /site/scicomp/auger-slurm/bin (will be linked to /site/bin in the 
> future) and can be accessed from ifarm or any host ...
> scicomp.jlab.org
> Let me (or Chris) know if you have any questions.
> Thanks

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mailman.jlab.org/pipermail/halld-offline/attachments/20190211/120f6c7a/attachment.html>

More information about the Halld-offline mailing list