[Clas12_software] jlab batch job memory requests
Nathan Baltzell
baltzell at jlab.org
Tue Feb 7 11:05:05 EST 2023
FYI Everyone,
We've brought up memory efficiency of batch job requests at CLAS12 collaboration and software meetings in previous years. Lots of jobs requesting a lot more memory than they actually use can make the farm unnecessarily idle and significantly reduce throughput for everyone.
*** And now Scicomp has a larger initiative to improve farm efficiency, which includes contacting people running memory-inefficient jobs and potentially throttling their jobs if no action is taken. ***
You can check metrics of your batch jobs at:
https://scicomp.jlab.org
There's a search feature at 'Slurm Jobs' (left sidebar) -> 'Jobs Query' (top), and 'Recent Jobs' (top), and also 'Memory Efficiency' (top).
Before launching a large number of new types of jobs, you can measure how much memory your jobs use. For example, by submitting a couple jobs and using that website, or by running your job interactively and checking in htop or ps or other system utilities. And then set your SLURM/SWIF job memory request accordingly.
Note, standard CLAS12 simulation jobs (gemc plus recon-util) require less than 1.7 GB of memory.
-Nathan
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mailman.jlab.org/pipermail/clas12_software/attachments/20230207/8d52933b/attachment.html>
More information about the Clas12_software
mailing list