[Halld-offline] Stability of DSelector on batch farms

Alexander Austregesilo aaustreg at jlab.org
Tue Nov 24 11:16:42 EST 2020


Dear Colleagues,

By default, the DSelector uses $HOME/.proof as working directory. When 
running multiple proof sessions in parallel, for example on a batch 
farm, having every thread of every job writing all intermediate files to 
the same file server potentially creates stability problems. In the 
worst case, we have seen jobs quietly stopping, resulting in incomplete 
output trees.

In order to set the 'sandbox' for proof to the local directory on the 
farm node, it is necessary to set the corresponding root environment 
variable BEFORE starting the proof session. This can be done in a root 
macro by issuing the following command:

gEnv->SetValue("ProofLite.Sandbox", "$PWD/.proof/");

The well-intended DPROOFLiteManager::Set_Sandbox("./") mechanism does 
NOT work due to the internal workings of proof.


I updated the launch scripts in the svn repository, in particular

https://halldsvn.jlab.org/repos/trunk/scripts/monitoring/root_analysis/Run_Selector.C


This fix improves the stability of the DSelector  jobs by orders of 
magnitude. Thanks to Naomi for drawing my attention to this problem.

Best regards,

Alex

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mailman.jlab.org/pipermail/halld-offline/attachments/20201124/c248a842/attachment.html>


More information about the Halld-offline mailing list