[Halld-offline] Stability of DSelector on batch farms
Alexander Austregesilo
aaustreg at jlab.org
Tue Nov 24 11:16:42 EST 2020
Dear Colleagues,
By default, the DSelector uses $HOME/.proof as working directory. When
running multiple proof sessions in parallel, for example on a batch
farm, having every thread of every job writing all intermediate files to
the same file server potentially creates stability problems. In the
worst case, we have seen jobs quietly stopping, resulting in incomplete
output trees.
In order to set the 'sandbox' for proof to the local directory on the
farm node, it is necessary to set the corresponding root environment
variable BEFORE starting the proof session. This can be done in a root
macro by issuing the following command:
gEnv->SetValue("ProofLite.Sandbox", "$PWD/.proof/");
The well-intended DPROOFLiteManager::Set_Sandbox("./") mechanism does
NOT work due to the internal workings of proof.
I updated the launch scripts in the svn repository, in particular
https://halldsvn.jlab.org/repos/trunk/scripts/monitoring/root_analysis/Run_Selector.C
This fix improves the stability of the DSelector jobs by orders of
magnitude. Thanks to Naomi for drawing my attention to this problem.
Best regards,
Alex
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mailman.jlab.org/pipermail/halld-offline/attachments/20201124/c248a842/attachment.html>
More information about the Halld-offline
mailing list