[Halld-offline] Stability of DSelector on batch farms
    Alexander Austregesilo 
    aaustreg at jlab.org
       
    Tue Nov 24 11:16:42 EST 2020
    
    
  
Dear Colleagues,
By default, the DSelector uses $HOME/.proof as working directory. When 
running multiple proof sessions in parallel, for example on a batch 
farm, having every thread of every job writing all intermediate files to 
the same file server potentially creates stability problems. In the 
worst case, we have seen jobs quietly stopping, resulting in incomplete 
output trees.
In order to set the 'sandbox' for proof to the local directory on the 
farm node, it is necessary to set the corresponding root environment 
variable BEFORE starting the proof session. This can be done in a root 
macro by issuing the following command:
gEnv->SetValue("ProofLite.Sandbox", "$PWD/.proof/");
The well-intended DPROOFLiteManager::Set_Sandbox("./") mechanism does 
NOT work due to the internal workings of proof.
I updated the launch scripts in the svn repository, in particular
https://halldsvn.jlab.org/repos/trunk/scripts/monitoring/root_analysis/Run_Selector.C
This fix improves the stability of the DSelector  jobs by orders of 
magnitude. Thanks to Naomi for drawing my attention to this problem.
Best regards,
Alex
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mailman.jlab.org/pipermail/halld-offline/attachments/20201124/c248a842/attachment.html>
    
    
More information about the Halld-offline
mailing list