[Halld-offline] MCwrapper-bot issues
Thomas Britton
tbritton at jlab.org
Fri Aug 2 15:12:35 EDT 2019
Dear collaborators,
The last two weeks has seen MCwrapper-bot struggle to produce MC. The reasons were two-fold; the first affected the use of our singularity container. Once remedied all submitted jobs began instantly failing at UConn. Since UConn typically makes up a large chunk of our OSG cycles the vast majority of jobs submitted failed. There is a mechanism to "hold" jobs that fail more than a set amount of times. Almost all jobs of all submitted projects triggered this condition. Being out of town I was unable to manually over-ride this switch and the decision was eventually made to blacklist UConn until remedied. This allowed jobs to run although at a lower rate and only when manually prompted by myself. Richard, Edgar, and myself investigated and the problems seems to have been resolved (thanks to a patch by Edgar) as of Thursday late afternoon. This delayed many projects for more than a week. Many projects have been finished and there are a couple still slowly working their way through.
Sorry for the delay.
Thomas
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mailman.jlab.org/pipermail/halld-offline/attachments/20190802/a3a37b17/attachment.html>
More information about the Halld-offline
mailing list