[Jlab-scicomp-briefs] Scientific Computing Notes - brief outage this morning and upcoming mainteance

Bryan Hess bhess at jlab.org
Mon May 11 08:33:50 EDT 2020


File Server Emergency Reboot - 5/11/2020

This morning one of the Lustre file severs was rebooted to clear a problem that caused some farm jobs to fail. Farm jobs with files stored on the affected server began to stall starting on Sunday afternoon. The system is now running normally again and impacted farm nodes are being brought back online.




Upcoming Planned Maintenance - 5/26/2020

There will be a planned maintenance period for the farm on Tuesday, May 26th starting at 8am to upgrade Lustre clients, to perform file system maintenance, and to patch/reboot infrastructure machines. Farm jobs will be paused by 8am on that day. New jobs will be released to run as soon as the work is completed. A reminder of this work will be sent closer to the date. Please contact us with any concerns or questions.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mailman.jlab.org/pipermail/jlab-scicomp-briefs/attachments/20200511/16a8b999/attachment-0002.html>


More information about the Jlab-scicomp-briefs mailing list