[lqcd-users] December 17 Compute Cluster Power Outage
Wesley Moore
wmoore at jlab.org
Wed Dec 17 17:30:00 EST 2025
Dear Users,
Services have been partially restored, and recovery efforts are ongoing.
Current status / in-progress work:
*
qcd16p and qcd18p will remain down while support services are restored.
*
ifarm will be patched and rebooted at 10:00 PM.
*
Farm node recovery is ongoing and will continue as systems come back online.
We will provide additional updates as progress continues and services are fully restored.
Thank you for your patience.
Best regards,
Wesley, on behalf of the Scientific Computing Operations Team
________________________________
From: Wesley Moore <wmoore at jlab.org>
Sent: Monday, December 15, 2025 1:26 PM
To: Wesley Moore via Jlab-scicomp-briefs <jlab-scicomp-briefs at jlab.org>; lqcd-users at jlab.org <lqcd-users at jlab.org>; epsci at jlab.org <epsci at jlab.org>; Wesley Moore <wmoore at jlab.org>
Subject: Re: December 17 Compute Cluster Power Outage
Dear Users,
This is a gentle reminder of the planned compute cluster outage on December 17, beginning at 6:00 AM, to support contractor work installing new PDUs. Please see previous email for details.
Please plan accordingly and ensure any critical work is completed ahead of the outage. As always, we will prioritize restoring core production services as quickly as possible once power is restored.
Thank you for your understanding.
Best regards,
Wesley, on behalf of the Scientific Computing Operations Team
________________________________
From: Jlab-scicomp-briefs <jlab-scicomp-briefs-bounces at jlab.org> on behalf of Wesley Moore via Jlab-scicomp-briefs <jlab-scicomp-briefs at jlab.org>
Sent: Friday, December 12, 2025 9:53 AM
To: Wesley Moore via Jlab-scicomp-briefs <jlab-scicomp-briefs at jlab.org>; lqcd-users at jlab.org <lqcd-users at jlab.org>; epsci at jlab.org <epsci at jlab.org>
Subject: [Jlab-scicomp-briefs] December 17 Compute Cluster Power Outage
Dear Users,
On December 17th, there will be a power outage affecting all compute clusters to allow a contractor to install new PDUs. The outage window will begin at 6:00 AM. Due to the nature of the work, we can only provide a general timeframe, and we expect the interruption to last most of the day. Full recovery time is still to be determined.
As always, we will prioritize restoring core production services first so jobs can resume as quickly as possible.
Affected services include, but are not limited to:
*
General compute cluster availability (Farm & QCD)
*
JupyterHub
* Tape library
Systems expected to be unaffected by the outage:
* Interactives (ifarm &. qcdi)
* Lustre filesystems
* Work disk filesystems
* code.jlab.org and associated runners
To reduce the impact on users, we will keep Maintenance Day (December 16th) very light to help ensure no user-facing disruptions to running jobs ahead of the outage.
We will provide updates as more information becomes available.
Best regards,
Wesley, on behalf of the Scientific Computing Operations Team
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mailman.jlab.org/pipermail/lqcd-users/attachments/20251217/2dcc519e/attachment.htm>
More information about the lqcd-users
mailing list