[Sbs_software] FW: [Halla_running] Planned Lustre upgrade/outage Aug 19-21. Upcoming changes to /cache.
Andrew Puckett
puckett at jlab.org
Thu Aug 8 17:48:14 EDT 2024
FYI, this is coming up soon
From: Halla_running <halla_running-bounces at jlab.org> on behalf of Ole Hansen via Halla_running <halla_running at jlab.org>
Reply-To: Ole Hansen <ole at jlab.org>
Date: Thursday, August 8, 2024 at 5:46 PM
To: "halla_running at jlab.org" <halla_running at jlab.org>
Subject: [Halla_running] Planned Lustre upgrade/outage Aug 19-21. Upcoming changes to /cache.
Hello everyone,
I wanted to give you a heads-up about two upcoming changes to the JLab farm systems:
(1) The Computer Center is planning to upgrade the farm's Lustre file system later this month. This will require several days of downtime. Specifically, the farm, ifarm, SWIF, /cache, /volatile, and tape storage services will be offline from August 19, 9am, to August 21, 9am.
To help with this migration, there are several things you can do:
- Don't submit farm jobs that are likely to run past August 19, 9am. While these jobs should automatically be queued to run later, it's best to avoid such conflicts as much as possible.
- Consolidate or delete small files (< 100 MB) on the Lustre disks. Consider using "tar" to pack directories full of small files. Better yet, consider moving such directories to /work. Disk performance for small files is significantly better on /work than on Lustre.
- Limit writing large files to /cache and /volatile during the week before the upgrade except with jcache. This will reduce the time to do a final sync of the old and new filesystems.
This upgrade will more than double the available Lustre disk space. All project disk quotas will be scaled up proportionally. (This applies to /cache and /volatile only, not to /work.)
After the upgrade, you may want to check if all your files were transferred correctly. The old filesystem will continue to be available under /lustre19 for at least one month.
(2) Another important change that may affect your farm job setups: /cache will be made read-only sometime later this year (exact date TBD). This will resolve various issues that affect the write-through cache, such as handling of duplicate files and out-of-sync problems with tape. After this change, only the tape software (Jasmine) will be allowed to write to /cache. Please ensure that your SWIF scripts do not put job output on /cache, but send it to /volatile instead. Use jput to save files from /volatile to tape. (Output to /work will not require any changes.)
Please send any questions or concerns to me.
Cheers,
Ole
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mailman.jlab.org/pipermail/sbs_software/attachments/20240808/64c5f21a/attachment.html>
-------------- next part --------------
An embedded and charset-unspecified text was scrubbed...
Name: ATT00001.txt
URL: <https://mailman.jlab.org/pipermail/sbs_software/attachments/20240808/64c5f21a/attachment.txt>
More information about the Sbs_software
mailing list