[Halla_running] Planned Lustre upgrade/outage Aug 19-21. Upcoming changes to /cache.

Ole Hansen ole at jlab.org
Thu Aug 8 17:46:01 EDT 2024


Hello everyone,

I wanted to give you a heads-up about two upcoming changes to the JLab 
farm systems:

(1) The Computer Center is planning to upgrade the farm's Lustre file 
system later this month. This will require several days of downtime. 
Specifically, the farm, ifarm, SWIF, /cache, /volatile, and tape storage 
services will be offline from August 19, 9am, to August 21, 9am.

To help with this migration, there are several things you can do:

- Don't submit farm jobs that are likely to run past August 19, 9am. 
While these jobs should automatically be queued to run later, it's best 
to avoid such conflicts as much as possible.

- Consolidate or delete small files (< 100 MB) on the Lustre disks. 
Consider using "tar" to pack directories full of small files. Better 
yet, consider moving such directories to /work. Disk performance for 
small files is significantly better on /work than on Lustre.

- Limit writing large files to /cache and /volatile during the week 
before the upgrade except with jcache. This will reduce the time to do a 
final sync of the old and new filesystems.

This upgrade will more than double the available Lustre disk space. All 
project disk quotas will be scaled up proportionally. (This applies to 
/cache and /volatile only, not to /work.)

After the upgrade, you may want to check if all your files were 
transferred correctly. The old filesystem will continue to be available 
under /lustre19 for at least one month.


(2) Another important change that may affect your farm job setups: 
/cache will be made read-only sometime later this year (exact date TBD). 
This will resolve various issues that affect the write-through cache, 
such as handling of duplicate files and out-of-sync problems with tape. 
After this change, only the tape software (Jasmine) will be allowed to 
write to /cache. Please ensure that your SWIF scripts do not put job 
output on /cache, but send it to /volatile instead. Use jput to save 
files from /volatile to tape. (Output to /work will not require any 
changes.)


Please send any questions or concerns to me.

Cheers,

Ole

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mailman.jlab.org/pipermail/halla_running/attachments/20240808/f6005e1c/attachment.html>


More information about the Halla_running mailing list