[Halld-offline] info on volatile, cache, and Hall D

Mark Ito marki at jlab.org
Wed May 26 10:07:02 EDT 2021


Folks,

I was talking to Ying Chen of SciComp yesterday. A couple of points came 
up that I think I should pass along.

  * *Our use of our **volatile space**is very light*. There are two
    classes of deletions performed by the deletion algorithm
    <https://scicomp.jlab.org/docs/volatile_disk_pool>: one by access
    age of files and the other by modification age. The access-age
    threshold is six months, older than that and the file will be
    deleted. The modification-age criteria are more complicated, but
    depends on the level of usage. A small disk footprint will allow a
    group to evade this class of deletions. Turns out we have been
    evading it since July of 2018. So for years now, unread files have
    been living for six months, and files occasionally read, since that
    time, are still there.  This is not bad, but we could be using
    volatile more heavily while still maintaining useful file lifetimes.
    You can see usage for all Halls here
    <https://scicomptest.jlab.org/scicomp/volatileDisk>.
      o I should note that those that have been using volatile seem to
        be cleaning up after themselves (the downward steps in this plot
        <https://halldweb.jlab.org/disk_management/volatile_disk_2.png>)
        once the data is not needed or archived. This practice really
        extends file lifetimes.
  * *There are many "small" files on the cache disk*, those smaller than
    1 MB, when judged against the count of such files
    <https://scicomptest.jlab.org/scicomp/cacheDisk/project> generated
    by the other Halls. This is not too bad either, but...
      o Reading and writing small files can put a high load on Lustre
        because of the high overhead for any file access.
      o Small files are archived to tape as one would normally expect,
        because of the high overhead for writing to tape. So these files
        are at risk for loss in case of hardware failure.
      o Small files are not tracked in the database used to manage the
        disk, so it is harder to track where they are and who is
        generating them.
      o We need to identify the use cases that are generating these files.

   -- Mark


-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mailman.jlab.org/pipermail/halld-offline/attachments/20210526/a26ef2a8/attachment.html>


More information about the Halld-offline mailing list