<html><head>
<meta http-equiv="Content-Type" content="text/html; charset=utf-8">
</head>
<body>
<p>Folks,</p>
<p>I was talking to Ying Chen of SciComp yesterday. A couple of
points came up that I think I should pass along.</p>
<ul>
<li><b>Our use of our </b><b>volatile space</b><b> is very light</b>.
There are two classes of deletions performed by the <a moz-do-not-send="true" href="https://scicomp.jlab.org/docs/volatile_disk_pool">deletion
algorithm</a>: one by access age of files and the other by
modification age. The access-age threshold is six months, older
than that and the file will be deleted. The modification-age
criteria are more complicated, but depends on the level of
usage. A small disk footprint will allow a group to evade this
class of deletions. Turns out we have been evading it since July
of 2018. So for years now, unread files have been living for six
months, and files occasionally read, since that time, are still
there. This is not bad, but we could be using volatile more
heavily while still maintaining useful file lifetimes. You can
see usage for all Halls <a moz-do-not-send="true" href="https://scicomptest.jlab.org/scicomp/volatileDisk">here</a>.</li>
<ul>
<li>I should note that those that have been using volatile seem
to be cleaning up after themselves (the downward steps in <a moz-do-not-send="true" href="https://halldweb.jlab.org/disk_management/volatile_disk_2.png">this
plot</a>) once the data is not needed or archived. This
practice really extends file lifetimes.<br>
</li>
</ul>
<li><b>There are many "small" files on the cache disk</b>, those
smaller than 1 MB, when judged against the <a moz-do-not-send="true" href="https://scicomptest.jlab.org/scicomp/cacheDisk/project">count
of such files</a> generated by the other Halls. This is not
too bad either, but...</li>
<ul>
<li>Reading and writing small files can put a high load on
Lustre because of the high overhead for any file access.</li>
<li>Small files are archived to tape as one would normally
expect, because of the high overhead for writing to tape. So
these files are at risk for loss in case of hardware failure.<br>
</li>
<li>Small files are not tracked in the database used to manage
the disk, so it is harder to track where they are and who is
generating them.</li>
<li>We need to identify the use cases that are generating these
files.</li>
</ul>
</ul>
<p> -- Mark</p>
<p><br>
</p>
</body>
</html>