[Halld-offline] Fwd: FYI disk fileservers

Fri Nov 3 14:55:22 EDT 2017

from Graham Heyes, explaining the recent history of disk use and disk 
procurements...

-------- Forwarded Message --------
Subject: 	FYI disk fileservers
Date: 	Thu, 2 Nov 2017 12:53:04 -0400
From: 	Graham Heyes <heyes at jlab.org>
To: 	Mark Ito <marki at jlab.org>, Ole Hansen <ole at jlab.org>, Brad Sawatzky 
<brads at jlab.org>, Harut Avakian <avakian at jlab.org>

Someone outside yesterday's meeting had a question about disk space so I 
wrote down the story so far. I thought it was a useful enough summary 
that I would send it to you in case you would like to share it with your 
people too.

Last year we identified the “work” filesystem as an area in need of 
improvement. By it’s very nature a “work” filesystem can contain a large 
number of small files that churn as people run jobs, compile code, etc. 
Unfortunately the Luster based system is not very efficient in this 
mode. What users see as the “work” filesystem is a virtual space carved 
out of a larger filesystem that also provides /cache and /volitile for 
the farm and space paid for by LQCD. The non-work areas are machine 
managed using algorithms that automatically free space by migrating old 
or little used files to tape. Conversely /work is “human managed” using 
quotas. A problem with this is that growth in the work area is at the 
expense of /cache and /volatile which gradually shrink. Also accessing 
many small files simultaneously in /work impacts performance for /cache, 
/volatile and the LQCD users.

The solution to the problem was to buy a new file server specifically 
designed for use as “work”. Due to funding constraints in FY17 we bought 
a system that was not fully populated with drives and controllers with 
the plan to expand at a later date. This minimal system was designed to 
meet the needs of FY18 as understood at the time. Halls A/C offered to 
add funds of their own to aid the procurement in return for a “larger 
slice of the pie”. Normally I buy disk out of my budget but I was glad 
of the one time help. I spoke with Rolf and we agreed that, for many 
reasons, we do not want the halls buying their own disk to add to the 
farm. The model is that the halls provide requirements and Rolf pays, 
via me, for IT to procure something to meet the requirements. One reason 
for this is that typically we add disk space in large chunks to keep the 
cost per terabyte down so we don’t want to buy piecemeal.

Here is a summary of where we are:

    Currently on Luster: work=170TB, cache=400 TB, Volatile=165 TB, for
    a total of 735 TB

    Note that ENP only bought 690 TB and the 45 TB difference is “on
    loan" from LQCD.

    The FY17 procurement is a high performance server optimized to be
    /work with 144 TB useable.
    We asked all the halls to temporarily reduce their use of /work to
    fit in the new server. After some pushback we are allowing some
    rarely used data to stay on Luster and are commencing the move to
    the new server.

    The FY18 procurement which will be installed in January adds 216 TB.

    So, after January we will have 360 TB of high performance /work on
    the new server compared with 170 TB of /work now under Luster. So
    double the space and higher performance hardware.

The space freed up on Luster will be added to /cache and /volatile. As 
data processing ramps up, whether on or off site, we will need more 
/cache to stage the data being processed. Based on requirement 
projections I expect this to exceed what we own so I have asked Chip to 
prepare another REQ to add 250 TB to the /cache and /volatile which will 
expand it to almost 1PB.

I am looking closely at the disk requirements that we had in the last 
computing review and am asking everyone to update them in light of 
additional experience in recent months. In particular to be sure that we 
do need to add the 250 TB of cache and if we do that will it be enough 
for the long term or will we need more.

I hope that this is useful!

Regards,
Graham

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mailman.jlab.org/pipermail/halld-offline/attachments/20171103/01483d8a/attachment-0001.html>