[Clas_offline] Fwd: ENP consumption of disk space under /work

Harut Avakian avakian at jlab.org
Thu Jun 1 11:01:24 EDT 2017




Dear All,

As you can see from the e-mail below,  keeping all our work disk space 
requires some additional funding.

Option 3 will inevitably impact on farm operations, removing of ~20% 
space from Lustre.

We can also choose something between options 1) and 3).

Please revise the content and move at least 75% of what is in 
/work/clas  to either /cache or /volatile?

The current Hall-B usage includes:

550G    hallb/bonus
1.5T    hallb/clase1
3.6T    hallb/clase1-6
3.3T    hallb/clase1dvcs
2.8T    hallb/clase1dvcs2
987G    hallb/clase1f
1.8T    hallb/clase2
1.6G    hallb/clase5
413G    hallb/clase6
2.2T    hallb/claseg1
3.9T    hallb/claseg1dvcs
1.2T    hallb/claseg3
4.1T    hallb/claseg4
2.7T    hallb/claseg5
1.7T    hallb/claseg6
367G    hallb/clas-farm-output
734G    hallb/clasg10
601G    hallb/clasg11
8.1T    hallb/clasg12
2.4T    hallb/clasg13
2.4T    hallb/clasg14
28G    hallb/clasg3
5.8G    hallb/clasg7
269G    hallb/clasg8
1.2T    hallb/clasg9
1.3T    hallb/clashps
1.8T    hallb/clas-production
5.6T    hallb/clas-production2
1.4T    hallb/clas-production3
12T    hallb/hps
13T    hallb/prad


Regards,

Harut

P.S. Few times we had crashes and they may also happen in future, so 
keeping important files in /work is not recommended.
You can see the list of lost files in /site/scicomp/lostfiles.txt  and 
/site/scicomp/lostfiles-jan-2017.txt



-------- Forwarded Message --------
Subject: 	ENP consumption of disk space under /work
Date: 	Wed, 31 May 2017 10:35:51 -0400
From: 	Chip Watson <watson at jlab.org>
To: 	Sandy Philpott <philpott at jlab.org>, Graham Heyes <heyes at jlab.org>, 
Ole Hansen <ole at jlab.org>, Harut Avakian <avakian at jlab.org>, Brad 
Sawatzky <brads at jlab.org>, Mark M. Ito <marki at jlab.org>



All,

As I have started on the procurement of the new /work file server, I
have discovered that Physics' use of /work has grown unrestrained over
the last year or two.

"Unrestrained" because there is no way under Lustre to restrain it
except via a very unfriendly Lustre quota system.  As we leave some
quota headroom to accommodate large swings in usage for each hall for
cache and volatile, then /work continues to grow.

Total /work has now reached 260 TB, several times larger than I was
anticipating.  This constitutes more than 25% of Physics' share of
Lustre, compared to LQCD which uses less than 5% of its disk space on
the un-managed /work.

It would cost Physics an extra $25K (total $35K - $40K) to treat the 260
TB as a requirement.

There are 3 paths forward:

(1) Physics cuts its use of /work by a factor of 4-5.
(2) Physics increases funding to $40K
(3) We pull a server out of Lustre, decreasing Physics' share of the
system, and use that as half of the new active-active pair, beefing it
up with SSDs and perhaps additional memory; this would actually shrink
Physics near term costs, but puts higher pressure on the file system for
the farm

The decision is clearly Physics', but I do need a VERY FAST response to
this question, as I need to move quickly now for LQCD's needs.

Hall D + GlueX,  96 TB
CLAS + CLAS12, 98 TB
Hall C,                35 TB
Hall A <unknown, still scanning>

Email, call (x7101), or drop by today 1:30-3:00 p.m. for discussion.

thanks,
Chip

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mailman.jlab.org/pipermail/clas_offline/attachments/20170601/5481450a/attachment.html>


More information about the Clas_offline mailing list