[Frost] Fwd: [Clas_offline] Fwd: ENP consumption of disk space under /work
Eugene Pasyuk
pasyuk at jlab.org
Fri Jun 9 16:59:41 EDT 2017
Hi Franz,
Sure, that's why I asked people to tell what they are using NOW. If there is something no one claims, I'll delete it.
And, yes, please clean up this old stuff. All the monitoring stuff from the cooking should be already on tape and could be retrieved if needed.
-Eugene
----- Original Message -----
> From: "Franz Klein" <fklein at jlab.org>
> To: "Eugene Pasyuk" <pasyuk at jlab.org>
> Cc: "frost" <frost at jlab.org>
> Sent: Friday, June 9, 2017 4:49:05 PM
> Subject: Re: [Frost] Fwd: [Clas_offline] Fwd: ENP consumption of disk space under /work
> Eugene,
> I think you should look a bit more into the directory trees.
> clasg9 owns quite a lot from calibration and cooking phases, but also because
> users ran their code as 'clasg9' (e.g. the g9b/ directory by Ross: 148GB).
>
> If there is no complaint, I will clean up at least all old (pass0) stuff plus
> log & histogram files in g9frost/ plus previous_constants/ plus sc_calib/
>
> Cheers,
> Franz
>
> ----- Original Message -----
> From: "Eugene Pasyuk" <pasyuk at jlab.org>
> To: "frost" <frost at jlab.org>
> Sent: Friday, June 9, 2017 4:14:50 PM
> Subject: [Frost] Fwd: [Clas_offline] Fwd: ENP consumption of disk space under
> /work
>
> Folks,
>
> A general work disk has grown to unexpected size. Computer center wants us to
> shrink it. See messages below.
> clasg9 is not the biggest offender. Out of almost 100 TB used by Hall B we have
> only 1.2 TB. However, we should keep the only thing which are used on a regular
> basis. Work disk is not meant to be used for the permanent storage and it is
> not backed up.
> Here is the list of the directories there with ownership:
>
> -rw-r--r-- 1 6348 451 Nov 25 2016 76runs.list
> drwxrwsr-x 3 clasg9 4096 Jun 1 2010 TAG
> drwxr-sr-x 7 natalie 4096 Aug 6 2013 bin
> -rw-rw-r-- 1 clasg9 1361489920 Oct 17 2016 clasg9_g9a.tar
> drwxrwsr-x 3 clasg9 4096 Feb 15 00:49 clasweb_monitor
> drwxrwsr-x 4 clasg9 4096 Mar 10 2010 cooklinks
> drwxr-sr-x 2 crede 4096 Aug 16 2014 crede
> drwxrwsr-x 2 clasg9 4096 Oct 27 2011 dcalign
> drwxrwsr-x 3 dschott 4096 Jul 31 2014 dschott
> drwxr-sr-x 5 dugger 4096 May 26 10:54 dugger
> drwxr-s--- 2 fklein 4096 Apr 19 11:30 fklein
> drwxrwsr-x 3 clasg9 4096 Oct 19 2012 g9a
> drwxrwsr-x 3 clasg9 4096 Oct 21 2012 g9b
> -rwxr-xr-x 1 6890 474 Nov 8 2010 g9b_sub_script.pl
> drwxrwsr-x 8 clasg9 4096 Dec 4 2012 g9frost
> drwxr-sr-x 3 6659 4096 Jan 31 2011 mcandrew
> drwxrwsr-x 3 natalie 36864 May 4 10:43 natalie
> drwxrwsr-x 4 clasg9 4096 Aug 14 2014 pass1
> drwxr-sr-x 4 6890 4096 May 19 2010 previous_constants
> -rwxrwxrwx 1 clasg9 15539 Nov 11 2010 runs_g9b_pass0
> -rwxrwxrwx 1 6890 15563 Nov 8 2010 runs_g9b_pass0~
> drwxrwsrwx 2 clasg9 20480 Feb 15 01:14 sc_calib
> drwxr-sr-x 2 sfegan 73728 Feb 15 00:49 sfegan
> drwxrwsr-x 3 supark 4096 Aug 8 2012 skpark
> drwxr-sr-x 7 7216 4096 Jan 21 2016 yuqing
> drwxrwsr-x 8 clasg9 4096 Sep 19 2015 zana
>
> Here is the space taken by these directories:
>
> 129K 76runs.list
> 6.9M TAG
> 191M bin
> 1.3G clasg9_g9a.tar
> 363M clasweb_monitor
> 137M cooklinks
> 4.0K crede
> 261K dcalign
> 1.6G dschott
> 4.2G dugger
> 1.9G fklein
> 2.8G g9a
> 148G g9b
> 129K g9b_sub_script.pl
> 578G g9frost
> 8.0K mcandrew
> 131G natalie
> 18G pass1
> 121G previous_constants
> 129K runs_g9b_pass0
> 129K runs_g9b_pass0~
> 53G sc_calib
> 34G sfegan
> 112M skpark
> 33G yuqing
> 4.7G zana
>
> Most of the file were not used for ages.
> Please review an let me know what is being used NOW and should be kept.
> Unclaimed files will be deleted.
> If there is something that you don't use but plan to use in a distant future and
> it is not easily recreatable you may think of putting this on tape.
> Please do it ASAP.
>
> Thanks,
>
> -Eugene
>
>
> From: "Harut Avagyan" <avakian at jlab.org>
> To: "clas offline" <clas_offline at jlab.org>
> Sent: Thursday, June 1, 2017 11:01:24 AM
> Subject: [Clas_offline] Fwd: ENP consumption of disk space under /work
>
>
>
>
>
>
>
>
> Dear All,
>
> As you can see from the e-mail below, keeping all our work disk space requires
> some additional funding.
>
>
> Option 3 will inevitably impact on farm operations, removing of ~20% space from
> Lustre.
>
> We can also choose something between options 1) and 3).
>
>
> Please revise the content and move at least 75% of what is in /work/clas to
> either /cache or /volatile?
>
>
> The current Hall-B usage includes:
>
>
> 550G hallb/bonus
> 1.5T hallb/clase1
> 3.6T hallb/clase1-6
> 3.3T hallb/clase1dvcs
> 2.8T hallb/clase1dvcs2
> 987G hallb/clase1f
> 1.8T hallb/clase2
> 1.6G hallb/clase5
> 413G hallb/clase6
> 2.2T hallb/claseg1
> 3.9T hallb/claseg1dvcs
> 1.2T hallb/claseg3
> 4.1T hallb/claseg4
> 2.7T hallb/claseg5
> 1.7T hallb/claseg6
> 367G hallb/clas-farm-output
> 734G hallb/clasg10
> 601G hallb/clasg11
> 8.1T hallb/clasg12
> 2.4T hallb/clasg13
> 2.4T hallb/clasg14
> 28G hallb/clasg3
> 5.8G hallb/clasg7
> 269G hallb/clasg8
> 1.2T hallb/clasg9
> 1.3T hallb/clashps
> 1.8T hallb/clas-production
> 5.6T hallb/clas-production2
> 1.4T hallb/clas-production3
> 12T hallb/hps
> 13T hallb/prad
>
>
>
>
>
> Regards, Harut
>
> P.S. Few times we had crashes and they may also happen in future, so keeping
> important files in /work is not recommended.
> You can see the list of lost files in /site/scicomp/lostfiles.txt and
> /site/scicomp/lostfiles-jan-2017.txt
>
>
>
> -------- Forwarded Message -------- Subject: ENP consumption of disk space
> under /work
> Date: Wed, 31 May 2017 10:35:51 -0400
> From: Chip Watson <watson at jlab.org>
>
> To: Sandy Philpott <philpott at jlab.org> , Graham Heyes <heyes at jlab.org> , Ole
> Hansen <ole at jlab.org> , Harut Avakian <avakian at jlab.org> , Brad Sawatzky
> <brads at jlab.org> , Mark M. Ito <marki at jlab.org>
>
>
> All,
>
> As I have started on the procurement of the new /work file server, I
> have discovered that Physics' use of /work has grown unrestrained over
> the last year or two.
>
> "Unrestrained" because there is no way under Lustre to restrain it
> except via a very unfriendly Lustre quota system. As we leave some
> quota headroom to accommodate large swings in usage for each hall for
> cache and volatile, then /work continues to grow.
>
> Total /work has now reached 260 TB, several times larger than I was
> anticipating. This constitutes more than 25% of Physics' share of
> Lustre, compared to LQCD which uses less than 5% of its disk space on
> the un-managed /work.
>
> It would cost Physics an extra $25K (total $35K - $40K) to treat the 260
> TB as a requirement.
>
> There are 3 paths forward:
>
> (1) Physics cuts its use of /work by a factor of 4-5.
> (2) Physics increases funding to $40K
> (3) We pull a server out of Lustre, decreasing Physics' share of the
> system, and use that as half of the new active-active pair, beefing it
> up with SSDs and perhaps additional memory; this would actually shrink
> Physics near term costs, but puts higher pressure on the file system for
> the farm
>
> The decision is clearly Physics', but I do need a VERY FAST response to
> this question, as I need to move quickly now for LQCD's needs.
>
> Hall D + GlueX, 96 TB
> CLAS + CLAS12, 98 TB
> Hall C, 35 TB
> Hall A <unknown, still scanning>
>
> Email, call (x7101), or drop by today 1:30-3:00 p.m. for discussion.
>
> thanks,
> Chip
>
> _______________________________________________
> Clas_offline mailing list
> Clas_offline at jlab.org
> https://mailman.jlab.org/mailman/listinfo/clas_offline
>
> _______________________________________________
> Frost mailing list
> Frost at jlab.org
> https://mailman.jlab.org/mailman/listinfo/frost
> _______________________________________________
> Frost mailing list
> Frost at jlab.org
> https://mailman.jlab.org/mailman/listinfo/frost
More information about the Frost
mailing list