[Halld-offline] Fwd: high Hall D farm utilization; need to adjust

Tue Mar 8 16:41:34 EST 2016

Curtis,

I think you are about right about the percentages.

I think that everything we are running right now are things we have 
discussed and decided on, so in that sense they make sense. The 
concerning thing is our apparent low utilization of the exclusive nodes.

We have more data now; the days of completing a launch over all of the 
data in a weekend may be behind us, especially since there will always 
be other competing projects going on. The competition is starting now 
because, hey, we have more data.

   -- Mark

On 03/08/2016 04:17 PM, Curtis A. Meyer wrote:
> Hi Mark,
>
>   yes, it would be useful to have this in words. But my interpretation
> is that halld is currently using about 54% of the farm. about 30% goes to
> the gxproj accounts, and 24% is going to other hall-d processes.
>
>   My understanding in discussions in early February was that this was
> about the time that halld would take over the farm, so it looks like we
> were right.
>
>    That said, we should try and determine what we are running on the
> farm right now, and see if some of it does not make sense. We should
> probably try to prioritize what we are doing.
>
> Curtis
> ---------
> Curtis A. MeyerMCS Associate Dean for Faculty and Graduate Affairs
> Wean:    (412) 268-2745Professor of Physics
> Doherty: (412) 268-3090Carnegie Mellon University
> Fax:         (412) 681-0648Pittsburgh, PA 15213
> curtis.meyer at cmu.edu 
> <mailto:curtis.meyer at cmu.edu>http://www.curtismeyer.com/
>
>
>
>> On Mar 8, 2016, at 3:45 PM, Mark Ito <marki at jlab.org 
>> <mailto:marki at jlab.org>> wrote:
>>
>> The latest on the farm from Sandy Philpott, FYI.
>>
>>
>> -------- Forwarded Message --------
>> Subject: 	high Hall D farm utilization; need to adjust
>> Date: 	Tue, 8 Mar 2016 14:11:19 -0500 (EST)
>> From: 	Sandy Philpott <philpott at jlab.org>
>> To: 	Mark Ito <marki at jlab.org>
>> CC: 	Graham Heyes <heyes at jlab.org>, Chip Watson <watson at jlab.org>
>>
>>
>>
>> Hi Mark,
>>
>> Following on the discussion this morning of the production gxproj accounts, here's the current usage
>> for them individually, and for Hall D overall:
>>   
>> --------
>> FairShare Information
>> Depth: 7 intervals   Interval Length: 1:00:00:00   Decay Rate: 0.70
>>
>> FSInterval        %     Target       0       1       2       3       4       5       6
>> FSWeight       ------- -------  1.0000  0.7000  0.4900  0.3430  0.2401  0.1681  0.1176
>> TotalUsage      100.00 ------- 91050.9 155705.8 163137.0 151037.6 127158.9 165079.1 182521.6
>>
>> gxproj5          12.27   5.00+   18.33   12.12   10.29   23.87 ------- ------- -------
>> gxproj4*          2.09   5.00+    5.33    0.62    0.38    1.14    4.69    0.77    2.52
>> gxproj3           7.95   5.00+    2.35    7.24    9.58    6.80   34.40    2.34    1.75
>> gxproj1           7.26   5.00+    6.76    9.88    8.06    4.46    3.63    7.25    5.01
>>   
>> hallb*            0.24  10.00     0.33    0.12    0.05    0.14    0.10    1.24    0.30
>> theory            0.79   5.00     1.11    0.52    0.45    0.36    2.74    0.56    0.62
>> hallc             6.82  10.00     6.34    5.29    2.66   10.59    4.14   12.45   19.58
>> clas*            29.47  40.00    33.77   32.08   23.66   26.63   21.91   45.37   16.60
>> halld*           53.97  30.00    47.01   54.23   66.67   52.07   56.21   34.95   60.92
>> halla*            4.34  10.00     4.66    5.10    2.41    4.01    7.05    5.38    1.98
>> accelerator       4.37   5.00     6.77    2.66    4.11    6.21    7.85    0.05 -------
>> --------
>>
>> So while gxproj4 is under its target (the serial jobs, yes?), gxproj5 and all of halld is well over.
>>
>> Note that the fairshare settings are still set the same as the past several years:
>>    A 10%, B 50%, C 10%, D 30%
>> and there are also lots of pending jobs by Accelerator and Theory guests.
>>
>> This leads to the next important point -- with Hall D so much over its fairshare, and with
>> the resources configured heavily in Hall D's favor with the exclusive and multicore nodes,
>> and with the inefficiency of the many-core jobs, SciComp needs to make some adjustments to make
>> more resources available to the other halls and their serial jobs.
>>
>> In that light, I'll be reconfiguring many nodes back to non-exclusive and away from the exclusive
>> and 12 core multicore queues. This will help use the farm and particularly farm14 nodes more efficiently.
>>
>> I've copied Chip and Graham, so they're aware.  Please check with Physics if adjustments should be made
>> to the 10/50/10/30 fairshare config.  In the meantime, I'll reconfigure nodes to get back a bit closer
>> to this agreement; the other halls have noticed their long queue wait times. (Oldest farm jobs were
>> submitted Feb. 17).
>>
>> Note that HPC has been paying back its debt to the farm over the last month, so all halls are
>> getting to take advantage of the farm's increased size. We can continue to run with this loan
>> payback for the near future.
>>
>> Regards,
>> Sandy
>>
>>
>> _______________________________________________
>> Halld-offline mailing list
>> Halld-offline at jlab.org <mailto:Halld-offline at jlab.org>
>> https://mailman.jlab.org/mailman/listinfo/halld-offline
>

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mailman.jlab.org/pipermail/halld-offline/attachments/20160308/88255566/attachment-0002.html>