[Halld-offline] Fwd: high Hall D farm utilization; need to adjust
Mark Ito
marki at jlab.org
Tue Mar 8 15:45:51 EST 2016
The latest on the farm from Sandy Philpott, FYI.
-------- Forwarded Message --------
Subject: high Hall D farm utilization; need to adjust
Date: Tue, 8 Mar 2016 14:11:19 -0500 (EST)
From: Sandy Philpott <philpott at jlab.org>
To: Mark Ito <marki at jlab.org>
CC: Graham Heyes <heyes at jlab.org>, Chip Watson <watson at jlab.org>
Hi Mark,
Following on the discussion this morning of the production gxproj accounts, here's the current usage
for them individually, and for Hall D overall:
--------
FairShare Information
Depth: 7 intervals Interval Length: 1:00:00:00 Decay Rate: 0.70
FSInterval % Target 0 1 2 3 4 5 6
FSWeight ------- ------- 1.0000 0.7000 0.4900 0.3430 0.2401 0.1681 0.1176
TotalUsage 100.00 ------- 91050.9 155705.8 163137.0 151037.6 127158.9 165079.1 182521.6
gxproj5 12.27 5.00+ 18.33 12.12 10.29 23.87 ------- ------- -------
gxproj4* 2.09 5.00+ 5.33 0.62 0.38 1.14 4.69 0.77 2.52
gxproj3 7.95 5.00+ 2.35 7.24 9.58 6.80 34.40 2.34 1.75
gxproj1 7.26 5.00+ 6.76 9.88 8.06 4.46 3.63 7.25 5.01
hallb* 0.24 10.00 0.33 0.12 0.05 0.14 0.10 1.24 0.30
theory 0.79 5.00 1.11 0.52 0.45 0.36 2.74 0.56 0.62
hallc 6.82 10.00 6.34 5.29 2.66 10.59 4.14 12.45 19.58
clas* 29.47 40.00 33.77 32.08 23.66 26.63 21.91 45.37 16.60
halld* 53.97 30.00 47.01 54.23 66.67 52.07 56.21 34.95 60.92
halla* 4.34 10.00 4.66 5.10 2.41 4.01 7.05 5.38 1.98
accelerator 4.37 5.00 6.77 2.66 4.11 6.21 7.85 0.05 -------
--------
So while gxproj4 is under its target (the serial jobs, yes?), gxproj5 and all of halld is well over.
Note that the fairshare settings are still set the same as the past several years:
A 10%, B 50%, C 10%, D 30%
and there are also lots of pending jobs by Accelerator and Theory guests.
This leads to the next important point -- with Hall D so much over its fairshare, and with
the resources configured heavily in Hall D's favor with the exclusive and multicore nodes,
and with the inefficiency of the many-core jobs, SciComp needs to make some adjustments to make
more resources available to the other halls and their serial jobs.
In that light, I'll be reconfiguring many nodes back to non-exclusive and away from the exclusive
and 12 core multicore queues. This will help use the farm and particularly farm14 nodes more efficiently.
I've copied Chip and Graham, so they're aware. Please check with Physics if adjustments should be made
to the 10/50/10/30 fairshare config. In the meantime, I'll reconfigure nodes to get back a bit closer
to this agreement; the other halls have noticed their long queue wait times. (Oldest farm jobs were
submitted Feb. 17).
Note that HPC has been paying back its debt to the farm over the last month, so all halls are
getting to take advantage of the farm's increased size. We can continue to run with this loan
payback for the near future.
Regards,
Sandy
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mailman.jlab.org/pipermail/halld-offline/attachments/20160308/6e7fc5ad/attachment.html>
More information about the Halld-offline
mailing list