[Halld-offline] Fwd: Re: CCPR 116932 UPDATE (Fair-share core hours)

Mark Ito marki at jlab.org
Wed May 11 08:34:52 EDT 2016


FYI


-------- Forwarded Message --------
Subject: 	Re: CCPR 116932 UPDATE (Fair-share core hours)
Date: 	Mon, 9 May 2016 15:58:56 -0400 (EDT)
From: 	Sandy Philpott <philpott at jlab.org>
To: 	Paul Mattione <pmatt at jlab.org>
CC: 	ccpr reply <ccpr_reply at jlab.org>, chen at jlab.org, kelvin at jlab.org, 
letta at jlab.org, David Rackley <rackley at jlab.org>, seitz at jlab.org, 
ychen at jlab.org, strosahl at jlab.org, larrieu at jlab.org, scicomp 
<scicomp at jlab.org>, Mark M. Ito <marki at jlab.org>



Let me try this again since that last message mixed physical and hyperthreaded core numbers...

You have to ask for all 48 cores on the farm14 nodes to get the whole node (per the March Physics meeting and subsequent announcemensts via news and jlab-scicomp-briefs email).  The idea at that meeting was to fit 3 16 core jobs onto a 48 core farm14 node, and 4 16 core jobs onto a new farm16 node that will likely have 32 physical cores, 64 with hyperthreading on.

----- Original Message -----
From: "Paul Mattione" <pmatt at jlab.org>
To: "Sandy Philpott" <philpott at jlab.org>
Cc: "ccpr reply" <ccpr_reply at jlab.org>, chen at jlab.org, kelvin at jlab.org, letta at jlab.org, "David Rackley" <rackley at jlab.org>, seitz at jlab.org, ychen at jlab.org, strosahl at jlab.org, larrieu at jlab.org, "scicomp" <scicomp at jlab.org>, "Mark M. Ito" <marki at jlab.org>
Sent: Monday, May 9, 2016 3:20:56 PM
Subject: Re: CCPR 116932 UPDATE (Fair-share core hours)

Hmm, I thought that requesting 24 cores guaranteed you access to the whole node?  Or has that changed?  I’m losing track ...

 From my original email, then:

>> A) If I request 24 cores, and my job takes 10 hours, this should be 240 fair-share core hours, correct?
>> B) If I request 48 cores, and my job takes 8 hours (reduced scaling), how many fair-share core hours would this be?  384?
>>

Then it would be to our disadvantage to ever use more than the number of physical cores, because it would cost us more fair-share core-hours to do the same amount of work.  If we had the whole node, then the other CPUs would just sit idle, wasted.

What am I getting wrong?

  - Paul

On May 9, 2016, at 3:16 PM, Sandy Philpott <philpott at jlab.org> wrote:

> Hi Paul,
>
> Note that if your job uses 48 threads, you need to submit the job requesting all 48 cores.
>
> Sandy
>
> ----- Original Message -----
> From: "Paul Mattione" <pmatt at jlab.org>
> To: "ccpr reply" <ccpr_reply at jlab.org>
> Cc: chen at jlab.org, kelvin at jlab.org, philpott at jlab.org, letta at jlab.org, "David Rackley" <rackley at jlab.org>, seitz at jlab.org, ychen at jlab.org, strosahl at jlab.org, larrieu at jlab.org
> Sent: Monday, May 9, 2016 11:36:24 AM
> Subject: Re: CCPR 116932 UPDATE (Fair-share core hours)
>
> So, let’s suppose I submit a job to a 24-core node, and I run it with 48 threads, and it takes a wall-time of 2 hours to finish.  If I submit this job requesting 24 cores or larger, it will land on the node and run it.
>
> Now, if I requested 24 cores, does it charge 24 * 2 = 48 core hours to fair-share?  Or does it charge the CPU time of 48 * 2 = 96 core hours?
>
> - Paul
>
> On May 9, 2016, at 10:26 AM, ccpr_reply at jlab.org wrote:
>> Mod Date:   2016/05/09
>> Mod Time:   10:26:01
>> Mod User:   chen
>> Current State:   COMPLETE
>> --------------------------------------------------------------
>> You raised a very good question. Currently, Maui scheduler does not do
>> performance based fairshare, i.e. Maui will treat all core hours the
>> same. In fact, we use Maui scheduler to do fairshare on job slots
>> instead of cores. So your conclusion is correct about not submitting a
>> job requesting more than 24 cores. This is our way to encourage users
>> to use exclusive mode: one can request 24 cores to get a machine
>> exclusively, and then one can run any number of threads on the
>> machine. By requesting an exclusive node with minimal required
>> threads, one will accumulate less faireshare core(slot) hours on average. 									
>> --------------------------------------------------------------		
>> Here is a copy of your Original Request:
>> 	
>> Email:     pmatt at jlab.org
>> Staff:	   chen
>> Category:  FARM CLUSTER
>> Subject:   Fair-share core hours
>> Submitted: 5/5/2016 11:19 AM
>> 			
>> For the fair-share usage determination, are all core hours treated the same?  Because performance (i.e. work per core-hour) is best if there are few enough jobs on the node that you don't need to use the hyperthreads. For example, let's say I have an entire 24-core (48-hyperthread) node to myself:
>>
>> A) If I request 24 cores, and my job takes 10 hours, this should be 240 fair-share core hours, correct?
>> B) If I request 48 cores, and my job takes 8 hours (reduced scaling), how many fair-share core hours would this be?  384?
>>
>> If so, you can see why it might be disadvantageous to ever submit anything requesting > 24 cores ...
>>
>> - Paul



-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mailman.jlab.org/pipermail/halld-offline/attachments/20160511/e3eb6fe5/attachment.html>


More information about the Halld-offline mailing list