[Halld-offline] can GlueX use idle cores at JLab?
Mark Ito
marki at jlab.org
Thu Apr 2 12:39:15 EDT 2015
Also, the 48 hour running time is get a big output file. We would like
to have 20 GB, but this will give us 10 GB. There may be cleverer ways...
On 04/02/2015 09:54 AM, Sandy Philpott wrote:
> Paul,
>
> Thanks - the job insight is helpful!
>
> We'll pull the farm12/13 nodes, all with 16 cores and 32 GB, back from HPC loan -- those will be a better match for these GlueX jobs. We can also pull in HPC's 9q nodes with 8 cores and 24 GB RAM if they're also useful.
>
> Regards,
> Sandy
>
> ----- Original Message -----
> From: "Paul Mattione" <pmatt at jlab.org>
> To: "Sandy Philpott" <philpott at jlab.org>
> Sent: Thursday, April 2, 2015 9:43:11 AM
> Subject: Re: [Halld-offline] can GlueX use idle cores at JLab?
>
> Not really … the memory usage is not something we have much control over; that’s just how much memory the simulation needs.
>
> For the job length … these jobs are for our “Data Challenge 3,” where we wanted to test tape -> job throughput in a more uniform way that what we've been able to do so far with our experimental data. So we’re generating a bunch of large (10 GB per file), uniformly-sized MC data to test with in order to model the hit that good experimental data will give. So it’s not very convenient for us to make the jobs smaller.
>
> Technically these are Mark Ito’s jobs, so you should direct further inquiries to him. I was just faster to the draw on the response.
>
> - Paul
>
> On Apr 2, 2015, at 9:38 AM, Sandy Philpott <philpott at jlab.org> wrote:
>
>> Thanks Paul -- Can they be shorted to 24 hour walltime and <= 1.4 GB, to fit better onto the hardware and play nice with other jobs? Sandy
>>
>> ----- Original Message -----
>> From: "Paul Mattione" <pmatt at jlab.org>
>> To: "Sandy Philpott" <philpott at jlab.org>
>> Cc: halld-offline at jlab.org
>> Sent: Thursday, April 2, 2015 9:10:16 AM
>> Subject: Re: [Halld-offline] can GlueX use idle cores at JLab?
>>
>> These are simulation jobs, which we can currently only run single-threaded. I think when we switch to GEANT4 we can do multithreaded simulations, but that's many months (at least) away.
>>
>> - Paul
>>
>> ----- Original Message -----
>> From: "Sandy Philpott" <philpott at jlab.org>
>> To: halld-offline at jlab.org
>> Sent: Thursday, April 2, 2015 8:44:01 AM
>> Subject: Re: [Halld-offline] can GlueX use idle cores at JLab?
>>
>> Hi again,
>>
>> Thanks - I see lots of jobs came in... but the memory requirements aren't a good match with the Haswells. Attached is the visual impact of the gxproj2 jobs able to use only 14 of the 24 cores on the farm14 nodes -- these are all long-running jobs but are a bit of a bad fit to the hardware. Can the jobs be adjusted to run mutli-core, or use less memory per single job? The goal is 1400 MB or below on these nodes, as they have 32 GB RAM. Can the jobs be shorted to the default 24 hour walltime to be better farm users? This would allow other jobs to get slots without having to wait the full >2 days for these to finish.
>>
>> Input / feedback welcome,
>> Sandy
>>
>>
>> From: "Mark Ito" <marki at jlab.org>
>> To: halld-offline at jlab.org
>> Sent: Tuesday, March 31, 2015 4:41:18 PM
>> Subject: [Halld-offline] Fwd: can GlueX use idle cores at JLab?
>>
>> from Sandy Philpott
>>
>>
>> -------- Forwarded Message --------
>> Subject: can GlueX use idle cores at JLab?
>> Date: Tue, 31 Mar 2015 11:52:19 -0400 (EDT)
>> From: Sandy Philpott <philpott at jlab.org>
>> To: halld-offline at jlab.org
>> CC: Heyes Graham <heyes at jlab.org> , Chip Watson <watson at jlab.org> , Mark Ito <marki at jlab.org> , David Lawrence <davidl at jlab.org>
>>
>> Hello GlueX,
>>
>> The newest Haswell farm14 nodes of 2400 cores at JLab have been mostly idle since their installation last fall. That's much of 1.7 M Haswell core-hours each month that are largely unused, or almost 5 M core hours of idle time so far.
>>
>> Could Hall D keep simulation jobs in the queue and running indefinitely, rather than just during times of the data challenges? Are there other jobs to run? Otherwise, many of the available computing cycles in the farm for Experimental Physics are falling on the floor rather than being used.
>>
>> Feedback and perspective welcome,
>> Sandy
>>
>>
>>
>> _______________________________________________
>> Halld-offline mailing list
>> Halld-offline at jlab.org
>> https://mailman.jlab.org/mailman/listinfo/halld-offline
>>
>>
>> _______________________________________________
>> Halld-offline mailing list
>> Halld-offline at jlab.org
>> https://mailman.jlab.org/mailman/listinfo/halld-offline
More information about the Halld-offline
mailing list