[Clas12_software] Fwd: class12-2 jobs performing lots of small i/o
Andrew Puckett
puckett at jlab.org
Mon Jul 2 20:18:16 EDT 2018
The much bigger problem that I thought the farm admins would’ve noticed first appears to be that a single-core job is requesting 9 GB memory allocation, which is wildly inefficient given the batch farm architecture of approximately 1.5 GB per core. Unless I’ve misunderstood something, or unless the job is actually multithreaded even though it appears not to be, each job with those parameters will grab six cores while using only one, causing five cores to sit idle ***per job***. While I am by no means an expert, I thought that the CLAS12 software framework was supposed to be fully multi-thread capable? I only bring it up as someone with a vested interest in the efficient use of the JLab scientific computing facilities...
puckett.physics.uconn.edu
> On Jul 2, 2018, at 7:03 PM, Francois-Xavier Girod <fxgirod at jlab.org> wrote:
>
> The I/O to those jobs is defined as per CC guidelines, there is no “small” I/O the hippo files are about 5 GB merging 10 evio files together. I think the CC need to have better diagnostics.
>
>> On Tue, Jul 3, 2018 at 7:56 AM Harout Avakian <avakian at jlab.org> wrote:
>> FYI
>>
>> I understood that was fixed. FX could you please check what is the problem.
>> Harut
>>
>> -------- Forwarded Message --------
>> Subject: class12-2 jobs performing lots of small i/o
>> Date: Mon, 2 Jul 2018 08:58:04 -0400 (EDT)
>> From: Kurt Strosahl <strosahl at jlab.org>
>> To: Harut Avagyan <avakian at jlab.org>
>> CC: sciops <sciops at jlab.org>
>>
>> Harut,
>>
>> There are a large number of clas12 jobs running through the farm under user clas12-2, these jobs are performing lots of small i/o.
>>
>> An example of one of these jobs is:
>>
>> Job Index: 55141495
>> User Name: clas12-2
>> Job Name: R4013_13
>> Project: clas12
>> Queue: prod64
>> Hostname: farm12021
>> CPU Req: 1 centos7 core requested
>> MemoryReq: 9 GB
>> Status: ACTIVE
>>
>> You can see the small i/o by looking: https://scicomp.jlab.org/scicomp/index.html#/lustre/users
>>
>> w/r,
>> Kurt J. Strosahl
>> System Administrator: Lustre, HPC
>> Scientific Computing Group, Thomas Jefferson National Accelerator Facility
>> _______________________________________________
>> Clas12_software mailing list
>> Clas12_software at jlab.org
>> https://mailman.jlab.org/mailman/listinfo/clas12_software
> _______________________________________________
> Clas12_software mailing list
> Clas12_software at jlab.org
> https://mailman.jlab.org/mailman/listinfo/clas12_software
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mailman.jlab.org/pipermail/clas12_software/attachments/20180702/312f4160/attachment-0001.html>
More information about the Clas12_software
mailing list