<html><head><meta http-equiv="content-type" content="text/html; charset=utf-8"></head><body dir="auto">Hi FX, <div><br></div><div>That would seem to be inconsistent with these complaint I got a few years ago from Sandy Philpott about “excessive” memory usage of my single-threaded GEANT4 simulation jobs: </div><div><br></div><div>“<span style="background-color: rgba(255, 255, 255, 0);">Hi Andrew,<br><br>We're trying to understand the 8GB memory requirement of your SBS farm jobs... The farm nodes are configured at 32 GB RAM for 24 cores, so 1.5 GB per core. Since your jobs request so much memory, it blocks other jobs' access to the systems. Why are these jobs such an exception to the farm job norm -- why do they need so much memory for just a single core job? Can your code run on multiple cores and use memory more efficiently?<br><br>All insight helpful and appreciated, as other users' jobs are backed up in the queue although many cores sit idle.<br><br>Regards,<br>Sandy”</span></div><div><br></div><div>What makes CLAS12 decoding jobs different in this regard? Unless something has changed since then? This question should probably be looked into at a higher level. If in fact these decode jobs are de facto forcing 5/6 cores per job to sit idle, that would be a significant issue for scientific computing for all four halls.</div><div><br></div><div>Best regards,</div><div>Andrew<br><br><div id="AppleMailSignature"><span style="background-color: rgba(255, 255, 255, 0);"><a href="http://puckett.physics.uconn.edu">puckett.physics.uconn.edu</a></span></div><div><br>On Jul 2, 2018, at 8:23 PM, Francois-Xavier Girod <<a href="mailto:fxgirod@jlab.org">fxgirod@jlab.org</a>> wrote:<br><br></div><blockquote type="cite"><div><div dir="ltr">The decoding is not multithreaded. The decoding of several evio files merged into one hipo file currently does require this much memory, however this is largely due to the CC insisting on monitoring virtual memory allocation. None of this is a problem, those are all features.<br><div><br></div><div>That being said, I do not think those jobs grab 6 cores. They only need and only use one.</div></div><br><div class="gmail_quote"><div dir="ltr">On Tue, Jul 3, 2018 at 9:18 AM Andrew Puckett <<a href="mailto:puckett@jlab.org">puckett@jlab.org</a>> wrote:<br></div><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><div dir="auto">The much bigger problem that I thought the farm admins would’ve noticed first appears to be that a single-core job is requesting 9 GB memory allocation, which is wildly inefficient given the batch farm architecture of approximately 1.5 GB per core. Unless I’ve misunderstood something, or unless the job is actually multithreaded even though it appears not to be, each job with those parameters will grab six cores while using only one, causing five cores to sit idle ***per job***. While I am by no means an expert, I thought that the CLAS12 software framework was supposed to be fully multi-thread capable? I only bring it up as someone with a vested interest in the efficient use of the JLab scientific computing facilities...<br><br><div id="m_-8296534818626272229AppleMailSignature"><span style="background-color:rgba(255,255,255,0)"><a href="http://puckett.physics.uconn.edu" target="_blank">puckett.physics.uconn.edu</a></span></div><div><br>On Jul 2, 2018, at 7:03 PM, Francois-Xavier Girod <<a href="mailto:fxgirod@jlab.org" target="_blank">fxgirod@jlab.org</a>> wrote:<br><br></div><blockquote type="cite"><div><div><div dir="auto">The I/O to those jobs is defined as per CC guidelines, there is no “small” I/O the hippo files are about 5 GB merging 10 evio files together. I think the CC need to have better diagnostics. </div></div><div><br><div class="gmail_quote"><div dir="ltr">On Tue, Jul 3, 2018 at 7:56 AM Harout Avakian <<a href="mailto:avakian@jlab.org" target="_blank">avakian@jlab.org</a>> wrote:<br></div><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">
<div text="#000000" bgcolor="#FFFFFF">
<p>FYI</p>
<p>I understood that was fixed. FX could you please check what is
the problem.<br>
</p>
<div class="m_-8296534818626272229m_-5789663251165737513moz-forward-container">Harut<br>
<br>
-------- Forwarded Message --------
<table class="m_-8296534818626272229m_-5789663251165737513moz-email-headers-table" border="0" cellspacing="0" cellpadding="0">
<tbody>
<tr>
<th valign="BASELINE" align="RIGHT" nowrap="">Subject:
</th>
<td>class12-2 jobs performing lots of small i/o</td>
</tr>
<tr>
<th valign="BASELINE" align="RIGHT" nowrap="">Date: </th>
<td>Mon, 2 Jul 2018 08:58:04 -0400 (EDT)</td>
</tr>
<tr>
<th valign="BASELINE" align="RIGHT" nowrap="">From: </th>
<td>Kurt Strosahl <a class="m_-8296534818626272229m_-5789663251165737513moz-txt-link-rfc2396E" href="mailto:strosahl@jlab.org" target="_blank"><strosahl@jlab.org></a></td>
</tr>
<tr>
<th valign="BASELINE" align="RIGHT" nowrap="">To: </th>
<td>Harut Avagyan <a class="m_-8296534818626272229m_-5789663251165737513moz-txt-link-rfc2396E" href="mailto:avakian@jlab.org" target="_blank"><avakian@jlab.org></a></td>
</tr>
<tr>
<th valign="BASELINE" align="RIGHT" nowrap="">CC: </th>
<td>sciops <a class="m_-8296534818626272229m_-5789663251165737513moz-txt-link-rfc2396E" href="mailto:sciops@jlab.org" target="_blank"><sciops@jlab.org></a></td>
</tr>
</tbody>
</table>
<br>
<br>
<pre>Harut,
There are a large number of clas12 jobs running through the farm under user clas12-2, these jobs are performing lots of small i/o.
An example of one of these jobs is:
Job Index: 55141495
User Name: clas12-2
Job Name: R4013_13
Project: clas12
Queue: prod64
Hostname: farm12021
CPU Req: 1 centos7 core requested
MemoryReq: 9 GB
Status: ACTIVE
You can see the small i/o by looking: <a class="m_-8296534818626272229m_-5789663251165737513moz-txt-link-freetext" href="https://scicomp.jlab.org/scicomp/index.html#/lustre/users" target="_blank">https://scicomp.jlab.org/scicomp/index.html#/lustre/users</a>
w/r,
Kurt J. Strosahl
System Administrator: Lustre, HPC
Scientific Computing Group, Thomas Jefferson National Accelerator Facility
</pre>
</div>
</div>
_______________________________________________<br>
Clas12_software mailing list<br>
<a href="mailto:Clas12_software@jlab.org" target="_blank">Clas12_software@jlab.org</a><br>
<a href="https://mailman.jlab.org/mailman/listinfo/clas12_software" rel="noreferrer" target="_blank">https://mailman.jlab.org/mailman/listinfo/clas12_software</a></blockquote></div></div>
</div></blockquote><blockquote type="cite"><div><span>_______________________________________________</span><br><span>Clas12_software mailing list</span><br><span><a href="mailto:Clas12_software@jlab.org" target="_blank">Clas12_software@jlab.org</a></span><br><span><a href="https://mailman.jlab.org/mailman/listinfo/clas12_software" target="_blank">https://mailman.jlab.org/mailman/listinfo/clas12_software</a></span></div></blockquote></div></blockquote></div>
</div></blockquote></div></body></html>