<div dir="ltr"><div>Hi all,</div><div><br></div><div>I've been running sets of test jobs on our cluster over the weekend to make sure that we're ready for the data challenge. I've started keeping notes here:</div>
<div><br></div><div><a href="https://halldweb1.jlab.org/wiki/index.php/NU_DC2_Tests">https://halldweb1.jlab.org/wiki/index.php/NU_DC2_Tests</a><br></div><div><br></div><div>The short version of the story so far is that with jobs of 10K events each, I'm getting a >50% success rate, though I don't have the hard memory limits that seemed to be causing some of the crashes that Mark mentioned on Friday. The failed jobs mainly die at REST creation, and while I've found increasing the JANA thread timeout limit to be helpful, the problems seem consistent with either certain events taking too long to process, or some site-specific bottlenecks.</div>
<div><br></div><div><br></div><div>Cheers,</div><div>Sean</div><div><br></div><div><br></div>-- <br>Dr. Sean Dobbs<br>Department of Physics & Astronomy <br>Northwestern University<br>phone: 847-467-2826
</div>