[Halld] Monitoring launch ver 18 @ NERSC

David Lawrence davidl at jlab.org
Wed Oct 24 10:20:12 EDT 2018


Hi All,

  Here is an update on monitoring launch ver18 of the RunPeriod-2018-01 data @ NERSC:

Most of the jobs have finished (>98%). The remaining issues have caused us
to asymptotically approach 100% completion. One issue in particular has to do with the job
ending improperly. Because we ran on KNL (much better availability) the jobs took
9 or 10 hours to complete, thus causing a 1 day debug cycle.

At this point, I think interested people should not wait to start looking at the outputs (if
you haven’t already started). As a reminder, the files can be found in:

/mss/halld/offline_monitoring/RunPeriod-2018-01/ver18

I will continue to work on finishing off these last lingering jobs. I have also been working 
with SciComp to resolve the transport issues. I hope to start another similar sized 
campaign next week so we can get to the point where we are confident enough to
start a recon launch at NERSC by the end of the year.


Regards,
-David

P.S. For those really interested I’ve started documenting some of the details about NERSC
running here:

https://halldweb.jlab.org/wiki/index.php/HOWTO_Execute_a_Launch_using_NERSC

-------------------------------------------------------------
David Lawrence Ph.D.
Staff Scientist, Thomas Jefferson National Accelerator Facility
Newport News, VA
davidl at jlab.org
(757) 269-5567 W
(757) 746-6697 C


-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mailman.jlab.org/pipermail/halld/attachments/20181024/330fa312/attachment.html>


More information about the Halld mailing list