[Halld-offline] Offline Software Meeting Minutes, August 7, 2018

Mark Ito marki at jlab.org
Tue Aug 7 20:19:56 EDT 2018


Folks,

Find the minutes below and here 
<https://halldweb.jlab.org/wiki/index.php/GlueX_Offline_Meeting,_August_7,_2018>.

   -- Mark

___________________


  Minutes, GlueX Offline Meeting, August 7, 2018

Present:

  * *CMU: * Curtis Meyer
  * *FIU: * Mahmoud Kamel
  * *FSU: * Sean Dobbs
  * *JLab: * Alex Austregesilo, Thomas Britton, Mark Dalton, Stuart
    Fegan, Mark Ito (chair), David Lawrence, Justin Stevens, Beni Zihlmann

The chairman neglected to hit the record button on BlueJeans.


      Announcements

 1. *reconstruction launch version set:
    version_recon-2017_01-ver03_jlab.xml
    <https://mailman.jlab.org/pipermail/halld-offline/2018-August/003306.html>*.
    The tag of sim-recon used in the reconstruction has been built on
    five platforms.
 2. Status of Recon Launch: Alex A.
      * We are using QCD12 boxes with farm18 nodes shown on
        [image:farmnodes.png|the SciComp webpage], but not in active use
        yet.
      * There are 700-800 jobs running simultaneously.
      * There will be 300 to 400 more when the farm18 nodes are activated.
      * We are 85% done with 2016 data; it will be done in 2 or 3 days.
      * Spring 2017 reconstruction should take 15 to 20 days.
      * Possible problem with cache disk space if we run more jobs
        simultaneously, our pin quota is used up.
      * David remarked that since we are copying the raw data to the
        local disk first, they could be unpinned as soon as they are copied.


      Review of minutes from the July 24 meeting

We went over the minutes 
<https://halldweb.jlab.org/wiki/index.php/GlueX_Offline_Meeting,_July_24,_2018#Minutes>. 



        NERSC Update

David gave us an update.

  * Chris Larrieu is back from vacation and has addressed some swif2 issues.
  * Test of reconstruction of one run with 220 files ran into a
    20-job-at-a-time limit imposed by swif2. The limit is motivated by
    having only 1 TB of disk space at NERSC. More space than that is
    needed to keep the pipe full.
  * David has consulted with a Brookhaven physicist who has been working
    with more space.
  * Reserving an entire node is possible, but you have to "pay" in
    advance for the time and it may be hard to get credit back for
    failed jobs.
  * David plans to move to a 20 TB "cache" disk (with a file lifetime
    limit).
  * The plan is to try a monitoring launch over Spring 2018 data first.
  * Sean asked about what software tag was going to be used. He
    cautioned that there is CDC reconstruction code that should be added
    to augment the code being used for the current reconstruction launch.
  * Alex cautioned that the monitoring launch uses many more plugins
    than are used in reconstruction launches. More memory may be required.


      Splitting up Sim-Recon: Aftermath

Mark led us through the announcement of the split 
<https://mailman.jlab.org/pipermail/halld-offline/2018-July/003292.html> 
performed Monday, July 30 and a wiki page he wrote 
<https://halldweb.jlab.org/wiki/index.php/Converting_sim-recon_tags_and_branches_to_the_split_repositories> 
describing how to recover branches and tags from the sim-recon 
repository when using the new halld_recon and halld_sim repositories.

Items that still need to be addressed:

 1. The use of the HALLD_MY directory needs to be revisited with the
    split repositories.
 2. A procedure for recovering tagged versions of sim-recon and
    deploying them in the split repositories needs to be developed.
 3. The automatic builds triggered by pull requests needs to be
    implemented on the new repositories.


      HDGeant4 issues

We reviewed the recent pull requests from Richard Jones fixing separate 
issued in the FDC simulation one in HDGeant (GEANT 3) and the other in 
HDGeant4. See [his comment, submitted today, on HDGeant4 Issue #54 
<https://github.com/JeffersonLab/HDGeant4/issues/54>. Corresponding pull 
requests to the halld_sim and hdgeant4 repositories have been merged to 
their respective master branches.


      Review of recent pull requests

The title of Pull request #1180 
<https://github.com/JeffersonLab/sim-recon/pull/1180> from David served 
as a reminder to upbraid us for adding frustration to his workflow. The 
issue is respect (or rather disrespect) for a mechanism for building 
sim-recon (at the time of the request) without all of the packages we 
build, i. e., a mechanism for having optional packages. Whether a 
package is optional or not is signaled by the absence or presence of the 
home environment variable for the package. When collaborators do not 
respect this convention, David is stuck either building the suddenly 
non-optional package or coding the mechanism in himself.

David has looked into the idea of having build "flavors;" configurations 
of the build with optional packages explicitly identified. That takes 
the configuration out of the shell environment. In general, he thinks 
that we may be due for re-factoring the SCons build system (SBMS) in any 
case.


      Review of recent discussion on the GlueX Software Help List

We looked at recent posts 
<https://groups.google.com/forum/#%21forum/gluex-software>.

  * None of those present, other than Mark, have experienced the halld
    web authentication error (401).
  * The cause of the g++ internal compiler error, on random source code
    files, during the single-threaded build of hdgeant4 on the ifarm
    machines and on no others is still a mystery.

Retrieved from 
"https://halldweb.jlab.org/wiki/index.php?title=GlueX_Offline_Meeting,_August_7,_2018&oldid=88570"

  * This page was last modified on 7 August 2018, at 20:16.

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mailman.jlab.org/pipermail/halld-offline/attachments/20180807/7e64b260/attachment-0001.html>


More information about the Halld-offline mailing list