[Halld-offline] Software Meeting Minutes, August 18, 2020

Mark Ito marki at jlab.org
Tue Aug 18 18:52:12 EDT 2020


Please find the minutes here 
and below.

   -- Mark

    GlueX Software Meeting, August 18, 2020, Minutes

Present: Sean Dobbs, Mark Ito (chair), Igal Jaegle, Naomi Jarvis, Justin 
Stevens, Nilanga Wickramaarachchi, Beni Zihlmann

There is a recording of his meeting 
<https://bluejeans.com/s/kAV6kSo3kEe/> on the BlueJeans site. Use your 
JLab credentials to authenticate.


 1. Draft of DSelector documentation
    Sean noted that the latest version of the document is on Overleaf.
    Naomi is the owner. Please contact her if you would like
    review/contribute the to document.
      * Naomi, Justin, and Beni noted that JLab now has a license of
        some sort. Mark will contact David Lawrence about the details.
        [Added in press: Mark forwarded the response he got from David
        to the aforementioned folks.]
      * Naomi reported on a issue with "Zombie ProofLite servers". Her
          o We were having a problem with zombie proofserv.exe processes
            left behind on our cluster nodes after the jobs completed,
            they were not cleaned up by slurm. Every so often, one of
            many worker threads would fail to initialize properly, as
            seen in the note on how many threads had gone parallel. The
            job output was unaffected, although presumably it took
            fractionally longer than it should have done, but after
            hundreds of DSelector jobs had been run, there were a large
            number of zombie processes left behind.
          o It turned out to be sort of a network issue, which caused
            TProof to timeout (after only 5 seconds) on some of the
            forked threads. Since it timed out, it skipped that thread
            and did not destroy it when ROOT exited. Setting a longer
            timeout solved the problem.
          o This can be done by appending the following line to
            ${HOME}/.rootrc: "ProofLite.StartupTimeOut 1800", which
            gives a very generous timeout allowance of 30 minutes.
 2. New version set: 4.24.1
    Mark pointed out
     1. This version set has the Python-3 build changes incorporated.
     2. A new container was created to track the new RPM required for
        HDGeant4 (tirpc-devel).
     3. The patch version set solves the "hang on first event" problem
        of HDGeant4.

      Review of Minutes from the Last Software Meeting

We went over the minutes from the August 4 

  * Since the last meeting, no one has noticed the slow wiki access on
    halldweb.jlab.org that we discussed. The only known change was the
    doubling of the period between refreshes of the MCwrapper web page
    that Thomas Britton reported last time.
  * The dE/dx theta correction for the CDC seems happy in its new home,
    merged onto the master branch of halld_recon.
  * The Python 3 compatible build that Mark described was not as
    universal as he had hoped. It continues to work for both CentOS 7
    and Fedora 32, but Ubuntu 20 and CentOS 8 did not work without
    modifications to the scheme. The problem is the way different
    distributions treat the interpretations (or lack thereof) of Python,
    Python 2, Python 3, and of Scons, SCons 2, Scons 3. Turns out that
    each distribution is a special case. This has delayed deployment of
    a CentOS 8 container.
  * Naomi reported a new feature that Theo Larrieu added to the
    electronic logbook that allows selection of multiple images to be
    uploaded at one time. This has worked for her. Mark will check to
    see if Mark Dalton has used the feature.

      Report from the Last HDGeant4 Meeting

We went over the minutes from the meeting on August 11 
without significant comment.

      Report from SciComp Meeting], August 6

Mark showed a slide 
summarizing items from the meeting. The big message is that Scientific 
Computing is moving to substantial support of the OSG at JLab, including 
contributing compute resources. Details of the plan are not known at 
present, but they are working on getting all of the component pieces 

      Restoration of Execution Tests for Pull Request Builds

Sean described bringing back execution of binaries 
e.g., hd_root, as part of the automatic pull request test procedure. 
This has been broken for many moons. In the process the Mark improved 
the environment set-up procedure. We need to do similar tests for 
hdgeant4 next. Sean will look into it.

      Review of recent issues and pull requests

Sean discussed halld_recon Pull Request #432 
<https://github.com/JeffersonLab/halld_recon/pull/432>: Suppress 
geometry-related warning messages". At present, merging is waiting on a 
new version of JANA to appear in a future version set. This is on Mark's 

      Action Item Review

 1. Ask David about JLab's Overleaf license (Mark, done)
 2. Create an FAQ on the zombie prooflite solution (Mark)
 3. Ask Mark D. about multiple file uploads with the electronic logbook
 4. HDGeant4 run-time testing for pull requests (Sean)
 5. Create a version set with a new version of JANA sourced from GitHub

Retrieved from 

  * This page was last modified on 18 August 2020, at 18:50.

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mailman.jlab.org/pipermail/halld-offline/attachments/20200818/bd0b16c9/attachment-0001.html>

More information about the Halld-offline mailing list