<html>

  <head>

    <meta http-equiv="content-type" content="text/html; charset=UTF-8">

  </head>

  <body>

    <p>Folks,</p>

    <p>Please find the minutes <a moz-do-not-send="true"

href="https://halldweb.jlab.org/wiki/index.php/GlueX_Software_Meeting,_April_27,_2021#Minutes">here</a>

      and below.</p>

    <p>  -- Mark</p>

    <p>     ______________________________________________</p>

    <p>

    </p>

    <div id="globalWrapper">

      <div id="column-content">

        <div id="content" class="mw-body" role="main">

          <h2 id="firstHeading" class="firstHeading" lang="en"><span

              dir="auto">GlueX Software Meeting, April 27, 2021, </span><span

              class="mw-headline" id="Minutes">Minutes</span></h2>

          <div id="bodyContent" class="mw-body-content">

            <div id="mw-content-text" dir="ltr" class="mw-content-ltr"

              lang="en">

              <p>Present: Alexander Austregesilo, Edmundo Barriga,

                Thomas Britton, Sean Dobbs, Mark Ito (chair), Igal

                Jaegle, Naomi Jarvis, Simon Taylor, Nilanga

                Wickramaarachchi, Jon Zarling, Beni Zihlmann

              </p>

              <p>There is a <a rel="nofollow" class="external text"

                  href="https://bluejeans.com/s/kxr2tk1kaQB/">recording

                  of this meeting</a>. Log into the <a rel="nofollow"

                  class="external text"

                  href="https://jlab.bluejeans.com">BlueJeans site</a>

                first to gain access (use your JLab credentials).

              </p>

              <h3><span class="mw-headline" id="Announcements">Announcements</span></h3>

              <ol>

                <li> <a rel="nofollow" class="external text"

href="https://mailman.jlab.org/pipermail/halld-offline/2021-April/008511.html">version

                    set 4.37.1 and mcwrapper v2.5.2</a>. Thomas

                  described the changes in the latest version of

                  MCwrapper. Luminosity is now used to normalize the

                  number of events to produce for each run number

                  requested.</li>

                <li> New: <a

href="https://halldweb.jlab.org/wiki/index.php/HOWTO_copy_a_file_from_the_ifarm_to_home"

                    title="HOWTO copy a file from the ifarm to home">HOWTO

                    copy a file from the ifarm to home</a>. Mark pointed

                  us to the new HOWTO. Sean told us that one could do

                  the same thing from <a class="moz-txt-link-abbreviated" href="ftp://ftp.jlab.org">ftp.jlab.org</a> without having to set

                  up an ssh tunnel. Mark will make the appropriate

                  adjustments to the documentation.</li>

                <li> To come: <a

href="https://halldweb.jlab.org/wiki/index.php/HOWTO_use_AmpTools_on_the_JLab_farm_GPUs"

                    title="HOWTO use AmpTools on the JLab farm GPUs">HOWTO

                    use AmpTools on the JLab farm GPUs</a>. Alex

                  described his HOWTO (still under construction).</li>

                <li> <a rel="nofollow" class="external text"

href="https://mailman.jlab.org/pipermail/halld-offline/2021-April/008517.html">work

                    disk full again</a>. Mark described the current work

                  disk crunch, including <a rel="nofollow"

                    class="external text"

href="https://halldweb.jlab.org/doc-public/DocDB/ShowDocument?docid=5082">plots

                    of recent usage history</a>. More clean-up will be

                  needed until the arrival of new work disk servers this

                  summer.</li>

                <li> <a rel="nofollow" class="external text"

href="https://mailman.jlab.org/pipermail/halld-offline/2021-April/008523.html">RedHat-6-era

                    builds on group disk slated for deletion</a>. Mark

                  reminded us that the deletion of these builds has been

                  carried out.</li>

                <li> <a rel="nofollow" class="external text"

href="https://mailman.jlab.org/pipermail/halld-offline/2021-April/008516.html">Bug

                    fix release of halld_recon: restore the REST version

                    number</a>. Mark reviewed the reason for the new

                  version sets.</li>

              </ol>

              <h3><span class="mw-headline"

                  id="Review_of_Minutes_from_the_Last_Software_Meeting">Review

                  of Minutes from the Last Software Meeting</span></h3>

              <p>We went over the <a

href="https://halldweb.jlab.org/wiki/index.php/GlueX_Software_Meeting,_March_30,_2021#Minutes"

                  title="GlueX Software Meeting, March 30, 2021">minutes

                  from the meeting on March 30th</a>.

              </p>

              <ul>

                <li> It turns out that there is no

                  pull-request-triggered test for HDGeant4. Mark has

                  volunteered to set one up à la the method Sean

                  instituted for halld_recon and halld_sim.</li>

                <li> Some significant progress has been made on

                  releasing CCDB 2.0.

                  <ul>

                    <li> The unit tests for CCDB 1.0 have been broken

                      for some time. Mark and Dmitry Romanov found and

                      fixed a problem with the fetch of constants in the

                      form map<string, string> having to do with

                      cache access. This problem is likely in the CCDB

                      2.0 branch.</li>

                    <li> Dmitry has started on reviving the MySQL

                      interface for CCDB 2.0.</li>

                    <li> Dmitry has moved us to a new workflow for CCDB

                      pull requests.

                      <ul>

                        <li> Developers will fork the JeffersonLab/ccdb

                          repository to their personal accounts and work

                          on branches created there as they see fit.</li>

                        <li> When a change is ready, they will submit a

                          pull request back to the JeffersonLab/ccdb

                          repository for merging.</li>

                        <li> This workflow is common outside Hall D. For

                          example Hall C uses it as do many groups

                          outside the Lab. We may consider using it

                          within Hall D as well. It makes it easier to

                          put up safeguards against spurious errors from

                          inadvertent/faulty commits and any code review

                          mechanism we may want to have. And it solves

                          the problem of the confusing proliferation of

                          branches in the main repository that we have

                          seen. We could move to it with no structural

                          changes to the repositories themselves.</li>

                        <li> Sean pointed out that such a workflow might

                          require minor changes to the

                          automatic-pull-request-triggered tests.</li>

                      </ul>

                    </li>

                  </ul>

                </li>

              </ul>

              <h3><span class="mw-headline"

                  id="Minutes_from_the_Last_HDGeant4_Meeting">Minutes

                  from the Last HDGeant4 Meeting</span></h3>

              <p>We went over the <a

href="https://halldweb.jlab.org/wiki/index.php/HDGeant4_Meeting,_April_6,_2021#Minutes"

                  title="HDGeant4 Meeting, April 6, 2021">minutes from

                  the HDGeant4 meeting on April 6th</a>. Sean noted that

                the overall focus of the HDGeant4 group is to compare

                Monte Carlo with data and, using the two simulation

                engines at our disposal, G3 and G4, try to drill down to

                see where difference arise at a basic physical level in

                HDGeant4, and then adjust the model to get agreement

                with data. This approach is preferred over one where

                empirical correction factors are imposed as an

                after-burner on the simulation.

              </p>

              <h3><span class="mw-headline"

                  id="Report_from_the_April_20th_SciComp_Meeting">Report

                  from the April 20th SciComp Meeting</span></h3>

              <p>Mark presented <a rel="nofollow" class="external text"

href="https://halldweb.jlab.org/doc-public/DocDB/ShowDocument?docid=5081">slides</a>,

                the first two reproducing the Bryan Hess's agenda for

                the meeting and the third summarizing some of the

                discussion. Please see his slides for the details.

              </p>

              <p>Sean asked if we could prioritize recovery of certain

                files over others. Mark will ask.

              </p>

              <h4><span class="mw-headline"

                  id="Handling_of_Recon_Launch_Output_from_Off-site">Handling

                  of Recon Launch Output from Off-site</span></h4>

              <p>Alex raised the issue of disk use when bringing results

                of reconstruction launches, performed off-site, back to

                JLab. All data land on volatile, and after reprocessing,

                get written to cache and from there to tape. He is

                worried about this procedure for two reasons:

              </p>

              <ol>

                <li> Data on volatile is subject to deletion (oldest

                  files get deleted first) and we do not want to lose

                  launch output to the disk cleaner.</li>

                <li> The array of problems we have always seen with

                  Lustre disks. Both volatile and cache are Lustre

                  systems.</li>

              </ol>

              <p>Mark showed a plot where the amount of data we have on

                volatile has been well under the deletion level for

                months now. His claim was that pre-mature deletion from

                volatile has not been a problem for quite a while. Alex

                did not think that the graph was accurate; it showed too

                little variation in usage level when Alex knows that

                there has been significant activity on the disk, an

                argument that Mark found convincing. Mark will have to

                check on the source of his data. That aside, disk usage

                in the context should be reviewed.

              </p>

              <h4><span class="mw-headline"

                  id="Consolidation_of_Skim_Files_on_to_Fewer_Tapes">Consolidation

                  of Skim Files on to Fewer Tapes</span></h4>

              <p>Sean has noticed that at times reprocessing skimmed

                data can take a long time due to retrieval times of

                files from tape. He suspects that this is because the

                files are scattered on many tapes and so a large number

                of tape mounts and file skips are needed to get all of

                the data. He proposed a project where, for certain

                skims, we re-write the data on to a smaller number of

                tapes.

              </p>

              <p>Mark had some comments:

              </p>

              <ul>

                <li> We should only start such a project on skims for

                  which there is some reasonable expectation that

                  retrieval will be done repeatedly in the future. The

                  consolidation step itself involves reading and writing

                  all of the files of interest and so reading those

                  files has to happen at least a couple of times after

                  consolidation before the exercise shows a net gain.</li>

                <li> The way we write data to tape, by putting skim

                  files on the write-through cache over several weeks

                  guarantees that the files will be scattered on

                  different tapes. With the write through cache we would

                  do better to buffer data on disk until a significant

                  fraction of one tape has been accumulated and then

                  manually trigger the write to tape.</li>

                <li> It is possible to set-up tape "volume sets" (a set

                  of specific physical tapes) in advance in the tape

                  library and then directed selected data types to

                  specific volume sets. The tapes in the volume sets

                  will then be dense in the data types so directed. This

                  is already done for raw data but there is not

                  structural impediment to doing it for other types of

                  data. This approach has the advantage there there is

                  no need to develop software to make it happen.</li>

              </ul>

              <p>Something does have to be done on this front. Sean and

                Mark will discuss the issue further.

              </p>

              <h3><span class="mw-headline"

                  id="ROOTWriter_and_DSelector_Updates">ROOTWriter and

                  DSelector Updates</span></h3>

              <p>Jon presented a list of ideas and improvements for our

                data analysis software. See <a rel="nofollow"

                  class="external text"

href="https://halldweb.jlab.org/wiki-private/index.php/ROOTWriter_and_DSelectorUpdates2021">his

                  wiki page</a>] for the complete list.

              </p>

              <p>The items and subsequent discussion were in two broad

                classes:

              </p>

              <ul>

                <li> How we use the ROOT toolkit: Are there more

                  efficient practices? Are there features we don't

                  exploit but should?</li>

                <li> How we analyze the data: Are there new features in

                  the Analysis Library that we should develop? Should

                  the contents of the REST format be expanded? Are there

                  things we do in Analysis that should be done in

                  reconstruction or vice-versa?</li>

              </ul>

              <p>One thing that came up was our use of TLorentzVector.

                Jon has seen others use a smaller (member-data-wise)

                class. Alex pointed out that the current ROOT

                documentation has marked this class as deprecated. Yet

                our use of TLorentzVector is ubiquitous. Several

                expressed interest in looking into this more closely.

              </p>

              <p>Jon encouraged us to think about where we might want to

                expend effort. This will likely come up again at a

                future meeting.

              </p>

              <h3><span class="mw-headline"

                  id="Production_of_Missing_Random_Trigger_Files">Production

                  of Missing Random Trigger Files</span></h3>

              <p>Sean reported that he and Peter Pauli are very close to

                filling in all of the gaps in the random trigger file

                coverage for Fall 2018. Peter may give a presentation on

                this work at a future meeting.

              </p>

              <h3><span class="mw-headline" id="Action_Item_Review">Action

                  Item Review</span></h3>

              <ol>

                <li> Set up pull-request-triggered tests for HDGeant4.

                  (Mark)</li>

                <li> Modify the documentation to feature <a class="moz-txt-link-abbreviated" href="ftp://ftp.jlab.org">ftp.jlab.org</a>.

                  (Mark)</li>

                <li> Prioritizing specific tapes to be recovered. (Mark)</li>

                <li> Review disk usage when re-repatriating recon launch

                  data. (Alex, Mark)</li>

                <li> Check input data for volatile usage plot. (Mark)</li>

                <li> Make a plan for structuring tape writing for

                  efficient file retrieval. (Sean, Mark)</li>

                <li> Look into how we use TLorentzVector (Alex, Simon,

                  Jon)</li>

                <li> Think about Jon's list of improvements. (all)</li>

              </ol>

            </div>

            <br>

          </div>

        </div>

      </div>

    </div>

  </body>

</html>