[Halld-offline] Offline Software Meeting Minutes, August 7, 2018
Mark Ito
marki at jlab.org
Tue Aug 7 20:19:56 EDT 2018
Folks,
Find the minutes below and here
<https://halldweb.jlab.org/wiki/index.php/GlueX_Offline_Meeting,_August_7,_2018>.
-- Mark
___________________
Minutes, GlueX Offline Meeting, August 7, 2018
Present:
* *CMU: * Curtis Meyer
* *FIU: * Mahmoud Kamel
* *FSU: * Sean Dobbs
* *JLab: * Alex Austregesilo, Thomas Britton, Mark Dalton, Stuart
Fegan, Mark Ito (chair), David Lawrence, Justin Stevens, Beni Zihlmann
The chairman neglected to hit the record button on BlueJeans.
Announcements
1. *reconstruction launch version set:
version_recon-2017_01-ver03_jlab.xml
<https://mailman.jlab.org/pipermail/halld-offline/2018-August/003306.html>*.
The tag of sim-recon used in the reconstruction has been built on
five platforms.
2. Status of Recon Launch: Alex A.
* We are using QCD12 boxes with farm18 nodes shown on
[image:farmnodes.png|the SciComp webpage], but not in active use
yet.
* There are 700-800 jobs running simultaneously.
* There will be 300 to 400 more when the farm18 nodes are activated.
* We are 85% done with 2016 data; it will be done in 2 or 3 days.
* Spring 2017 reconstruction should take 15 to 20 days.
* Possible problem with cache disk space if we run more jobs
simultaneously, our pin quota is used up.
* David remarked that since we are copying the raw data to the
local disk first, they could be unpinned as soon as they are copied.
Review of minutes from the July 24 meeting
We went over the minutes
<https://halldweb.jlab.org/wiki/index.php/GlueX_Offline_Meeting,_July_24,_2018#Minutes>.
NERSC Update
David gave us an update.
* Chris Larrieu is back from vacation and has addressed some swif2 issues.
* Test of reconstruction of one run with 220 files ran into a
20-job-at-a-time limit imposed by swif2. The limit is motivated by
having only 1 TB of disk space at NERSC. More space than that is
needed to keep the pipe full.
* David has consulted with a Brookhaven physicist who has been working
with more space.
* Reserving an entire node is possible, but you have to "pay" in
advance for the time and it may be hard to get credit back for
failed jobs.
* David plans to move to a 20 TB "cache" disk (with a file lifetime
limit).
* The plan is to try a monitoring launch over Spring 2018 data first.
* Sean asked about what software tag was going to be used. He
cautioned that there is CDC reconstruction code that should be added
to augment the code being used for the current reconstruction launch.
* Alex cautioned that the monitoring launch uses many more plugins
than are used in reconstruction launches. More memory may be required.
Splitting up Sim-Recon: Aftermath
Mark led us through the announcement of the split
<https://mailman.jlab.org/pipermail/halld-offline/2018-July/003292.html>
performed Monday, July 30 and a wiki page he wrote
<https://halldweb.jlab.org/wiki/index.php/Converting_sim-recon_tags_and_branches_to_the_split_repositories>
describing how to recover branches and tags from the sim-recon
repository when using the new halld_recon and halld_sim repositories.
Items that still need to be addressed:
1. The use of the HALLD_MY directory needs to be revisited with the
split repositories.
2. A procedure for recovering tagged versions of sim-recon and
deploying them in the split repositories needs to be developed.
3. The automatic builds triggered by pull requests needs to be
implemented on the new repositories.
HDGeant4 issues
We reviewed the recent pull requests from Richard Jones fixing separate
issued in the FDC simulation one in HDGeant (GEANT 3) and the other in
HDGeant4. See [his comment, submitted today, on HDGeant4 Issue #54
<https://github.com/JeffersonLab/HDGeant4/issues/54>. Corresponding pull
requests to the halld_sim and hdgeant4 repositories have been merged to
their respective master branches.
Review of recent pull requests
The title of Pull request #1180
<https://github.com/JeffersonLab/sim-recon/pull/1180> from David served
as a reminder to upbraid us for adding frustration to his workflow. The
issue is respect (or rather disrespect) for a mechanism for building
sim-recon (at the time of the request) without all of the packages we
build, i. e., a mechanism for having optional packages. Whether a
package is optional or not is signaled by the absence or presence of the
home environment variable for the package. When collaborators do not
respect this convention, David is stuck either building the suddenly
non-optional package or coding the mechanism in himself.
David has looked into the idea of having build "flavors;" configurations
of the build with optional packages explicitly identified. That takes
the configuration out of the shell environment. In general, he thinks
that we may be due for re-factoring the SCons build system (SBMS) in any
case.
Review of recent discussion on the GlueX Software Help List
We looked at recent posts
<https://groups.google.com/forum/#%21forum/gluex-software>.
* None of those present, other than Mark, have experienced the halld
web authentication error (401).
* The cause of the g++ internal compiler error, on random source code
files, during the single-threaded build of hdgeant4 on the ifarm
machines and on no others is still a mystery.
Retrieved from
"https://halldweb.jlab.org/wiki/index.php?title=GlueX_Offline_Meeting,_August_7,_2018&oldid=88570"
* This page was last modified on 7 August 2018, at 20:16.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mailman.jlab.org/pipermail/halld-offline/attachments/20180807/7e64b260/attachment-0001.html>
More information about the Halld-offline
mailing list