[Halld-offline] Data Challenge Meeting Minutes, march 14, 2014
Mark Ito
marki at jlab.org
Fri Mar 14 20:20:54 EDT 2014
Challengistas,
Find the minutes below and at
https://halldweb1.jlab.org/wiki/index.php/GlueX_Data_Challenge_Meeting,_March_14,_2014#Minutes
-- Mark
___________________________________________________________________
GlueX Data Challenge Meeting, March 14, 2014
Minutes
Present:
* CMU: Paul Mattione, Curtis Meyer
* FSU: Volker Crede, Priyashree Roy, Aristeidis Tsaris
* IU: Kei Moriya
* JLab: Sergey Furletov, Mark Ito (chair), David Lawrence, Sandy
Philpott, Dmitry Romanov, Simon Taylor, Beni Zihlmann
* MIT: Justin Stevens
* NU: Sean Dobbs
* UConn: Richard Jones, James McIntyre, Brendan Pratt
Note that we used BlueJeans for telecommunications for this meeting.
Was not terrible.
Random Number Seeds Procedure
Curtis described [44]his proposal for a way to handle random number
seeds to allow reproducing results even in a multi-threaded
environment.
Mark showed [45]David's message describing the syntax for setting the
random number seed for mcsmear on the command line. He also mentioned
that David has implemented a switch for mcsmear to have it use the two
hdgeant seeds of the incoming event, along with the number 137, as the
three random number seeds for each event in mcsmear, roughly consistent
with Curtis's proposal. This should be ported to the branch.
Mark reminded us of the [46]seed procedure we used last year.
Mark proposed that we follow the same procedure as last year, only this
time used a fixed seed triplet for mcsmear. No further code development
would be needed.
Richard thought we should go ahead and fully implement Curtis's scheme.
He also volunteered to make the needed modifications to bggen, hdgeant,
and the data model to support it. We agreed have him go ahead and make
the changes for this challenge.
Short File Issue
Mark ran 2000 jobs yesterday, 1000 events each, with compression turned
back on, and did not see any short files. Richard thought that that was
not enough to see the problem with the current code (100 jobs of
100,000 events each might give 2 short files). This is much less than
we saw a couple of weeks ago.
Richard proposed that if a fix is not found soon, we run without
compression. It would simplify the job handling procedure and a short
file might be correlated with some other less obvious type of data
corruption. Justin and Mark both reported that this increases the REST
file size by about a factor of two. Richard will continue to work on a
fix, but we will not hold-up starting for it.
Mark promised to copy David's compression on/off switch to the trunk
from the branch.
Non-Reproducible Results
There has been a lot of activity around this topic. For details find
the email traffic [47]here.
Simon checked in a fix for t0 variation in wire-based tracking
yesterday.
Mark reported non-reproducibility even after this fix. Sean has not
seen problems but did not do a lot of trials. Paul has seen differences
in multiple trials on the same smeared data, and found that those
results are of only two classes with all results within a class
identical. Paul has traced the difference causer to a single event; he
has been looking particular file he distributed some time ago. Beni
confirms that this event is the culprit.
Work continues on this issue.
Running Jobs at FSU
Aristeidis summarized recent running on the cluster at FSU.
He contrasted two sets of running conditions, one with the [48]branch
as of last Friday and the other with the [49]2.4 tag. Memory usage is
low and stable with the branch, but the REST files are of two different
sizes. With the tag there is about 10% memory variation job to job, but
all files are the same size. The files are also a factor of 2 bigger
[no compression?].
Nodes at JLab
Mark described Sandy's summary of the [50]current activity and future
potential for addition of nodes to the physics batch farm for our data
challenge. One of several numbers: we have 911,000 core-hours "in the
bank" due to the "loan" of nodes from the physics batch farm to the
LQCD cluster over the past few months.
Electromagnetic Background Update
Kei presented slides describing the current state of his [51]studies on
using the data challenge software configuration for the pπ^+π^-
pπ^+π^-π^0 final states. Slides covered:
* file sizes
* CPU and virtual memory usage
* reaction selection
* properties of reconstructed events
* momentum vs. polar angle for protons
* momentum vs. polar angle for π^0s
* effects of kinematic-fit-confidence-level cuts on event spectra and
yield
Run number assignments
Mark pointed out recent changes to the [52]run conditions page.
He proposed that we reduce the fraction of events that we aim for from
15% to 5%, and raise the fraction of 1×10^7 from 70% to 80%. We thought
that that was reasonable.
Schedule
We agreed on noon, Thursday March 20 as the deadline for code and
configuration changes. Processing at the various sites can then proceed
as local schedules permit. In the mean time:
* email exchanges on relevant work is encouraged
* we will discuss status at Wednesday's offline meeting, following
Paul's suggestion
* we will hold open the possibility of an ad hoc meeting if needed
* Curtis asked that sites send CPU resource estimates to Mark by
Monday for compilation
Action Items
1. Implement random-number-seed-storage design in bggen, hdgeant, and
HDDM. Richard
2. Fix the reproducibility issue. All
3. Investigate the short rest file issue further. Richard
4. Send resource info to Mark. All site managers/honchos/cognoscenti
5. Confirm observation of differences in repeated runs with the 2.5
tag. Mark
6. Put compression on/off switch on the trunk. Mark
7. Bring David's seed-mcsmear-from-hdgeant-seed scheme to branch. Mark
8. Update the conditions page to reflect the reduction in high EM
event fraction. Mark
Retrieved from
"https://halldweb1.jlab.org/wiki/index.php/GlueX_Data_Challenge_Meeting,_March_14,_2014"
References
44. https://mailman.jlab.org/pipermail/halld-offline/2014-March/001561.html
45. https://mailman.jlab.org/pipermail/halld-offline/2014-March/001563.html
46.
https://halldweb1.jlab.org/wiki/index.php/Random_number_seeds_for_Data_Challenge_1
47.
https://mailman.jlab.org/pipermail/halld-offline/2014-March/thread.html#start
48. http://hadron.physics.fsu.edu/~aristeidis/dc2_3_11.pdf
49. http://hadron.physics.fsu.edu/~aristeidis/dc2_3_14.pdf
50.
https://halldweb1.jlab.org/wiki/index.php/Nodes_at_JLab_for_Data_Challenge_2
51. https://halldweb1.jlab.org/wiki/images/4/4c/2014-03-14-DC2.pdf
52.
https://halldweb1.jlab.org/data_challenge/02/conditions/data_challenge_2.html
--
Mark M. Ito, Jefferson Lab, marki at jlab.org, (757)269-5295
More information about the Halld-offline
mailing list