[Halld-offline] Offline Software Meeting Minutes, March 19, 2014
Mark Ito
marki at jlab.org
Wed Mar 19 19:14:06 EDT 2014
Folks,
Please find the minutes below and at
https://halldweb1.jlab.org/wiki/index.php/GlueX_Offline_Meeting,_March_19,_2014#Minutes
.
-- Mark
_______________________________________________________________________
GlueX Offline Meeting, March 19, 2014
Minutes
Present:
* CMU: Paul Mattione
* FSU: Aristeidis Tsaris
* IU: Kei Moriya
* JLab: Mark Ito (chair), Dmitry Romanov, Simon Taylor
* MIT: Justin Stevens
* NU: Sean Dobbs
* UConn: Alex Barnes
Review of Minutes from the Last Meeting
We looked at the [GlueX Offline Meeting, February 5,
2014#Minutes|minutes of the February 5th meeting]. In particular
reviewed the features of Gagik's Tagged File System in preparation for
contrast and compare with EventStore.
Porting EventStore to GlueX
Sean led us through what he has learned/reminded-himself-of about
EventStore. See [31]his slides for details. They covered:
* [Introduction to] EventStore
* Example Invocation
* Architecture for CLEO and GlueX
* CLEO Data Model and EventStore
* Data Life Cycle [stages of event sample refinement]
* Metadata [run selection criteria]
* Roadmap [for implementation]
* CLEO Data Lifecycle [as example]
* Other Things [version tags, metadata criteria for GlueX]
* Can It Work with the Grid?
There are still some questions about the mechanism used to index events
within a data file and how much the code can be ported to GlueX. Use
with the grid would likely take some development. We encouraged Sean to
proceed with his plan to implement a simple example to understand
possible issues in detail.
Data Challenge Meeting Report, March 14
We took only a cursory look at the [32]minutes from the last data
challenge meeting as the rest of the agenda items for this meeting were
to deal with the open issues from that meeting.
Fix to Compression of REST-Formatted Files
Richard Jones checked in a change to fix the "short file" problem where
writing to the output REST file would stop in the middle of
reconstruction even though event processing continued on. There was
indeed a bug in the xstream library when compression was enabled, as
suspected originally by David and Simon.
Mark ran 5,000 jobs of 10,000 events each and saw no short files. (Note
that this means none of the jobs crashed for any other reason in
addition.) We declared this problem fixed.
New Random Number Seed Scheme
Richard also checked in a change to the random number seed procedure.
Now in hdgeant, each incoming event is checked to see if it contains
seed information. If so, hdgeant's random number generated is reseeded
with data derived from the incoming information. This is essentially
the plan Curtis proposed at the last data challenge meeting. It gives
reproducible results even in a multi-threaded program where individual
events may go to different threads in repeated runs on the same input
data.
Richard also made some modifications to a similar scheme that David had
implemented in mcsmear, this time with mcsmear cuing off of information
output from hdgeant.
We noted that we did not see a change come in to modify bggen to write
seed information in its output events. That is necessary to fully
implement Curtis's proposal. We will ask Richard about this.
We decided that we can run this data challenge without this change to
bggen. David pointed out that since hdgeant is single-threaded at
present, and event-by-event seed is not necessary to insure
reproducibility.
Non-Reproducible Reconstruction
There has been a lot of work on this problem since the last data
challenge meeting, but no fixes have been found. The incidence appears
to be less than a part per mil event-wise and the differences seen
between repeated analysis are slight. These differences are also
limited to individual events; for the most part subsequent events in
the file give identical results. We decided that this problem should
not impact analysis of resulting data and we can live with it for this
data challenge.
Data Challenge Status: are we ready to freeze?
We confirmed that the deadline for changes/improvements is noon
Thursday, March 20 (tomorrow), but at present there is nothing
stopping us from freezing on the current version of the trunk.
Paul reminded us to make sure that the configuration files we are using
at the various sites are those checked into
trunk/data_challenge/02/conditions, and described at
https://halldweb1.jlab.org/data_challenge/02/conditions/data_challenge_2.html
.
He recently fixed the beam photon energy range to be consistent with
what agreed on and updated the particle.dat file used by bggen to its
most recent version.
Other Data Challenge Issues
Sean asked about two aspects that we have not quite nailed down.
1. Monitoring: what are we planning to do to monitor data integrity?
2. Data Distribution: how do we plan to distribute the resulting data
to data analyzers?
Although ideally these issues would have been dealt with by now, we
noted that we could start with processing as long as we address them
early in the course of production. We agreed to discuss them further at
the...
Next Data Challenge Meeting
We will meet on Friday at 11:00 am as the current schedule calls for.
Note that by that the time code and configuration files will have been
frozen so in principle production may have started at the sites. This
meeting will be a chance to evaluate how things are going.
Retrieved from
"https://halldweb1.jlab.org/wiki/index.php/GlueX_Offline_Meeting,_March_19,_2014"
References
31. https://halldweb1.jlab.org/wiki/images/0/08/EventStore_GlueX.pdf
32.
https://halldweb1.jlab.org/wiki/index.php/GlueX_Data_Challenge_Meeting,_March_14,_2014#Minutes
33.
https://halldweb1.jlab.org/data_challenge/02/conditions/data_challenge_2.html
--
Mark M. Ito, Jefferson Lab, marki at jlab.org, (757)269-5295
More information about the Halld-offline
mailing list