[Halld-offline] Data Challenge Meeting Minutes, February 7, 2014
Mark Ito
marki at jlab.org
Mon Feb 10 13:35:37 EST 2014
People,
Find the minutes below and at:
https://halldweb1.jlab.org/wiki/index.php/GlueX_Data_Challenge_Meeting,_February_7,_2014#Minutes
.
-- Mark
_________________________________________________________________
GlueX Data Challenge Meeting, February 7, 2014
Minutes
Present:
* CMU: Curtis Meyer, Paul Mattione
* FSU: Volker Crede, Aristeidis Tsaris
* IU: Kei Moriya
* JLab: Eugene Chudakov, Mark Ito (chair), Sandy Philpott, Simon
Taylor, Beni Zihlmann
* MIT: Justin Stevens
* NWU: Sean Dobbs
* UConn: Richard Jones
Agreed upon Parameters
We reviewed the parameters listed in the [25]agenda above. There was no
discussion and no changes were proposed.
Status of Preparations
JLab CC is ready, what do we need to tell them
Sandy gave a report on preparations by JLab Scientific Computing.
* The current farm is at about 1200 cores. The farm typically 1400
cores. For the 25% level test we will need 1250 cores. The plan is
to bring another 1000 cores over from LQCD to keep the farm
generally usable.
* One option is to end our data challenge at a power outage that is
planned for some time during the next 6 to 8 weeks.
* We estimate that we will need a full complement of nodes for about
2 weeks.
* Lattice nodes are available to us because Physics has been lending
32 16-core nodes to the LQCD farm all of December and January.
* It looks like the SRM capability is not a pressing problem for
running this challenge at JLab, but does need to be addressed in
the medium term. This issue will be discussed at the upcoming
collaboration meeting.
Update on phi=0 geometry issues in CDC? Simon/Richard
Simon sent a plot to Richard illustrating the problem. Beni has also
done a study with lead CDC straws and geantinos in which he observed
disappearing straws under certain circumstances. Richard has been able
to reproduce the problem; it appears to involve the transition in the
wrap-around from 359 going back to 0 degrees. He is studying the
problem now.
Random number seeds procedure?
No progress to report.
Electromagnetic Backgrounds update
Kei showed [26]slides detailing studies he has done on the computing
resources needed to generate EM background under various conditions. He
tried two beam rates, corresponding to 10^7 and 10^8, and two time
intervals, 400 ns and 800 ns. In addition to these four combinations he
ran with no beam background at all. For the comparison between no
background and 10^7 with an 800 ns gate, the output file size increases
by about 10% but the execution time for the job (including bggen,
hdgeant, mcsmear, and hd_root) nearly triples. The increase is mainly
in hdgeant, as expected. He also did comparisons of the number of
neutral showers, and deposited energy in the BCAL and FCAL separately.
Paul was running at CMU with the recently defined versions (see below)
and was seeing a third of his 10,000-event[?] jobs crash in
reconstruction.
Mark is also seeing crashes with these versions at JLab even though
there is no EM background in his jobs. He showed a [27]wiki page where
the success rate was only about 20% for 50,000 event runs with
parameters unchanged from the first data challenge.
We had been expecting that seg faults in the reconstruction were
largely eliminated in recent weeks; this was a surprise to many of us.
All aspects of the code and how it is configured need to be examined.
Check on event genealogy
No progress to report.
Preparations of standard distribution/scripts
Mark prepared tagged versions of HDDS and sim-recon yesterday and put
up the [28]webpage for distribution of all relevant software versions
and run-time configuration files. These were mentioned in the
discussion above.
Report on data management practices
Sean went over a quick survey he has done on various data management
systems. Richard had asked him to do so at the last meeting. See
[29]his slides for details. He described two systems in detail:
1. a custom database with xrootd/SRM
2. DIRAC toolkit
In general he found documentation less than comprehensive. Also the
ease of modifying systems for our purposes was not clear. We probably
should postpone adoption of such a system until after this data
challenge.
Open Stack Overview
Justin described an effort at MIT to operate a cluster where users
instantiate a virtual machine of their choice with their desired
software stack when running on a node. See [30]his wiki page for
details. The cluster to develop this system currently stands at 22
blades with 8 cores each. The project is looking for users. Justin
asked the group if this was something that we would like to pursue; he
volunteered to implement the data challenge jobs as a demo and as an
additional production site if things go well. We all thought it was a
great idea.
Workflow Tools at JLab
Mark reported that he talked to Chris Larrieu of SciComp and they are
not ready to deploy a system in time for this data challenge. Curtis
suggested that we invite Chris to future data challenge meetings.
Proposed Schedule
We endorsed the schedule that Curtis proposed, reproduced below:
* Launch of Data Challenge Thursday Feb.27, 2014 (est.).
* Test jobs going successfully by Tuesday February 25.
* Distribution ready by Monday February 24.
Action Items
1. Look at the phi = 0 efficiency hole. -> Richard
2. Understand random number seed saving and retrieval. -> Mark
3. Test new genealogy scheme. -> Kei
4. Settle on a realistic gate time for EM background. -> Kei
5. Invite Chris Larrieu to future data challenge meetings. -> Mark
6. Track the source of crashes in the reconstruction. -> All
Retrieved from
"https://halldweb1.jlab.org/wiki/index.php/GlueX_Data_Challenge_Meeting,_February_7,_2014"
References
25.
https://halldweb1.jlab.org/wiki/index.php/GlueX_Data_Challenge_Meeting,_February_7,_2014#Agenda
26.
https://halldweb1.jlab.org/wiki/images/3/30/2014-02-07-DataChallenge2_EMrates.pdf
27. https://halldweb1.jlab.org/wiki/index.php/Quick_Look_at_DC_2
28.
https://halldweb1.jlab.org/data_challenge/02/conditions/data_challenge_2.html
29.
https://halldweb1.jlab.org/wiki/images/6/6a/DataChallenge-20140207.pdf
30. https://halldweb1.jlab.org/wiki/index.php/Openstack_at_MIT_Overview
--
Mark M. Ito, Jefferson Lab, marki at jlab.org, (757)269-5295
More information about the Halld-offline
mailing list