[Halld-offline] Data Challenge Meeting, April 4, 2014
Mark Ito
marki at jlab.org
Sat Apr 5 18:59:30 EDT 2014
Folks,
Find the minutes below and at
https://halldweb1.jlab.org/wiki/index.php/GlueX_Data_Challenge_Meeting,_April_4,_2014#Minutes
.
-- Mark
_____________________________________________
GlueX Data Challenge Meeting, April 4, 2014
Minutes
Present:
* CMU: Paul Mattione, Curtis Meyer
* FSU: Volker Crede, Priyashree Roy, Aristeidis Tsaris
* IU: Kei Moriya
* JLab: Mark Ito (chair), Sandy Philpott, Simon Taylor
* MIT: Justin Stevens
* NU: Sean Dobbs
* UConn: Richard Jones
Announcements
* Sean went through [30]his email describing a Python script to
compare monitoring_hists with standard distributions. Justin had
tried it with success. Justin also noted that a look at the thrown
beam photon energy plot gives a quick check that the number of
events in the job is correct. The script is now linked from the
[31]conditions page.
Data Challenge 2 Event Tally Board
We took a quick look at the [32]tally board we are currently up to 1.5
gigaevents.
OSG Update
Richard brought us up to speed on the OSG effort.
We are still in amber mode, as opposed to green, on the OSG. Some
throttling of our jobs is being done. Still we have seen a peak of
10,000 cores devoted to this data challenge, as shown on a [33]recent
graph of running jobs for GlueX from the [34]UConn OSG/GlueX status
site. Richard also showed us a [35]plot of "idle" jobs, i. e., those
queued for running. They show an effect where jobs are accepted and
quickly fail for some sites where the installed run-time libraries are
incompatible with our software stack. Richard is going through and
eliminating these types of problems. When we have a configuration
consistent with all contributing sites, we will ask to be flipped to
green. That should happen over the next few days.
UConn is contributing about 400 cores to the OSG currently.
Northwestern is contributing about 250 cores.
A rough estimate of the event count from the OSG already puts it at
about that of all of the other sites combined thus far.
We agreed that until production turns on fully on the OSG, we will
continue in our current mode at the other sites and re-evaluate plans
when the situation with the OSG changes.
Site Round-Up
We had brief reports on production at the various sites.
JLab
Mark showed the latest [36]plot of number of running jobs versus time.
Sandy gave a clarification of the core count on the batch farm. We now
have a total of roughly 3400 cores total. The Hall D share is 2400. The
nominal core count for the farm as a whole is 1400. The surplus comes
from the loan from LQCD.
MIT
Justin took us through his [37]wiki page.
The activity at MIT related to this data challenge, both there and on
FutureGrid, will be reported at the OSG all-hands meeting.
CMU
Paul reports that all is well in Pittsburgh, with 20 million events
produced in the last two days. Production continues smoothly.
FSU
Aristeidis reports that production continues, with 5 million since
Wednesday. He suspects that progress should be faster and is looking
into possible impediments.
Core-Hour Credit at Jefferson Lab
We discussed how to spend our credit with the LQCD farm. Currently we
have used about one third of the million core-hours. SciComp has told
us that a late summer data challenge would mesh with a seasonal slack
period on the LQCD cluster, and that at time a fresh loan to us would
be easy to do. They are fine with us burning it all for this data
challenge. Curtis thought that a tape-resident-data driven challenge
should be started sooner, in which case the credit might come in handy.
We will have to revisit this issue as the picture becomes more clear.
Retrieved from
https://halldweb1.jlab.org/wiki/index.php/GlueX_Data_Challenge_Meeting,_April_4,_2014
References
30.
https://mailman.jlab.org/pipermail/halld-offline/2014-April/001641.html
31.
https://halldweb1.jlab.org/data_challenge/02/conditions/data_challenge_2.html
32.
https://docs.google.com/spreadsheets/d/1qvF9B-76gr8NdsTKsO17jqL0qc5OXqK46JluvXnJ98k/edit?usp=sharing
33.
https://halldweb1.jlab.org/wiki/images/8/86/Osg_running_2014-04-05.png
34. http://gryphn.phys.uconn.edu/vofrontend/monitor/frontendStatus.html
35. https://halldweb1.jlab.org/wiki/images/7/74/Osg_idle_2014-04-05.png
36. https://halldweb1.jlab.org/wiki/images/3/3c/Jobs_gluex_04-04.png
37.
https://halldweb1.jlab.org/wiki/index.php/MIT/FutureGrid_Data_Challenge_2_Production#Update_4.2F14.2F14
--
Mark M. Ito, Jefferson Lab, marki at jlab.org, (757)269-5295
More information about the Halld-offline
mailing list