[Halld-offline] Data Challenge Meeting, April 4, 2014

Mark Ito marki at jlab.org
Sat Apr 5 18:59:30 EDT 2014


Folks,

Find the minutes below and at

https://halldweb1.jlab.org/wiki/index.php/GlueX_Data_Challenge_Meeting,_April_4,_2014#Minutes 
.

   -- Mark
_____________________________________________

GlueX Data Challenge Meeting, April 4, 2014
Minutes

    Present:
      * CMU: Paul Mattione, Curtis Meyer
      * FSU: Volker Crede, Priyashree Roy, Aristeidis Tsaris
      * IU: Kei Moriya
      * JLab: Mark Ito (chair), Sandy Philpott, Simon Taylor
      * MIT: Justin Stevens
      * NU: Sean Dobbs
      * UConn: Richard Jones

Announcements

      * Sean went through [30]his email describing a Python script to
        compare monitoring_hists with standard distributions. Justin had
        tried it with success. Justin also noted that a look at the thrown
        beam photon energy plot gives a quick check that the number of
        events in the job is correct. The script is now linked from the
        [31]conditions page.

Data Challenge 2 Event Tally Board

    We took a quick look at the [32]tally board we are currently up to 1.5
    gigaevents.

OSG Update

    Richard brought us up to speed on the OSG effort.

    We are still in amber mode, as opposed to green, on the OSG. Some
    throttling of our jobs is being done. Still we have seen a peak of
    10,000 cores devoted to this data challenge, as shown on a [33]recent
    graph of running jobs for GlueX from the [34]UConn OSG/GlueX status
    site. Richard also showed us a [35]plot of "idle" jobs, i. e., those
    queued for running. They show an effect where jobs are accepted and
    quickly fail for some sites where the installed run-time libraries are
    incompatible with our software stack. Richard is going through and
    eliminating these types of problems. When we have a configuration
    consistent with all contributing sites, we will ask to be flipped to
    green. That should happen over the next few days.

    UConn is contributing about 400 cores to the OSG currently.
    Northwestern is contributing about 250 cores.

    A rough estimate of the event count from the OSG already puts it at
    about that of all of the other sites combined thus far.

    We agreed that until production turns on fully on the OSG, we will
    continue in our current mode at the other sites and re-evaluate plans
    when the situation with the OSG changes.

Site Round-Up

    We had brief reports on production at the various sites.

JLab

    Mark showed the latest [36]plot of number of running jobs versus time.

    Sandy gave a clarification of the core count on the batch farm. We now
    have a total of roughly 3400 cores total. The Hall D share is 2400. The
    nominal core count for the farm as a whole is 1400. The surplus comes
    from the loan from LQCD.

MIT

    Justin took us through his [37]wiki page.

    The activity at MIT related to this data challenge, both there and on
    FutureGrid, will be reported at the OSG all-hands meeting.

CMU

    Paul reports that all is well in Pittsburgh, with 20 million events
    produced in the last two days. Production continues smoothly.

FSU

    Aristeidis reports that production continues, with 5 million since
    Wednesday. He suspects that progress should be faster and is looking
    into possible impediments.

Core-Hour Credit at Jefferson Lab

    We discussed how to spend our credit with the LQCD farm. Currently we
    have used about one third of the million core-hours. SciComp has told
    us that a late summer data challenge would mesh with a seasonal slack
    period on the LQCD cluster, and that at time a fresh loan to us would
    be easy to do. They are fine with us burning it all for this data
    challenge. Curtis thought that a tape-resident-data driven challenge
    should be started sooner, in which case the credit might come in handy.
    We will have to revisit this issue as the picture becomes more clear.

Retrieved from 
https://halldweb1.jlab.org/wiki/index.php/GlueX_Data_Challenge_Meeting,_April_4,_2014

References

   30. 
https://mailman.jlab.org/pipermail/halld-offline/2014-April/001641.html
   31. 
https://halldweb1.jlab.org/data_challenge/02/conditions/data_challenge_2.html
   32. 
https://docs.google.com/spreadsheets/d/1qvF9B-76gr8NdsTKsO17jqL0qc5OXqK46JluvXnJ98k/edit?usp=sharing
   33. 
https://halldweb1.jlab.org/wiki/images/8/86/Osg_running_2014-04-05.png
   34. http://gryphn.phys.uconn.edu/vofrontend/monitor/frontendStatus.html
   35. https://halldweb1.jlab.org/wiki/images/7/74/Osg_idle_2014-04-05.png
   36. https://halldweb1.jlab.org/wiki/images/3/3c/Jobs_gluex_04-04.png
   37. 
https://halldweb1.jlab.org/wiki/index.php/MIT/FutureGrid_Data_Challenge_2_Production#Update_4.2F14.2F14

-- 
Mark M. Ito, Jefferson Lab, marki at jlab.org, (757)269-5295




More information about the Halld-offline mailing list