<html>
<head>
<meta content="text/html; charset=ISO-8859-1"
http-equiv="Content-Type">
</head>
<body bgcolor="#FFFFFF" text="#000000">
<br>
-PJANA:BATCH_MODE=1<br>
<div class="moz-cite-prefix">On 12/6/12 1:14 AM, Richard Jones
wrote:<br>
</div>
<blockquote cite="mid:50C037BB.8090703@uconn.edu" type="cite">Mark,
<br>
<br>
The only way to know if we are ready is to start production and
find out what hits the fan. I pushed out a small batch this
morning just to test the framework. Things have changed on
several sites since Igor last ran on the grid (software areas have
moved, policies have changed here and there) so I am tweaking
things to get them to run everywhere, but events are flowing.
<br>
<br>
I started production this morning at 7:30 with a batch of 1000
jobs. As of 18:00 hours this afternoon all but a few stragglers
were done, and 46 million rest events (920 files of 50k events
each) have landed on the srm at UConn. Log files all look good.
It seems that the software is more robust than I feared. Not a
single job has crapped out due to memory ceiling violations so
far. None of the rest files has exactly the same size, which means
that we didn't goof up with our random numbers in the most obvious
way.
<br>
<br>
I stopped production at 18:00 to implement changes in a couple of
scripts, and just launched again, this time a batch of 500 million
events. That should keep the pipe full until the weekend. Then
it is time to do the real challenge -- 5 billion events in one
batch job!
<br>
<br>
One small change I wish we had made is to get rid of the ticker
tape of how many events have been processed so far. This is nice
for interactive running, but in batch mode it fills up the log
file with very little purpose, and makes browsing the log file a
pain. A quick glance at JApplication did not show an obvious way
to turn it off with a command line option, so I created a cron job
that scans the log files as they come back from the grid and
compresses out the ticker tape lines. Saves an order of magnitude
in space in my log directory.
<br>
<br>
-Richard J.
<br>
<br>
<br>
On 12/5/2012 4:13 PM, Paul Mattione wrote:
<br>
<blockquote type="cite">Things are fine on my end. I still have
an ~8% failure rate though (due to RAM limitations). 13.2
million events and counting.
<br>
<br>
- Paul
<br>
<br>
On Dec 5, 2012, at 4:02 PM, Mark M. Ito wrote:
<br>
<br>
<blockquote type="cite">OK. Have not heard from Richard but
that's OK...
<br>
<br>
I was wondering if there are any decisions left or technical
questions or concerns at this point. Otherwise I think we can
start. I'm assuming that Richard is done with the changes to
the branch and Paul thinks that they are fine. I'll make a tag
by tomorrow morning if I do not hear that that is a bad idea.
<br>
<br>
If either of you think we really do need a conference call,
let me know and I will set one up.
<br>
<br>
On 12/05/2012 02:01 PM, Mark M. Ito wrote:
<br>
<blockquote type="cite">Richard and Paul,
<br>
<br>
Can we have a quick, touch-base-like phone meeting on the
data challenge
<br>
status today? Say at 4 pm. Or suggest a time. We can use the
ReadyTalk
<br>
thing.
<br>
<br>
-- Mark
<br>
<br>
</blockquote>
-- <br>
Mark M. Ito
<br>
Jefferson Lab
<br>
<a class="moz-txt-link-abbreviated" href="mailto:marki@jlab.org">marki@jlab.org</a>
<br>
(757)269-5295
<br>
</blockquote>
</blockquote>
<br>
<br>
<br>
<fieldset class="mimeAttachmentHeader"></fieldset>
<br>
<pre wrap="">_______________________________________________
Halld-offline mailing list
<a class="moz-txt-link-abbreviated" href="mailto:Halld-offline@jlab.org">Halld-offline@jlab.org</a>
<a class="moz-txt-link-freetext" href="https://mailman.jlab.org/mailman/listinfo/halld-offline">https://mailman.jlab.org/mailman/listinfo/halld-offline</a></pre>
</blockquote>
<br>
</body>
</html>