<div dir="ltr">Hello all,<div><br></div><div>While nobody wants to see the cost for MC generation go up relative to what we saw in the first data challenge, I want to emphasize that we cannot simply wish away the background. It is going to be a feature of the real data that we need to analyze, and to be ready for that we need to generate events with realistic background. I would urge us to do generation this time around with a realistic gate and bg rate for what we expect to see in phase-1 physics running, and take whatever hit we need to in terms of the number of events we can simulate. I cannot think of any study that is going to fail because we generated a factor 2-3 fewer events, but there are plenty of things that can go wrong if we fail to simulate realistic backgrounds in the detector. I see no advantage to generating even a part of the DC sample with less than 10^7 g/s background rates and the full 1us gate.</div>
<div><br></div><div>With more time and effort, we could probably come up with a way to generate a library of background events and overlay random entries from the library onto the bggen events at the desired rate. This is not trivial to do because the superposition must be carried out at the hits accumulation level inside Geant before hits aggregation, but given the computing cost it is likely the way to go. Meanwhile I think we should press ahead with the DC and take whatever hit we need to in cpu to get where we need to be. Remember that during DC #1 our processing efficiency was only about 50% on the grid overall, so if we get that number up to near 95% we can process with background turned on at about the same rate as we did then.</div>
<div><br></div><div>-Richard J.</div></div><div class="gmail_extra"><br><br><div class="gmail_quote">On Sun, Feb 9, 2014 at 11:49 PM, Kei Moriya <<a href="mailto:kmoriya@indiana.edu">kmoriya@indiana.edu</a>> wrote:<br>
<blockquote class="gmail_quote"><br>
To follow up on the discussion at the Data Challenge meeting<br>
regarding memory usage for bggen jobs, I ran over 10 runs for<br>
each configuration of EM background level and had the cluster report<br>
back the maximum vmem used (see attached plot).<br>
<br>
Note that the spikes in max vmem happened most frequently for<br>
standard EM background (the dashed lines are the average for<br>
each EM background type, but they are influenced strongly by<br>
the outliers), while the set with high rate and long gate time<br>
has no large spikes.<br>
<br>
The jobs were set up so that for each run, the original thrown hddm<br>
file was the same for each EM background set. The RNDM variable<br>
specified in <a href="http://control.in">control.in</a> for hdgeant was always the same. Does<br>
anybody know if the simulation of the detector hits is done<br>
independent of the EM background, or if the EM background offsets<br>
the random number sequence for each hit?<br>
<br>
I also monitored the time it took for each job type to finish.<br>
The results are<br>
type time (hrs)<br>
no EM bg 1 - 1.5<br>
standard 2 - 3.5<br>
high rate 5 - 5.5<br>
long gate 3 - 3.5<br>
high rate, long gate 10 - 15<br>
<br>
Most of the time is being spent on hdgeant. I don't know why<br>
some jobs take 50% more time than others. The increase in<br>
processing time is a much larger penalty than the small increase<br>
we see in the final REST file sizes.<br>
<br>
I've also started looking into the number of tracks and track quality,<br>
and also photon timing information.<font color="#888888"><br>
<br>
Kei<br>
<br>
<br>
</font><br>_______________________________________________<br>
Halld-offline mailing list<br>
<a href="mailto:Halld-offline@jlab.org">Halld-offline@jlab.org</a><br>
<a href="https://mailman.jlab.org/mailman/listinfo/halld-offline">https://mailman.jlab.org/mailman/listinfo/halld-offline</a><br></blockquote></div><br></div>