[Halld-offline] non-reproducibility study
Paul Mattione
pmatt at jlab.org
Tue Mar 11 17:32:10 EDT 2014
Beni, Sean, and I did a bunch of studies over the weekend; I checked in some changes to the trunk that improve things, but there are still issues. What we've done/determined so far:
1) Fixed bugs: Variables were sometimes uninitialized because functions (such as DReferenceTrajectory::DistToRT()) were exited from early.
2) Fixed bugs: Sometimes we weren't checking that the return value was "NOERROR" on some of the DReferenceTrajectory methods.
3) It looks like everything through wire-based tracking is OK, except DTrackWireBased::t0() is non-deterministic (the rest of the tracking parameters are fine). This was discovered by saving the DTrackWireBased results to REST instead of DTrackTimeBased in our local test builds, then running hddm-xml and diff on the output (I agree this is the best way to go).
4) It looks like the track-BCAL/FCAL/TOF/SC matching is now deterministic (it wasn't before). However, it looks like there are occasional differences when multithreading. Perhaps this is due to problems with 3)??
5) It looks like time-based tracking still has issues, which are worse when running multithreaded. Perhaps these are related to 3)??
There are other potential problems but it's hard to tell for certain; I'm not sure, but it looks like some of these bugs behave differently when compiling with different versions of gcc.
I tried to fix all the bugs I could find that were outside of the Kalman Filter, but it's hard to tell if I got them all. We got Simon up to date so he's looking at things now.
- Paul
On Mar 11, 2014, at 5:14 PM, Mark Ito wrote:
> Folks,
>
> Ran 1000 short jobs using the latest DC2 tags.
>
> tags/hdds-dc-2.1
> tags/sim-recon-dc-2.4
>
> No EM background, 1000 events per job, saving the smeared raw data
> (hdgeant_smeared.hddm). Then I ran the same hd_root again against the
> saved smeared raw data.
>
> Doing a diff on the dana_rest output:
>
> differ = 994
> identical = 6
>
> I then dumped the hddm into xml form and diffed the output for a dozen
> pairs. Find the results at
>
> https://halldweb1.jlab.org/talks/2014-1Q/diffhddm.txt
>
> You can see that the differences are not that big. You can tell that the
> same event was analyzed in iteration 1 and iteration 2. Close but no
> cigar. I could look at more files if that is of interest to anyone.
>
> The rest files and the root files are at JLab in
>
> /volatile/halld/home/gluex/proj/dc_02_1
> /volatile/halld/home/gluex/proj/dc_02_2
>
> Each of these directories has a rest directory and a hd_root directory.
> 1 and 2 indicate the original run and the re-run respectively, of
> course. Compression was off and there are no short rest files.
>
> Hope this helps someone. Not sure what my next step should be.
> Suggestions welcome.
>
> -- Mark
> _______________________________________________
> Halld-offline mailing list
> Halld-offline at jlab.org
> https://mailman.jlab.org/mailman/listinfo/halld-offline
More information about the Halld-offline
mailing list