[Halld-offline] Source of Inconsistency (This Case Fixed)
Curtis A. Meyer
cmeyer at cmu.edu
Thu Mar 20 07:59:39 EDT 2014
QUite a catch! Curtis
---------
Curtis A. Meyer MCS Associate Dean for Faculty and Graduate Affairs
Wean: (412) 268-2745 Professor of Physics
Doherty: (412) 268-3090 Carnegie Mellon University
Fax: (412) 681-0648 Pittsburgh, PA 15213
curtis.meyer at cmu.edu http://www.curtismeyer.com/
On Mar 19, 2014, at 10:50 PM, Paul Mattione <pmatt at jlab.org> wrote:
> I've found the source of the inconsistency we've been seeing: it's ultimately due to the fact that the associated-object map JObject::associated uses the object pointer as the key for the map. This means that the objects are stored in the map in a random order (they are stored in the order returned by the less-than-operator, which for pointers uses the (random) memory address). In the long run we may want to consider a modification to this, but for now, whenever you're getting associated objects, you need to be very careful that your results are not dependent on the order of the objects.
>
> Fortunately, this is normally not the case. However, in DTrackCandidate_factory::MatchMethod4(), when the DFDCPseudo* hits associated with the DTrackCandidate* object are grabbed, their random order is a problem. In this case, there is a sort routine applied to these hits, FDCHitSortByLayerincreasing, which normally removes the randomness. However, when the hits are on the exact same layer and wire, they remain in a random order with respect to each other. There is a similar sort routine for the CDC, which has the same problem.
>
> These FDC hits are added along with CDC hits to a DHelicalFit object which is used to try to link the hits together and get an improved estimate for the tracking parameters. Inside the fit function, DHelicalFit::FitLineRiemann(), there is another sort routine that sorts the hits by hit-z (DHFProjection_cmp). This still doesn't fix the problem though, since although the FDC hits on the same wire have different x & y, they have the same z. Thus when the linear-regression fit is performed, and the distances between the xy-projections are taken between hits, you get inconsistent results.
>
> I've checked in modifications to these sort routines so that the results are more deterministic in case of ties (tie-breakers: in DTrackCandidate_factory, sort CDC and FDC hits by energy, and in DHelicalFit, sort by the projection distance from the beamline). All 32 of my single-threaded jobs analyzing this 5k-event file give identical results. This bug did not result from any uninitialized values, memory errors, from reading past the end of an array/vector, etc, and so was very difficult to track down. It hasn't been a fun few weeks, but hopefully most of the bugs have been stamped out now.
>
> - Paul
>
>
> _______________________________________________
> Halld-offline mailing list
> Halld-offline at jlab.org
> https://mailman.jlab.org/mailman/listinfo/halld-offline
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mailman.jlab.org/pipermail/halld-offline/attachments/20140320/d717ab1a/attachment-0002.html>
More information about the Halld-offline
mailing list