[Halld-offline] Fwd: r7587 - trunk/sim-recon/src/libraries/TRACKING
David Lawrence
davidl at jlab.org
Thu Mar 17 15:24:53 EDT 2011
Hi Richard,
Let's put this on the agenda (and by us, I mean Mark) for the
offline meeting next week.
Note that we currently have around 150k lines of code (+30k for JANA).
With 5 brave volunteers, they would need to review about 30k lines each.
In the 150k lines, we have about 290 uses of STL sort.
Regards,
-David
On 3/17/11 2:47 PM, Richard Jones wrote:
> David and Curtis,
>
> I agree with David's alternative fix to the the SortInteractions()
> code. The point of my example was to clarify what was causing the
> problem, not to recommend the best fix. Maurizio Ungaro, who also saw
> my post, suggested an elegant alternative, by replacing the "double"
> declarations for the local variables to hold intermediate results in
> SortInteractions() with "long double". This should work both on i686
> and x86_64 hardware, and neatly avoids the entropy problem by saving
> the full 80bit mantissa by extending the double representation in
> memory to the full precision (actually extends it to 96 bits for m32
> and 128 bits for m64).
>
> More generally I agree with Curtis that this is a general issue, not
> just about STL sort. We need to look at all instances in the code
> where comparisons are being made between doubles, and make sure that
> fuzzy compares are being done, or the logic is tolerant of a little
> entropy. Maybe we can divide up the work between us. Between 4 or 5
> of us, we could do this before the May collaboration meeting.
>
> -Richard J.
>
>
>
>
>
> On 3/17/2011 12:20 PM, David Lawrence wrote:
>> Hi Curtis,
>>
>> I have created a new issue in Mantis to review the existing STL
>> sort calls in our reconstruction to ensure they do not have the
>> potential of having a similar bug.
>>
>> https://halldnew.jlab.org/mantisbt/view.php?id=50
>>
>> Regards,
>> -David
>>
>> On 3/17/11 12:14 PM, Curtis A. Meyer wrote:
>> Hi David -
>>
>> does it make sense to open a "general ticket" in Mantis about
>> the more general
>> affects of this issue?
>>
>> Curtis
>> On 3/17/11 11:23 AM, David Lawrence wrote:
>>
>> Hi All,
>>
>> I've just committed a fix to the seg. fault/hang problem based
>> on Richard's analysis. This is slightly different than the fix
>> Richard suggested. It avoids the 80bit/64bit comparison issue by
>> pre-calculating the values to be compared rather than doing it in the
>> sort algorithm itself. This should also speed things up a little
>> since DVector3::Perp() is not being called repeatedly during the sort
>> for the same object.
>>
>> I have been able to run through my one reliably-problematic
>> event using the new code without any problem. If anyone notices an
>> issue, please let me know.
>>
>> This problem has been marked as resolved in Mantis.
>>
>> Regards,
>> -David
>>
>> -------- Original Message --------
>> Subject: r7587 - trunk/sim-recon/src/libraries/TRACKING
>> Date: Thu, 17 Mar 2011 11:16:40 -0400
>> From:
>> Hall-D.SVN.Repository at jlab.org<mailto:Hall-D.SVN.Repository at jlab.org>
>> To: davidl at jlab.org<mailto:davidl at jlab.org>,
>> brash at pcs.cnu.edu<mailto:brash at pcs.cnu.edu>,
>> wolin at jlab.org<mailto:wolin at jlab.org>,
>> zisis at uregina.ca<mailto:zisis at uregina.ca>,
>> mashephe at indiana.edu<mailto:mashephe at indiana.edu>,
>> remitche at indiana.edu<mailto:remitche at indiana.edu>,
>> zihlmann at jlab.org<mailto:zihlmann at jlab.org>,
>> somov at jlab.org<mailto:somov at jlab.org>,
>> staylor at jlab.org<mailto:staylor at jlab.org>
>>
>>
>>
>> Author: davidl
>> Date: 2011-03-17 11:16:39 -0400 (Thu, 17 Mar 2011)
>> New Revision: 7587
>>
>> Modified:
>>
>> trunk/sim-recon/src/libraries/TRACKING/DTrackCandidate_factory_CDC.cc
>> Log:
>> This is a fix for the seg. fault/hang problem that has been plaguing us
>> for the last ~4 months. It precalculates the values used in the
>> comparison
>> in the SortIntersections routine to avoid issues with values calculated
>> with 80bit precision being compared with values having been copied to
>> and
>> from a 64bit register. See the report on the GlueX wiki here:
>>
>> http://www.jlab.org/Hall-D/software/wiki/index.php/Diagnosing_segmentation_faults_in_reconstruction_software
>>
>>
>>
>>
>>
>>
>> _______________________________________________
>> Halld-offline mailing list
>> Halld-offline at jlab.org<mailto:Halld-offline at jlab.org>
>> https://mailman.jlab.org/mailman/listinfo/halld-offline
>>
>>
>>
>> --
>> Prof. Curtis A. Meyer Department of Physics
>> Phone: (412) 268-2745 Carnegie Mellon University
>> Fax: (412) 681-0648 Pittsburgh PA 15213-3890
>> cmeyer at ernest.phys.cmu.edu<mailto:cmeyer at ernest.phys.cmu.edu>
>> http://www.curtismeyer.com/
>>
>>
>
>
>
> _______________________________________________
> Halld-offline mailing list
> Halld-offline at jlab.org
> https://mailman.jlab.org/mailman/listinfo/halld-offline
-------------- next part --------------
An HTML attachment was scrubbed...
URL: https://mailman.jlab.org/pipermail/halld-offline/attachments/20110317/70d7deef/attachment.html
More information about the Halld-offline
mailing list