[Halld-offline] diagnosis of cause for segfaults in DTrackCandidate_factory_CDC::FindThetaZRegression()
Beni Zihlmann
zihlmann at jlab.org
Thu Mar 17 08:36:57 EDT 2011
Hi Richard,
I have one question to this issue. If I understand this correctly it
means that there is a
potential problem every time I do an explicit calculation inside an if
statement and
comparing that with a variable from memory.
cheers,
Beni
> Dear colleagues,
>
> I have reproduced and diagnosed the segfaults that take place in the
> current GlueX reconstruction code, when compiled for the i686
> platform. Note that they also occur on 64bit hardware when running
> the 32bit executable, so it is not just a 32bit issue. The
> explanation is a bit too long for email, so I have written it up in
> the form of a wiki page. Please see it at the following URL.
>
> http://www.jlab.org/Hall-D/software/wiki/index.php/Diagnosing_segmentation_faults_in_reconstruction_software#How_to_diagnose_this_kind_of_error
>
>
> In that wiki page, I also explain why this should not be considered to
> be a compiler optimization bug, but rather a bug in our user code, in
> the context of x87 math. That, in spite of the fact that recompiling
> with -O0 seemed to solve it! In fact, turning off optimization is not
> a reliable solution, and the current bug probably will break out again
> in -O0 code in the near future, as g++ continues to evolve. What is
> more, in considering the impact of this bug, the segfault is really
> only the tip of the iceberg. I would expect this problem to be
> happening much more often in -m32 builds, but only showing up as
> segfaults in the (rare?) case that the memory between the valid data
> and the end of the valid data segment contains all zeros. In what
> might be the more normal occurrance of this bug, we could be getting
> bogus results from the tracking and not know it. In other words, the
> segfault is your friend.
>
> Besides this, there is the more serious issue of how robust the rest
> of the code is against what I might call the "x87 entropy problem"
> with randomly fluctuating least-significant bits in doubles. This
> probably warrants a broader discussion, beyond the resolution of this
> particular bug.
>
> -Richard J.
>
>
> _______________________________________________
> Halld-offline mailing list
> Halld-offline at jlab.org
> https://mailman.jlab.org/mailman/listinfo/halld-offline
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mailman.jlab.org/pipermail/halld-offline/attachments/20110317/3c9703c7/attachment-0002.html>
More information about the Halld-offline
mailing list