[Halld-offline] Fwd: non-reproducibility study

Beni Zihlmann zihlmann at jlab.org
Fri Mar 14 08:44:06 EDT 2014


Hi Paul,
yea! I see a very similar thing. I run single threaded but with two 
different groups
once using hd_root and once using hd_dump both with my plugin. The group
using hd_root is consistent in itself and the group with hd_dump agrees with
the group of hd_root except for one run! And yes its event 1339!

Event  #Tracks #CDCHits #FDCHits #+tracsk #-tracks #Neutrals
---------------------------------------------------------------------------------------
1336    8     51  169  6  2   11
1337    9    68  491  7  2   7
1338    16  81  817  10  6   6
1339    21  43  1398  11  10   17     >>>>> 1339 21  43  1398  11  10   18
1340    14  42  1067  8  6   16


different # of Neutrals!

cheers,
Beni

> I ran 32 single-threaded jobs with the new software, and I see nearly 
> identical results, but not quite.  Half of the REST files have one 
> identical file size (and identical contents), and the other half have 
> a different identical file size (and identical contents).  This is 
> true whether I run with saving the time-based tracking results or the 
> wire-based tracking results to REST.  The attached "diff" files show 
> the difference for each case (via hddm-xml).
>
> This difference is isolated to one event: #1339 in the file I linked 
> everyone to a week ago.  In fact, while three wire-based and 
> time-based tracks are listed, they all have the same candidate id: 2.
>
> We've got it cornered Simon!  Now let's finish it off!
>
>  - Paul
>
>
>
>
>
>
>
>
> On Mar 13, 2014, at 6:56 PM, Sean Dobbs wrote:
>
>>
>>
>> Hi all,
>>
>> I checked out and built a clean version with the new tags and am now 
>> seeing consistent results when running with one thread and 4 threads.
>>
>>
>> ---Sean
>>
>>
>> On Thu, Mar 13, 2014 at 2:37 PM, Mark Ito <marki at jlab.org 
>> <mailto:marki at jlab.org>> wrote:
>>
>>     Still seeing differences of the same ilk as previously reported.
>>
>>     On 03/13/2014 02:31 PM, Mark Ito wrote:
>>     > I've re-tagged to reflect this change: tags/sim-recon-2.5 .
>>     >
>>     > On 03/13/2014 02:12 PM, Simon Taylor wrote:
>>     >> I have checked in some changes to the tracking code that appear to
>>     >> address the valgrind errors mentioned below.
>>     >>
>>     >> Simon
>>     >>
>>     >> On 03/12/2014 03:59 PM, Matthew Shepherd wrote:
>>     >>> Having just spent many frustrating hours hunting down my own
>>     separate non-deterministic bug I was motivated to run hd_dump
>>     -DTrackWireBased through valgrind.
>>     >>>
>>     >>> The error below seems suspicious and could result in
>>     non-determinstic behaviour, although valgrind is known to
>>     generate "errors" where there are none.  I didn't have time to
>>     look at the code since I have to run to another meeting, but
>>     thought I would pass it on.
>>     >>>
>>     >>> Matt
>>     >>>
>>     >>>
>>     >>> ==7443== Conditional jump or move depends on uninitialised
>>     value(s)
>>     >>> ==7443==    at 0x88F769:
>>     DTrackFitterKalmanSIMD::KalmanForwardCDC(double, DMatrix5x1&,
>>     DMatrix5x5&, double&, unsigned int&) (in
>>     /home/fs1/mashephe/gluex/my_src/bin/Linux_CentOS6-x86_64-gcc4.4.6/hd_dump)
>>     >>> ==7443==    by 0x8926C4:
>>     DTrackFitterKalmanSIMD::ForwardCDCFit(DMatrix5x1 const&,
>>     DMatrix5x5 const&) (in
>>     /home/fs1/mashephe/gluex/my_src/bin/Linux_CentOS6-x86_64-gcc4.4.6/hd_dump)
>>     >>> ==7443==    by 0x89762A: DTrackFitterKalmanSIMD::KalmanLoop()
>>     (in
>>     /home/fs1/mashephe/gluex/my_src/bin/Linux_CentOS6-x86_64-gcc4.4.6/hd_dump)
>>     >>> ==7443==    by 0x898315: DTrackFitterKalmanSIMD::FitTrack()
>>     (in
>>     /home/fs1/mashephe/gluex/my_src/bin/Linux_CentOS6-x86_64-gcc4.4.6/hd_dump)
>>     >>> ==7443==    by 0x84B8CB:
>>     DTrackFitter::FindHitsAndFitTrack(DKinematicData const&,
>>     DReferenceTrajectory const*, jana::JEventLoop*, double, int,
>>     double, DetectorSystem_t) (in
>>     /home/fs1/mashephe/gluex/my_src/bin/Linux_CentOS6-x86_64-gcc4.4.6/hd_dump)
>>     >>> ==7443==    by 0x8C25D7:
>>     DTrackWireBased_factory::DoFit(unsigned int, DTrackCandidate
>>     const*, DReferenceTrajectory*, jana::JEventLoop*, double) (in
>>     /home/fs1/mashephe/gluex/my_src/bin/Linux_CentOS6-x86_64-gcc4.4.6/hd_dump)
>>     >>> ==7443==    by 0x8C4803:
>>     DTrackWireBased_factory::evnt(jana::JEventLoop*, int) (in
>>     /home/fs1/mashephe/gluex/my_src/bin/Linux_CentOS6-x86_64-gcc4.4.6/hd_dump)
>>     >>> ==7443==    by 0x6C8F38:
>>     jana::JFactory<DTrackWireBased>::Get(std::vector<DTrackWireBased
>>     const*, std::allocator<DTrackWireBased const*> >&) (in
>>     /home/fs1/mashephe/gluex/my_src/bin/Linux_CentOS6-x86_64-gcc4.4.6/hd_dump)
>>     >>> ==7443==    by 0x6C97EC: jana::JFactory<DTrackWireBased>*
>>     jana::JEventLoop::GetFromFactory<DTrackWireBased>(std::vector<DTrackWireBased
>>     const*, std::allocator<DTrackWireBased const*> >&, char const*,
>>     jana::JEventLoop::data_source_t&, bool) (in
>>     /home/fs1/mashephe/gluex/my_src/bin/Linux_CentOS6-x86_64-gcc4.4.6/hd_dump)
>>     >>> ==7443==    by 0x6C9A84: jana::JFactory<DTrackWireBased>*
>>     jana::JEventLoop::Get<DTrackWireBased>(std::vector<DTrackWireBased const*,
>>     std::allocator<DTrackWireBased const*> >&, char const*, bool) (in
>>     /home/fs1/mashephe/gluex/my_src/bin/Linux_CentOS6-x86_64-gcc4.4.6/hd_dump)
>>     >>> ==7443==    by 0x6CA153:
>>     jana::JFactory<DTrackWireBased>::GetNrows() (in
>>     /home/fs1/mashephe/gluex/my_src/bin/Linux_CentOS6-x86_64-gcc4.4.6/hd_dump)
>>     >>> ==7443==    by 0x571A39: MyProcessor::evnt(jana::JEventLoop*,
>>     int) (in
>>     /home/fs1/mashephe/gluex/my_src/bin/Linux_CentOS6-x86_64-gcc4.4.6/hd_dump)
>>     >>>
>>     >>>
>>     >> _______________________________________________
>>     >> Halld-offline mailing list
>>     >> Halld-offline at jlab.org <mailto:Halld-offline at jlab.org>
>>     >> https://mailman.jlab.org/mailman/listinfo/halld-offline
>>
>>     --
>>     Mark M. Ito, Jefferson Lab, marki at jlab.org
>>     <mailto:marki at jlab.org>, (757)269-5295 <tel:%28757%29269-5295>
>>
>>     _______________________________________________
>>     Halld-offline mailing list
>>     Halld-offline at jlab.org <mailto:Halld-offline at jlab.org>
>>     https://mailman.jlab.org/mailman/listinfo/halld-offline
>>
>>
>>
>>
>>
>> -- 
>> Sean Dobbs
>> Department of Physics & Astronomy
>> Northwestern University
>> phone: 847-467-2826
>> _______________________________________________
>> Halld-offline mailing list
>> Halld-offline at jlab.org <mailto:Halld-offline at jlab.org>
>> https://mailman.jlab.org/mailman/listinfo/halld-offline
>
>
>
> _______________________________________________
> Halld-offline mailing list
> Halld-offline at jlab.org
> https://mailman.jlab.org/mailman/listinfo/halld-offline

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mailman.jlab.org/pipermail/halld-offline/attachments/20140314/4954da2b/attachment-0002.html>


More information about the Halld-offline mailing list