[Halld-offline] Fwd: non-reproducibility study
Paul Mattione
pmatt at jlab.org
Fri Mar 14 11:44:41 EDT 2014
The difference is for:
Event #1339, Track Candidate #2, DTrackCandidate pz & z0.
We need to find which track candidate factory this is coming from still.
- Paul
On Mar 14, 2014, at 8:44 AM, Beni Zihlmann wrote:
> Hi Paul,
> yea! I see a very similar thing. I run single threaded but with two different groups
> once using hd_root and once using hd_dump both with my plugin. The group
> using hd_root is consistent in itself and the group with hd_dump agrees with
> the group of hd_root except for one run! And yes its event 1339!
>
> Event #Tracks #CDCHits #FDCHits #+tracsk #-tracks #Neutrals
> ---------------------------------------------------------------------------------------
> 1336 8 51 169 6 2 11
> 1337 9 68 491 7 2 7
> 1338 16 81 817 10 6 6
> 1339 21 43 1398 11 10 17 >>>>> 1339 21 43 1398 11 10 18
> 1340 14 42 1067 8 6 16
>
>
> different # of Neutrals!
>
> cheers,
> Beni
>
>> I ran 32 single-threaded jobs with the new software, and I see nearly identical results, but not quite. Half of the REST files have one identical file size (and identical contents), and the other half have a different identical file size (and identical contents). This is true whether I run with saving the time-based tracking results or the wire-based tracking results to REST. The attached "diff" files show the difference for each case (via hddm-xml).
>>
>> This difference is isolated to one event: #1339 in the file I linked everyone to a week ago. In fact, while three wire-based and time-based tracks are listed, they all have the same candidate id: 2.
>>
>> We've got it cornered Simon! Now let's finish it off!
>>
>> - Paul
>>
>>
>>
>>
>>
>>
>>
>>
>> On Mar 13, 2014, at 6:56 PM, Sean Dobbs wrote:
>>
>>>
>>>
>>> Hi all,
>>>
>>> I checked out and built a clean version with the new tags and am now seeing consistent results when running with one thread and 4 threads.
>>>
>>>
>>> ---Sean
>>>
>>>
>>> On Thu, Mar 13, 2014 at 2:37 PM, Mark Ito <marki at jlab.org> wrote:
>>> Still seeing differences of the same ilk as previously reported.
>>>
>>> On 03/13/2014 02:31 PM, Mark Ito wrote:
>>> > I've re-tagged to reflect this change: tags/sim-recon-2.5 .
>>> >
>>> > On 03/13/2014 02:12 PM, Simon Taylor wrote:
>>> >> I have checked in some changes to the tracking code that appear to
>>> >> address the valgrind errors mentioned below.
>>> >>
>>> >> Simon
>>> >>
>>> >> On 03/12/2014 03:59 PM, Matthew Shepherd wrote:
>>> >>> Having just spent many frustrating hours hunting down my own separate non-deterministic bug I was motivated to run hd_dump -DTrackWireBased through valgrind.
>>> >>>
>>> >>> The error below seems suspicious and could result in non-determinstic behaviour, although valgrind is known to generate "errors" where there are none. I didn't have time to look at the code since I have to run to another meeting, but thought I would pass it on.
>>> >>>
>>> >>> Matt
>>> >>>
>>> >>>
>>> >>> ==7443== Conditional jump or move depends on uninitialised value(s)
>>> >>> ==7443== at 0x88F769: DTrackFitterKalmanSIMD::KalmanForwardCDC(double, DMatrix5x1&, DMatrix5x5&, double&, unsigned int&) (in /home/fs1/mashephe/gluex/my_src/bin/Linux_CentOS6-x86_64-gcc4.4.6/hd_dump)
>>> >>> ==7443== by 0x8926C4: DTrackFitterKalmanSIMD::ForwardCDCFit(DMatrix5x1 const&, DMatrix5x5 const&) (in /home/fs1/mashephe/gluex/my_src/bin/Linux_CentOS6-x86_64-gcc4.4.6/hd_dump)
>>> >>> ==7443== by 0x89762A: DTrackFitterKalmanSIMD::KalmanLoop() (in /home/fs1/mashephe/gluex/my_src/bin/Linux_CentOS6-x86_64-gcc4.4.6/hd_dump)
>>> >>> ==7443== by 0x898315: DTrackFitterKalmanSIMD::FitTrack() (in /home/fs1/mashephe/gluex/my_src/bin/Linux_CentOS6-x86_64-gcc4.4.6/hd_dump)
>>> >>> ==7443== by 0x84B8CB: DTrackFitter::FindHitsAndFitTrack(DKinematicData const&, DReferenceTrajectory const*, jana::JEventLoop*, double, int, double, DetectorSystem_t) (in /home/fs1/mashephe/gluex/my_src/bin/Linux_CentOS6-x86_64-gcc4.4.6/hd_dump)
>>> >>> ==7443== by 0x8C25D7: DTrackWireBased_factory::DoFit(unsigned int, DTrackCandidate const*, DReferenceTrajectory*, jana::JEventLoop*, double) (in /home/fs1/mashephe/gluex/my_src/bin/Linux_CentOS6-x86_64-gcc4.4.6/hd_dump)
>>> >>> ==7443== by 0x8C4803: DTrackWireBased_factory::evnt(jana::JEventLoop*, int) (in /home/fs1/mashephe/gluex/my_src/bin/Linux_CentOS6-x86_64-gcc4.4.6/hd_dump)
>>> >>> ==7443== by 0x6C8F38: jana::JFactory<DTrackWireBased>::Get(std::vector<DTrackWireBased const*, std::allocator<DTrackWireBased const*> >&) (in /home/fs1/mashephe/gluex/my_src/bin/Linux_CentOS6-x86_64-gcc4.4.6/hd_dump)
>>> >>> ==7443== by 0x6C97EC: jana::JFactory<DTrackWireBased>* jana::JEventLoop::GetFromFactory<DTrackWireBased>(std::vector<DTrackWireBased const*, std::allocator<DTrackWireBased const*> >&, char const*, jana::JEventLoop::data_source_t&, bool) (in /home/fs1/mashephe/gluex/my_src/bin/Linux_CentOS6-x86_64-gcc4.4.6/hd_dump)
>>> >>> ==7443== by 0x6C9A84: jana::JFactory<DTrackWireBased>* jana::JEventLoop::Get<DTrackWireBased>(std::vector<DTrackWireBased const*, std::allocator<DTrackWireBased const*> >&, char const*, bool) (in /home/fs1/mashephe/gluex/my_src/bin/Linux_CentOS6-x86_64-gcc4.4.6/hd_dump)
>>> >>> ==7443== by 0x6CA153: jana::JFactory<DTrackWireBased>::GetNrows() (in /home/fs1/mashephe/gluex/my_src/bin/Linux_CentOS6-x86_64-gcc4.4.6/hd_dump)
>>> >>> ==7443== by 0x571A39: MyProcessor::evnt(jana::JEventLoop*, int) (in /home/fs1/mashephe/gluex/my_src/bin/Linux_CentOS6-x86_64-gcc4.4.6/hd_dump)
>>> >>>
>>> >>>
>>> >> _______________________________________________
>>> >> Halld-offline mailing list
>>> >> Halld-offline at jlab.org
>>> >> https://mailman.jlab.org/mailman/listinfo/halld-offline
>>>
>>> --
>>> Mark M. Ito, Jefferson Lab, marki at jlab.org, (757)269-5295
>>>
>>> _______________________________________________
>>> Halld-offline mailing list
>>> Halld-offline at jlab.org
>>> https://mailman.jlab.org/mailman/listinfo/halld-offline
>>>
>>>
>>>
>>>
>>> --
>>> Sean Dobbs
>>> Department of Physics & Astronomy
>>> Northwestern University
>>> phone: 847-467-2826
>>> _______________________________________________
>>> Halld-offline mailing list
>>> Halld-offline at jlab.org
>>> https://mailman.jlab.org/mailman/listinfo/halld-offline
>>
>>
>>
>> _______________________________________________
>> Halld-offline mailing list
>> Halld-offline at jlab.org
>> https://mailman.jlab.org/mailman/listinfo/halld-offline
>
> _______________________________________________
> Halld-offline mailing list
> Halld-offline at jlab.org
> https://mailman.jlab.org/mailman/listinfo/halld-offline
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mailman.jlab.org/pipermail/halld-offline/attachments/20140314/eee48981/attachment-0002.html>
More information about the Halld-offline
mailing list