[Halld-offline] Fwd: non-reproducibility study

Paul Mattione pmatt at jlab.org
Fri Mar 14 11:44:41 EDT 2014


The difference is for:

Event #1339, Track Candidate #2, DTrackCandidate pz & z0.

We need to find which track candidate factory this is coming from still.  

 - Paul

On Mar 14, 2014, at 8:44 AM, Beni Zihlmann wrote:

> Hi Paul,
> yea! I see a very similar thing. I run single threaded but with two different groups
> once using hd_root and once using hd_dump both with my plugin. The group
> using hd_root is consistent in itself and the group with hd_dump agrees with
> the group of hd_root except for one run! And yes its event 1339!
> 
> Event  #Tracks #CDCHits #FDCHits #+tracsk #-tracks #Neutrals
> ---------------------------------------------------------------------------------------
> 1336    8     51  169  6  2   11
> 1337    9    68  491  7  2   7
> 1338    16  81  817  10  6   6
> 1339    21  43  1398  11  10   17     >>>>> 1339    21  43  1398  11  10   18
> 1340    14  42  1067  8  6   16
> 
> 
> different # of Neutrals!
> 
> cheers,
> Beni
> 
>> I ran 32 single-threaded jobs with the new software, and I see nearly identical results, but not quite.  Half of the REST files have one identical file size (and identical contents), and the other half have a different identical file size (and identical contents).  This is true whether I run with saving the time-based tracking results or the wire-based tracking results to REST.  The attached "diff" files show the difference for each case (via hddm-xml).  
>> 
>> This difference is isolated to one event: #1339 in the file I linked everyone to a week ago.  In fact, while three wire-based and time-based tracks are listed, they all have the same candidate id: 2.  
>> 
>> We've got it cornered Simon!  Now let's finish it off! 
>> 
>>  - Paul
>> 
>> 
>> 
>> 
>> 
>> 
>> 
>> 
>> On Mar 13, 2014, at 6:56 PM, Sean Dobbs wrote:
>> 
>>> 
>>> 
>>> Hi all,
>>> 
>>> I checked out and built a clean version with the new tags and am now seeing consistent results when running with one thread and 4 threads.
>>> 
>>> 
>>> ---Sean
>>> 
>>> 
>>> On Thu, Mar 13, 2014 at 2:37 PM, Mark Ito <marki at jlab.org> wrote:
>>> Still seeing differences of the same ilk as previously reported.
>>> 
>>> On 03/13/2014 02:31 PM, Mark Ito wrote:
>>> > I've re-tagged to reflect this change: tags/sim-recon-2.5 .
>>> >
>>> > On 03/13/2014 02:12 PM, Simon Taylor wrote:
>>> >> I have checked in some changes to the tracking code that appear to
>>> >> address the valgrind errors mentioned below.
>>> >>
>>> >> Simon
>>> >>
>>> >> On 03/12/2014 03:59 PM, Matthew Shepherd wrote:
>>> >>> Having just spent many frustrating hours hunting down my own separate non-deterministic bug I was motivated to run hd_dump -DTrackWireBased through valgrind.
>>> >>>
>>> >>> The error below seems suspicious and could result in non-determinstic behaviour, although valgrind is known to generate "errors" where there are none.  I didn't have time to look at the code since I have to run to another meeting, but thought I would pass it on.
>>> >>>
>>> >>> Matt
>>> >>>
>>> >>>
>>> >>> ==7443== Conditional jump or move depends on uninitialised value(s)
>>> >>> ==7443==    at 0x88F769: DTrackFitterKalmanSIMD::KalmanForwardCDC(double, DMatrix5x1&, DMatrix5x5&, double&, unsigned int&) (in /home/fs1/mashephe/gluex/my_src/bin/Linux_CentOS6-x86_64-gcc4.4.6/hd_dump)
>>> >>> ==7443==    by 0x8926C4: DTrackFitterKalmanSIMD::ForwardCDCFit(DMatrix5x1 const&, DMatrix5x5 const&) (in /home/fs1/mashephe/gluex/my_src/bin/Linux_CentOS6-x86_64-gcc4.4.6/hd_dump)
>>> >>> ==7443==    by 0x89762A: DTrackFitterKalmanSIMD::KalmanLoop() (in /home/fs1/mashephe/gluex/my_src/bin/Linux_CentOS6-x86_64-gcc4.4.6/hd_dump)
>>> >>> ==7443==    by 0x898315: DTrackFitterKalmanSIMD::FitTrack() (in /home/fs1/mashephe/gluex/my_src/bin/Linux_CentOS6-x86_64-gcc4.4.6/hd_dump)
>>> >>> ==7443==    by 0x84B8CB: DTrackFitter::FindHitsAndFitTrack(DKinematicData const&, DReferenceTrajectory const*, jana::JEventLoop*, double, int, double, DetectorSystem_t) (in /home/fs1/mashephe/gluex/my_src/bin/Linux_CentOS6-x86_64-gcc4.4.6/hd_dump)
>>> >>> ==7443==    by 0x8C25D7: DTrackWireBased_factory::DoFit(unsigned int, DTrackCandidate const*, DReferenceTrajectory*, jana::JEventLoop*, double) (in /home/fs1/mashephe/gluex/my_src/bin/Linux_CentOS6-x86_64-gcc4.4.6/hd_dump)
>>> >>> ==7443==    by 0x8C4803: DTrackWireBased_factory::evnt(jana::JEventLoop*, int) (in /home/fs1/mashephe/gluex/my_src/bin/Linux_CentOS6-x86_64-gcc4.4.6/hd_dump)
>>> >>> ==7443==    by 0x6C8F38: jana::JFactory<DTrackWireBased>::Get(std::vector<DTrackWireBased const*, std::allocator<DTrackWireBased const*> >&) (in /home/fs1/mashephe/gluex/my_src/bin/Linux_CentOS6-x86_64-gcc4.4.6/hd_dump)
>>> >>> ==7443==    by 0x6C97EC: jana::JFactory<DTrackWireBased>* jana::JEventLoop::GetFromFactory<DTrackWireBased>(std::vector<DTrackWireBased const*, std::allocator<DTrackWireBased const*> >&, char const*, jana::JEventLoop::data_source_t&, bool) (in /home/fs1/mashephe/gluex/my_src/bin/Linux_CentOS6-x86_64-gcc4.4.6/hd_dump)
>>> >>> ==7443==    by 0x6C9A84: jana::JFactory<DTrackWireBased>* jana::JEventLoop::Get<DTrackWireBased>(std::vector<DTrackWireBased const*, std::allocator<DTrackWireBased const*> >&, char const*, bool) (in /home/fs1/mashephe/gluex/my_src/bin/Linux_CentOS6-x86_64-gcc4.4.6/hd_dump)
>>> >>> ==7443==    by 0x6CA153: jana::JFactory<DTrackWireBased>::GetNrows() (in /home/fs1/mashephe/gluex/my_src/bin/Linux_CentOS6-x86_64-gcc4.4.6/hd_dump)
>>> >>> ==7443==    by 0x571A39: MyProcessor::evnt(jana::JEventLoop*, int) (in /home/fs1/mashephe/gluex/my_src/bin/Linux_CentOS6-x86_64-gcc4.4.6/hd_dump)
>>> >>>
>>> >>>
>>> >> _______________________________________________
>>> >> Halld-offline mailing list
>>> >> Halld-offline at jlab.org
>>> >> https://mailman.jlab.org/mailman/listinfo/halld-offline
>>> 
>>> --
>>> Mark M. Ito, Jefferson Lab, marki at jlab.org, (757)269-5295
>>> 
>>> _______________________________________________
>>> Halld-offline mailing list
>>> Halld-offline at jlab.org
>>> https://mailman.jlab.org/mailman/listinfo/halld-offline
>>> 
>>> 
>>> 
>>> 
>>> -- 
>>> Sean Dobbs
>>> Department of Physics & Astronomy 
>>> Northwestern University
>>> phone: 847-467-2826
>>> _______________________________________________
>>> Halld-offline mailing list
>>> Halld-offline at jlab.org
>>> https://mailman.jlab.org/mailman/listinfo/halld-offline
>> 
>> 
>> 
>> _______________________________________________
>> Halld-offline mailing list
>> Halld-offline at jlab.org
>> https://mailman.jlab.org/mailman/listinfo/halld-offline
> 
> _______________________________________________
> Halld-offline mailing list
> Halld-offline at jlab.org
> https://mailman.jlab.org/mailman/listinfo/halld-offline

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mailman.jlab.org/pipermail/halld-offline/attachments/20140314/eee48981/attachment-0002.html>


More information about the Halld-offline mailing list