[Halld-offline] problems with reading recent data with many threads

David Lawrence davidl at jlab.org
Sat Dec 17 02:20:23 EST 2016


Hi Sean,

  I’ve done a preliminary investigation and it appears there really is a corrupted event
in the middle of the file. It is the 8th “event" in an EVIO block (not block as in entangled
events, but one level up.) The two header words for this particular physics event seems
corrupted (length claims >600MB which is bogus. Header word of 0x41444331 also
doesn’t make sense.  I can see a data bank for ROC 51 (FDC1) at about the right position
relative to the corrupted header based on the previous event. I can also see another valid
physics event header in the right location given the average event size for the file. The
corrupted header means those 20 events are basically lost. More problematic though is
that it throws the parser completely off so that it cannot continue. It is possible that I can
instrument some type of recovery mechanism where it searches for what looks like the
next valued physics event header if we deem the inaccessible data valuable enough.

The biggest issue is why the corruption is there in the first place. This part of the formatting
should be done by the event builder so I don’t believe it can be the front end module 
firmware or the ROL. I do know there has been some experimenting in the last few days 
with the new EMU sockets feature in CODA. The fact that it’s a new feature makes it a suspect.

We can discuss it more at the RC meeting in the morning.

Regards,
-David


> On Dec 16, 2016, at 6:31 PM, David Lawrence <davidl at jlab.org> wrote:
> 
> Hi Sean,
> 
>  Nathan pointed me to this earlier this morning but I haven’t had a chance to track it down.
> It looks to be only an infrequent event. Most likely something that was added in sync 
> events. I do not believe there is any corruption of the data overall. I will take a look at this
> this evening to see if I can track it down.
> 
> Regards,
> -David
> 
> 
>> On Dec 16, 2016, at 6:08 PM, Sean Dobbs <s-dobbs at northwestern.edu> wrote:
>> 
>> Hi Offliners,
>> 
>> I've run into some crashes reading some of the raw data files taken last night.  I've been analyzing run 21943 with the latest sim-recon, and have noticed this problem with files hd_rawdata_021943_001.evio and hd_rawdata_021943_002.evio
>> 
>> Strangely the crashes seem to occur when I run with many (>20) threads.  I tried recreating the problem in single threaded mode, but it did not occur.  I haven't done a scan to see at which point the problem happens, but it does some to consistently occur at larger thread count.  A snippet containing the error is below.  Has anyone else seen anything like this?
>> 
>> Cheers,
>> Sean
>> 
>> libraries/DAQ/HDEVIO.cc:873 Uknown tag: 4144ad)  1.5kHz  (avg.: 1.6kHz)       
>> libraries/DAQ/HDEVIO.cc:873 Uknown tag: 41b
>> libraries/DAQ/HDEVIO.cc:873 Uknown tag: 98eb
>> libraries/DAQ/HDEVIO.cc:873 Uknown tag: 0
>> libraries/DAQ/HDEVIO.cc:873 Uknown tag: 8bc0
>> JANA ERROR>>
>> JANA ERROR>>?JException:    code = 0    text = EVIO bank length word is zero in swap_bank!
>> JANA ERROR>>
>> JANA ERROR>>File: libraries/DAQ/swap_bank.cc,   Line: 35
>> JANA ERROR>>nts processed  (531.6k events read)  480.0Hz  (avg.: 1.6kHz)     
>> JANA ERROR>>?JException:    code = 0    text = EVIO bank length word is zero in swap_bank!
>> JANA ERROR>>
>> JANA ERROR>>File: libraries/DAQ/swap_bank.cc,   Line: 35
>> JANA ERROR>>nts processed  (531.7k events read)  0.0Hz  (avg.: 1.6kHz)       
>> JANA ERROR>>?JException:    code = 0    text = WARNING: unknown bank type (0x43)
>> JANA ERROR>>
>> JANA ERROR>>File: libraries/DAQ/swap_bank.cc,   Line: 91
>> JANA ERROR>> didn't sleep  (531.7k events read)  0.0Hz  (avg.: 1.6kHz)     
>> 
>> _______________________________________________
>> Halld-offline mailing list
>> Halld-offline at jlab.org
>> https://mailman.jlab.org/mailman/listinfo/halld-offline
> 
> 
> _______________________________________________
> Halld-offline mailing list
> Halld-offline at jlab.org
> https://mailman.jlab.org/mailman/listinfo/halld-offline





More information about the Halld-offline mailing list