[Clas12_software] Critical bug in 5a.1.0
Nathan Baltzell
baltzell at jlab.org
Fri Mar 2 20:51:40 EST 2018
Will be good to give people a reference model. I based EB’s CCDB access on ECAL's.
-Nathan
> On Mar 2, 2018, at 8:40 PM, Gagik Gavalian <gavalian at jlab.org> wrote:
>
> We have tools for accessing the database, which are safe. It is adviced that people use those to avoid problem, but it’s not enforced.
>
> Gagik
>
> Sent from my iPhone
>
> On Mar 2, 2018, at 8:31 PM, Francois-Xavier Girod <fxgirod at jlab.org> wrote:
>
>> Dear Vardan
>>
>> Can we agree on the same file to test? It may be easier to start from a raw file and decode it so I might use a recent run from the "online" directory, or might even copy fresh data
>>
>> I can run this on one of the clara machine tonight
>>
>> Best regards
>> FX
>>
>> On Fri, Mar 2, 2018 at 8:13 PM, Vardan Gyurjyan <gurjyan at jlab.org> wrote:
>> Hi Silvester,
>> I also notice that FTCAL and FTHODO services are making many dB accesses during the initialization. I expect to see a single access to the database followed by a single dB disconnect.
>> -Vardan
>>
>>
>> Sent from my iPhone
>>
>> On Mar 2, 2018, at 7:50 PM, Sylvester J. Joosten <tuf42480 at temple.edu> wrote:
>>
>>> Hi Rafaella, hi Vardan,
>>>
>>> In case this is useful: I recognize this error in the context of accessing tables from CCDB from COATJAVA. If this happens when
>>>
>>> if(this.entries.hasItem(index)==false) (org.jlab.utils.groups.IndexedTable)
>>>
>>> fails. entries.hasItem() contains the following checks:
>>>
>>> —> calls IndexedList.hasItem(int... index) (org.jlab.utils.groups.IndexedList)
>>> —> has 2 internal checks:
>>> 1. if(index.length!=this.indexSize)
>>> 2. IndexGenerator.hashCode(index);
>>>
>>> In my experience, this implies that there is some kind of issue or inconsistency when accessing CCDB, at least for this particular run.
>>>
>>> Just my 2 cents.
>>> Best,
>>> Sylvester
>>>
>>>
>>>> On Mar 2, 2018, at 7:37 PM, Vardan Gyurjyan <gurjyan at jlab.org> wrote:
>>>>
>>>> Hi Raffaella,
>>>> I do not know what is exactly the cause but whenever I use one of these services in the data processing chain I get out memory exception. This exception is not recoverable. As I mention this happens every single time on clonfar0 node. I never saw this error on the farm machines though. May be FX can comment. I am getting these error on data over the ET as well as decoded files from FX’s decoded files directory.
>>>> Vardan
>>>> Sent from my iPhone
>>>>
>>>> On Mar 2, 2018, at 4:53 PM, Raffaella De Vita <Raffaella.Devita at ge.infn.it> wrote:
>>>>
>>>>> Hi Vardan,
>>>>> I'm the author of those services. The only change that was done recently was a modification of an hardcoded constant and, after that was done, FX cooked several files for FT studies. Nothing else was changed in more than one month. Anyway, I will try to reproduce the problem and debug it.What data are you processing?
>>>>> Regards,
>>>>> Raffaella
>>>>>
>>>>> Vardan Gyurjyan wrote:
>>>>>> Hi FX,
>>>>>>
>>>>>> Since I am not sure who is the FTCAL and FTHODO engines author I am cc-ing this email to clas12.
>>>>>> There is a critical bug introduced in these service engine’s code for the 5a.1.0 release. It is 100% reproducible on clonfarm0 node (online reconstruction node), where JVM crashes with the out of memory exception. Reconstruction chain without these services function properly.
>>>>>> These are service engines that also print for every event warning messages such as “ [IndexedTable] ---> error.. entry does not exist” (I consider them warning since if this is a real error the processing should be stoped). Any ways, this is a serious bug that can result in large number of job failures on the farm.
>>>>>>
>>>>>> -vardan
>>>>>> --------------------------------------------------
>>>>>> Vardan H. Gyurjyan, Ph.D.
>>>>>> Staff Scientist
>>>>>> Thomas Jefferson Accelerator Facility
>>>>>> Newport News, VA, 23606
>>>>>> E-mail: gurjyan at jlab.org
>>>>>> 757-269-5879 (JLAB)
>>>>>>
>>>>>>
>>>>>>
>>>>>> ______________________________
>>>>>> _________________
>>>>>> Clas12_software mailing list
>>>>>>
>>>>>> Clas12_software at jlab.org
>>>>>> https://mailman.jlab.org/mailman/listinfo/clas12_software
>>>>>
>>>> _______________________________________________
>>>> Clas12_software mailing list
>>>> Clas12_software at jlab.org
>>>> https://mailman.jlab.org/mailman/listinfo/clas12_software
>>>
>>
>> _______________________________________________
>> Clas12_software mailing list
>> Clas12_software at jlab.org
>> https://mailman.jlab.org/mailman/listinfo/clas12_software
>>
>> _______________________________________________
>> Clas12_software mailing list
>> Clas12_software at jlab.org
>> https://mailman.jlab.org/mailman/listinfo/clas12_software
> _______________________________________________
> Clas12_software mailing list
> Clas12_software at jlab.org
> https://mailman.jlab.org/mailman/listinfo/clas12_software
More information about the Clas12_software
mailing list