[Clas12_software] Critical bug in 5a.1.0
Francois-Xavier Girod
fxgirod at jlab.org
Fri Mar 2 20:31:26 EST 2018
Dear Vardan
Can we agree on the same file to test? It may be easier to start from a raw
file and decode it so I might use a recent run from the "online" directory,
or might even copy fresh data
I can run this on one of the clara machine tonight
Best regards
FX
On Fri, Mar 2, 2018 at 8:13 PM, Vardan Gyurjyan <gurjyan at jlab.org> wrote:
> Hi Silvester,
> I also notice that FTCAL and FTHODO services are making many dB accesses
> during the initialization. I expect to see a single access to the database
> followed by a single dB disconnect.
> -Vardan
>
>
> Sent from my iPhone
>
> On Mar 2, 2018, at 7:50 PM, Sylvester J. Joosten <tuf42480 at temple.edu>
> wrote:
>
> Hi Rafaella, hi Vardan,
>
> In case this is useful: I recognize this error in the context of accessing
> tables from CCDB from COATJAVA. If this happens when
>
> if(this.entries.hasItem(index)==false) (org.jlab.utils.groups.
> IndexedTable)
>
> fails. entries.hasItem() contains the following checks:
>
> —> calls IndexedList.hasItem(int... index) (org.jlab.utils.groups.
> IndexedList)
> —> has 2 internal checks:
> 1. if(index.length!=this.indexSize)
> 2. IndexGenerator.hashCode(index);
>
> In my experience, this implies that there is some kind of issue or
> inconsistency when accessing CCDB, at least for this particular run.
>
> Just my 2 cents.
> Best,
> Sylvester
>
>
> On Mar 2, 2018, at 7:37 PM, Vardan Gyurjyan <gurjyan at jlab.org> wrote:
>
> Hi Raffaella,
> I do not know what is exactly the cause but whenever I use one of these
> services in the data processing chain I get out memory exception. This
> exception is not recoverable. As I mention this happens every single time
> on clonfar0 node. I never saw this error on the farm machines though. May
> be FX can comment. I am getting these error on data over the ET as well as
> decoded files from FX’s decoded files directory.
> Vardan
> Sent from my iPhone
>
> On Mar 2, 2018, at 4:53 PM, Raffaella De Vita <Raffaella.Devita at ge.infn.it>
> wrote:
>
> Hi Vardan,
> I'm the author of those services. The only change that was done recently
> was a modification of an hardcoded constant and, after that was done, FX
> cooked several files for FT studies. Nothing else was changed in more than
> one month. Anyway, I will try to reproduce the problem and debug it.What
> data are you processing?
> Regards,
> Raffaella
>
> Vardan Gyurjyan wrote:
>
> Hi FX,
>
> Since I am not sure who is the FTCAL and FTHODO engines author I am
> cc-ing this email to clas12.
> There is a critical bug introduced in these service engine’s code for the
> 5a.1.0 release. It is 100% reproducible on clonfarm0 node (online
> reconstruction node), where JVM crashes with the out of memory exception.
> Reconstruction chain without these services function properly.
> These are service engines that also print for every event warning messages
> such as “ [IndexedTable] ---> error.. entry does not exist” (I consider
> them warning since if this is a real error the processing should be
> stoped). Any ways, this is a serious bug that can result in large number of
> job failures on the farm.
>
> -vardan
> --------------------------------------------------
> Vardan H. Gyurjyan, Ph.D.
> Staff Scientist
> Thomas Jefferson Accelerator Facility
> Newport News, VA, 23606
> E-mail: gurjyan at jlab.org
> 757-269-5879 <(757)%20269-5879> (JLAB)
>
>
>
> _______________________________________________
> Clas12_software mailing listClas12_software at jlab.orghttps://mailman.jlab.org/mailman/listinfo/clas12_software
>
>
> _______________________________________________
> Clas12_software mailing list
> Clas12_software at jlab.org
> https://mailman.jlab.org/mailman/listinfo/clas12_software
>
>
>
> _______________________________________________
> Clas12_software mailing list
> Clas12_software at jlab.org
> https://mailman.jlab.org/mailman/listinfo/clas12_software
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mailman.jlab.org/pipermail/clas12_software/attachments/20180302/2a40f698/attachment-0002.html>
More information about the Clas12_software
mailing list