[G12] Simulation failure
Michael C. Kunkel
mkunkel at jlab.org
Fri Aug 29 23:30:42 EDT 2014
Greetings,
Question, what is the difference when I run a1c interactively as opposed
to running a1c on a farm node,..besides the obvious place of computation.
I ask this because I can run simulation jobs interactively with success,
but that same job will always fail on the farm.
BR
MK
On 8/29/14 6:13 PM, Michael C. Kunkel wrote:
> Greetings,
>
> After looking in a different place, (the .out file), it seems that the
> issue is again with run 10 being called.
> calib_connect: RunIndex_table = "calib_user.RunIndexg12_mk"
> calib_connect: using server=clasdb.jlab.org (default)
> calib_connect: using latest constants
> calib_connect: connecting as user=clasuser
> set_sc_version_constants for run 10: flag=2
> Abort
>
> This is the same crash I get when I ask the standard build to run with
> run 10. This was not the case several weeks ago.
>
> BR
> MK
>
>
> On 8/29/14 5:53 PM, Michael C. Kunkel wrote:
>> Greetings,
>>
>> I did not check which farm nodes they were because it was not only I who
>> had jobs crash. Another member had 2 jobs complete while 8 jobs crashed
>> in the a1c stage. I will verify nodes though, might give insight.
>>
>> As to Paolones statement, I was able to run a simulation as well a few
>> weeks ago. I ran the same simulation that I ran weeks ago and that
>> crashed, meaning less than a month ago the same simulation jobs passed
>> while now they crash, all jobs comparisons were done identical, i.e.
>> same random seed, ffread, build, environment etc. The only difference
>> was when they were submitted. I remember you (Paolone) saying you had
>> issues with pion simulation, when you fixed it, did you touch anything
>> with the build? I ask because your email is vague on this, you say "
>>
>> I haven't touched my build at all since then
>>
>> "
>>
>> So does that mean the build was touched at some point when you were
>> fixing it?
>>
>> BR
>> MK
>>
>>
>> On 8/29/14 4:51 PM, Eugene Pasyuk wrote:
>>> MK, did you check on which farm nodes you jobs crashed? Some times one farm node goes bad and all jobs crash on it.
>>>
>>> -Eugene
>>>
>>> ----- Original Message -----
>>>> From: "Michael Paolone"<mpaolone at jlab.org>
>>>> To: "Michael C. Kunkel"<mkunkel at jlab.org>
>>>> Cc: "g12"<g12 at jlab.org>
>>>> Sent: Friday, August 29, 2014 4:37:23 PM
>>>> Subject: Re: [G12] Simulation failure
>>>>
>>>> Just FYI, I was able to run a simulation just a week ago with no
>>>> errors.
>>>> Also, I haven't touched my build at all since then. Since there is
>>>> likely
>>>> more than just a difference in build between Will's and MK's
>>>> simulation
>>>> (i.e. database, environment, ffread, gpp params, etc..), the problem
>>>> could
>>>> be elsewhere. One easy way to check would be for Will to run
>>>> everything
>>>> the same except with my a1c, or MK run everything the same except
>>>> with the
>>>> group build.
>>>>
>>>> -Michael
>>>>
>>>>
>>>>> Thanks Will,
>>>>>
>>>>> So it appears something ha changed pertaing to Paolones build.
>>>>>
>>>>> BR
>>>>> MK
>>>>>
>>>>>
>>>>> On 8/29/14 3:40 PM, Will Phelps wrote:
>>>>>> Hey MK,
>>>>>> I submitted ~500 jobs (5 million events) during that meeting using
>>>>>> the
>>>>>> standard clas build and run 56855 and it looks like everything
>>>>>> passed.
>>>>>> -Will
>>>>>>
>>>>>> On Aug 29, 2014, at 3:31 PM, Michael C. Kunkel<mkunkel at jlab.org>
>>>>>> wrote:
>>>>>>
>>>>>>> Greetings,
>>>>>>>
>>>>>>> In the last meeting I spoke of the simulation issue I was seeing
>>>>>>> with
>>>>>>> Harsh. I have ran 10 jobs that I know ran 1 month ago and all
>>>>>>> fail for
>>>>>>> the exact a1c error that was shown during the meeting.
>>>>>>>
>>>>>>> What shall we do next? Is it possbile for Paolone to rebuild his
>>>>>>> a1c?
>>>>>>>
>>>>>>> BR
>>>>>>> MK
>>>>>>> _______________________________________________
>>>>>>> G12 mailing list
>>>>>>> G12 at jlab.org
>>>>>>> https://mailman.jlab.org/mailman/listinfo/g12
>>>>> BR
>>>>> MK
>>>>> _______________________________________________
>>>>> G12 mailing list
>>>>> G12 at jlab.org
>>>>> https://mailman.jlab.org/mailman/listinfo/g12
>>>>>
>>>> _______________________________________________
>>>> G12 mailing list
>>>> G12 at jlab.org
>>>> https://mailman.jlab.org/mailman/listinfo/g12
>>>>
>> BR
>> MK
>> _______________________________________________
>> G12 mailing list
>> G12 at jlab.org
>> https://mailman.jlab.org/mailman/listinfo/g12
>
>
> BR
> MK
>
>
> _______________________________________________
> G12 mailing list
> G12 at jlab.org
> https://mailman.jlab.org/mailman/listinfo/g12
BR
MK
-------------- next part --------------
An HTML attachment was scrubbed...
URL: https://mailman.jlab.org/pipermail/g12/attachments/20140829/be1f9397/attachment.html
More information about the G12
mailing list