[Halld-offline] Gluex osg production environment ready for testing
David Lawrence
davidl at jlab.org
Tue Jun 27 08:16:26 EDT 2017
Hi Richard,
I know I’ve seen some similar error messages in the past that I think were related to
detached plugins freeing objects in global memory. I’m not sure that that is what’s
going on here though.
The error message doesn’t seem like a lot to go on. Is it possible to provide the input
file and exact command that caused the crash? I can try reproducing the issue with
the same code version and then check if the problem still exists with the most up-to-date
sim-recon. That is unless someone else has an idea of what this error is.
Regards,
-David
> On Jun 26, 2017, at 12:51 PM, Richard Jones <richard.t.jones at uconn.edu> wrote:
>
> Hello Mark and all,
>
> After a number of interruptions, I finally finished the roll-out of a container-based osg production framework for Gluex simulations. Please have a look at the git project linked below. It contains two scripts that give everything you need to try it. This is bound to be somewhat rough at first, but let me know how it goes.
>
> https://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_rjones30_gluex-2Dosg-2Djobscripts.git&d=DwIFaQ&c=lz9TcOasaINaaC3U7FbMev2lsutwpI4--09aP8Lu18s&r=oyKV5joTkJsuRYv6hh48IMTw3i-IrYD-ZUAHHU0DdAY&m=yBdV0QcoSdNGYTyiFLCifrX3vuQ9hs3mU5Y41RDL0xg&s=5ntO185gT9cG8-rdDLR31p3P56FPFh-7OQ9YLpE3kdI&e= <https://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_rjones30_gluex-2Dosg-2Djobscripts.git&d=DwMFaQ&c=lz9TcOasaINaaC3U7FbMev2lsutwpI4--09aP8Lu18s&r=DSufUSddsl3c0GGnbiwkxA&m=H-M9V5h7RHRPQdPyqhFYT4-e2nnlIfghJ78PVvHysDo&s=Ef4LJD-x3sztTM4aGGJOQBBpX9-0QIhjZFy-_G3ypTc&e=>
>
> The sim-recon build that is configured for osg production right now is based on version.2.11.xml from the http://halldweb.jlab.org/dist <http://halldweb.jlab.org/dist> area, which looks like the latest stable release. Still, it is not very stable. This is probably because of the fast development over the past few months of calibrations for real data, and the lack of corresponding exercising of the code on Monte Carlo. A minimal checkout and run of the distro job script submits a job with 40 slices of 250 events each generated with bggen, simulated with hdgeant4, passed through hd_root for REST production, and through hd_root again for plugins analysis. When I tested it, only one of the 40 jobs crashed:
>
> *** Error in `hd_root': free(): invalid pointer: 0x00002baa1c394ff8 ***
>
> with 37 completing successfully, and 2 others bombing out of hd_root for other reasons. At this point, I would like to pass things back to the reconstruction/analysis gurus. I could work on diagnosing and fixing these bugs, but as I discovered with the "hd_root hanging" bug last week, there is a good chance that some of these bugs are known and fixed in somedev dev branch of the sim_recon tree or jana repo.
>
> Please have a look and try things out. Let me know if/when you run into issues. I will update the HOWTO on the wiki once we get a little further into the learning curve. Right now, you can rely on the README in the git project linked above for docs.
>
> Richard
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mailman.jlab.org/pipermail/halld-offline/attachments/20170627/035b06e3/attachment-0002.html>
More information about the Halld-offline
mailing list