[Halld-offline] Software Meeting Minutes, February 5, 2019

Wed Feb 6 14:21:16 EST 2019

I think I am using the same set of German scripts as Richard suggests, 
someone named Vogt?

On 2/6/19 12:34 PM, Shepherd, Matthew wrote:
>> On Feb 6, 2019, at 8:09 AM, Mark Ito <marki at jlab.org> wrote:
>>
>> Thomas reported that HDGeant[3] crashes when run on a gluex_install build on Ubuntu on Windows. The error looks like
>>
>> !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
>> LOCB/LOCF: address 0x7ffa129ad700 exceeds the 32 bit address space
>> or is not in the data segments
>> This may result in program crash or incorrect results
>> Therefore we will stop here
>> !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
>>
>> No investigation of the cause has been started thus far. A candidate for an issue, perhaps?
> This is the symptom that Mac users have seen for at least 6-7 years.
>
> An email thread sent to this list back in 2012.  The proposed solution is still likely not relevant, but the cause may be similar.
>
>> Begin forwarded message:
>>
>> From: Richard Jones <richard.t.jones at uconn.edu>
>> Subject: Re: [Halld-offline] hdgeant still broken on 64-bit OS X systems?
>> Date: December 16, 2012 at 1:08:25 PM EST
>> To: <halld-offline at jlab.org>
>>
>> Matt,
>>
>> This is a common problem seen in trying to port cernlib to 64bit. The problem is deep in the heart of cernlib, where there exists code from ages ago that was designed to circumvent inefficiencies in subroutine calling way back in the days of the VAX and the CDC.  In those days, it was expensive to call a subroutine because you had to construct a stack frame, push all of the registers contents into the stack frame, then load up the argument list, then do the call.  The return was just the same in reverse.  So if the subroutine did something simple, the overhead was signficant. These things called jumpad, jumpst, jumpx0, jumpx1, ... are all low-overhead ways of calling subroutines that take shortcuts with the push/restore of context.  Unfortunately, these things made certain assumptions about the length of a pointer, which changes in x86_64.  This LOCF error happens when you get it wrong.
>>
>> It takes quite some work to get it right.   Rather than fix it myself, I went on a hunt and found one guy (a German grad student, as I recall) that did it for x86_64 hardware and I downloaded his mod of 2005.  This is what I use.  I build everything from sources, so with a little luck it may also work for you.
>>
>> The quick way to go would be to download the "gridmake" utility and just be bold and try to make a hdgeant data file.  For example,
>>
>> ./gridmake -f gridmake.xml hdgeant_1.hddm
>>
>> It tries to recognize all software packages (eg. xerces, cernlib, root, jana, amptools,...) that you need and download and build them if they are not already present on your system in a recognizable form.  It builds everything under a "packages" directory that it makes in the CWD of wherever you are when you run gridmake.  For this, all you need is the gridmake perl script and the gridmake.xml file.  Both are available from here:
>>
>> http://zeus.phys.uconn.edu/halld/gridwork/
>>
>> You also need a grid certificate and have the environment variable X509_USER_PROXY point to a proxy made from your grid cert.  Your certificate DN needs to be on file in our Gluex grid database for the build to work.
>>
>> -Richard J.
>>
>>
>> On 12/16/2012 9:00 AM, Matthew Shepherd wrote:
>>> Hi all,
>>>
>>> I upgraded my MacBook recently to latest version of OS X.  I was able to get the full GlueX source to compile with a little work.  Basic details:
>>>
>>> Install command line tools from "Preferences" in Xcode
>>>
>>> Install fink
>>> use fink to install cernlib and geant3
>>>
>>> set /cern/pro to point to /sw  (can probably change env also, but this is standard path)
>>>
>>> inside of /sw/lib set these symlinks:
>>>
>>> libgcc_s.dylib -> gcc4.7/lib/libgcc_s.1.dylib
>>> libgfortran.a -> gcc4.7/lib/libgfortran.a
>>> libquadmath.a -> gcc4.7/lib/libquadmath.a
>>>
>>> I then updated everything and did a fresh build.
>>>
>>> Now when I run hdgeant I get the following message:
>>>
>>> !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
>>> LOCB/LOCF: address 0x7fff5074435c exceeds the 32 bit address space
>>> or is not in the data segments
>>> This may result in program crash or incorrect results
>>> Therefore we will stop here
>>> !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
>>>
>>> While I don't understand it, it seems to be an artifact of trying to run a 64-bit version of hdgeant.  Is there a way to fix this?  Do we know if the problem is in our code or in cernlib?
>>>
>>> I'm just using cernlib provided by fink.  There is some discussion of patching an recompiling cernlib here:
>>>
>>> http://www-jlc.kek.jp/~fujiik/macosx/10.7.X/HEPonX/memo/CERNLIBonX.html
>>>
>>> However, the author seems to imply that the fink distributions have the patches applied.  I have not tried to patch and compile my own copy of cernlib.
>>>
>>> It is probably not worth a major development effort to fix this.  I just thought if someone knew of a trival fix, I could do it.  After all, I've been running hdgeant elsewhere and copying files to my laptop for years now.
>>>
>>> Matt
>>>