[Halld-offline] Going back to the old BMS_OSNAME [Re: Changing BMS_OSNAME]

David Lawrence davidl at jlab.org
Thu Jun 9 14:44:16 EDT 2022


Ouch! This is unfortunate and not something that could have easily been foreseen.
Kudos to those who caught the problem early and and a blackout plan ready just in case.

I wonder if one lesson we should take here is that maybe we should be moving
more towards a “container only” world and give up on supporting native OS
on the ifarm(?). I’m not 100% convinced myself, but it would position us closer
to a single set of binaries that can be run anywhere.

Regards,
-David

-------------------------------------------------------------
David Lawrence Ph.D.
Staff Scientist - - EPSCI Group Lead
Thomas Jefferson National Accelerator Facility
Newport News, VA
davidl at jlab.org<mailto:davidl at jlab.org>
(757) 269-5567 W
(757) 746-6697 C


On Jun 9, 2022, at 1:58 PM, Mark Ito <marki at jlab.org<mailto:marki at jlab.org>> wrote:


Folks,

Alex A. discovered a problem with the new BMS_OSNAME scheme that is definitely there, but whose solution is unclear at present. It has to do with building on the native CentOS 7 of the ifarm using libraries from the build in the CentOS 7 container. Going forward, the new scheme depends on this combination working, so this is a non-trivial problem.

As a result, we are going to drop back to the old BMS_OSNAME scheme, i.e., revert Build Scripts back to version 2.34. This has already been done at JLab.

Note that any conversions you may have performed on your directories using change_bms_osname.sh do not need to be reversed to work with the reverted/old scheme. There will be soft links that you created in the conversion to service the new scheme, but those links are completely ignored in the old scheme.

TL;DR

Before this change was proposed, tests were performed where the software was run on the ifarm, using both binaries and shared libraries from the container build. The reverse scenario was also tested, i.e., running in the container using software built on the native ifarm. The newly discovered problem scenario is where new binaries are built on ifarm against libraries from the container and then run on the ifarm. So not all mixtures appear to work. It is not known why.

Also, this implies a lot of possible combinations to test. Indeed if we need to keep track of working and non-working combos, that kind of defeats the advantage of having a single build for the ifarm and container.

Since we do not have a solution in hand we think it is better to cut our losses now while we work on understanding the problem. We will likely try again in the intermediate term when a revised strategy has been developed.

Build Scripts versions 3.0 and 3.1 have been deleted from the GitHub repo.

Sorry for the confusion.

  -- Mark

On 6/8/22 5:00 PM, Mark Ito wrote:

Folks,

Elton discovered a bug in the change_bms_osname.sh script. The shebang<https://en.wikipedia.org/wiki/Shebang_(Unix)> was malformed! That went unnoticed during testing. It would have affected you if you are running the tcsh shell.

The problem is fixed in Build Scripts 3.1<https://github.com/JeffersonLab/build_scripts/releases/tag/3.1> which is now installed at JLab.

  -- Mark

On 6/8/22 11:55 AM, Mark Ito wrote:

The change went through at 8 this morning as planned. If you see problems, please let me know.

On 6/7/22 4:27 PM, Mark Ito wrote:

Reminder: the definition of BMS_OSNAME will change at JLab tomorrow morning.

On 6/6/22 3:10 PM, Mark Ito wrote:

Folks,

We are changing the value of the BMS_OSNAME environment variable. Some user-owned directories will need patching to work with the new scheme. A script to do the patching will be available (see below). This action was discussed and endorsed at recent Software Meetings.

The change occurs when we update from Build Scripts version 2.34 to version 3.0. This will occur at JLab at 8:00 am, on Wednesday, June 8.

Note: there is a test version of the new Build Scripts checked out at /group/halld/Software/build_scripts-3.0 for those who would like to test the new arrangement.

This change will affect you if you have any private builds of certain GlueX packages (e.g., halld_recon, halld_sim) or if you have personal plugins in your HALLD_MY directory. Also people who maintain their own GLUEX_TOP directories will have to apply a patch. You will know that you have been affected if you start to suddenly get file-not-found-like errors.

The script to patch your directories is $BUILD_SCRIPTS/change_bms_osname.sh. The usage message is

usage: change_bms_osname.sh <target>

where target is one of

  "jlab": convert the /group/halld/Software/builds tree at JLab (script must be
          executed from a directory named "builds")
  "gluex_top": convert a GLUEX_TOP tree (GLUEX_TOP must be defined in the
               environment)
  "halld_my": convert a HALLD_MY tree (HALLD_MY must be defined in the
              environment)
  "jana": convert a jana tree (JANA_HOME must be defined in the environment)
  "hdds": convert a hdds tree (HDDS_HOME must be defined in the environment)
  "halld_recon": convert a halld_recon tree (HALLD_RECON_HOME must be defined in
                 the environment)
  "halld_sim": convert a halld_sim tree (HALLD_SIM_HOME must be defined in the
               environment)
  "gluex_root_analysis": convert a gluex_root_analysis tree (ROOT_ANALYSIS_HOME
                         must be defined in the environment)



So, for example, if you want to patch your private build of halld_sim, the command is

$BUILD_SCRIPTS/change_bms_osname.sh halld_sim

The script uses the definition of HALLD_SIM_HOME that it finds in your environment to identify the directory to be patched.

Basically, since BMS_OSNAME appears in the directory structure for our packages, links have to be added, from new name to old name, to keep the built package viable after the change.

A table of old and new BMS_OSNAME definitions is attached.

  -- Mark



_______________________________________________
Halld-offline mailing list
Halld-offline at jlab.org<mailto:Halld-offline at jlab.org>
https://mailman.jlab.org/mailman/listinfo/halld-offline

_______________________________________________
Halld-offline mailing list
Halld-offline at jlab.org<mailto:Halld-offline at jlab.org>
https://mailman.jlab.org/mailman/listinfo/halld-offline

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mailman.jlab.org/pipermail/halld-offline/attachments/20220609/cc36fea9/attachment-0001.html>


More information about the Halld-offline mailing list