[Halld-offline] CCDB server problems for farm jobs over the past week

Mark Ito marki at jlab.org
Thu May 13 10:21:28 EDT 2021


Things look nominal right now but over the past week, there have been 
problems with overloading of one or the other of the mysql servers used 
for farm jobs. We believe that this has been due to jobs using an old, 
buggy version of CCDB. This version was included in the version sets we 
used for launches we performed over a year ago and in some cases users 
are using those version sets to ensure compatibility with those 
launches. Other cases come from users using private version sets with 
the old CCDB version imbedded. Many of you will recall the problems that 
that version created for us.

The problem only occurs for jobs that check *all* of following boxes:

  * Using a version set with CCDB 1.06.06.
      o CCDB 1.06.07 is the current standard.
  * Using a mysql server for the CCDB.
      o Using an SQLite version seems to work. Running on the OSG uses
        SQLite is seems therefore to be OK.
  * Running many hundreds of jobs simultaneously using the same mysql
      o Like on the JLab farm, say.

To repeat: you need an AND of these conditions to have issues. If any 
one of them is avoided, there should not be a problem. Thus your 
work-around options have been described.

A long-term fix would be to remake version sets without the buggy CCDB. 
I'm looking into feasability.

   -- Mark

