[Halld-offline] Make CCDB Use SQLite Again (MCUSA)
davidl at jlab.org
Mon Apr 29 15:46:14 EDT 2019
Hi Mark et al.,
It is probably a little late, but I thought I should point out a feature in the DANA library
I put in some time ago to solve the SQLite issue for the online monitoring. If you set the
SQLITE_TO_LOCAL config parameter to a file name on the local disk AND you specify
using an sqlite file in you JANA_CALIB_URL, then it will copy the file first and use the
copy. The benefit of copying is that it is read-only so you don’t have the file lock contention
for a network mounted SQLite file. It looks like you’ve solved this by randomly choosing
from 100 copies of the sqlite which will work too.
I just thought I’d mention it in case you run into a similar situation in the future and this
could be useful.
Also, FWIW: The jobs we run at NERSC all copy the SQLite file from cvmfs to the local
disk at start up to avoid lock contention.
David Lawrence Ph.D.
Staff Scientist, Thomas Jefferson National Accelerator Facility
Newport News, VA
davidl at jlab.org<mailto:davidl at jlab.org>
(757) 269-5567 W
(757) 746-6697 C
On Apr 29, 2019, at 11:57 AM, Mark Ito <marki at jlab.org<mailto:marki at jlab.org>> wrote:
You will recall that we switched the default choice for the database engine for CCDB at JLab from SQLite to MySQL back on March 12 (https://mailman.jlab.org/pipermail/halld-offline/2019-March/003561.html). Since then we have been studying usage patterns and CCDB performance on MySQL. And, not unexpectedly, we have experienced "storms" where the MySQL server is overloaded with CCDB requests from farm jobs and performance is severely degraded. Jobs don't crash but they take a long time to read in constants before starting to analyze events, up to two hours in the worst cases. These have been occurring once every day or two and last for a few hours.
We have learned a lot. The storms are definitely coming from farm jobs when many of them start at the same time. The queries have been analyzed (no big inefficiencies found). These things were invisible to us when we were running with the SQLite default. As a result IT is putting up a new beefier server for us to use for the farm.
Until the new server comes online (a week or two), I am proposing that we switch back to using SQLite as the default for CCDB, but with two differences from how we used SQLite at JLab in the past.
(1) SQLite will be used for the farm only. Interactive users at JLab will continue to use the MySQL server.
(2) Rather than accessing the ccdb.sqlite file on the group disk, farm jobs will access one of 100 copies of ccdb.sqlite, located on the work disk, chosen randomly for each job.
The idea is to separate farm demand for CCDB constants from other functions of our main database server while we wait for the new server that will do just that. Other functions include the writing of calibration constants to the CCDB, RCDB access, and MCwrapper job control functions.
Note again that this change only affects processes using the defaults from the build_scripts-based environment set-up. Those that set their own JANA_CALIB_URL environment variable explicitly will get what they ask for, as always.
The proposed switch-over time is COB today. I will make the change unless I hear objections.
Halld-offline mailing list
Halld-offline at jlab.org<mailto:Halld-offline at jlab.org>
-------------- next part --------------
An HTML attachment was scrubbed...
More information about the Halld-offline