[Clas12_calcom] Fwd: OTWG and CLAS12 software chain

Fri Dec 5 11:50:58 EST 2014

Hi Folks,

Here is the message from Harut that I referred to earlier. As I 
mentioned, there are quite a few points that overlap with Calcom.

I don't think we need to talk about setting up a new committee at the 
moment, but after the CWB review, it might be useful to discuss whether 
there is anything in what Harut describes that is not somehow already 
being addressed.

Best wishes,

Dave

-------- Original Message --------
Subject: OTWG and CLAS12 software chain
Date: Thu, 20 Nov 2014 18:09:12 -0500
From: Harut Avakian <avakian at jlab.org>
To: Dave Ireland <David.Ireland at glasgow.ac.uk>

Dear Dave,

During the collaboration meeting we discussed a little bit the possible
activities of the OTWG and I am listing below few issues to address  to 
improve  the software chain, to make it easier to use and maintain.
The new OTWG can hopefully become an efficient tool to control  the 
quality of the CLAS12 software. We can collect those and some other
issues and have a detailed discussion within the collaboration on how to 
proceed in building efficient CLAS12 software chain.

Best regards,
Harut

--------------------------------------

Some of the ideas below were discussed years ago but the previous OTWG
was not very efficient in forcing the rules (also developing them) and 
as a result
was not very efficient in preventing the uncontrolled development of
the software for data analysis, sometimes hindering efficient usage of
the available manpower.

Here are some major issues (I can provide many examples from our 
experience) we had in 6 GeV era to deal with:

1) Digitization of response in MC (gsim), including algorithms and
involved calibration constants were different from reconstruction
program (recsis).
In some cases  even the calibration programs were using different
algorithms compared to reconstruction.
In particular most of the gsim digitization, which is crucial for
realistic MC description of detector responses was using hardcoded ( 
mostly idealistic or unrealistic)
constants, as a result, tools like gpp were developed introducing 
additional mess.
Some people are still using user developed "stretching" procedure to 
match by hand
the gsim and data missing masses, invariant masses and other relevant 
distributions in
particular for x-section analysis.

2) Calibration constants were completely unprotected from uneducated
calibrators. The procedure for updating the constants in the main
database was not well defined, resulting
in cooking from many different  run databases which run groups used to
avoid dealing with messing up the constants from different group by
other groups in the main RunIndex.
The proprietary RunIndexes which were designed originally  mainly for
tests, were out of external control and not even protected from being
deleted.

3) Calibration procedures were not using any common tools and it was
hard to control the output and track bugs in the code written in many
different languages starting from collection of kumacs and ending with
sophisticated C++ programs with graphical interfaces.
Experts leaving the field were leaving behind the code which was
hard to change and use. If you look the time dependence of the
constants, you will see that every next run group calibrators were
starting up with completely messing up the constants
and in the end coming back to roughly the same constants after many
iterations. No efficient control over changes and the quality of
calibration constants sets was available.

4) Online and offline reconstruction used different codes for
reconstruction and monitoring adding some duplication of efforts.

5) Analysis procedure was taking a lot of manpower  wasted on writing
user routines for  fiducial regions and different momentum, energy loss,
alignment, RF and other corrections.
Things went even worse in the end with corrections for unknown positions
and angles of target magnets for polarized target and dvcs experiments.

6) Absence of  good documentation for reconstruction, calibration and
quality control was triggering uncontrolled re-writing parts of the code
by some users. Some part of the mess was introduced  by users  introducing
changes in the code without checking its effect on others.
Photon people not checking effect of their activities on electron runs
and vice-a-versa happened pretty often.

7) We didn't manage to come up with a standard DST, and as a result 
multiple version of
DSTs were floating around, adding non-standard, based on needs of a 
single run group codes which were hard to maintain and transfer to
other groups.

8) No strict software chain release rules were available.

Suggestion for strict rules to be controlled by OTWG, which can do that
only, when some kind of official software release rules will become
mandatory and could be signed by OTWG, something like "experimental 
readiness review"
but for software. With that the software could be checked and released
only after strict checks.

Below are some suggestions (per item above). I am sure other people will 
have more ideas
what could be checked and tested before we release the software to users
and code developers.
Only accepted by collaboration as mandatory for all code developers,
the rules will make the work of the new OTWG efficient.

1) The digitization in MC  should use the same  code and the same
constants as the reconstruction and calibration programs for all detectors.
Having the set of calibration constants in the gemc will certainly
improve the description quality and in addition would allow using the
simulation for tests of calibration
software in addition to tests of reconstruction software and the full
analysis chain.

2) The original idea was to write temporary sets in user data bases (ex
calib_user.Runindexe1f)
and after testing them only Run Group analysis coordinator was allowed
to write the constants in the main RunIndex and only for their own run
periods. A special database with
Run coordinators and run_mi run_max was created. The CLAS6 OTWG
responsible for the offline database, considered that a dictatorship and
as a result every calibrator was allowed to write into the main RunIndex
introducing mess (overwriting sometimes other run period indexes, as it
was easy to mistype the run_numbers), which was taking days and weeks
for analyzers to figure out.
The official constants should be stored only in a single place and
official cooking should be done using that official RunIndex. The
corresponding branch should be tagged as passN_run_group(ex. pass1_e1,
pass2_e1,...).

3) Only common tools should be allowed in the calibration software. It
is not enough to write everything in JAVA it should be "common tool"
based JAVA. If the existing tools are
not allowing certain things, the library could be always updated.
Currently calibration codes are already developed by detector groups,
which are certainly useful to develop
calibration algorithms and to define the list of relevant constants, but
they should be only officially released after checking the compatibility
with the "common tool" environment.

4) Online and offline reconstruction and monitoring should be the same.
We currently have a fast 10Gb connection (could be used both for file
and ET) with farm in computer center allowing to run our codes there
from control room. The same monitoring tools, including histograms and
other graphical presentations, should be used in both cases to check the
data quality.

5) The library for all kind of corrections should be created by our 
software experts and
approved by the OTWG and updated with better software solutions coming 
later on.
That "common analysis tools" will save a lot of time for analyzers and 
also reviewers in dealing with
all kind of analysis cuts, in particular
PID, data quality cuts for different processes, momentum corrections,
fiducial volume definitions and so on.

6) Documentation, including comments in the code and html and pdf
extended documentation should be available before the release of any
component.
OTWG members should be able to run all the chain using the documentation
and get "nominal" plots before approval of the software for collaboration.

7) OTWG involving run group analysis coordinators  can also take the 
responsibility
for development and maintenance of a standard DSTs for CLAS12.
Several processes will require analysis of data from different run 
periods (ex. unpolarized and polarized, hydrogen and deuteron target 
experiments and so on)
and standard DSTs will be crucial for such studies.

8) The simplest version of the quality control could use standard input
files 1 for MC and 1 for data for each major CLAS12 configuration to
test the full chain, checking that
reconstruction quality (ex. number of reconstructed events in given bins
in momentum and polar and azimuthal angles) is not worse than the
"standard/nominal set", replacing it
if better quality was achieved. Some physics processes sensitive to
resolutions, like missing masses in e'p\pi\pi, dvcs) could be included
in the "release validation/check" list.
In the early stage of CLAS6 we had a test data file and before every
release we were running the recsis on it to make sure no major things
are messed up. That alone was saving
time of future compilations and quality checks.

-------------- next part --------------
A non-text attachment was scrubbed...
Name: David_Ireland.vcf
Type: text/x-vcard
Size: 316 bytes
Desc: not available
Url : https://mailman.jlab.org/pipermail/clas12_calcom/attachments/20141205/c8a63e63/attachment.vcf