[Halld-cpp] CPP REST production
Ilya Larin
ilarin at jlab.org
Fri Jun 28 13:39:19 EDT 2024
I had an impression, that blocks with bad quality are switched off at the hit reconstruction stage.
Maybe I am wrong:
FCAL/DFCALHit_factory.cc lines 178-181:
// throw away hits from bad or noisy channels
fcal_quality_state quality =
static_cast<fcal_quality_state>(block_qualities[digihit->row][digihit->column]);
if ( (quality==BAD) || (quality==NOISY) ) continue;
Ilya
________________________________
From: Igal Jaegle <ijaegle at jlab.org>
Sent: Friday, June 28, 2024 13:04
To: Ilya Larin <ilarin at jlab.org>; Alexander Austregesilo <aaustreg at jlab.org>; halld-cpp at jlab.org <halld-cpp at jlab.org>
Cc: Sean Dobbs <sdobbs at jlab.org>
Subject: Re: CPP REST production
No problem from my side. Just let me know when they are loaded so that I can start the production a day or two later.
tks ig.
________________________________
From: Ilya Larin <ilarin at jlab.org>
Sent: Friday, June 28, 2024 1:01 PM
To: Alexander Austregesilo <aaustreg at jlab.org>; halld-cpp at jlab.org <halld-cpp at jlab.org>; Igal Jaegle <ijaegle at jlab.org>
Cc: Sean Dobbs <sdobbs at jlab.org>
Subject: Re: CPP REST production
I have few more tables for dead counters if you think it would be good to update.
Ilya
________________________________
From: Halld-cpp <halld-cpp-bounces at jlab.org> on behalf of Igal Jaegle via Halld-cpp <halld-cpp at jlab.org>
Sent: Friday, June 28, 2024 13:00
To: Alexander Austregesilo <aaustreg at jlab.org>; halld-cpp at jlab.org <halld-cpp at jlab.org>
Cc: Sean Dobbs <sdobbs at jlab.org>
Subject: Re: [Halld-cpp] CPP REST production
ok so I have the green light to start the production - ctof & ps evio skims?
tks ig.
________________________________
From: Alexander Austregesilo <aaustreg at jlab.org>
Sent: Friday, June 28, 2024 12:15 PM
To: Igal Jaegle <ijaegle at jlab.org>; halld-cpp at jlab.org <halld-cpp at jlab.org>
Cc: Naomi Jarvis <nsj at cmu.edu>; Sean Dobbs <sdobbs at jlab.org>
Subject: Re: CPP REST production
Excellent. Something must be wrong on my side with the sizes. When I list my directory I get this output, which seems consistent with you:
-rw-r--r-- 1 aaustreg halld 43688763 Jun 25 15:59 converted_random.hddm
-rw-r--r-- 1 aaustreg halld 1244538131 Jun 25 15:59 dana_rest.hddm
-rw-r--r-- 1 aaustreg halld 16751296 Jun 25 15:59 hd_rawdata_101586_000.BCAL-LED.evio
-rw-r--r-- 1 aaustreg halld 322128 Jun 25 15:59 hd_rawdata_101586_000.CCAL-LED.evio
-rw-r--r-- 1 aaustreg halld 237368676 Jun 25 15:59 hd_rawdata_101586_000.cpp_2c.evio
-rw-r--r-- 1 aaustreg halld 814874380 Jun 25 15:59 hd_rawdata_101586_000.ctof.evio
-rw-r--r-- 1 aaustreg halld 322128 Jun 25 15:59 hd_rawdata_101586_000.DIRC-LED.evio
-rw-r--r-- 1 aaustreg halld 19690620 Jun 25 15:59 hd_rawdata_101586_000.FCAL-LED.evio
-rw-r--r-- 1 aaustreg halld 110784828 Jun 25 15:59 hd_rawdata_101586_000.npp_2g.evio
-rw-r--r-- 1 aaustreg halld 132611736 Jun 25 15:59 hd_rawdata_101586_000.npp_2pi0.evio
-rw-r--r-- 1 aaustreg halld 2804807604 Jun 25 15:59 hd_rawdata_101586_000.ps.evio
-rw-r--r-- 1 aaustreg halld 188496932 Jun 25 15:59 hd_rawdata_101586_000.random.evio
-rw-r--r-- 1 aaustreg halld 314564 Jun 25 15:59 hd_rawdata_101586_000.sync.evio
-rw-r--r-- 1 aaustreg halld 44638598 Jun 25 15:59 hd_root.root
-rw-r--r-- 1 aaustreg halld 1329 Jun 27 14:59 jana_recon_2022_05_ver01.config
-rw-r--r-- 1 aaustreg halld 1343 Jun 26 15:38 jana_recon_2022_05_ver01.config~
-rw-r--r-- 1 aaustreg halld 7594 Jun 25 15:59 syncskim.root
-rw-r--r-- 1 aaustreg halld 7370 Jun 25 15:59 tree_bcal_hadronic_eff.root
-rw-r--r-- 1 aaustreg halld 130996 Jun 25 15:59 tree_fcal_hadronic_eff.root
-rw-r--r-- 1 aaustreg halld 96654396 Jun 25 15:59 tree_PSFlux.root
-rw-r--r-- 1 aaustreg halld 1063333 Jun 25 15:59 tree_tof_eff.root
-rw-r--r-- 1 aaustreg halld 50428784 Jun 25 15:59 tree_TPOL.root
-rw-r--r-- 1 aaustreg halld 8617 Jun 25 15:59 tree_TS_scaler.root
But when I ask the shell for the size of individual files, I get a different result:
ls -s hd_rawdata_101586_000.ps.evio
1743356 hd_rawdata_101586_000.ps.evio
That is super weird, but I don't think any of the lock fixes have anything to do with that. It is a difference between disk allocation and byte count.
>From my side, I would just update the config file from teh group disk to get rid of the ctof and ps skims.
I also noticed that you have tree_sc_eff.root as an output file in the launch_nersc.py script, even though it is not produced by the job. Maybe you want to update this list.
On 6/28/24 00:09, Igal Jaegle wrote:
The second attempt worked, the job took slightly more than 3h with 100% success rate but the skims are almost all larger than the one produced locally. So, there is still an issue.
The files are on volatile and will be moved tomorrow on cache
/volatile/halld/offsite_prod/RunPeriod-2022-05/recon/ver99-perl/RUN101586/
tks ig.
s -lrth /volatile/halld/offsite_prod/RunPeriod-2022-05/recon/ver99-perl/RUN101586/FILE000/RUN101586/FILE000/
total 5.2G
-rw-r--r-- 1 gxproj4 halld 784 Jun 25 05:42 tree_sc_eff.root
-rw-r--r-- 1 gxproj4 halld 315K Jun 27 22:13 hd_rawdata_101586_000.CCAL-LED.evio
-rw-r--r-- 1 gxproj4 halld 16M Jun 27 22:13 hd_rawdata_101586_000.BCAL-LED.evio
-rw-r--r-- 1 gxproj4 halld 7.5K Jun 27 22:13 syncskim.root
-rw-r--r-- 1 gxproj4 halld 174K Jun 27 22:13 job_info_101586_000.tgz
-rw-r--r-- 1 gxproj4 halld 227M Jun 27 22:14 hd_rawdata_101586_000.cpp_2c.evio
-rw-r--r-- 1 gxproj4 halld 778M Jun 27 22:14 hd_rawdata_101586_000.ctof.evio
-rw-r--r-- 1 gxproj4 halld 42M Jun 27 22:15 converted_random_101586_000.hddm
-rw-r--r-- 1 gxproj4 halld 49M Jun 27 22:15 tree_TPOL.root
-rw-r--r-- 1 gxproj4 halld 8.5K Jun 27 22:15 tree_TS_scaler.root
-rw-r--r-- 1 gxproj4 halld 1.2G Jun 27 22:15 dana_rest.hddm
-rw-r--r-- 1 gxproj4 halld 93M Jun 27 22:15 tree_PSFlux.root
-rw-r--r-- 1 gxproj4 halld 129K Jun 27 22:15 tree_fcal_hadronic_eff.root
-rw-r--r-- 1 gxproj4 halld 7.2K Jun 27 22:15 tree_bcal_hadronic_eff.root
-rw-r--r-- 1 gxproj4 halld 37M Jun 27 22:15 hd_root.root
-rw-r--r-- 1 gxproj4 halld 308K Jun 27 22:16 hd_rawdata_101586_000.sync.evio
-rw-r--r-- 1 gxproj4 halld 315K Jun 27 22:16 hd_rawdata_101586_000.DIRC-LED.evio
-rw-r--r-- 1 gxproj4 halld 127M Jun 27 22:18 hd_rawdata_101586_000.npp_2pi0.evio
-rw-r--r-- 1 gxproj4 halld 106M Jun 27 22:18 hd_rawdata_101586_000.npp_2g.evio
-rw-r--r-- 1 gxproj4 halld 19M Jun 27 22:18 hd_rawdata_101586_000.FCAL-LED.evio
-rw-r--r-- 1 gxproj4 halld 180M Jun 27 22:21 hd_rawdata_101586_000.random.evio
-rw-r--r-- 1 gxproj4 halld 2.7G Jun 27 22:21 hd_rawdata_101586_000.ps.evio
-rw-r--r-- 1 gxproj4 halld 1.1M Jun 27 22:27 tree_tof_eff.root
________________________________
From: Igal Jaegle <ijaegle at jlab.org><mailto:ijaegle at jlab.org>
Sent: Wednesday, June 26, 2024 10:38 AM
To: Alexander Austregesilo <aaustreg at jlab.org><mailto:aaustreg at jlab.org>; halld-cpp at jlab.org<mailto:halld-cpp at jlab.org> <halld-cpp at jlab.org><mailto:halld-cpp at jlab.org>
Cc: Naomi Jarvis <nsj at cmu.edu><mailto:nsj at cmu.edu>; Sean Dobbs <sdobbs at jlab.org><mailto:sdobbs at jlab.org>
Subject: Re: CPP REST production
PERLMUTTER is down due to a maintenance day. But tomorrow I can grab the other logs.
tks ig.
________________________________
From: Alexander Austregesilo <aaustreg at jlab.org><mailto:aaustreg at jlab.org>
Sent: Wednesday, June 26, 2024 10:32 AM
To: Igal Jaegle <ijaegle at jlab.org><mailto:ijaegle at jlab.org>; halld-cpp at jlab.org<mailto:halld-cpp at jlab.org> <halld-cpp at jlab.org><mailto:halld-cpp at jlab.org>
Cc: Naomi Jarvis <nsj at cmu.edu><mailto:nsj at cmu.edu>; Sean Dobbs <sdobbs at jlab.org><mailto:sdobbs at jlab.org>
Subject: Re: CPP REST production
Hi Igal,
This failure rate is pretty bad. Can you point me to the log files of the failed jobs? The messages you sent concerning root dictionaries and ps_counts_thresholds are not fatal, they will not cause a failures. You should still confirm with Sasha if the ps_counts_thresholds are important.
The file /work/halld/home/gxproj4/public/ForAlexAndSean/std.out seems to be cut off. We may want to add this option to the config file to remove the number of processed events:
JANA:BATCH_MODE 1
You can find my output files here: /work/halld2/home/aaustreg/Analysis/cpp/REST/
I don't have the logs, but all files were closed at exactly the same time.
Cheers,
Alex
On 6/26/24 08:59, Igal Jaegle wrote:
Alex,
Could you provide the path to your output files and most importantly the logs?
The results of the test on NERSC for the same run are as follows
108 evio files were cooked properly out of 126, thus 15% failure rate which is way too much to proceed with the cooking.
tks ig.
________________________________
From: Alexander Austregesilo <aaustreg at jlab.org><mailto:aaustreg at jlab.org>
Sent: Monday, June 24, 2024 6:06 PM
To: halld-cpp at jlab.org<mailto:halld-cpp at jlab.org> <halld-cpp at jlab.org><mailto:halld-cpp at jlab.org>
Cc: Naomi Jarvis <nsj at cmu.edu><mailto:nsj at cmu.edu>; Igal Jaegle <ijaegle at jlab.org><mailto:ijaegle at jlab.org>; Sean Dobbs <sdobbs at jlab.org><mailto:sdobbs at jlab.org>
Subject: CPP REST production
Dear Colleagues,
I processed one single file of a typical CPP production run on Pb
target. Here is a list of all files and their sizes which will be
produced suring the REST prodution:
1.7G hd_rawdata_101586_000.ps.evio
1.2G dana_rest.hddm
495M hd_rawdata_101586_000.ctof.evio
160M hd_rawdata_101586_000.cpp_2c.evio
115M hd_rawdata_101586_000.random.evio
93M tree_PSFlux.root
90M hd_rawdata_101586_000.npp_2pi0.evio
72M hd_rawdata_101586_000.npp_2g.evio
49M tree_TPOL.root
42M converted_random.hddm
41M hd_root.root
16M hd_rawdata_101586_000.FCAL-LED.evio
13M hd_rawdata_101586_000.BCAL-LED.evio
1.1M tree_tof_eff.root
164K tree_fcal_hadronic_eff.root
105K hd_rawdata_101586_000.DIRC-LED.evio
105K hd_rawdata_101586_000.CCAL-LED.evio
94K hd_rawdata_101586_000.sync.evio
24K tree_TS_scaler.root
24K tree_bcal_hadronic_eff.root
24K syncskim.root
The trigger skims for ps (with thick converter) and ctof are quite
large. Do we actually need them or were they maybe already produced
during the calibration stages?
As far as I understand, Igal has started a test of the REST production
at NERSC. We are getting closer to launch!
Cheers,
Alex
--
Alexander Austregesilo
Staff Scientist - Experimental Nuclear Physics
Thomas Jefferson National Accelerator Facility
Newport News, VA
aaustreg at jlab.org<mailto:aaustreg at jlab.org>
(757) 269-6982
--
Alexander Austregesilo
Staff Scientist - Experimental Nuclear Physics
Thomas Jefferson National Accelerator Facility
Newport News, VA
aaustreg at jlab.org<mailto:aaustreg at jlab.org>
(757) 269-6982
--
Alexander Austregesilo
Staff Scientist - Experimental Nuclear Physics
Thomas Jefferson National Accelerator Facility
Newport News, VA
aaustreg at jlab.org<mailto:aaustreg at jlab.org>
(757) 269-6982
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mailman.jlab.org/pipermail/halld-cpp/attachments/20240628/95bc54ed/attachment-0001.html>
More information about the Halld-cpp
mailing list