[d2n-analysis-talk] Disk Space Requirements

Diana Parno dparno at cmu.edu
Mon Jan 24 15:33:24 EST 2011


Here's a summary of our space requirements for BigBite and the LHRS.

I have already replayed all four-pass production runs from March,  
three randomly chosen four-pass nitrogen runs, and the one-pass 3He  
and H2 runs. These 134 runs occupy a total of 263.4 GB. (There are  
also four-pass production runs from February, with a very low (~20%)  
electron beam polarization. In principle, we could eventually use  
these too.)

There are 306 five-pass production runs, including three randomly  
chosen five-pass nitrogen runs. It's possible that closer examination  
will eliminate some of these from the list, but we don't know yet. The  
average disk space occupied by a four-pass run is 2.294 GB/run; the  
highest disk space is 2.649 GB/run. This is a pretty significant  
difference when multiplied by 306 runs. If we take the worst-case  
estimate -- that five-pass production runs occupy an average of 2.649  
GB each -- then we need 810.5 GB to hold all the replayed 5-pass  
production runs at once. I think that the average number of 2.294 GB/ 
run is a more accurate estimate for the five-pass runs, since the  
2.649 GB/run reflects runs where 6 million events were taken (rather  
than 5 million). This average number gives 702.1 GB for the whole five- 
pass data set.

Storing replayed root files for the whole BigBite dataset -- four- 
pass, five-pass, one-pass coincidence runs, and nitrogen runs at both  
production energies -- thus comes out to 965.4 GB of disk space.

Our skim process basically doubles this by cloning the primary root  
tree and adding a couple of variables. Storing the complete set of  
replayed and skimmed root files for BigBite would require 1.931 TB.

What about the LHRS? Dave estimates that most LHRS runs are less than  
a GB, although some runs make it up to 4 GB. Of course, this is  
heavily dependent on the kinematic. Let's assume an average size of 1  
GB. We further assume that the number of production LHRS runs is  
approximately equal to the number of production BigBite runs (411,  
since the one-pass runs are shared). We need to analyze a nitrogen run  
for each of the fifteen kinematic points, so that brings us up to 420,  
since six nitrogen runs were already included in the 411 figure. That  
gives us approximately 420 GB for the replayed LHRS root files, and  
another 420 GB for the skimmed LHRS root files, for 840 GB in total.

By this rough calculation, the total disk space d2n needs for all of  
its root files is 2.771 TB. We could halve this requirement by purging  
replayed root files in favor of skimmed files, but this would require  
us to re-replay runs (which could take up to a few days) if we  
discover that we've missed something meaningful in the skim. (In  
particular, at present the skim process does not save the THaRunInfo  
information or the EVBBITE tree.) Additional disk space is of course  
needed for code, analysis databases compiled over time (e.g. databases  
of beam trips or of cut performance), and figures. If (as seems  
likely) our runs go through multiple rounds of replays, then we may  
need additional space if we don't want to delete Round-1-replayed  
files before starting on replay Round 2.

Best,
Diana


More information about the d2n-analysis-talk mailing list