[d2n-analysis-talk] Disk Space Requirements
Diana Parno
dparno at cmu.edu
Mon Jan 24 15:33:24 EST 2011
Here's a summary of our space requirements for BigBite and the LHRS.
I have already replayed all four-pass production runs from March,
three randomly chosen four-pass nitrogen runs, and the one-pass 3He
and H2 runs. These 134 runs occupy a total of 263.4 GB. (There are
also four-pass production runs from February, with a very low (~20%)
electron beam polarization. In principle, we could eventually use
these too.)
There are 306 five-pass production runs, including three randomly
chosen five-pass nitrogen runs. It's possible that closer examination
will eliminate some of these from the list, but we don't know yet. The
average disk space occupied by a four-pass run is 2.294 GB/run; the
highest disk space is 2.649 GB/run. This is a pretty significant
difference when multiplied by 306 runs. If we take the worst-case
estimate -- that five-pass production runs occupy an average of 2.649
GB each -- then we need 810.5 GB to hold all the replayed 5-pass
production runs at once. I think that the average number of 2.294 GB/
run is a more accurate estimate for the five-pass runs, since the
2.649 GB/run reflects runs where 6 million events were taken (rather
than 5 million). This average number gives 702.1 GB for the whole five-
pass data set.
Storing replayed root files for the whole BigBite dataset -- four-
pass, five-pass, one-pass coincidence runs, and nitrogen runs at both
production energies -- thus comes out to 965.4 GB of disk space.
Our skim process basically doubles this by cloning the primary root
tree and adding a couple of variables. Storing the complete set of
replayed and skimmed root files for BigBite would require 1.931 TB.
What about the LHRS? Dave estimates that most LHRS runs are less than
a GB, although some runs make it up to 4 GB. Of course, this is
heavily dependent on the kinematic. Let's assume an average size of 1
GB. We further assume that the number of production LHRS runs is
approximately equal to the number of production BigBite runs (411,
since the one-pass runs are shared). We need to analyze a nitrogen run
for each of the fifteen kinematic points, so that brings us up to 420,
since six nitrogen runs were already included in the 411 figure. That
gives us approximately 420 GB for the replayed LHRS root files, and
another 420 GB for the skimmed LHRS root files, for 840 GB in total.
By this rough calculation, the total disk space d2n needs for all of
its root files is 2.771 TB. We could halve this requirement by purging
replayed root files in favor of skimmed files, but this would require
us to re-replay runs (which could take up to a few days) if we
discover that we've missed something meaningful in the skim. (In
particular, at present the skim process does not save the THaRunInfo
information or the EVBBITE tree.) Additional disk space is of course
needed for code, analysis databases compiled over time (e.g. databases
of beam trips or of cut performance), and figures. If (as seems
likely) our runs go through multiple rounds of replays, then we may
need additional space if we don't want to delete Round-1-replayed
files before starting on replay Round 2.
Best,
Diana
More information about the d2n-analysis-talk
mailing list