[Halld-offline] reducing file sizes
Blake Leverington
leverinb at uregina.ca
Wed Oct 7 16:34:01 EDT 2009
Hi Richard,
Ok, I followed your instructions and everything went smoothly, so thank
you. I ended up with a 70% reduction in my files sizes, 165MB down to
51MB. I've attached my edited hddm header file, could you take a quick
look to see if there is anything else I could cut out? I think I took
out everything unwanted, I think, but I'm not familiar enough with the
hddm formatting. All I require is photon reconstruction in the FCAL and
BCAL and the truth information.
Cheers,
-Blake
Richard Jones wrote:
> Blake,
>
> Yes, this is just the kind of thing that hddm was built to be able to
> do easily. To make it simple for you to try out, I just created a
> short utility program hddmcp.c in the HDGeant svn directory. To use
> it, follow these steps.
>
> 1. Find a big hddm file that contains way more data sections than
> you want to see.
> 2. Using "head -n" (or your favorite text editor) pick the xml text
> header from the beginning of the file (starting with <HDDM> and
> ending with </HDDM> and save it in a new file. Let's call it
> xxx.hddm and save it in a new working directory, such as
> HDGeant/work.
> 3. Edit your xxx.hddm file, stripping out all of the sections that
> you don't want to see in the output file. The only restriction
> is that you have to remove complete sections, so that there are
> no unmatched tags. Anything will do, provided that the xml
> remains valid and the top-level tag is <HDDM>...</HDDM>
> 4. From inside the new work directory, run the command "hddm-c
> xxx.hddm" to generate the hddm_s.h and hddm_s.c libraries.
> 5. Copy hddmcp.c from the HDGeant svn directory [you must have done
> svn update recently to see it] into your work directory and
> compile it there, as in "gcc -I $HALLD_HOME/include hddcp.c -o
> hddmcp hddm_s.c"
> 6. Run it, as in "./hddmcp my_gargantuan_hddm_file.hddm
> my_new_trimmed_hddm_file.hddm"
>
> Just for fun, I just tried it on a recent hddm file I produced with
> hdgeant, and I stripped out everything except the Monte Carlo and the
> barrelEMcal sections. The size of the hddm file went from 34 MB to
> 0.5 MB.
>
> -Richard Jones
>
> Blake Leverington wrote:
>> Hi guys,
>>
>> This is just sort of a general query. I'm looking at different ways of
>> reducing file size and cpu times for a rather large PYTHIA data
>> set(2.5x10^8) events. Aside from simply filtering out events I was
>> hoping I could cut out some of the information that HDGEANT records
>> since I am only using information from the calorimeters. The rest of the
>> detectors are effectively dead material for my analysis. Any thoughts on
>> whether this is feasable, possible or just plain ineffective in reducing
>> my file size?
>>
>> Cheers,
>> -Blake
>>
>>
>
> ------------------------------------------------------------------------
>
> _______________________________________________
> Halld-offline mailing list
> Halld-offline at jlab.org
> https://mailman.jlab.org/mailman/listinfo/halld-offline
-------------- next part --------------
An embedded and charset-unspecified text was scrubbed...
Name: hddmedited.hddm
URL: <https://mailman.jlab.org/pipermail/halld-offline/attachments/20091007/2bc1f37a/attachment.ksh>
More information about the Halld-offline
mailing list