[Halld-offline] reducing file sizes

Blake Leverington leverinb at uregina.ca
Wed Oct 7 16:34:01 EDT 2009


Hi Richard,

Ok, I followed your instructions and everything went smoothly, so thank 
you. I ended up with a 70% reduction in my files sizes, 165MB down to 
51MB. I've attached my edited hddm header file, could you take a quick 
look to see if there is anything else I could cut out? I think I took 
out everything unwanted, I think,  but I'm not familiar enough with the 
hddm formatting.  All I require is photon reconstruction in the FCAL and 
BCAL and the truth information.

Cheers,
-Blake

Richard Jones wrote:
> Blake,
>
> Yes, this is just the kind of thing that hddm was built to be able to 
> do easily.  To make it simple for you to try out, I just created a 
> short utility program hddmcp.c in the HDGeant svn directory.  To use 
> it, follow these steps.
>
>    1. Find a big hddm file that contains way more data sections than
>       you want to see.
>    2. Using "head -n" (or your favorite text editor) pick the xml text
>       header from the beginning of the file (starting with <HDDM> and
>       ending with </HDDM> and save it in a new file.  Let's call it
>       xxx.hddm and save it in a new working directory, such as
>       HDGeant/work.
>    3. Edit your xxx.hddm file, stripping out all of the sections that
>       you don't want to see in the output file.  The only restriction
>       is that you have to remove complete sections, so that there are
>       no unmatched tags.  Anything will do, provided that the xml
>       remains valid and the top-level tag is <HDDM>...</HDDM>
>    4. From inside the new work directory, run the command "hddm-c
>       xxx.hddm" to generate the hddm_s.h and hddm_s.c libraries.
>    5. Copy hddmcp.c from the HDGeant svn directory [you must have done
>       svn update recently to see it] into your work directory and
>       compile it there, as in "gcc -I $HALLD_HOME/include hddcp.c -o
>       hddmcp hddm_s.c"
>    6. Run it, as in "./hddmcp     my_gargantuan_hddm_file.hddm    
>       my_new_trimmed_hddm_file.hddm"
>
> Just for fun, I just tried it on a recent hddm file I produced with 
> hdgeant, and I stripped out everything except the Monte Carlo and the 
> barrelEMcal sections.  The size of the hddm file went from 34 MB to 
> 0.5 MB. 
>
> -Richard Jones
>
> Blake Leverington wrote:
>> Hi guys,
>>
>> This is just sort of a general query. I'm looking at different ways of 
>> reducing file size and cpu times for a rather large PYTHIA data 
>> set(2.5x10^8) events. Aside from simply filtering out events I was 
>> hoping I could cut out some of the information that HDGEANT records 
>> since I am only using information from the calorimeters. The rest of the 
>> detectors are effectively dead material for my analysis. Any thoughts on 
>> whether this is feasable, possible or just plain ineffective in reducing 
>> my file size?
>>
>> Cheers,
>> -Blake
>>
>>   
>
> ------------------------------------------------------------------------
>
> _______________________________________________
> Halld-offline mailing list
> Halld-offline at jlab.org
> https://mailman.jlab.org/mailman/listinfo/halld-offline
-------------- next part --------------
An embedded and charset-unspecified text was scrubbed...
Name: hddmedited.hddm
URL: <https://mailman.jlab.org/pipermail/halld-offline/attachments/20091007/2bc1f37a/attachment.ksh>


More information about the Halld-offline mailing list