[Halld-offline] Comparing hddm and evio file sizes, with and w/o gzip
Elliott Wolin
wolin at jlab.org
Tue Mar 2 15:52:55 EST 2010
Hi,
Using the danaevio plugin, which writes dana objects to evio files, I
compared hddm and evio file sizes w/ and w/o gzip. I used two files
generated by David L, 100 events each of smeared single pi and
multi-track events. The DANA objects written out in evio are basically
containers for the hddm data read in.
_No gzip_
single pi: 2.9M evio 3.1M hddm
multi trk: 23M evio 24M hddm
The ratio of file sizes is:
single pi: 1.07 hddm/evio
multi tri: 1.04 hddm/evio
This is not surprising as for the most part the same bytes are written
out, but in different order.
_With gzip (default flags)_
single pi: 967k evio 1.4M hddm
multi tri: 6.6M evio 9.8M hddm
This gives gzip compression ratios:
single pi: 3.0 evio 2.2 hddm
multi trk: 3.5 evio 2.4 hddm
The ratio of gzipped file sizes is:
single pi: 1.45 hddm/evio
multi trk: 1.48 hddm/evio
Note that evio files compress better. I speculate this is because data
is grouped differently in the two formats. In evio files data for all
tracks for a particular aspect are all grouped together, e.g. all track
id's are sequential, all track vertex x values are sequential, etc.
Thus it is more likely there are long runs of the same or similar
numbers in evio files, allowing gzip to compress more effectively.
Also note that the evio package includes auto-gunzip on input, and I am
about to add an auto-gzip option for output.
Sincerely,
Elliott
================================================================================
Those raised in a morally relative or neutral environment will hold
no truths to be self-evident.
Elliott Wolin
Staff Physicist, Jefferson Lab
12000 Jefferson Ave
Suite 8 MS 12A1
Newport News, VA 23606
757-269-7365
================================================================================
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mailman.jlab.org/pipermail/halld-offline/attachments/20100302/563bf0d3/attachment-0002.html>
More information about the Halld-offline
mailing list