[Halld-offline] Comparing hddm and evio file sizes, with and w/o gzip

Elliott Wolin wolin at jlab.org
Tue Mar 2 15:52:55 EST 2010


Hi,

Using the danaevio plugin, which writes dana objects to evio files, I 
compared hddm and evio file sizes w/ and w/o gzip.  I used two files 
generated by David L, 100 events each of smeared single pi and 
multi-track events.  The DANA objects written out in evio are basically 
containers for the hddm data read in.

_No gzip_

  single pi:   2.9M evio    3.1M hddm
  multi trk:   23M  evio     24M hddm

The ratio of file sizes is:

  single pi:    1.07  hddm/evio
  multi tri:    1.04  hddm/evio

This is not surprising as for the most part the same bytes are written 
out, but in different order. 


_With gzip (default flags)_

  single pi:    967k evio     1.4M hddm
  multi tri:    6.6M evio     9.8M hddm

This gives gzip compression ratios:

  single pi:   3.0 evio     2.2 hddm
  multi trk:   3.5 evio     2.4 hddm

The ratio of gzipped file sizes is:

  single pi:     1.45 hddm/evio
  multi trk:     1.48 hddm/evio


Note that evio files compress better.  I speculate this is because data 
is grouped differently in the two formats.  In evio files data for all 
tracks for a particular aspect are all grouped together, e.g. all track 
id's are sequential, all track vertex x values are sequential, etc.  
Thus it is more likely there are long runs of the same or similar 
numbers in evio files, allowing gzip to compress more effectively.

Also note that the evio package includes auto-gunzip on input, and I am 
about to add an auto-gzip option for output.


				Sincerely,
					Elliott
 

================================================================================


 Those raised in a morally relative or neutral environment will hold
		    no truths to be self-evident.
				   

Elliott Wolin
Staff Physicist, Jefferson Lab
12000 Jefferson Ave
Suite 8 MS 12A1
Newport News, VA 23606
757-269-7365

================================================================================

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mailman.jlab.org/pipermail/halld-offline/attachments/20100302/563bf0d3/attachment-0002.html>


More information about the Halld-offline mailing list