[Halld-offline] [New Logentry] Follow-up Re: mcsmear profiling

Thu Jun 18 21:15:01 EDT 2015

Logentry Text:
--
A brief study of the compression options for HDDM output files generated by mcsmear was made. This was done using a version of mcsmear that contains Tegan's modifications that remove the time development feature when smearing BCAL hits, allowing the program to run faster. The motivation for this exercise was to see how well mcsmear scales when running with multiple threads. A feature that will be desirable when running large simulation farm jobs with multi-threaded GEANT4 and multi-threaded sim-recon.

Findings are:

1. mcsmear does not scale well with multiple threads since most of its time is spent in writing the event to the HDDM output stream. This is necessary since that action must be serialized (see comment below).

2. HDDM has multiple options for compressing the data stream on the fly that include: no compression, bz2 compression, zlib compression. The default for mcsmear prior to revision 18781 (committed today) was to use bz2 compression. With revision 18781 this was changed to use zlib compression by default.

3. mcsmear runs approximately 3-4 times faster with no compression than with bz2 compression. The cost is approximately 2.3 times larger files (see figure below).

4. Our computing model does not require us to store large amounts of simulated HDDM files on tape. If we do end up writing some simulated events to tape though, the tape drives have built in compression so the tape cost should be the same whether we use compressed or uncompressed files. The cost would be in bandwidth of moving the files to/from tape and local disk storage.

In the figure below tests of the processing rate and output HDDM file size were done for 2 different conditions:

- 10k bggen events simulated using hdgeant. mcsmear was run on ifarm1401 (aka ifarm65) a Linux CentOS 6.5 computer
- 5k CPP signal events (pi+pi-) simulated using CPPsim. mcsmear was run on Mac OS X 10.9 MacBook Pro Laptop

In both configurations, the single thread rates and the 4 thread rates were measured.

It is worth noting that the read rate was also measured using hd_ana, but with no reconstruction. These were roughly proportional to the writing rates corresponding to the specific compression method. In other words, the uncompressed files could be read in faster than the compressed files could.

[figure:1]

Note on improving parallel scalability of mcsmear:
It is possible that mcsmear could be made to scale to faster rates by parallelizing HDDM's serialization of data. This would require some modification to HDDM at a lower level to allow individual events to be written to separate buffers first and then the buffers added to the output stream. Before we go to that trouble though further benchmarking will be needed to see if it is at all worthwhile. Uncompressed writing rates from mcsmear for bggen events is currently at ~900Hz. This is significantly fast than the simulation rate which is on the order of 10 Hz. We will need to see what rates can be achieved from multi-threaded GEANT4.

---

This is a plain text email for clients that cannot display HTML.  The full logentry can be found online at https://logbooks.jlab.org/entry/3344045
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mailman.jlab.org/pipermail/halld-offline/attachments/20150618/40404704/attachment-0001.html>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: mcsmear_compression.png
Type: image/png
Size: 38631 bytes
Desc: not available
URL: <https://mailman.jlab.org/pipermail/halld-offline/attachments/20150618/40404704/attachment-0001.png>