[Halld-online] Proposal for "compressed" EVIO raw event format
Elliott Wolin
wolin at jlab.org
Wed Oct 24 10:32:45 EDT 2012
Hi,
I've been running Dave A's mc2coda package to generate simulated raw
data tapes from Hall D monte carlo data using the "rawevent" JANA
plugin. Dave's package currently generates EVIO records that correspond
to block-mode readout with block size set to one, and it simulates
pulse-integral readout mode in the FADC's. The event data includes all
header/trailer words generated by the front-end boards.
This mode is fairly verbose and includes redundant data. E.g. the
trigger time is repeated for each slot in each crate, although only one
trigger time is needed for the event. Also, each slot generates four
words per event even if there are no hits in the crate. I.e. a
completely empty crate generates 16 * 4 = 64 words (256 bytes).
Below I propose a new format that eliminates all redundancy and
extraneous information. Initial simulation results in a factor of three
reduction in the average event size, from arount 28 kBytes/event to 6.6
kBytes/event. I suspect my format may be too compressed, please
comment. We will discuss this at today's online meeting. Note that
this compression can happen at the crate and/or EB level.
This is a fragment of an event generated by mc2coda corresponding to an
empty crate:
<bank content="bank" data_type="0x10" tag="7" num="1" nchildren="1">
<bank content="uint32" data_type="0x1" tag="0" num="1" nwords="64">
0x808c0801 0x90800001 0x98000064
0x88000004 0x80cc0801
0x90c00001 0x98000064 0x88000004
0x810c0801 0x91000001
0x98000064 0x88000004 0x814c0801
0x91400001 0x98000064
0x88000004 0x818c0801 0x91800001
0x98000064 0x88000004
0x81cc0801 0x91c00001 0x98000064
0x88000004 0x820c0801
0x92000001 0x98000064 0x88000004
0x824c0801 0x92400001
0x98000064 0x88000004 0x834c0801
0x93400001 0x98000064
0x88000004 0x838c0801 0x93800001
0x98000064 0x88000004
0x83cc0801 0x93c00001 0x98000064
0x88000004 0x840c0801
0x94000001 0x98000064 0x88000004
0x844c0801 0x94400001
0x98000064 0x88000004 0x848c0801
0x94800001 0x98000064
0x88000004 0x84cc0801 0x94c00001
0x98000064 0x88000004
0x850c0801 0x95000001 0x98000064 0x88000004
Here each module/slot generates 4 words: block header, event header,
trigger time and block trailer, and includes no data.
This is a fragment from a crate that contains data:
<bank content="bank" data_type="0x10" tag="3" num="1" nchildren="1">
<bank content="uint32" data_type="0x1" tag="0" num="1" nwords="100">
0x80900801 0x90800001 0x98000032
0x88000004 0x80d00801
0x90c00001 0x98000032 0xb8503138
0xc0503138 0xb8603138
0xc0603138 0xb8703138 0xc0703138
0xb8803138 0xc0803138
0xb8903138 0xc0903138 0xb8a03138
0xc0a03138 0xb8b03138
0xc0b03138 0xb8c03138 0xc0c03138
0xb8d03138 0xc0d03138
0xb8e03138 0xc0e03138 0xb8f03138
0xc0f03138 0xb9003138
0xc1003138 0xb9103138 0xc1103138
0xb9203138 0xc1203138
0xb9303138 0xc1303138 0x88000022
0xf8000000 0xf8000000
0x81100801 0x91000001 0x98000032
0x88000004 0x81500801
0x91400001 0x98000032 0xbb103138
0xc3103138 0x88000006
0xf8000000 0xf8000000 0x81900801
0x91800001 0x98000032
0x88000004 0x81d00801 0x91c00001
0x98000032 0x88000004
0x82100801 0x92000001 0x98000032
0x88000004 0x82500801
0x92400001 0x98000032 0x88000004
0x83500801 0x93400001
0x98000032 0x88000004 0x83900801
0x93800001 0x98000032
0x88000004 0x83d00801 0x93c00001
0x98000032 0x88000004
0x84100801 0x94000001 0x98000032
0x88000004 0x84500801
0x94400001 0x98000032 0x88000004
0x84900801 0x94800001
0x98000032 0x88000004 0x84d00801
0x94c00001 0x98000032
0x88000004 0x85100801 0x95000001
0x98000032 0x88000004
</bank>
Note that all this information is needed by the event building system to
ensure all crates are active and aligned as far as trigger times are
concerned.
My straw proposal for "compressed" or "sparsified" raw data format is as
follows. I removed all 4-word empty slots, removed all empty crates,
compressed the 4-word overhead per slot down to a single slot header
word, removed the intermediate bank layer directly below the outermost
bank and changed the outermost bank num from 1 to 2. I haven't found a
place for the trigger time yet, perhaps in the first segment, perhaps in
a new segment. Average reduction is a factor of 3.
This is the first few banks of compressed data:
<bank content="bank" data_type="0x10" tag="65361" num="2" nchildren="40">
<bank content="segment" data_type="0x20" tag="65313" num="63" nchildren="2">
<segment content="uint64" data_type="0xa" tag="1" nwords="2">
0x1 0x64
</segment>
<segment content="uint16" data_type="0x5" tag="1" nwords="2">
0 0
</segment>
</bank>
<bank content="uint32" data_type="0x1" tag="1" num="1" nwords="141">
0xe0900010 0xb8203138 0xc0203138 0xb8403138
0xc0403138
0xb9003138 0xc1003138 0xb9203138 0xc1203138
0xbac03138
0xc2c03138 0xbae03138 0xc2e03138 0xbba03138
0xc3a03138
0xbbb03138 0xc3b03138 0xe0d0000a 0xb8f03138
0xc0f03138
0xb9103138 0xc1103138 0xba103138 0xc2103138
0xba203138
0xc2203138 0xbc703138 0xc4703138 0xe110000a
0xb8e03138
0xc0e03138 0xb9003138 0xc1003138 0xbb403138
0xc3403138
0xbb603138 0xc3603138 0xbc503138 0xc4503138
0xe1500008
0xb8203138 0xc0203138 0xbae03138 0xc2e03138
0xbb003138
0xc3003138 0xbc303138 0xc4303138 0xe1900008
0xbaa03138
0xc2a03138 0xbac03138 0xc2c03138 0xbbe03138
0xc3e03138
0xbc303138 0xc4303138 0xe1d00006 0xbb203138
0xc3203138
0xbb303138 0xc3303138 0xbc503138 0xc4503138
0xe210000a
0xb8203138 0xc0203138 0xb8603138 0xc0603138
0xb8643138
0xc0643138 0xbba03138 0xc3a03138 0xbc103138
0xc4103138
0xe2500006 0xb9103138 0xc1103138 0xb9203138
0xc1203138
0xb9243138 0xc1243138 0xe350000a 0xb8703138
0xc0703138
0xb8e03138 0xc0e03138 0xba603138 0xc2603138
0xba643138
0xc2643138 0xba683138 0xc2683138 0xe390000a
0xb9f03138
0xc1f03138 0xba603138 0xc2603138 0xbc003138
0xc4003138
0xbc043138 0xc4043138 0xbc103138 0xc4103138
0xe3d00002
0xbc103138 0xc4103138 0xe4100006 0xb9903138
0xc1903138
0xb9943138 0xc1943138 0xb9a03138 0xc1a03138
0xe4500008
0xb9f03138 0xc1f03138 0xba303138 0xc2303138
0xbbc03138
0xc3c03138 0xbbf03138 0xc3f03138 0xe4d0000c
0xb8a03138
0xc0a03138 0xb8b03138 0xc0b03138 0xb8e03138
0xc0e03138
0xba603138 0xc2603138 0xba803138 0xc2803138
0xba903138
0xc2903138 0xe5100004 0xbc003138 0xc4003138
0xbc403138
0xc4403138
</bank>
The slot header word uses one of the "user mode" header codes, 12 in
this case. After this the 5-bit slot number appears in the usual place,
followed by 4 bits defining the board type (I used the scheme in
mc2coda). The lower 16 bits contains the number of hit words to
follow. Finally, the bank tag corresponds to the crate/ROC number. I
wonder if I should change the crate bank num from 1 to 2, as I did for
the outermost bank...
Comments, suggestions, criticisms?
Thanks,
--
Sincerely,
Elliott
================================================================================
Those raised in a morally relative or neutral environment will hold
no truths to be self-evident.
Elliott Wolin
Staff Physicist, Jefferson Lab
12000 Jefferson Ave
Suite 8 MS 12A1
Newport News, VA 23606
757-269-7365
================================================================================
More information about the Halld-online
mailing list