[Halld-offline] REST Data Format
Mark M. Ito
marki at jlab.org
Tue Jul 3 10:21:32 EDT 2012
Paul,
I think expanding the list of data is probably a good idea to make the
output more useful for analysis. The idea would be that decisions that
are appropriate at the analysis level be make-able at the analysis
level. But each byte needs to examined critically, as we are doing.
I'm not sure about the detector plane intersection points though, since
those are derivable by swimming. I think the CPU time is cheap on the
scale of things and the resulting information from swimming is
potentially much richer.
-- Mark
On 07/02/2012 05:05 PM, Paul Mattione wrote:
> Matt Shepherd and I have had a lot more discussion about the data included in the REST format, and we think that the results that are currently planned on being stored are too sparse. We think that we should include all of the relevant reconstructed information from the different detector systems (not just BCAL & FCAL, but also TOF and SC), along with the results from the time-based tracking. This doesn't include hit-level ADCs and TDCs, just things like the final reconstructed hit position, time, and uncertainties in the different systems (not the CDC and FDC hits, of course). We think that this would be the highest level of analysis-independent information that can be provided. This would allow us to later study and tweak the matching between the charged tracks and the hits in the different systems, which you wouldn't be able to do otherwise without reconstructing all of the data again.
>
> The question is then whether or not we also store the results of the matching between the charged tracks and the hits in the different systems (e.g. the DNeutralShower and DChargedTrack objects (e.g. DChargedTrack contains things like TOF hit time projected to the beamline)). From an analysis tools point-of-view it wouldn't matter: you request JANA to give you the DChargedTrack objects, and it would return them whether they were directly saved in REST or had to built from the saved DTrackTimeBased (and other) objects.
>
> If we don't store the track-system matching results it would save disk space (e.g. the matching matrix between charged tracks and calorimeter showers is not 1-to-1, but many-to-many), but increase cpu time. To match the charged tracks to the detector system hits we have to swim the track out from the beamline through GlueX. The full results from this swim are far too large to save in REST, but we could save the intersection of each track (in space and time) with each detector system (BCAL, FCAL, etc.) in REST to save time on this step. The track could always be re-swum if necessary, but it shouldn't need to be.
>
> In summary, I (and perhaps Matt) propose that we store the results of the following objects in REST:
>
> DTrackTimeBased (only the relevant quantities (no DReferenceTrajectory, vector<DStartTime_t>, tracking pulls, t0, t1, etc.), plus the intersection with the detector elements (BCAL, TOF, etc.))
> DBCALShower
> DFCALShower
> DTOFPoint
> DSCHit
>
> This may not be necessary for the current data challenge, but I think it's best for our long-term analysis needs. What do you guys think?
>
> - Paul
>
>
> _______________________________________________
> Halld-offline mailing list
> Halld-offline at jlab.org
> https://mailman.jlab.org/mailman/listinfo/halld-offline
--
Mark M. Ito
Jefferson Lab (www.jlab.org)
(757)269-5295
More information about the Halld-offline
mailing list