[Halld-offline] Storage and Analysis of Reconstructed Events

Mon Feb 13 10:17:05 EST 2012

Hi Matt,

You wrote:

>This high level skim include all events that looked like they had some
>hadrons in the final state (not cosmic, bhabha scattering, etc.)  It
>was a substantial fraction of all events recorded and pretty much any
>event that you would want for a physics analysis.  All of this was on
>disk -- we had several large RAID systems.

And you wrote in the previous mail:

>>> The framework could simultaneously deliver the remainder of the data
>>> that was in the high level reconstruction skim with each event.  Its

Was this "high level reconstruction skim" kept in sequential files,
that one needed to read all of them to extract the data for the
selected events?

If GlueX has 100 times more events - is it conceivable to keep
the "high level reconstruction skim" on disks?

Eugene

On Mon, 13 Feb 2012, Matthew Shepherd wrote:

>
> Eugene,
>
> On Feb 12, 2012, at 2:22 PM, Eugene Chudakov wrote:
>
>> You wrote:
>>> The framework could simultaneously deliver the remainder of the data
>>> that was in the high level reconstruction skim with each event.  Its
>>> nice because you get a reduced number of events with enhanced
>>> information without the cost of duplicating the initial skim output.
>>
>> I am not sure I get it.  Does the "high level reconstruction skim"
>> sample include all the reconstructed events or a subsample of them? In
>> the former case they can hardly be stored on disks with a quick
>> access. In the latter case - how many such "skims" were kept on disks?
>
> This high level skim include all events that looked like they had some hadrons in the final state (not cosmic, bhabha scattering, etc.)  It was a substantial fraction of all events recorded and pretty much any event that you would want for a physics analysis.  All of this was on disk -- we had several large RAID systems.
>
> The subsequent skims "D tag" skim for example, selected a subset of these events, but did not copy them to an additional file.  (The only additional file that was created contained specialized info about D reconstruction.)  The data access system in CLEO allowed random access to events in the original sample so it was efficient to run over a subsample of events.  There was also little disk penalty for creating additional skims since you weren't copying events each time.
>
> Using a CLEO-like system in GlueX one could have many useful skims, e.g., a Primakoff skim that had high energy deposition in the FCAL, a skim where some particle (eta', omega, etc.) was reconstructed.  This would make a great starting place for analyses rather than going back and running through all events each time.
>
>> It would be useful to see how many events CLEO had written out and in
>> which format, for the permanent storage, for a relatively long term
>> (months), for a short term (days) etc. Of course, the term storage is
>> associated with a certain analysis.
>
> I can try to find more details on this.
>
> Matt
>