<html>

  <head>

    <meta content="text/html; charset=ISO-8859-1"

      http-equiv="Content-Type">

  </head>

  <body bgcolor="#FFFFFF" text="#000000">

    Richard,<br>

    <br>

    Perhaps storing the config information in the file is not such a

    great idea after all (and I agree that even if we were to do that

    the HDDM template at the beginning is a terrible place to put it, I

    was thinking about the special record thing). I see that if the

    config information is used  as anything other than a comment, then

    that info has to be kept and disseminated not only to other

    concurrent threads but also to future incarnations of the data

    (skims in your example).<br>

    <br>

      -- Mark<br>

    <br>

    <div class="moz-cite-prefix">On 07/18/2012 03:09 PM, Richard Jones

      wrote:<br>

    </div>

    <blockquote cite="mid:50070A05.5050400@uconn.edu" type="cite">Hello,

      <br>

      <br>

      I have no objection to storing a string tag for each object,

      representing the GetTag() string from jana.  That can be done

      either on an event-by-event basis or globally.  Event-by-event

      should only be adopted if the analysis can handle the situation

      where tags switch dynamically within a job, or we want to store

      more than one tag (say both default and "KLOE" bcal clusters) and

      let the user decide which to use.  That would require changes to

      the current DEventSourceREST.cc, but would be easy to do.  If tags

      are stored globally, then the hddm system will ensure

      automatically that only streams with the same tag strings get

      merged together as a result of a skim or by hddm-cat.  It would

      also provide a better way for the danarest plugin to decide which

      tag to use for each output object, instead of the provisional way

      I am handling it right now for DBCALShower objects, which David

      points out is incorrect in some cases.

      <br>

      <br>

      As to the idea of flooding the REST file header with analysis

      qualifiers, that is not something that hddm can do right now.  I

      could add the capability, but I question why.  The only function

      of the hddm header, as currently conceived, is to document to the

      hddm toolkit how to unpack the event data and what their meaning

      and relationships are. That is all it does.  It is not a place to

      record random comments like the name of the application that wrote

      the file, or the command line switches.  User code does not

      normally even access the header, it is just handled by the hddm

      library.  So at present, storing runconfig-type information would

      require adding special events to the stream, AND the huge change

      of making hddm streams stateful....

      <br>

      <br>

      Just like root trees, hddm streams designed to be stateless.  This

      is an important design feature that I am not eager to concede.

      Think about trying to stick config-type information into a root

      tree, and then analyze it with a TSelector on PROOF.  You are

      going to have to do major gymnastics to get that information to

      every analysis session that gets started to run your job. 

      Building single-threaded concepts like this into the analysis

      sounds like we are still working like we did 20 years ago.

      <br>

      <br>

      It was not my original intent to embed metadata about the

      conditions of the production inside the file, because I want later

      to be able to string these events together and create skims.  In

      general I want to avoid "stateful" streams in hddm, relying

      instead on the global keys like runnumber,eventnumber to reference

      database records for this information, similar to how root trees

      work.  By keeping the streams stateless I avoid all kinds of

      ordering and synchronization issues.  A related issue is the "skip

      to event NNN" action, which is very fast in hddm because you don't

      have to read in every event. Imagine a sparse skim, which in the

      limit would consist of one state record for every event.  Do I

      stop and check every time I hit a state record, do a bit-by-bit

      comparison with the current state, and throw an exception on

      incompatible changes?  Much better to check that compatibility

      ahead of time, from a database lookup, wouldn't it?

      <br>

      <br>

      If we want to make sure you don't lose the metadata, why not store

      them in two separate databases, or promote the database to a

      higher data security level?  We are not going to be able to

      analyze a REST file without access to a database (required to

      re-swim reference trajectories).  Playing devil's advocate, why

      are we not storing the magnetic field map in the event file? 

      Without the magnetic field we cannot get back the original

      DTrackTimeBased objects.

      <br>

      <br>

      -Richard J.

      <br>

      <br>

      <br>

      <br>

      <fieldset class="mimeAttachmentHeader"></fieldset>

      <br>

      <pre wrap="">_______________________________________________

Halld-offline mailing list

<a class="moz-txt-link-abbreviated" href="mailto:Halld-offline@jlab.org">Halld-offline@jlab.org</a>

<a class="moz-txt-link-freetext" href="https://mailman.jlab.org/mailman/listinfo/halld-offline">https://mailman.jlab.org/mailman/listinfo/halld-offline</a></pre>

    </blockquote>

    <br>

    <pre class="moz-signature" cols="72">-- 

Mark M. Ito

Jefferson Lab (<a class="moz-txt-link-abbreviated" href="http://www.jlab.org">www.jlab.org</a>)

(757)269-5295</pre>

    <br>

    <br>

  </body>

</html>