<html>
<head>
<meta content="text/html; charset=windows-1252"
http-equiv="Content-Type">
</head>
<body text="#000000" bgcolor="#FFFFFF">
Hi everyone,<br>
<br>
I appreciate everyone taking the initiative to lay out these ideas.
My plan for Thursday is to go through these carefully and try to
organize them into a list of issues and preferences with what I
started to present last week.<br>
<br>
Best,<br>
Seamus<br>
<br>
<div class="moz-cite-prefix">On 03/31/2015 04:01 PM, Thomas K
Hemmick wrote:<br>
</div>
<blockquote
cite="mid:CAAVEoEdZWH7JzYi5+ZCpVQA9q_Ab9mV+a5wcefxAdqji-vLwYg@mail.gmail.com"
type="cite">
<div dir="ltr">Hi everyone
<div><br>
</div>
<div>I like that fact that the scope of the discussion has
shifted to the over-arching issues. I agree with both Zhiwen
that flexibility is critical and with Richard that there is
such a thing as "enough rope to hang yourself". We'll need
judgement to decide the right number of rules since too many
and too few are both bad.</div>
<div><br>
</div>
<div>Although my favorite means of enforcing rules is via the
compiler (...you WILL implement ALL the purely virtual
methods...), there is typically not enough oomph in that area
to make all our designs self-fulfilling.</div>
<div><br>
</div>
<div>Another way to approach this same discussion is to list the
things that can (will) go wrong if we don't address them at
the architecture level. Here is a short list from me:</div>
<div><br>
</div>
<div>1) The simulation geometry is different from the real
geometry.</div>
<div>2) The offline code applies corrections that fix real data
but ruin simulated data.</div>
<div>3) I just got a whole set of simulated events from
someone's directory, but they can't tell me the details on the
conditions used for the simulations so I cannot be sure if I
can use them.</div>
<div>4) The detector has evolved to use new systems and I need
different codes to look at simulated (and real) events from
these two periods.</div>
<div>5) I've lost track of how the hits relate to the
simulation.</div>
<div>6) Sometimes the hit list for a reconstructed track
includes a few noise hits that were pretty much aligned with
the simulated track. Do I consider this track as
reconstructed or not?</div>
<div>7) The simulation is too good. How should I smear out the
simulated data to match the true detector performance. Is
there a standard architecture for accomplishing this?</div>
<div>8) My friend wrote some code to apply fine
corrections/calibrations to the data. I don't know whether I
should or should not do this to simulated data. </div>
<div>9) Now that we're a couple of years in, I realize that to
analyze my old dataset, I need some form of a hybrid of code:
progression in some places, but old standard stuff in
others. What do I do?</div>
<div>10) Are we (as a collaboration) convinced that all the
recent changes are correct? Do we have a system for tracking
performance instead of just lines of code change (e.g.
benchmark code that double-checks impact of offline changes to
simulations via nightly generating performance metric runs).</div>
<div><br>
</div>
<div>To my opinion, this long list actually requires only a few
design paradigms (the fewer the better) to address it and
avoid these issues by design. One example of a design
paradigm is that the simulated output is required
(administrative rule) to be self describing. We then define
what it means to self-describe in terms of the information
content (e.g. events, background, geometry, calibration
assumptions, ...) and designate a (1) SIMPLE, (2) Universal,
(3) Extensible format by which we will incorporate
self-description into our files. Another example of a design
paradigm is portability. We'll have to define portability
(even to your laptop?) and then architect a solution. PHENIX
did poorly in this by not distinguishing between "core
routines" (that open ANY file and then understand/unpack its
standard content) and "all other code" (daq, online
monitoring, simulation, reconstruction, analysis). </div>
<div><br>
</div>
<div>Here is an example set of few rules that can accomplish a
lot:</div>
<div>(1) Although there are many formats for input information
and output information, only one "in memory" format can be
allowed for any single piece of information. Input translates
to this and output translates from this.</div>
<div>(2) All "in memory" formats for geometry, calibration, and
the like will be built with streamers so that these have the
*option* of being included in our output files. The intention
is that an output file containing the streamed geometry and
calibration would be completely self describing and thereby
require NO reference to external resources (static files,
databases, etc...) when used by others.</div>
<div>(3) The "in memory" formats will explicitly be coded to
allow for schema evolution so that old files are compatible
with new code.</div>
<div><br>
</div>
<div>Tools like GEMC are, to my opinion, small internal engines
that fit within our over-arching framework, as indicated by
Rakitha. To me, these are all basically the same and I have
literally no preference. The key is how we fit such tools
into an overall picture of the COMPLETE computing task that
will lie before us.</div>
<div><br>
</div>
<div>Cheers!</div>
<div>Tom</div>
</div>
<div class="gmail_extra"><br>
<div class="gmail_quote">On Tue, Mar 31, 2015 at 2:30 PM,
Richard S. Holmes <span dir="ltr"><<a
moz-do-not-send="true" href="mailto:rsholmes@syr.edu"
target="_blank">rsholmes@syr.edu</a>></span> wrote:<br>
<blockquote class="gmail_quote" style="margin:0 0 0
.8ex;border-left:1px #ccc solid;padding-left:1ex">
<div dir="ltr">
<div class="gmail_extra"><span class=""><br>
<div class="gmail_quote">On Tue, Mar 31, 2015 at 2:04
PM, Zhiwen Zhao <span dir="ltr"><<a
moz-do-not-send="true"
href="mailto:zwzhao@jlab.org" target="_blank">zwzhao@jlab.org</a>></span>
wrote:<br>
<blockquote class="gmail_quote" style="margin:0 0 0
.8ex;border-left:1px #ccc solid;padding-left:1ex">
<div style="overflow:hidden">For the output tree
format, I think we definitely need a tree or a
bank contain the event info.<br>
But the critical thing is that each sub-detector
should be able to define it easily and freely.<br>
They can have very different truth info and
digitization and the requirement can be
different at<br>
different stage of study.</div>
</blockquote>
</div>
<br>
</span>Certainly. But it would be better to define a
good, general format for detector information which
different detector groups can use as a baseline and, if
needed, modify to suit their needs than to have them
re-invent the wheel for each detector in inconsistent
ways. (At the moment, for instance, different banks use
different names to refer to the same quantity.)</div>
<div class="gmail_extra"><br>
</div>
<div class="gmail_extra">Event, primary, and trajectory
data can, I think, be structured the same for hits in
all detectors.</div>
<div class="gmail_extra"><br>
</div>
<div class="gmail_extra">This reminds me: I'd like to see
some degree of standardization of volume IDs. Right now
it's anarchy.<span class=""><br>
<div><br>
</div>
-- <br>
<div>- Richard S. Holmes<br>
Physics Department<br>
Syracuse University<br>
Syracuse, NY 13244<br>
<a moz-do-not-send="true" href="tel:315-443-5977"
value="+13154435977" target="_blank">315-443-5977</a><br>
</div>
</span></div>
</div>
<br>
_______________________________________________<br>
Halla12_software mailing list<br>
<a moz-do-not-send="true"
href="mailto:Halla12_software@jlab.org">Halla12_software@jlab.org</a><br>
<a moz-do-not-send="true"
href="https://mailman.jlab.org/mailman/listinfo/halla12_software"
target="_blank">https://mailman.jlab.org/mailman/listinfo/halla12_software</a><br>
<br>
</blockquote>
</div>
<br>
</div>
<br>
<fieldset class="mimeAttachmentHeader"></fieldset>
<br>
<pre wrap="">_______________________________________________
Halla12_software mailing list
<a class="moz-txt-link-abbreviated" href="mailto:Halla12_software@jlab.org">Halla12_software@jlab.org</a>
<a class="moz-txt-link-freetext" href="https://mailman.jlab.org/mailman/listinfo/halla12_software">https://mailman.jlab.org/mailman/listinfo/halla12_software</a>
</pre>
</blockquote>
<br>
<pre class="moz-signature" cols="72">--
Seamus Riordan, Ph.D. <a class="moz-txt-link-abbreviated" href="mailto:seamus.riordan@stonybrook.edu">seamus.riordan@stonybrook.edu</a>
Stony Brook University Office: (631) 632-4069
Research Assistant Professor Fax: (631) 632-8573
A-103 Department of Physics
6999 SUNY
Stony Brook NY 11974-3800
</pre>
</body>
</html>