[Halla12_software] SoLID Software Meeting on Thursday at 2pm 03/26 Room-B101
Seamus Riordan
riordan at jlab.org
Tue Mar 31 16:10:15 EDT 2015
Hi everyone,
I appreciate everyone taking the initiative to lay out these ideas. My
plan for Thursday is to go through these carefully and try to organize
them into a list of issues and preferences with what I started to
present last week.
Best,
Seamus
On 03/31/2015 04:01 PM, Thomas K Hemmick wrote:
> Hi everyone
>
> I like that fact that the scope of the discussion has shifted to the
> over-arching issues. I agree with both Zhiwen that flexibility is
> critical and with Richard that there is such a thing as "enough rope
> to hang yourself". We'll need judgement to decide the right number of
> rules since too many and too few are both bad.
>
> Although my favorite means of enforcing rules is via the compiler
> (...you WILL implement ALL the purely virtual methods...), there is
> typically not enough oomph in that area to make all our designs
> self-fulfilling.
>
> Another way to approach this same discussion is to list the things
> that can (will) go wrong if we don't address them at the architecture
> level. Here is a short list from me:
>
> 1) The simulation geometry is different from the real geometry.
> 2) The offline code applies corrections that fix real data but ruin
> simulated data.
> 3) I just got a whole set of simulated events from someone's
> directory, but they can't tell me the details on the conditions used
> for the simulations so I cannot be sure if I can use them.
> 4) The detector has evolved to use new systems and I need different
> codes to look at simulated (and real) events from these two periods.
> 5) I've lost track of how the hits relate to the simulation.
> 6) Sometimes the hit list for a reconstructed track includes a few
> noise hits that were pretty much aligned with the simulated track. Do
> I consider this track as reconstructed or not?
> 7) The simulation is too good. How should I smear out the simulated
> data to match the true detector performance. Is there a standard
> architecture for accomplishing this?
> 8) My friend wrote some code to apply fine corrections/calibrations
> to the data. I don't know whether I should or should not do this to
> simulated data.
> 9) Now that we're a couple of years in, I realize that to analyze my
> old dataset, I need some form of a hybrid of code: progression in
> some places, but old standard stuff in others. What do I do?
> 10) Are we (as a collaboration) convinced that all the recent changes
> are correct? Do we have a system for tracking performance instead of
> just lines of code change (e.g. benchmark code that double-checks
> impact of offline changes to simulations via nightly generating
> performance metric runs).
>
> To my opinion, this long list actually requires only a few design
> paradigms (the fewer the better) to address it and avoid these issues
> by design. One example of a design paradigm is that the simulated
> output is required (administrative rule) to be self describing. We
> then define what it means to self-describe in terms of the information
> content (e.g. events, background, geometry, calibration assumptions,
> ...) and designate a (1) SIMPLE, (2) Universal, (3) Extensible format
> by which we will incorporate self-description into our files. Another
> example of a design paradigm is portability. We'll have to define
> portability (even to your laptop?) and then architect a solution.
> PHENIX did poorly in this by not distinguishing between "core
> routines" (that open ANY file and then understand/unpack its standard
> content) and "all other code" (daq, online monitoring, simulation,
> reconstruction, analysis).
>
> Here is an example set of few rules that can accomplish a lot:
> (1) Although there are many formats for input information and output
> information, only one "in memory" format can be allowed for any single
> piece of information. Input translates to this and output translates
> from this.
> (2) All "in memory" formats for geometry, calibration, and the like
> will be built with streamers so that these have the *option* of being
> included in our output files. The intention is that an output file
> containing the streamed geometry and calibration would be completely
> self describing and thereby require NO reference to external resources
> (static files, databases, etc...) when used by others.
> (3) The "in memory" formats will explicitly be coded to allow for
> schema evolution so that old files are compatible with new code.
>
> Tools like GEMC are, to my opinion, small internal engines that fit
> within our over-arching framework, as indicated by Rakitha. To me,
> these are all basically the same and I have literally no preference.
> The key is how we fit such tools into an overall picture of the
> COMPLETE computing task that will lie before us.
>
> Cheers!
> Tom
>
> On Tue, Mar 31, 2015 at 2:30 PM, Richard S. Holmes <rsholmes at syr.edu
> <mailto:rsholmes at syr.edu>> wrote:
>
>
> On Tue, Mar 31, 2015 at 2:04 PM, Zhiwen Zhao <zwzhao at jlab.org
> <mailto:zwzhao at jlab.org>> wrote:
>
> For the output tree format, I think we definitely need a tree
> or a bank contain the event info.
> But the critical thing is that each sub-detector should be
> able to define it easily and freely.
> They can have very different truth info and digitization and
> the requirement can be different at
> different stage of study.
>
>
> Certainly. But it would be better to define a good, general format
> for detector information which different detector groups can use
> as a baseline and, if needed, modify to suit their needs than to
> have them re-invent the wheel for each detector in inconsistent
> ways. (At the moment, for instance, different banks use different
> names to refer to the same quantity.)
>
> Event, primary, and trajectory data can, I think, be structured
> the same for hits in all detectors.
>
> This reminds me: I'd like to see some degree of standardization of
> volume IDs. Right now it's anarchy.
>
> --
> - Richard S. Holmes
> Physics Department
> Syracuse University
> Syracuse, NY 13244
> 315-443-5977 <tel:315-443-5977>
>
> _______________________________________________
> Halla12_software mailing list
> Halla12_software at jlab.org <mailto:Halla12_software at jlab.org>
> https://mailman.jlab.org/mailman/listinfo/halla12_software
>
>
>
>
> _______________________________________________
> Halla12_software mailing list
> Halla12_software at jlab.org
> https://mailman.jlab.org/mailman/listinfo/halla12_software
--
Seamus Riordan, Ph.D. seamus.riordan at stonybrook.edu
Stony Brook University Office: (631) 632-4069
Research Assistant Professor Fax: (631) 632-8573
A-103 Department of Physics
6999 SUNY
Stony Brook NY 11974-3800
-------------- next part --------------
An HTML attachment was scrubbed...
URL: https://mailman.jlab.org/pipermail/halla12_software/attachments/20150331/0b65729c/attachment-0001.html
More information about the Halla12_software
mailing list