[Halld-offline] Data Management Plans

Curtis A. Meyer cmeyer at cmu.edu
Thu Jul 31 12:18:18 EDT 2014


Unfortunately, it is a bit more than just the four-vectors. There also needs to be a release of matching
Monte Carlo data to be able to use the data.

curtis
---------
Curtis A. Meyer			MCS Associate Dean for Faculty and Graduate Affairs
Wean:    (412) 268-2745	Professor of Physics
Doherty: (412) 268-3090	Carnegie Mellon University
Fax:         (412) 681-0648	Pittsburgh, PA 15213
curtis.meyer at cmu.edu	http://www.curtismeyer.com/



On Jul 31, 2014, at 12:02 PM, Chip Watson <watson at jlab.org> wrote:

> Concur, we don't want to generate unnecessary work for ourselves.  I would suggest that just the data in a plot is insufficient.  For example if what is show is a histogram (1D) which is the result of statistical analysis of 20GB of 4 vectors, then provide the 20GB file so others can try their own cuts on the data.  So probably not the raw data, but probably the N-tuples.  Yes, needs discussion and then specificity to satisfy the directive.
> 
> On 7/31/14 11:58 AM, Matthew Shepherd wrote:
>> I think we need to define what "data" means in this case.
>> NSAC was briefed on this last summer and, as far as my
>> understanding goes, digital data can mean anything from
>> making the plots digitally available in PDF to providing
>> raw four-vectors, software, etc. to do a complete analysis.
>> 
>> I noticed that the third principle of the policy states:
>> 
>> "Not all data need to be shared or preserved. The costs and 
>> benefits of doing so should be considered in data 
>> management planning."
>> 
>> I think a good strategy to start with is to making available
>> in tabular form any data points, error bars, etc. that
>> appear in papers in refereed journals.  
>> This seems to be optimal cost/benefit point as others 
>> may want to do fits to cross sections,
>> extracted scattering amplitudes, etc. from GlueX data.
>> It is more useful than just a plot, but not as cumbersome
>> as making the raw data available.  I don't think the
>> raw data would be of much use to most people anyway
>> without the insider experience to analyze it.
>> 
>> In this case the DOI would reference tabular data
>> that was relevant for plots in a particular journal article
>> (not terabytes of actual and simulated data).
>> 
>> Of course this needs more discussion and we need
>> to make sure that what we do is compliant with the
>> policy, but I'd argue for thinking about what is most
>> logical and useful first and see if it fits the guidelines
>> rather than trying to glean specific guidance from
>> what is a very general policy.
>> 
>> Matt
> 
> _______________________________________________
> Halld-offline mailing list
> Halld-offline at jlab.org
> https://mailman.jlab.org/mailman/listinfo/halld-offline

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mailman.jlab.org/pipermail/halld-offline/attachments/20140731/65b38a98/attachment-0002.html>


More information about the Halld-offline mailing list