[Halld-offline] Data Management Plans

Matthew Shepherd mashephe at indiana.edu
Thu Jul 31 11:58:33 EDT 2014


I think we need to define what "data" means in this case.
NSAC was briefed on this last summer and, as far as my
understanding goes, digital data can mean anything from
making the plots digitally available in PDF to providing
raw four-vectors, software, etc. to do a complete analysis.

I noticed that the third principle of the policy states:

"Not all data need to be shared or preserved. The costs and 
benefits of doing so should be considered in data 
management planning."

I think a good strategy to start with is to making available
in tabular form any data points, error bars, etc. that
appear in papers in refereed journals.  
This seems to be optimal cost/benefit point as others 
may want to do fits to cross sections,
extracted scattering amplitudes, etc. from GlueX data.
It is more useful than just a plot, but not as cumbersome
as making the raw data available.  I don't think the
raw data would be of much use to most people anyway
without the insider experience to analyze it.

In this case the DOI would reference tabular data
that was relevant for plots in a particular journal article
(not terabytes of actual and simulated data).

Of course this needs more discussion and we need
to make sure that what we do is compliant with the
policy, but I'd argue for thinking about what is most
logical and useful first and see if it fits the guidelines
rather than trying to glean specific guidance from
what is a very general policy.

Matt

---------------------------------------------------------------------
Matthew Shepherd, Associate Professor
Department of Physics, Indiana University, Swain West 265
727 East Third Street, Bloomington, IN 47405

Office Phone:  +1 812 856 5808

On Jul 28, 2014, at 3:57 PM, Curtis A. Meyer <cmeyer at cmu.edu> wrote:

> It would be useful of someone could explain this to us. I have read through a number of the pages, but it is
> not real obvious what we will need to do. In particular, does the fact that we give a data set a DOI mean that
> anyone has access to it? Does it mean that we must maintain it in the designated site forever? 
> 
> Curtis
> 
> 
> ---------
> Curtis A. Meyer			MCS Associate Dean for Faculty and Graduate Affairs
> Wean:    (412) 268-2745	Professor of Physics
> Doherty: (412) 268-3090	Carnegie Mellon University
> Fax:         (412) 681-0648	Pittsburgh, PA 15213
> curtis.meyer at cmu.edu	http://www.curtismeyer.com/
> 
> 
> 
> On Jul 28, 2014, at 1:44 PM, Mark M. Ito <marki at jlab.org> wrote:
> 
>> From Chip Watson, FYI
>> 
>> 
>> -------- Original Message --------
>> Subject:	Data Management Plans
>> Date:	Mon, 28 Jul 2014 11:20:18 -0400
>> From:	Chip Watson <watson at jlab.org>
>> To:	Mark Ito <marki at jlab.org>, David Lawrence <davidl at jlab.org>, Graham Heyes <heyes at jlab.org>, Sandy Philpott <sandy.philpott at jlab.org>, Kari Heffner <heffner at jlab.org>
>> 
>> All,
>> 
>> Effective Oct 1, all new proposals for funding to NP will require Data Management Plans (DMP) that meet a set of requirements.  Here is the announcement and base document:
>> 
>>     http://science.energy.gov/funding-opportunities/digital-data-management/
>> 
>> This document references additional documents (tree like) and these following web pages should all be read:
>> 
>>     NP:    http://science.energy.gov/np/funding-opportunities/digital-data-management/
>>     (this document references JLab's document)
>> 
>>     http://science.energy.gov/funding-opportunities/digital-data-management/suggested-elements-for-a-dmp/
>> 
>>     http://science.energy.gov/funding-opportunities/digital-data-management/faqs/
>> 
>> One component not yet addressed in the JLab / GlueX data management plans is providing unique identifiers to particular data sets, and how to find and download data of interest.  So far we have documented principles, but not details on how to get to the data and how to reference particular data.  I believe that this scope must eventually be addressed in the JLab/IT and /GlueX documents.  DOE is offering a Digital Object Identifier service:
>> 
>>     https://www.osti.gov/elink/aboutDataIDService.jsp
>> 
>> The OSTI web page suggests this could point to a landing page with additional links to data components.  So, we need to plan how we want to manage DOI's and landing pages, and what ability to download data we want to grant to outsiders (non-JLab users, presumably).
>> 
>> Since GlueX is up first, I would like you to take the lead in working through this one.  Everyone on this email distribution can have a role to play and should think about a path forward.  I suggest that we have a joint meeting later this month when everyone is available.   Please reply or reply/all to suggest a replacement if you would like to delegate this to someone else or don't feel you are the appropriate person, and perhaps suggest which week in August would be good for you.
>> 
>> Chip
>> 
>> 
>> 
>> 
>> _______________________________________________
>> Halld-offline mailing list
>> Halld-offline at jlab.org
>> https://mailman.jlab.org/mailman/listinfo/halld-offline
> 
> _______________________________________________
> Halld-offline mailing list
> Halld-offline at jlab.org
> https://mailman.jlab.org/mailman/listinfo/halld-offline





More information about the Halld-offline mailing list