Dataset Identification:
Resource Abstract:
- description: <p>United States agricultural researchers have many options for making their data available online. This
dataset aggregates the primary sources of ag-related data and determines where researchers are likely to deposit their agricultural
data. These data serve as both a current landscape analysis and also as a baseline for future studies of ag research data.</p>
<h3>Purpose</h3> <p>As sources of agricultural data become more numerous and disparate, and collaboration
and open data become more expected if not required, this research provides a landscape inventory of online sources of open
agricultural data.</p> <p>An inventory of current agricultural data sharing options will help assess how the <a
href="https://data.nal.usda.gov">Ag Data Commons</a>, a platform for USDA-funded data cataloging and publication,
can best support data-intensive and multi-disciplinary research. It will also help agricultural librarians assist their researchers
in data management and publication. The goals of this study were to</p> <ul> <li>establish where agricultural
researchers in the United States-- land grant and USDA researchers, primarily ARS, NRCS, USFS and other agencies -- currently
publish their data, including general research data repositories, domain-specific databases, and the top journals</li>
<li>compare how much data is in institutional vs. domain-specific vs. federal platforms</li> <li>determine
which repositories are recommended by top journals that require or recommend the publication of supporting data</li>
<li>ascertain where researchers not affiliated with funding or initiatives possessing a designated open data repository
can publish data</li> </ul> <h3>Approach</h3> <p>The National Agricultural Library team focused
on Agricultural Research Service (ARS), Natural Resources Conservation Service (NRCS), and United States Forest Service (USFS)
style research data, rather than ag economics, statistics, and social sciences data. To find domain-specific, general, institutional,
and federal agency repositories and databases that are open to US research submissions and have some amount of ag data, resources
including re3data, libguides, and ARS lists were analysed. Primarily environmental or public health databases were not included,
but places where ag grantees would publish data were considered.</p> <h3>Search methods</h3> <p>We
first compiled a list of known domain specific USDA / ARS datasets / databases that are represented in the Ag Data Commons,
including ARS Image Gallery, ARS Nutrition Databases (sub-components), SoyBase, PeanutBase, National Fungus Collection, i5K
Workspace @ NAL, and GRIN. We then searched using search engines such as Bing and Google for non-USDA / federal ag databases,
using Boolean variations of agricultural data /ag data / scientific data + NOT + USDA (to filter out the federal / USDA results).
Most of these results were domain specific, though some contained a mix of data subjects.</p> <p>We then used
search engines such as Bing and Google to find top agricultural university repositories using variations of agriculture, ag
data and university to find schools with agriculture programs. Using that list of universities, we searched each university
web site to see if their institution had a repository for their unique, independent research data if not apparent in the initial
web browser search. We found both ag specific university repositories and general university repositories that housed a portion
of agricultural data. Ag specific university repositories are included in the list of domain-specific repositories. Results
included Columbia University International Research Institute for Climate and Society, UC Davis Cover Crops Database, etc.
If a general university repository existed, we determined whether that repository could filter to include only data results
after our chosen ag search terms were applied. General university databases that contain ag data included Colorado State University
Digital Collections, University of Michigan ICPSR (Inter-university Consortium for Political and Social Research), and University
of Minnesota DRUM (Digital Repository of the University of Minnesota). We then split out NCBI (National Center for Biotechnology
Information) repositories.</p> <p>Next we searched the internet for open general data repositories using a variety
of search engines, and repositories containing a mix of data, journals, books, and other types of records were tested to determine
whether that repository could filter for data results after search terms were applied. General subject data repositories include
Figshare, Open Science Framework, PANGEA, Protein Data Bank, and Zenodo.</p> <p>Finally, we compared scholarly
journal suggestions for data repositories against our list to fill in any missing repositories that might contain agricultural
data. Extensive lists of journals were compiled, in which USDA published in 2012 and 2016, combining search results in ARIS,
Scopus, and the Forest Service's TreeSearch, plus the USDA web sites Economic Research Service (ERS), National Agricultural
Statistics Service (NASS), Natural Resources and Conservation Service (NRCS), Food and Nutrition Service (FNS), Rural Development
(RD), and Agricultural Marketing Service (AMS). The top 50 journals' author instructions were consulted to see if they
(a) ask or require submitters to provide supplemental data, or (b) require submitters to submit data to open repositories.</p>
<p>Data are provided for Journals based on a 2012 and 2016 study of where USDA employees publish their research studies,
ranked by number of articles, including 2015/2016 Impact Factor, Author guidelines, Supplemental Data?, Supplemental Data
reviewed?, Open Data (Supplemental or in Repository) Required? and Recommended data repositories, as provided in the online
author guidelines for each the top 50 journals.</p> <h3>Evaluation</h3> <p>We ran a series of searches
on all resulting general subject databases with the designated search terms. From the results, we noted the total number of
datasets in the repository, type of resource searched (datasets, data, images, components, etc.), percentage of the total
database that each term comprised, any dataset with a search term that comprised at least 1% and 5% of the total collection,
and any search term that returned greater than 100 and greater than 500 results.</p> <p>We compared domain-specific
databases and repositories based on parent organization, type of institution, and whether data submissions were dependent
on conditions such as funding or affiliation of some kind.</p> <h3>Results</h3> <p>A summary of the
major findings from our data review:</p> <ul> <li>Over half of the top 50 ag-related journals from our profile
require or encourage open data for their published authors.</li> <li>There are few general repositories that are
both large AND contain a significant portion of ag data in their collection. GBIF (Global Biodiversity Information Facility),
ICPSR, and ORNL DAAC were among those that had over 500 datasets returned with at least one ag search term and had that result
comprise at least 5% of the total collection.</li> <li>Not even one quarter of the domain-specific repositories
and datasets reviewed allow open submission by any researcher regardless of funding or affiliation.</li> </ul>;
abstract: <p>United States agricultural researchers have many options for making their data available online. This dataset
aggregates the primary sources of ag-related data and determines where researchers are likely to deposit their agricultural
data. These data serve as both a current landscape analysis and also as a baseline for future studies of ag research data.</p>
<h3>Purpose</h3> <p>As sources of agricultural data become more numerous and disparate, and collaboration
and open data become more expected if not required, this research provides a landscape inventory of online sources of open
agricultural data.</p> <p>An inventory of current agricultural data sharing options will help assess how the <a
href="https://data.nal.usda.gov">Ag Data Commons</a>, a platform for USDA-funded data cataloging and publication,
can best support data-intensive and multi-disciplinary research. It will also help agricultural librarians assist their researchers
in data management and publication. The goals of this study were to</p> <ul> <li>establish where agricultural
researchers in the United States-- land grant and USDA researchers, primarily ARS, NRCS, USFS and other agencies -- currently
publish their data, including general research data repositories, domain-specific databases, and the top journals</li>
<li>compare how much data is in institutional vs. domain-specific vs. federal platforms</li> <li>determine
which repositories are recommended by top journals that require or recommend the publication of supporting data</li>
<li>ascertain where researchers not affiliated with funding or initiatives possessing a designated open data repository
can publish data</li> </ul> <h3>Approach</h3> <p>The National Agricultural Library team focused
on Agricultural Research Service (ARS), Natural Resources Conservation Service (NRCS), and United States Forest Service (USFS)
style research data, rather than ag economics, statistics, and social sciences data. To find domain-specific, general, institutional,
and federal agency repositories and databases that are open to US research submissions and have some amount of ag data, resources
including re3data, libguides, and ARS lists were analysed. Primarily environmental or public health databases were not included,
but places where ag grantees would publish data were considered.</p> <h3>Search methods</h3> <p>We
first compiled a list of known domain specific USDA / ARS datasets / databases that are represented in the Ag Data Commons,
including ARS Image Gallery, ARS Nutrition Databases (sub-components), SoyBase, PeanutBase, National Fungus Collection, i5K
Workspace @ NAL, and GRIN. We then searched using search engines such as Bing and Google for non-USDA / federal ag databases,
using Boolean variations of agricultural data /ag data / scientific data + NOT + USDA (to filter out the federal / USDA results).
Most of these results were domain specific, though some contained a mix of data subjects.</p> <p>We then used
search engines such as Bing and Google to find top agricultural university repositories using variations of agriculture, ag
data and university to find schools with agriculture programs. Using that list of universities, we searched each university
web site to see if their institution had a repository for their unique, independent research data if not apparent in the initial
web browser search. We found both ag specific university repositories and general university repositories that housed a portion
of agricultural data. Ag specific university repositories are included in the list of domain-specific repositories. Results
included Columbia University International Research Institute for Climate and Society, UC Davis Cover Crops Database, etc.
If a general university repository existed, we determined whether that repository could filter to include only data results
after our chosen ag search terms were applied. General university databases that contain ag data included Colorado State University
Digital Collections, University of Michigan ICPSR (Inter-university Consortium for Political and Social Research), and University
of Minnesota DRUM (Digital Repository of the University of Minnesota). We then split out NCBI (National Center for Biotechnology
Information) repositories.</p> <p>Next we searched the internet for open general data repositories using a variety
of search engines, and repositories containing a mix of data, journals, books, and other types of records were tested to determine
whether that repository could filter for data results after search terms were applied. General subject data repositories include
Figshare, Open Science Framework, PANGEA, Protein Data Bank, and Zenodo.</p> <p>Finally, we compared scholarly
journal suggestions for data repositories against our list to fill in any missing repositories that might contain agricultural
data. Extensive lists of journals were compiled, in which USDA published in 2012 and 2016, combining search results in ARIS,
Scopus, and the Forest Service's TreeSearch, plus the USDA web sites Economic Research Service (ERS), National Agricultural
Statistics Service (NASS), Natural Resources and Conservation Service (NRCS), Food and Nutrition Service (FNS), Rural Development
(RD), and Agricultural Marketing Service (AMS). The top 50 journals' author instructions were consulted to see if they
(a) ask or require submitters to provide supplemental data, or (b) require submitters to submit data to open repositories.</p>
<p>Data are provided for Journals based on a 2012 and 2016 study of where USDA employees publish their research studies,
ranked by number of articles, including 2015/2016 Impact Factor, Author guidelines, Supplemental Data?, Supplemental Data
reviewed?, Open Data (Supplemental or in Repository) Required? and Recommended data repositories, as provided in the online
author guidelines for each the top 50 journals.</p> <h3>Evaluation</h3> <p>We ran a series of searches
on all resulting general subject databases with the designated search terms. From the results, we noted the total number of
datasets in the repository, type of resource searched (datasets, data, images, components, etc.), percentage of the total
database that each term comprised, any dataset with a search term that comprised at least 1% and 5% of the total collection,
and any search term that returned greater than 100 and greater than 500 results.</p> <p>We compared domain-specific
databases and repositories based on parent organization, type of institution, and whether data submissions were dependent
on conditions such as funding or affiliation of some kind.</p> <h3>Results</h3> <p>A summary of the
major findings from our data review:</p> <ul> <li>Over half of the top 50 ag-related journals from our profile
require or encourage open data for their published authors.</li> <li>There are few general repositories that are
both large AND contain a significant portion of ag data in their collection. GBIF (Global Biodiversity Information Facility),
ICPSR, and ORNL DAAC were among those that had over 500 datasets returned with at least one ag search term and had that result
comprise at least 5% of the total collection.</li> <li>Not even one quarter of the domain-specific repositories
and datasets reviewed allow open submission by any researcher regardless of funding or affiliation.</li> </ul>
Citation
- Title Inventory of online public databases and repositories holding agricultural data in 2017.
-
- creation Date
2018-03-01T05:58:06.128453
Resource language:
Processing environment:
Back to top:
Metadata data stamp:
2018-08-06T19:39:49Z
Resource Maintenance Information
- maintenance or update frequency:
- notes: This metadata record was generated by an xslt transformation from a dc metadata record; Transform by Stephen M. Richard, based
on a transform by Damian Ulbricht. Run on 2018-08-06T19:39:49Z
Metadata contact
-
pointOfContact
- organisation Name
CINERGI Metadata catalog
-
- Contact information
-
-
- Address
-
- electronic Mail Address cinergi@sdsc.edu
Metadata language
eng
Metadata character set encoding:
utf8
Metadata standard for this record:
ISO 19139 Geographic Information - Metadata - Implementation Specification
standard version:
2007
Metadata record identifier:
urn:dciso:metadataabout:be168ca7-07d5-4795-ae62-6c8e8c4a1cde
Metadata record format is ISO19139 XML (MD_Metadata)