Recommendations Data mentioned in a publication should:
1. Be available in a shared community repository, so anyone can access it
2. Include basic metadata, so others can search and understand its contents
3. Have a license, so anyone can understand the conditions for reuse of the data
4. Have an associated digital object identifier (DOI) or persistent URL (PURL) so that the
data is available permanently
5. Be cited properly in the prose and listed accurately among the references, so readers
can identify the datasets unequivocally and data creators can receive credit for their
Table 2. Author Checklist Part I.
Recommendations for data in publications.
nal and independent researchers with the pressure to
publish, and it is easy to see how this situation can
lead to research being documented less vigorously.
However, by following the recommendations given
here, authors can increase the trustworthiness and
reproducibility of research results with relatively little effort. Still, changes cannot be expected solely
from individual researchers alone. The research community, funding sponsors, employers of researchers,
and publishers should each, in their respective roles,
incentivize and reward reproducible research.
Best Practices and Recommendations
The recommendations we introduce are based on
best practices put forward by scientific organizations
such as the Research Data Alliance; 1 the Federation of
Earth Science Information Partners; 2 DataCite; 3 the
National Research Council (2012); the Task Group on
Data Citation Standards and Practices (2013); the
Data Citation Synthesis Group (2014); and scholars
such as Ball and Duke, 4 Wilkinson et al. (2016), Stodden et al. (2016), Gil et al. (2016), Nosek et al. (2015),
Starr et al. (2015), Downs et al. (2015), Mooney and
Newton (2012), Goodman et al. (2014), Garijo et al.
(2013), and Altman and King (2007), as well as earth
and space science publishers5 Hanson et al. (2015).
Strong momentum is building in support of FAIR
practices, that is, to make data findable, accessible,
interoperable, and reusable (Wilkinson et al. 2016).
Our recommendations support FAIR principles and
extend them to promote reproducible research, open
science, and digital scholarship.
Implementing these recommendations requires
extra space in publications. We suggest including this
additional content in appendices that technical
reviewers will not be required to assess but can quickly check. For electronic publications, there should
not be any space limitations imposed for such appendices.
When these recommendations cannot be met, a
brief explanation should be included about the reasons. Possible reasons may be restricted access (for
example, proprietary or sensitive data), ownership by
close collaborators who do not wish to disclose certain details, inadequate resources (for example, to
house large datasets), or an unreasonable burden on
We begin with recommendations for data and
source code as the basic ingredients of a computational experiment. Then we describe recommendations to document AI methods and the experiments
themselves. If all recommendations for AI methods
(table 4) are implemented, then the publication
should in theory be R3 (method reproducible), while
if all recommendations for data (table 2) are also
implemented, then the research should be R2 (data
reproducible). Finally, all four sets of recommendations (tables 2–5) must be implemented for the
research to be fully R1 (experiment reproducible).
We will refer to the complete set of 20 recommendations as an author checklist, we provide examples
to demonstrate that they are synergistic, and we
argue that they can be easily implemented.
Recommendations for Data
Table 2 summarizes our recommendations for documenting data, which concern ( 1) repository use, ( 2)
metadata, ( 3) licenses, ( 4) persistent unique identifiers, and ( 5) citations. These recommendations can
be easily implemented if researchers use community
data repositories that support recommended best
Data repositories exist for many domains, and as
such they are available to the AI community. Examples of these general repositories are Zenodo, 6
figshare, 7 and Dataverse. 8 These repositories will
automatically assign a DOI to any uploaded data and
will also accept software, figures, movies, and slide