which the source code was written, or until their
draft is published. Many journals impose this, such as
Science and Nature. See Joly et al. (2012) for a review
of data retention policies.
The creation and documentation of additional
information we recommend should be done by
researchers who publish their studies. Documenting
and sharing code and data in such a way that this
information can be easily used and cited by others
gives researchers credit for a larger portion of their
research effort. For academic researchers, we advocate that tenure committees give weight to the publication of data and source code when evaluating
candidates for tenure. Thus, the publication velocity
should not be reduced, but include research products
other than publications.
The recommendations we suggest should be a part
of daily research practices. According to Irakli
Loladze, despite increasing the work load by 30 percent, “Reproducibility is like brushing your teeth. It
is good for you, but it takes time and effort. Once you
learn it, it becomes a habit” (Baker 2016).
Another recommendation for improving the readability and comparability of research papers is to
require structured abstracts, which are commonly
used in medical journals. Structured abstracts can be
used to efficiently communicate a research objective,
the motivation for and process by which an empirical study was conducted, and what results were
achieved. Structured abstracts also require
researchers to structure their own thoughts about
their research. We suggest a five-part structured
abstract containing ( 1) the research motivation, ( 2)
the research objective, ( 3) the method used to conduct any empirical studies, ( 4) the results of the
research, and ( 5) the conclusion. This structure
enforces a coherent research narrative, which is not
always the case for unstructured abstracts. The
abstract for this article is an example of the proposed
structure, while Gundersen and Kjensmo (2018) provides an abstract for empirical research that follows
these recommendations and includes an explicit
description of the hypothesis and an interpretation
of the results.
Call to Arms
As a community, we should ensure that the research
we conduct is properly documented. To make AI
research reproducible and more trustworthy, we pro-
posed best practices that should be adopted by edi-
tors and program chairs and incorporated into the
review forms of AAAI publication venues.
Publishers should provide extra space to document
and cite data, source code, and empirical study
designs. AAAI leadership should encourage AI
researchers to increase the reproducibility of their
published work. This support could include provid-
ing structured templates to organize appendices and
making available extra space in publications to
accommodate the needed documentation.
For AI research to become open and more reproducible, the research community and publishers
have to establish suitable practices. Authors need to
adopt these practices, disseminate them to colleagues
and students, and help develop mechanisms and
technology to make it easier for others to adopt
Our objective with this article is to highlight the
benefits of reproducible science and to propose initial, modest changes that can increase the reproducibility of AI research results. There are many additional actions that could and should be taken, and
we look forward to further dialogue with the AI
research community on how to increase the reproducibility and scientific value of AI publications.
This research was funded in part by the National Sci-
ence Foundation under grant ICER-1440323. This
work has in part been carried out at the Telenor-
NTNU AI Lab, Norwegian University of Science and
Technology, Trondheim, Norway. The recommenda-
tions proposed are based on the Geoscience Paper of
the Future and the Scientific Paper of the Future best
practices developed under that award. Thanks to Sig-
bjørn Kjensmo for all the effort put into surveying
the state of the art of reproducibility of AI.
Altman, M., and King, G. 2007. A Proposed Standard for the
Scholarly Citation of Quantitative Data. D-Lib Magazine 13
Baker, M. 2016. Is There a Reproducibility Crisis? Nature
Begley, C. G., and Ellis, L. M. 2012. Drug Development:
Raise Standards for Preclinical Cancer Research. Nature
483(7391): 531–33. doi.org/10.1038/483531a.
Bouquet, P.; Serafini, L.; Zanobini, S.; and Benerecetti, M.
2003. An Algorithm for Semantic Coordination. Paper presented at the Second International Semantic Integration
Workshop. Sanibel Island, FL, October 20–23.
Braun, M. L., and Ong, C. S. 2014. Open Science in