subset that is most representative of a model’s inference. Rutgers’ approach allows for explanation of the
inferences of any probabilistic generative and discriminative model, as well as influential DL models (Yang
and Shafto 2017).
Rutgers is also developing a formal theory of human-machine cooperation and supporting interactive
guided explanation of complex compositional models.
Common among these is a core approach of building
from models of human learning to foster explainability
and carefully controlled behavioral experiments to
Explanation by Bayesian teaching inputs a data set,
a probabilistic model, and an inference method and
returns a small subset of examples that best explains
the model’s inference. Experiments with unfamiliar
images show that explanations of inferences about
categories of (and specific) images increase the accuracy of people’s reasoning about a model (Vong et al.
2018). Experiments with familiar image categories
show that explanations allow users to accurately
calibrate trust in model predictions.
Explanation of complex models is facilitated by
interactive guided explanations. By exploiting compositionality and cooperative modifications of ML
models, Rutgers provides a generic approach to
fostering understanding via guided exploration. Interaction occurs through an interface that exposes
model structure and explains each component with
aspects of the data. The Rutgers approach has been
demonstrated to facilitate understanding of large text
corpora, as assessed by a human’s ability to accurately
summarize the corpus after short, guided explanations.
Rutgers is addressing the data analytics challenge
problem area and has demonstrated its approach on
images, text, combinations of these (for example,
VQA), and structured simulations involving temporal
Conclusions and Future Work
DARPA’s XAI program is developing and evaluating a
wide variety of new ML techniques: modified DL
techniques that learn explainable features; methods
that learn more structured, interpretable, causal
models; and model induction techniques that infer an
explainable model from any black-box model. One year
into the XAI program, initial technology demonstrations and results indicate that these three broad strategies merit further investigation and will provide future
developers with design options covering the performance versus explainability trade space. The developer
teams’ XAI systems are being evaluated to assess the
value of explanations that they provide, localizing the
contributions of specific techniques within this
The authors thank the XAI development teams,
specifically their principle investigators, for their
innovative research and contributions to this article:
Trevor Darrell (UCB), Brian Ruttenberg and Avi Pfeffer
(CRA), Song-Chun Zhu (UCLA), Alan Fern (OSU),
Mark Stefik (PARC), Zico Kolter (Carnegie Mellon),
Mohamed Amer and Giedrius Burachas (SRI Interna-
tional), Bill Ferguson (Raytheon BBN), Vibhav Gogate
(UTD), Xia (Ben) Hu (TAMU), Patrick Shafto (Rutgers),
and Robert Hoffman (IHMC). The authors owe a
special thanks to Marisa Carrera for her exceptional
technical support to the XAI program and her ex-
tensive editing skills.
Belbute-Peres, F., and Kolter, J. Z. 2017. A Modular Differ-
entiable Rigid Body Physics Engine. Paper presented at the
Neural Information Processing Systems Deep Reinforcement
Learning Symposium. Long Beach, CA, December 7.
Chakraborty, S.; Tomsett, R.; Raghavendra, R.; Harborne, D.;
Alzantot, M.; Cerutti, F.; and Srivastava, M.; et al. 2017.
Interpretability of Deep Learning Models: A Survey of Results. Presented at the IEEE Smart World Congress 2017
Workshop: DAIS 2017 — Workshop on Distributed Analytics Infrastructure and Algorithms for Multi-Organiza-tion Federations, San Francisco, CA, August 4–8. doi.org/10.
Dodge, J.; Penney, S.; Hilderbrand, C.; Anderson, A.; and
Burnett, M. 2018. How the Experts Do It: Assessing and
Explaining Agent Behaviors in Real-Time Strategy Games. In
Proceedings of the 2018 CHI Conference on Human Factors in
Computing Systems. New York: Association for Computing
Du, M.; Liu, N.; Song, Q.; and Hu, X. 2018. Towards Explanation of DNN-Based Prediction and Guided Feature Inversion. In Proceedings of the 24th ACM SIGKDD International
Conference on Knowledge Discovery and Data Mining, 1358–67.
New York: Association for Computing Machinery. doi.org/
Gao, J.; Liu, N.; Lawley, M.; and Hu, X. 2017. An Interpretable Classification Framework for Information Extraction from Online Healthcare Forums. Journal of
Healthcare Engineering: 2460174. doi.org/10.1155/2017/
Gogate, V., and Domingos, P. 2016. Probabilistic Theorem
Proving. Communications of the ACM 59( 7): 107–15. doi.org/10.
Harradon, M.; Druce, J.; and Ruttenberg, B. 2018. Causal
Learning and Explanation of Deep Neural Networks via
Autoencoded Activations. arXiv preprint. arXiv:1802.00541v1
[ cs.AI]. Ithaca, NY: Cornell University Library.
Hefny, A.; Marinho, Z.; Sun, W.; Srinivasa, S.; and Gordon, G.
2018. Recurrent Predictive State Policy Networks. In Proceedings of the 35th International Conference on Machine
Learning, 1954–63. International Machine Learning Society.
Hendricks, L. A.; Hu, R.; Darrell, T.; and Akata, Z. 2018.
Grounding Visual Explanations. Presented at the European
Conference of Computer Vision (ECCV). Munich, Germany;
September 8–14. doi.org/10.1007/978-3-030-01216-8_17
Hoffman, R.; Miller, T.; Mueller, S. T.; Klein, G.; and Clancey,
W. J. 2018. Explaining Explanation, Part 4: A Deep Dive on
Deep Nets. IEEE Intelligent Systems 33( 3): 87–95. doi.org/10.
Hoffman, R. R.; and Klein, G. 2017. Explaining Explanation,
Part 1: Theoretical Foundations. IEEE Intelligent Systems 32( 3):