analytics algorithms and must decide which to report as
supporting evidence in their analyses and which to
pursue further. These algorithms often produce false
alarms that must be pruned and are subject to concept
drift. Furthermore, these algorithms often make recommendations that the analyst must assess to determine
whether the evidence supports or contradicts their hy-potheses. Effective explanations will help confront these
The autonomy challenge was motivated by the
need to effectively manage AI partners. For example,
the Department of Defense seeks semiautonomous
systems to augment warfighter capabilities. Operators
will need to understand how these behave so they can
determine how and when to best use them in future
missions. Effective explanations will better enable such
For both challenge problem areas, it is critical to
measure explanation effectiveness. While it would be
convenient if a learned model’s explainability could be
measured automatically, an XAI system’s explanation
effectiveness must be assessed according to how its
explanations aid human users. This requires human-in-the-loop psychologic experiments to measure the
user’s satisfaction, mental model, task performance,
and appropriate trust. DARPA formulated an initial
explanation evaluation framework that includes potential measures of explanation effectiveness (figure 5).
Exploring and refining this framework is an important
part of the XAI program’s research agenda.
The XAI program’s goal, concept, strategies, chal-
lenges, and evaluation framework are described in
the program’s 2016 broad agency announcement.
Figure 6 displays the XAI program’s schedule, which
consists of two phases. Phase 1 ( 18 months) com-
menced in May 2017 and includes initial technology
demonstrations of XAI systems. Phase 2 ( 30 months)
includes a sequence of evaluations against challenge
problems selected by the system developers and the
XAI evaluator. The first formal evaluations of XAI
systems took place during the fall of 2018. This article
describes the developer teams’ progress leading up to
these evaluations, whose results were presented at an XAI
program meeting during the winter of 2019.
Development and Progress
Figure 7 summarizes the 11 XAI Technical Area 1 (TA1)
developer teams and the TA2 team [from the Florida
Institute for Human and Machine Cognition (IHMC)]
that is developing the psychologic model of explanation.
Three TA1 teams are pursuing both challenge problem
areas (autonomy and data analytics), three are working on
only the former, and five are working on only the latter.
Per the strategies described in figure 2, the TA1 teams are
investigating a diverse range of techniques for developing
explainable models and explanation interfaces.
Decision-Making Foundations of XAI
The objective of the IHMC team (which includes
researchers from MacroCognition and Michigan
Technological University) is to develop and evaluate
psychologically plausible models of explanation and
develop actionable concepts, methods, measures, and
metrics for explanatory reasoning. The IHMC team is
Learning Ensemble Methods
Figure 1. Learning Performance Versus Explainability Trade-Off for Several Categories of Learning Techniques.