18 AI MAGAZINE
( 3) Identifying new possible directions of transdis-
ciplinary research, for example, delving deeper into
the psychological functions of noncompliance, and
exploring their transferability to AI.
( 4) Promoting richer models of AI in popular culture, to offer a counterpoint to cliché representations
of AI rebellion.
Rebel Agents: Prior Work
and Hypothetical Scenarios
Before describing the AI rebellion framework, we dis-
cuss prior work and introduce three hypothetical sce-
narios for illustrating rebellion. In tables 1 and 2, we
provide examples of components of the AI rebellion
framework using these scenarios, while table 3 relates
prior work to the framework.
Rebel Agents in Prior Work
Gregg-Smith and Mayol-Cuevas (2015) describe
cooperative handheld intelligent tools with task-specific knowledge that “refuse” to execute actions
which violate task specifications. For example, in a
simulated painting task, if the alter points the tool at
a pixel that is not supposed to be painted, the tool
can take initiative to disable its own painting function.
Briggs and Scheutz (2015) propose a general
process for embodied AI agents’ refusal to execute
commands due to several categories of reasons:
knowledge, capacity, goal priority and timing, social
role and obligation, and normative permissibility.
Briggs, McConnell, and Scheutz (2015) demonstrate how embodied AI agents can convincingly
express, through verbal or nonverbal communication, their reluctance to perform a task. In their
human-robot interaction evaluation scenarios, a
robot protests repeatedly, simulating increasingly
intense emotions, when ordered to topple a tower of
cans that it supposedly just finished building.
Apker, Johnson, and Humphrey (2016) describe
autonomous-vehicle agents that form teams and
receive commands from a centralized operator. Predefined templates are used to determine how an
agent should respond to each command. Contingency behaviors are provided for situations in which
the agent, while monitoring its health, detects faults
(for example, insufficient fuel). In such situations,
the agent will disregard commands and instead execute the appropriate contingency behavior, effectively rebelling. Coman et al. (2017) provide an extensive description of how these agents fit into the AI
rebellion framework.
Hiatt, Harrison, and Trafton (2011) propose AI
agents that use theory of mind (that is, the “ability to
infer the beliefs, desires, and intentions of others”),
manifested through mental simulation of “what
human teammates may be thinking,” to determine
whether they should notify a human teammate that
he or she is deviating from expected behavior. The
authors report on an experiment showing that agents
with the proposed capabilities are perceived as “more
natural and intelligent teammates.”
Borenstein and Arkin (2016) explore the idea of
“ethical nudges” through which robots might
attempt to influence humans to adopt ethically
acceptable behavior, through verbal or nonverbal
communication. For example, a robot might nudge
an alter to stop neglecting a child, to refrain from
smoking in a public area, or to donate to charities
and volunteer. The authors discuss the ethical accept-
ability of creating robots that have this ability, noting
that it is arguable whether the design goal to “subtly
or directly influence human behavior” is ever ethi-
cally acceptable.
Milli et al. (2017) explore the idea that robot disobedience may be beneficial given imperfect human
alter rationality. In the context of their model of collaborative human-robot interaction, they show that,
given a human alter who is not perfectly rational,
disobedience of direct orders in support of what are
inferred to be the human’s actual preferences
improves performance.
In addition, an entire agency paradigm, that of
goal reasoning, models agents with potential for
rebellion. Goal reasoning agents can reason about
and modify the goals they are pursuing, in order to
react to unexpected events and explore opportunities
(Vattam et al. 2013).
Hypothetical Rebellion Scenarios
The following hypothetical scenarios (furniture
mover, personal assistant, and hiring committee)
have as protagonists AI agents that can become
rebels.
Furniture Mover
A robot mover assists alters in furniture-moving tasks
such as carrying a table (a more complex version of
the system of Agravante et al. [2013]). This is an
example of a two-agent collaborative task in which
both participants have partial information access,
and each participant has access to some information
that is unavailable to the other (for example, each
participant might be able to see behind the other, but
not behind him-, her-, or itself; the AI agent could,
through its sensors, have access to additional information not available to the human). Rebellion could
consist of refusing an action verbally requested or
physically initiated by the alter. This rebellion could
occur because the agent reasons that the action
endangers the alter’s safety, the rebel agent’s safety,
or task execution correctness.
Personal Assistant
An AI personal assistant can execute various commands, including ordering products from e-com-merce websites and assisting the alter in pursuing his
or her health-related goals. The agent’s potential
rebellious behavior includes attempting to dissuade