strengthen and improve existing evaluation
The first part of the workshop revolved around
new algorithms directly tackling some of the AI challenges posed by video game platforms. Michael Bowling (University of Alberta) listed several challenges
that can be addressed using the ALE an as evaluation
platform. Erik Talvitie (Franklin and Marshall College) and Michael Bowling proposed a new simple
feature set for reinforcement learning in visual
domains, designed to capture pairwise, position-invariant, spatial relationships between objects. Marlos Machado (University of Alberta) and colleagues
presented a domain-independent optimistic initial-ization approach for reinforcement learning. Satinder
Singh (University of Michigan) presented some of his
recent work, showing, for example, how one can generate a real-time player from planning and deep-learning techniques. Nir Lipovetzky (University of
Melbourne) and colleagues discussed how one can
use classical planning algorithms without having a
PDDL-model nor any prior knowledge of the actions
effects and goals. Matthew Hausknecht (University of
Texas at Austin) concluded this portion of the workshop with some recent results applying neuroevolu-tion to policy search for the Atari 2600.
The second half of our workshop focused on the
evaluation of generally competent agents. Peter
Stone (University of Texas at Austin) recapitulated
some of the important evaluation lessons learned
from the General Game Playing competition.
Matthew Hausknecht and Peter Stone further spoke
of the dangers of deterministic evaluation. In addition, Marc G. Bellemare (Google DeepMind) provided empirical evidence of the exploitability of determinism. He presented an algorithm, the Brute, which
optimizes a single game trajectory using an open-loop control approach. This led to two rounds of panel discussions where we agreed that deterministic
evaluation takes us away from the goals of reinforcement learning. Also, the workshop participants came
up with a set of evaluation standards to be followed,
such as the essential information to be reported in
future works. Moreover, we discussed the best way to
inject stochasticity in the ALE. The panel was
undoubtedly successful and led to the drafting of a
set of evaluation standards for general competency in
video games. We expect these standards to ease reproducibility and comparability between different
We believe the workshop was very successful,
achieving all of its original goals. The organizers are
now writing an article to present the evaluation standards discussed in the workshop. This article will also
introduce a revised arcade learning environment,
which will facilitate the new evaluation standards
agreed upon during the workshop.
Marc G. Bellemare (Google DeepMind), Michael
Bowling (University of Alberta, Canada), Marlos C.
Machado (University of Alberta, Canada), Erik Talvitie (Franklin and Marhsall College, USA) and Joel
Veness (Google DeepMind) organized this workshop.
This summary was written by Marlos C. Machado.
The papers of the workshop were published as AAAI
Press Technical Report WS- 15-10.
without Prior Coordination
Interaction between agents is the defining attribute
of multiagent systems, encompassing problems such
as planning in a decentralized setting, learning oth-
er agent models, composing teams with high task
performance, and selected resource-bounded com-
munication and coordination. While there is signifi-
cant variety in methodologies used to solve such
problems, the majority of these methods depend on
some form of prior coordination. For example, learn-
ing algorithms may assume that all agents share a
common learning method or prior beliefs, distrib-
uted optimization methods may assume specific
structural constraints regarding the partition of state
space or cost and rewards, and symbolic methods
often make strong assumptions regarding norms and
protocols. However, in realistic problems, these
assumptions are easily violated. Thus, there is a need
for new models and algorithms that specifically
address the case of ad hoc interactions.
The purpose of this workshop was to discuss the
role of such predefined knowledge and coordination
in multiagent systems, and to provide a venue for
research on novel models and algorithms that specifically address multiagent interaction without prior
coordination (MIPC). There were a total of seven
accepted papers, with topics as diverse as nonpara-metric Bayesian learning in I-POMDPs, optimal
selection of multirobot coalition formation algorithms, combining the expert and type-methodolo-gies for effective interaction, and the RoboCup 2014
SPL drop-in player competition. The presented
research demonstrated that MIPC problems exist in
various flavors and that there are a variety of
approaches to tackle such problems.
We were again privileged to have invited talks by
three distinguished researchers: “Leveraging Expert
Feedback in Recommender Systems” by Pascal
Poupart from the University of Waterloo; “
Agent-Human Interaction without Prior Communication”
by Sarit Kraus from Bar-Ilan University; and “
Interactive POMDPs” by Piotr Gmytrasiewicz from the
University of Illinois at Chicago.
The workshop was chaired by Stefano Albrecht,
Jacob Crandall, and Somchaya Liemhetcharat. Stefano Albrecht was the author of this report. The advisory committee consisted of Subramanian
Ramamoorthy, Peter Stone, and Manuela Veloso.
The chairs would like to thank the workshop participants, the invited speakers, the program committee,