86 AI MAGAZINE
came in second, with 11. 9 percent games solved and
293 hints required. Finally, OUT TWIKI solved only
5. 9 percent of the games, consuming 197 hints,
though this result followed from the overly long
response times of its complex reasoning engine,
which caused the system to time out frequently.
Interestingly, of the 30 cities correctly guessed by
any of the submitted Guesser agents overall, 24 were
correctly guessed by a single competition entry, and
only one city (Paris) appears in the list of 12 cities
that each of the three Guesser agents most frequently generated as a guess. This result suggests that there
is a high degree of diversity not only in the human
games in our evaluation data, but also in the behaviors of the submitted Guesser agents, which reinforces our confidence that the scenario is indeed one
where diversity awareness is key.
Awards for the winners were presented at The
Taboo Challenge Competition Workshop that took
place on August 29, 2017 in Melbourne as part of the
IJCAI 2017 program, where the participants also had
an opportunity to present the papers they had sub-
mitted alongside their implementations.
Lessons Learned
Despite our best attempts to simplify some of the ele-
ments of the competition, the task turned out to be
much harder than expected. We attribute this to two
factors.
First, many human players often solved the game
after just one or two hints, and such hints were often
highly contextual (for example, terrorist attack would
immediately suggest a city where such an attack had
taken place most recently). It is easy to see why an
artificial Guesser developed to achieve good performance over a broad range of games would be
unable to match human performance in these
instances, but it should be possible to solve this problem by gathering more game data so that only more
Figure 1. Taboo Challenge Competition Data Collection and Evaluation Process.
Describer Guesser
GAME 1:
describer: river
guesser: London
describer: No. famous pastries
guesser: Vienna
describer: No. hunchback
guesser: Paris
describer: YES!
GAME 2:
describer: huge statue
guesser: New York
describer: No. festival
guesser: Buenos Aires
describer: No. animated movie
describer: YES!
Generating Evaluation
Dataset and Distributing
Training Dataset
Gamified Data Collection
with GUESSence App
Number of correct
cities and number of
guesses needed
Running the
Competition
Publication,
Presentations,
and Awards
YES!
Amsterdam
flat
No. festival
Sydney
sea