an input image, we try to train a regression neural
network to output the angle of the pendulum in the
input image. This is also the first setting where we
learn constraints implicitly.
Constraints
Since the outputs of the regressor over continuous
frames should form a sine wave, we can provide a
simulator that generates reasonable samples of trajectories. We define the structured prediction problem by concatenating the network outputs of contiguous images and form a high-dimensional
trajectory. Unlike previous experiments, no explicit
formulas are given throughout the experiment, and
the (implicit) constraints are learned by the discriminator, using samples provided by the simulator.
Evaluation
After 5000 updates, the regressor converges to rela-
tively stable predictions for each frame. We then
manually label the angle of the ball of the pendulum
in each frame in the test set, and measure the corre-
lation of the predicted position with the ground
truth label in pixels. We achieve a correlation of 96. 3
percent. Example predictions on the test data are
shown in figure 6. In this experiment, while the for-
mula-driven approach is arguably more appropriate
for this problem, we demonstrate that our model is
nonetheless capable of solving the task by learning
the constraints implicitly through experience with
samples from a black-box simulator.
Tracking Two Pendulums Simultaneously
To test the capability of our model to deal with more
complex dynamics, we present synthetic images that
contain two pendulums, and aim to track both of
them. The two pendulums are independent, as
Figure 5. Example Images.
Whenever Peach (blond) shows up, Mario (red) comes around, but not vice versa. Yoshi (green) and Bowser (orange) appear randomly. The
system trains with this high-level knowledge and learns to answer whether each image contains Peach or Mario. The first column contains
example images. The second and third columns show the attended locations for the Peach and Mario networks, respectively.
Mario channel means Peach channel means
10
6
5
4
3
2
1
0
6
5
4
3
2
1
0
50
40
30
20
10
0
23456 1 0 23456
Mario channel means Peach channel means
10
6
5
4
3
2
1
0
6
5
4
3
2
1
0
23456 1 0 23456
10 0 20
hasPeach=True, hasMario=True
30 40 50
50
40
30
20
10
0
10 0 20
hasPeach=False, hasMario=True
30 40 50