stealing algorithm on the same observation stream
for each of the synthetic domains. As expected it
shows a clear difference in speedup depending on the
structure of the plans and the grammar used to
describe it. Comparing figure 11 to table 1 also shows
a clear correlation. The LAST-RIGHT and FIRST-MID
domains, which generate only a handful of explanations, have limited speedup, while the FIRST-LEFT
and LAST-MID, which generate tens of thousands of
explanations, exhibit close to linear speedup. This
gives us strong reason to believe that the differences
in the speedup are a result of the differences in the
number of generated explanations and therefore the
number of processors that can be kept busy.
In the previous section, we have shown that the
number of intermediate and final explanations can
vary wildly depending on the structure of the
domain. We now examine properties of all the
domains examined so far, real and synthetic.
This indicates that when more explanations are
possible according to the grammar, more work is
required, therefore more threads can be kept busy,
and a greater speedup is achievable. However, the
converse is also true. Fewer explanations in a domain
means that less work needs to be done, and for small
enough problems there will be no significant gain in
the run time for a parallel implementation. Therefore, to help in real-world deployment, we need to be
able to identify when a parallel implementation is
worth the cost.
To identify this, figure 12 is a second scatter plot
graphing speedup achieved with 12 threads against
the base run time with 1 thread for each of the problem domains. It shows that for runs that take longer
than around 5 seconds, we achieve 10-fold speedup,
very close to the ideal 12-fold speedup, making parallelism worthwhile. For shorter runs, there is much
less benefit to the multithreaded implementation.
Our analysis also suggests that for real-world
domains with plan grammars with predominately
LAST-RIGHT or FIRST-MID structure (where both the
causal structure of the plan and the CCG grammar’s
anchors act to reduce the number of explanations)
parallelism will be less helpful. The amount of work
required will already be reduced by the causal and gramatical constraints. Domains with grammars that do
not use these tools to constrain the number of explanations will see significant benefits from parallelism.
This article has shown that parallelization using a
work-stealing scheduling regime can be usefully
applied to significantly speed up the processing of the
ELEXIR plan-recognition system. The multithreaded
implementation discussed in this article allows us to
0.0 0.5 1.0 1. 5 2.0 2. 5 3.0
Figure 10. Run Time Versus Total Explanations for All Domains.
Figure 11. Speedup of Synthetic Domain Problems
with Increasing Number of Threads.
The data points for the FIRST-LEFT and LAST-MID, as well as the FIRST-MID
and LAST-RIGHT series overlap extremely closely.
0 5 10 15 20 25 30 35
Figure 12. Speedup versus Run Time for All Domains.