36 AI MAGAZINE
First, the classification algorithms tended to clus-
ter the data differently (figure 4), making it difficult
to consistently draw conclusions.
Second, although CART tended to outperform the
other two approaches in prediction accuracy, it was
less stable compared with FFT. Variables used at the
first three levels of a decision tree to cluster the data
were different when 95 percent of the data were used
for training cases as opposed to 65 percent used for
training, which was not the case for FFT.
Third, the FFT approach could be seen as a more
robust decision model because of the relatively
higher prediction accuracy and stable clusters in scenarios where training was limited. For large training
sets, CART tends to outperform the other approaches
in terms of prediction accuracy and model stability.
However, its model becomes less robust with decreases in the number of training data points, although its predictive accuracy is still relatively high.
Also, the difference in the way these algorithms
work can cause difficulty in the interpretation of the
results. For instance, the way the FFT method trains
a decision tree makes it impossible to compare the
clusters of multiple output classes together in one
single representation. The FFT approach requires
that a designer adopt a one-versus-all strategy, which
means comparing between a large number of varied
decision tree representations to account for all the
class labels before arriving at a conclusion. For this
specific example, three times as many decision trees
had to be created for the FFT approach as compared
with the CART approach, with an added intermedi-
ate classification interpretation step that introduces
ambiguity in the results.
Given these results, we elected to continue the
analysis with CART as it was the best algorithm in
terms of our four criteria of ( 1) strong prediction
accuracy, ( 2) straightforward model interpretability
and explainability, ( 3) high stability and robustness,
and ( 4) fast (enough) learning capability.
Contextual Cues from ML Analyses
As discussed previously, the traditional approach us-
ing top-down, hypothesis-driven experimental meth-
ods via an ANOVA led to the conclusion that neither
the displays nor the speed of the car (the vehicle at-
tributes) had any global effect on decision times and
the only individual attributes that were statistically
significant were age and conscientiousness (figure
5a). However, we can frame the problem differently
using a ML approach in that a bottom-up, data-driven
approach can be taken to first determine which seg-
ments of the population are most affected by the
designs in question and then develop a hypothesis-
driven statistical model on those clusters (figure 5b);
for example, a within-subjects ANOVA.
This data-driven approach to the problem (figure 5b)
provides us with the flexibility to account for individual
The bold arrows indicate those paths that predict people who depend on the car-mounted display for information.
Prediction Accuracy = 62%
Prediction Accuracy = 67%
Prediction Accuracy = 51%
≥ 53. 5 < 53. 5
≥ 49. 5 < 49. 5
< 51 ≥ 51
< 51. 5
< 52. 5 ≥ 51. 5
< 53 ≥ 53 < 49 ≥ 49
≥ 51. 5 < 51. 5 ≥ 51. 5