We employ an aggregate of term matchers (
Murdock et al. 2012a) to match pairs of assertions. Each
term matcher posts a confidence value on the degree
of match between two assertions based on its own
resource for determining equivalence. For example, a
WordNet-based term matcher considers terms in the
same synset to be equivalent, and a Wikipedia-redi-rect-based term matcher considers terms with a redirect link between them in Wikipedia to be a match.
The dotted line between Parkinson disease and
Parkinson’s disease in figure 5 is posted by the UMLS-based
term matcher, which considers variants for the same
concept to be equivalent.
Confidence and Belief
Once the assertion graph is constructed, and some
questions and answers are posted, there remains the
problem of confidence estimation. We develop multiple models of inference to address this step.
One approach to the problem of inferring the correct
hypothesis from the assertion graph is probabilistic
inference over a graphic model (Pearl 1988). We refer
to the component that does this as the belief engine.
Although the primary goal of the belief engine is to
infer confidences in hypotheses, it also has the secondary goal to infer belief in unknown nodes that are
not hypotheses. These intermediate nodes may be
important intermediate steps toward an answer; by
assigning high confidences to them in the main loop,
we know to assign them high priority for subquestion asking. Therefore, the belief engine needs to
assign a confidence to each node, not just hypotheses.
To execute the belief engine, we first make a working copy of the assertion graph that we call the inference graph. A separate graph is used so that we can
make changes without losing information that might
Figure 4. The Emerald.