fied as potential final answers to the punch line question (for example, the most likely diagnoses of a
patient’s problem). In some situations hypotheses
may be provided up front — a physician may have a
list of competing diagnoses and want to explore the
evidence for each. But in general the system needs to
identify these. Hypothesis nodes may be treated differently in later iterations. For instance, we may
attempt to do backward chaining from the hypotheses, asking Watson what things, if they were true of
the patient, would support or refute a hypothesis.
The process may terminate after a fixed number of
iterations or based on some other criterion like confidence in the hypotheses.
While hypothesis identification is part of WatsonPaths, it is not described in detail in this article. In
the system that generates the results we present in
this article, no hypothesis identification is necessary
because the multiple-choice answers are provided.
That system always does one iteration of expansion,
both forward from the identified factors and backward from the hypotheses, before stopping.
Hypothesis Confidence Refinement
As described so far, WatsonPath’s confidence in each
hypothesis depends on the strengths of the edges
leading to it, and since our primary relation (edge)
generator is Watson, the hypothesis confidence
depends heavily on the confidence of Watson’s
answers. Having good answer confidence depends on
having a representative set of question/answer pairs
with which to train Watson. The following question
arises: What can we do if we do not have a representative set of question/answer pairs, but we do have
training examples for entire scenarios (for example,
correct diagnoses associated with patient scenarios)?
To leverage the available scenario-level ground truth,
we have built machine-learning techniques to learn
a refinement of Watson’s confidence estimation that
produces better results when applied to the entire
scenario. We describe our techniques in the Learning
over Assertion Graphs section.
The core data structure used by WatsonPaths is the
assertion graph. Figure 3 explains this data structure,
along with the visualization that we commonly use
for it. Assertion graphs are defined as follows.
A statement is something that can be true or false
(though its state may not be known). Often we deal
with unstructured statements, which are natural language expressions like “A 63-year-old patient is sent
to the neurologist with a clinical picture of resting
tremor that began 2 years ago.” WatsonPaths also
allows for statements that are structured expressions,
namely, a predicate and arguments. Not all natural
language expressions can have a truth value. For
instance, the string “patient” cannot be true or false;
thus it does not fit into the semantics of an assertion
graph. WatsonPaths is charitable in interpreting
strings as if they had a truth value. For instance, the
default semantics of the string “low hemoglobin” is
the same as “patient has low hemoglobin.”
A relation is a named association between statements. Technically, relations are themselves statements, and have a truth value. The relation has a
head, a tail, and predicate; for instance in medicine
we may say that “Parkinsons causes resting tremor”
or “Parkinson’s matches Parkinsonism.” Typically we
are concerned with relations that may provide evidence for the truth of one statement given another.
Although some relations may have special meanings
in the probabilistic inference systems, a common
semantics for a relation is indicative in the following
way: “A indicates B” means that the truth of A
provides an independent reason to believe that B is true.
An assertion is a claim that some agent makes
about the truth of a statement (including a relation).
The assertion records the name of the agent and a
confidence value. Assertions may also record provenance information that explains how the agent came
to its conclusion. For the Watson question-answering
agent, this includes natural language passages that
provide evidence for the answer.
In the assertion graph, each node represents exactly one statement, and each edge represents exactly
one relation. Nodes and edges may have multiple
assertions attached to them, one for each agent that
has asserted that node or edge to be true.
We often visualize assertion graphs by using a
node’s border width to represent the confidence of
the node, an edge’s width to represent the confidence
of the edge, and an edge’s gray level as the amount of
“belief flow” along that edge. Belief flow is described
later, but essentially it is how much the value of the
head influences the value of the tail. This depends
mostly on the confidences of the assertions on the
The goal of scenario analysis is to identify informa-
tion in the natural language narrative of the problem
scenario that is potentially relevant to solving the
problem. When human experts read the problem
narrative, they are trained to extract concepts that
match a set of semantic types relevant for solving the
problem. In the medical domain, doctors and nurses
identify semantic types like chief complaints, past
medical history, demographics, family and social his-
tory, physical examination findings, labs, and current
medications (Bowen 2006). Experts also generalize
from specific observations in a particular problem
instance to more general terms used in the domain
corpus. An important aspect of this information
extraction is to identify the semantic qualifiers asso-
ciated with the clinical observations (Chang, Bor-
Summer 2017 63