day). As another example, weapons, people, and natural disasters frequently occur in expressions that
describe the cause of injuries or damage, such as a
(grenade/sniper) killed three people or a (bomb/tornado)
caused massive damage.
To address this problem, we added a second layer
of bootstrapping, called meta-bootstrapping. After the
mutual bootstrapping process completed, the learned
lexicon entries were reevaluated and only the five
most trusted entries were retained. The mutual bootstrapping process was then restarted using the five
new category members as additional seed words. The
criteria for reevaluating the lexicon entries was based
on the number of different patterns that occurred
with the term.
The meta-bootstrapping solution was a bit ad hoc
and it was expensive because the learning process
now involved nested bootstrapping processes. Since
then, better solutions have been found that induce
semantic dictionaries based on multiple contextual
patterns, rather than a single pattern, and that can
detect semantic drift during bootstrapping. Never-
theless, the mutual bootstrapping idea at the heart of
this work has proven to be useful in a variety of sub-
sequent research efforts, including research on
semantic lexicon induction and other tasks, as we
discuss in the next section.
Subsequent Work
In the years since our mutual bootstrapping paper, a
wide variety of related research has appeared. Here,
we present a brief overview of some of the most
closely related work that has emerged, with the
caveat that this summary aims to highlight different
avenues of follow-on work and as such, it is not
intended to be a comprehensive literature survey.
Learning with Multiple Views
An important aspect of the mutual bootstrapping
algorithm is that it uses two facets of the data, the
noun phrases and their contexts, to learn from a
small initial set of seeds. The idea of learning from
multiple knowledge sources also arose contemporaneously in Collins and Singer’s (1999) work on bootstrapped learning for named entity recognition and
in Blum and Mitchell’s (1998) work on cotraining.
Collins and Singer’s procedure for named entity
Figure 1: The Mutual Bootstrapping Process.
cholera, flu,
listeria, measles,
tuberculosis
Best Pattern
Best Nouns
infected with <np>
ebola, malaria
plague, pneumonia,
tularemia
Patterns
Lexicon