toward worse grades in adaptive
inspections. The most important distinction, however, is between restaurants with minor violations (grades A
and B) and those posing considerable
health risks (grade C and worse).
nEmesis uncovers 11 venues in the latter category, whereas control finds
only 7, a 64 percent improvement.
All of our data, suitably anonymized
for further analysis.
CDC studies show that each outbreak averages 17. 8 afflicted individuals and 1. 1 hospitalizations (CDC
2013). Therefore we estimate that
adaptive inspections saved 71 infections and 4. 4 hospitalizations over the
three-month period. Since the Las
Vegas health department performs
more than 35,000 inspections annually, nEmesis can prevent over 9126 cases of foodborne illness and 557 hospitalizations in Las Vegas alone. This is
likely an underestimate as an adaptive
inspection can catch the restaurant
sooner than a normal inspection. During that time, the venue continues to
Adaptive inspections yield a number
of unexpected benefits. nEmesis alerted SNHD to an unpermitted seafood
establishment. This business was
flagged by nEmesis because it uses a
comprehensive list of food venues
independent of the permit database.
An adaptive inspection also discovered
a food handler working while sick with
an influenza-like disease. Finally, we
observed a reduced amount of foodborne illness complaints from the public and subsequent investigations during the experiment. Between January
2, 2015, and March 31, 2015, SNHD
performed 5 foodborne illness investigations. During the same time frame
the previous year, SNHD performed 11
foodborne illness investigations. Over
the last 7 years, SNHD averaged 7. 3
investigations during this three-month
time frame. It is likely that nEmesis
alerted the health district to food safety risks faster than traditional complaint channels, prior to an outbreak.
Given the ambiguity of online data,
it may appear hopeless to identify
problematic restaurants fully automat-
ically. However, we demonstrate that
nEmesis uncovers significantly more
problematic restaurants than current
inspection processes. This work is the
first to directly validate disease predic-
tions made from social media data. To
date, all research on modeling public
health from online data measured
accuracy by correlating aggregate esti-
mates of the number of cases of dis-
ease based on online data and aggre-
gate estimates based on traditional
data sources (Grassly, Fraser, and Gar-
nett 2005; Brownstein, Wolfe, and
Mandl 2006; Ginsberg et al. 2008;
Golder and Macy 2011; Sadilek et al.
2013). By contrast, each prediction of
our model is verified by an inspection
following a well-founded professional
protocol. Furthermore, we evaluate
nEmesis in a controlled double-blind
experiment, where predictions are ver-
ified in the order of hours.
Finally, this study also showed that
social-media-driven inspections can
discover health violations that could
never be found by traditional protocols, such as unlicensed venues. This
fact indicates that it may be possible to
adapt the nEmesis approach for identifying food safety problems in noncommercial venues, ranging from
school picnics to private parties. Identifying possible sources of foodborne
illness among the public could support more targeted and effective food
safety awareness campaigns.
The success of this study has led the
Southern Nevada Health District to
win a CDC grant to support the further development of nEmesis and its
permanent deployment statewide.
This research was partly funded by
NSF grants 1319378 and 1516340;
NIH grant 5R01GM108337-02; and
the Intel ISTC-PC.
Achrekar, H.; Gandhe, A.; Lazarus, R.; Yu,
S.; and Liu, B. 2012. Twitter Improves Sea-
sonal Influenza Prediction. Proceedings of
the Fifth Annual International Conference on
Health Informatics. Setubal, Portugal: Insti-
tute for Systems and Technologies of Infor-
mation, Control and Communication.
Anderson, R., and May, R. 1979. Population
Biology of Infectious Diseases: Part I. Nature
Attenberg, J., and Provost, F. 2010. Why
Label When You Can Search?: Alternatives
to Active Learning for Applying Human
Resources to Build Classification Models
Under Extreme Class Imbalance. In
Proceedings of the 16th ACM SIGKDD Conference on
Knowledge Discovery and Data Mining, 423–
432. New York: Association for Computing
Brennan, S.; Sadilek, A.; and Kautz, H. 2013.
Towards Understanding Global Spread of
Disease from Everyday Interpersonal Interactions. In Proceedings of the 23rd International Joint Conference on Artificial Intelligence. Menlo Park, CA: AAAI Press
Broniatowski, D. A., and Dredze, M. 2013.
National and Local Influenza Surveillance
Through Twitter: An Analysis of the 2012–
2013 Influenza Epidemic. PLoS ONE 8( 12):
e83672. doi: 10.1371/journal.pone.0083672.
Brownstein, J.; Wolfe, C.; and Mandl, K.
2006. Empirical Evidence for the Effect of
Airline Travel on Inter-Regional Influenza
Spread in the United States. PLoS Medicine
3( 10): e401. dx.doi.org/10.1371/journal.
Brownstein, J. S.; Freifeld, B. S.; and Madoff,
L. C. 2009. Digital Disease Detection — Harnessing the Web for Public Health Surveillance. New England Journal of Medicine
260( 21): 2153–2157.
CDC. 2013. Surveillance for Foodborne Disease Outbreaks United States, 2013: Annual
Report. Technical Report, Centers for Disease
Control and Prevention National Center for
Emerging and Zoonotic Infectious Diseases.
Atlanta, GA: Centers for Disease Control
Chawla, N.; Japkowicz, N.; and Kotcz, A.
2004. Editorial: Special Issue on Learning
from Imbalanced Data Sets. ACM SIGKDD
Explorations Newsletter 6( 1): 1–6.
Chen, P.; David, M.; and Kempe, D. 2010.
Better Vaccination Strategies for Better People. In Proceedings of the 11th ACM Conference on Electronic Commerce, 179–188. New
York: Association for Computing Machinery.
Chunara, R.; Andrews, J.; and Brownstein, J.
2012. Social and News Media Enable Estimation of Epidemiological Patterns Early in
the 2010 Haitian Cholera Outbreak. The
American Journal of Tropical Medicine and
Hygiene 86( 1): 39–45.
Cortes, C., and Vapnik, V. 1995. Support-Vector Networks. Machine Learning 20( 3):
Culotta, A. 2010. Towards Detecting
Influenza Epidemics by Analyzing Twitter
Messages. Paper presented at the First Workshop on Social Media Analytics, July 25–28,
De Choudhury, M.; Gamon, M.; Counts, S.;
and Horvitz, E. 2013. Predicting Depression