approximately 2. 5 million checklists collected across
385 thousand unique locations. All models were
trained with 2. 25 million checklists made across 360
thousand unique locations, with the remaining held
out for model evaluation. At small spatial scales, the
data density can be seen to correlate with human
population and travel patterns. At larger spatial scales
the observations are seen to be most densely distributed in the United States where eBird originated, and
sparser in Central and South America even in regions
of high human density (see the left panel, figure 1).
The spatiotemporal distributions presented here
were modeled as a spatial mixture of local temporal
trajectories. We estimated the trajectory within each
stixel using a binary response GAM as the class of
base model. The binary response indicates the presence or absence of a single bird species recorded on a
given search. The logit of the probability of occurrence was modeled as an additive function of the day
of the year and several other factors describing the
effort spent searching for birds. Seasonal variation is
captured by a smooth function of the day-of-year
covariate and fit with a penalized cyclic spline basis.
To account for variation in detection rates we includ-
ed effort covariates for the amount of time spent on
a search, the distance traveled while searching, and
the number of observers in the search party. The time
of the day was used to account for diurnal variation
in behavior, suchas higher detectability of birds dur-
ing their participation in the dawn chorus (Diefen-
bach et al. 2007), which make species more or less
conspicuous. The ensemble was created by parti-
tioning the study extent into square stixels measured
in units of degrees latitude and longitude with P =
200. The minimum sample size per base model was γ = 500.
The predictive performance of STEM and
AdaSTEM were compared using distribution estimates for Barn Swallow (Hirundo rustica) in (Fink,
Damoulas, and Dave 2013). In these tests, AdaSTEM
outperformed STEM for all measures of predictive
performance for all 12 months of the year. These
results demonstrated the ability of AdaSTEM to take
advantage of the varying eBird observation density
by reducing bias in regions with high data density
and controlling variance in regions with low data
Autumn Migration Estimates
To demonstrate how AdaSTEM can adapt to different
distributional dynamics across a range of extents and
scales we estimated the distributions for Barn Swal-
low (Hirundo rustica), Blackpoll Warbler (Setophaga
striata), and Black-throated Blue Warbler (Setophaga
caerulescens). These three species are all broadly dis-
tributed migratory birds with very different autumn
migration strategies — different distribution loca-
tions, distribution extents, and timing of movement.
To develop rangewide estimates of species’ distri-
butions we selected the smallest stixel size necessary
to achieve at least half the maximum ensemble sup-
port, P, across 90 percent of the Western Hemisphere.
Then we used this model to estimate one daily dis-
Left Panel Right Panel
(a) ρ U
Data AdaSTEM STEM Target Data AdaSTEM STEM
(b) ρ U
Figure 2. AdaSTEM Versus STEM Synthetic Data Experiment for
Uniform or Nonuniform Density of Observations and Single- or Multiscale Signal.
Left Panel: For the single-scale function both models perform comparably for both uniform (row a) and nonuniform (row b) data density.
Right Panel: In the presence of multiscale signal AdaSTEM clearly outperforms STEM when the density of observations is sufficient (row b)
to capture the small scale correlation. (Color version of figure presented in electronic version of AI Magazine).