In general, the effective range varies as a joint function of data density and the prevalence of the species.
For example, in regions where the species is correctly predicted to be present with very low probability,
residuals are uniformly small and the range of residual correlation tends to be larger.
For observational data, especially crowdsourced data,
bias assessment is important because biases incurred
during the data collection process may produce
regions where the estimatedprobabilities of occurrence are systematically high or low compared to the
observed rates of occurrence. It is important to know
where biased regions occur, how big the biased
regions are, and the strength of the bias.
To identify biased regions we interpolated the
residuals from June 24–July 1 across the same extent
as that used to visualize the effective range. Then we
looked for areas where the interpolated residuals
were substantially larger or smaller than expected by
chance alone. This was done by standardizing the
residuals and plotting only those regions where the
standardized residuals were more than twice as large
as their associated standard errors. Figure 4 (center)
shows regions with systematically large residuals.
Regions where the estimated occurrence rates are too
low are shown in red and regions where the estimated occurrence rates are too high are shown in blue.
Most of the contiguous regions of bias shown in fig-
ure 4b are relatively small, with larger regions in
Montana, Nevada, Texas, and Arkansas.
Uncertainty estimates are required when making
statistical inference about distributional summaries.
For spatial prioritization we may want to evaluate
whether the difference in expected occurrence rates
between regions is larger than that expected by
chance. To do this we need uncertainty estimates for
the AdaSTEM occurrence rates. These uncertainties
can be approximated based on the variation across
bootstrap replicates. However, because the AdaSTEM
estimator ye(s) is computed as an average across
bootstrap replicates, the standard errors σye(s) will
be smaller than the variance across the bootstrap
replicates. If we assume that the bootstrap replicates
are independent, then σye(s) will be smaller by a factor of n(s)–1/2. For example, if n(s) = 50 the standard
error of the ensemble estimate will be approximately 15 percent of the standard error of estimates
across individual bootstrap replicates.
Figure 4 (right) shows the pointwise standard
errors computed across 50 bootstrap replicates. These
standard errors are conservative, that is, larger than
the actual standard errors for ye(s). Like the spatial
scale and bias diagnostics, the uncertainty estimates
vary jointly with data density and species prevalence.
For example, many data dense regions have relatively high uncertainties. One reason for this follows
0.00 3 2 1 0 - 1 - 2 - 3 150 100 50 0 0.04 0.08 0.12
Figure 4. Assessing the Scale and Quality of the June 28 Barn Swallow AdaSTEM Distribution.
Left: Effective range of spatial correlation in kilometers. Center: Interpolated bias estimates shown in units of standard errors. Right:
Pointwise standard errors computed across 50 individual bootstrap replicates. These diagnostics capture the interplay of the ecological process
(that is, species occurrence), the data density, and the scale structure of AdaSTEM to affect the scale and quality of the estimated distribution. (Color version of figure presented in electronic version of AI Magazine).