Geographically Skewed Recruitment and COVID-19 Seroprevalence Estimates: A Cross-Sectional Serosurveillance Study and Mathematical Modelling Analysis


Tyler Brown Pablo Martinez de Salazar Munoz Bridget Bunda Ellen K. Williams David Bor James S. Miller Amir Mohareb Julia Thierauf Wenxin Yang Julian Villalba Vivek Naranbai Wilfredo Garcia Beltran Tyler E. Miller Doug Kress Kristen Stelljes Keith Johnson Dan Larremore Jochen Lennerz A. John Iafrate Satchit Balsari Caroline Buckee Yonatan Grad


Objectives Convenience sampling is an imperfect but important tool for seroprevalence studies. For COVID-19, local geographic variation in cases or vaccination can confound studies that rely on the geographically skewed recruitment inherent to convenience sampling. The objectives of this study were: (1) quantifying how geographically skewed recruitment influences SARS-CoV-2 seroprevalence estimates obtained via convenience sampling and (2) developing new methods that employ Global Positioning System (GPS)-derived foot traffic data to measure and minimise bias and uncertainty due to geographically skewed recruitment. Design We used data from a local convenience-sampled seroprevalence study to map the geographic distribution of study participants’ reported home locations and compared this to the geographic distribution of reported COVID-19 cases across the study catchment area. Using a numerical simulation, we quantified bias and uncertainty in SARS-CoV-2 seroprevalence estimates obtained using different geographically skewed recruitment scenarios. We employed GPS-derived foot traffic data to estimate the geographic distribution of participants for different recruitment locations and used this data to identify recruitment locations that minimise bias and uncertainty in resulting seroprevalence estimates. Results The geographic distribution of participants in convenience-sampled seroprevalence surveys can be strongly skewed towards individuals living near the study recruitment location. Uncertainty in seroprevalence estimates increased when neighbourhoods with higher disease burden or larger populations were undersampled. Failure to account for undersampling or oversampling across neighbourhoods also resulted in biased seroprevalence estimates. GPS-derived foot traffic data correlated with the geographic distribution of serosurveillance study participants. Conclusions Local geographic variation in seropositivity is an important concern in SARS-CoV-2 serosurveillance studies that rely on geographically skewed recruitment strategies. Using GPS-derived foot traffic data to select recruitment sites and recording participants’ home locations can improve study design and interpretation.

BMJ Open