Thresholding nonprobability units in combined data for efficient domain estimation
Thresholding nonprobability units in combined data for efficient domain estimation
Author(s): Terrance D. Savitsky, Matthew R. Williams, Vladislav Beresovsky, Julie GershunskayaSubject(s): Social Sciences, Economy
Published by: Główny Urząd Statystyczny
Keywords: survey sampling; nonprobability sampling; data combining; quasi randomization; thresholding units; bayesian hierarchical modeling;
Summary/Abstract: Quasi-randomization approaches estimate latent participation probabilities for units from a nonprobability / convenience sample. Estimation of participation probabilities for convenience units allows their combination with units from the randomized survey sample to form a survey-weighted domain estimate. One leverages convenience units for domain estimation under the expectation that estimation precision and bias will improve relative to solely using the survey sample; however, convenience sample units that are very different in their covariate support from the survey sample units may inflate estimation bias or variance. This paper develops a method to threshold or exclude convenience units to minimize the variance of the resulting survey-weighted domain estimator. We compare our thresholding method with other thresholding constructions in a simulation study for two classes of datasets based on the degree of overlap between survey and convenience samples on covariate support. We reveal that excluding convenience units that each express a low probability of appearing in both reference and convenience samples reduces estimation error.
Journal: Statistics in Transition. New Series
- Issue Year: 26/2025
- Issue No: 2
- Page Range: 1-19
- Page Count: 19
- Language: English