Matched Mass Imputation for Survey Data Integration

Jeremy Flood, Sayed A. Mostafa

Research output: Contribution to journalArticlepeer-review

Abstract

Analysis of nonprobability survey samples has gained much attention in recent years due to their wide availability and the declining response rates within their costly probabilistic counterparts. Still, valid population inference cannot be deduced from nonprobability samples without additional information, which typically takes the form of a smaller survey sample with a shared set of covariates. In this paper, we propose the matched mass imputation (MMI) approach as a means for integrating data from probability and nonprobability samples when common covariates are present in both samples but the variable of interest is available only in the nonprobability sample. The proposed approach borrows strength from the ideas of statistical matching and mass imputation to provide robustness against potential nonignorable bias in the nonprobability sample. Specifically, MMI is a two-step approach: first, a novel application of statistical matching identifies a subset of the nonprobability sample that closely resembles the probability sample; second, mass imputation is performed using these matched units. Our empirical results, from simulations and a real data application, demonstrate the effectiveness of the MMI estimator under nearest-neighbor matching, which almost always outperformed other imputation estimators in the presence of nonignorable bias. We also explore the effectiveness of a bootstrap variance estimation procedure for the proposed MMI estimator.

Original languageEnglish
Pages (from-to)332-352
Number of pages21
JournalJournal of Data Science
Volume23
Issue number2
DOIs
StatePublished - Apr 2025

Keywords

  • data integration
  • mass imputation
  • nonignorable missingness
  • nonprobability samples
  • statistical matching

Fingerprint

Dive into the research topics of 'Matched Mass Imputation for Survey Data Integration'. Together they form a unique fingerprint.

Cite this