Estimating response propensities in nonprobability surveys using machine learning weighted models.
Identificadores
URI: https://hdl.handle.net/10481/104747Metadatos
Mostrar el registro completo del ítemAutor
Ferri García, Ramón; Rueda-Sánchez, Jorge Luis; Rueda García, María del Mar; Cobo Rodríguez, BeatrizEditorial
Elsevier
Materia
Propensity score adjustment Design weights Nonprobability samples
Fecha
2024Referencia bibliográfica
Ferri, Ramón (AC); Rueda, Jorge Luis; Rueda, María; Cobo, Beatriz. Estimating response propensities in nonprobability surveys using machine learning weighted models. Mathematics and Computers in Simulation, 225 (2024), 779-793. 2024. (4.4, Q1 (2023)). https://doi.org/10.1016/j.matcom.2024.06.012.
Patrocinador
This work is part of grant PDC2022-133293-I00 funded by MCIN/AEI/10.13039/501100011033 and the European Union ‘‘NextGenerationEU’’/PRTR, and partially funded by Consejería de Universidad, Investigación e Innovación (C-EXP-153-UGR23, Andalusia, Spain), Plan Propio de Investigación 𝑦� Transferencia (PPJIA2023-030, University of Granada) and IMAG-Maria de Maeztu CEX2020-001105-M/AEI/10.13039/501100011033. The second author has a FPI grant from Ministerio de Educación 𝑦� Ciencia (PRE2022-103200) associated with the aforementioned IMAG-Maria de Maeztu funding. The authors thank Kenneth C. Chu (Statistics Canada) and Jean-François Beaumont (Statistics Canada) for their assessment of the application of TrIPW algorithm, including the R package to perform the simulations. Funding for open access charge: Universidad de Granada / CBUA.Resumen
Propensity Score Adjustment (PSA) is a widely accepted method to reduce selection bias
in nonprobability samples. In this approach, the (unknown) response probability of each
individual is estimated in a nonprobability sample, using a reference probability sample. This,
the researcher obtains a representation of the target population, reflecting the differences (for
a set of auxiliary variables) between the population and the nonprobability sample, from which
response probabilities can be estimated.
Auxiliary probability samples are usually produced by surveys with complex sampling
designs, meaning that the use of design weights is crucial to accurately calculate response
probabilities. When a linear model is used for this task, maximising a pseudo log-likelihood
function which involves design weights provides consistent estimates for the inverse probability
weighting estimator. However, little is known about how design weights may benefit the
estimates when techniques such as machine learning classifiers are used.
This study aims to investigate the behaviour of Propensity Score Adjustment with machine
learning classifiers, subject to the use of weights in the modelling step. A theoretical approximation
to the problem is presented, together with a simulation study highlighting the properties
of estimators using different types of weights in the propensity modelling step.