A new technique for handling non-probability samples based on model-assisted kernel weighting
Identificadores
URI: https://hdl.handle.net/10481/104748Metadata
Show full item recordAuthor
Cobo Rodríguez, Beatriz; Rueda-Sánchez, Jorge Luis; Ferri García, Ramón; Rueda García, María del MarEditorial
Elsevier
Materia
Model-assisted kernel Kernel weighting Employment Confinement period COVID-19
Date
2024Referencia bibliográfica
Cobo, Beatriz (AC); Rueda, Jorge Luis; Ferri, Ramón; Rueda, María. A new technique for handling non-probability samples based on model-assisted kernel weighting. Mathematics and Computers in Simulation, 227 (2025), 272-281. 2024. (4.4, Q1 (2023)). https://doi.org/10.1016/j.matcom.2024.08.009.
Sponsorship
This work is part of grant PDC2022-133293-I00 funded by the MCIN/AEI/10.13039/5011000 11033, Spain and the European Union ‘‘NextGenerationEU’’/PRTR’’1 and the grant FEDER C-EXP-153-UGR23 funded by the Consejería de Universidad, Investigación e Innovación and by the ERDF Andalusia Program 2021–2027, Spain. The research was also partially supported from IMAG-María de Maeztu CEX2020-001105-M/AEI/10.13039/501100011033 and PPJIA2023-030 of the University of Granada, Spain. Funding for open access charge: Universidad de Granada / CBUA.Abstract
non-probability samples. Non-probability samples are increasingly used for their low research
costs and the speed of the attainment of results, but these surveys are expected to have strong
selection bias caused by several mechanisms that can eventually lead to unreliable estimates
of the population parameters of interest. Thus, the classical methods of statistical inference do
not apply because the probabilities of inclusion in the sample for individual members of the
population are not known. Therefore, in the last few decades, new possibilities of inference
from non-probability sources have appeared.
Statistical theory offers different methods for addressing selection bias based on the
availability of auxiliary information about other variables related to the main variable, which
must have been measured in the non-probability sample. Two important approaches are inverse
probability weighting and mass imputation. Other methods can be regarded as combinations of
these two approaches.
This study proposes a new estimation technique for non-probability samples. We call this
technique model-assisted kernel weighting, which is combined with some machine learning
techniques. The proposed technique is evaluated in a simulation study using data from a
population and drawing samples using designs with varying levels of complexity for, a study
on the relative bias and mean squared error in this estimator under certain conditions. After
analyzing the results, we see that the proposed estimator has the smallest value of both
the relative bias and the mean squared error when considering different sample sizes, and
in general, the kernel weighting methods reduced more bias compared to based on inverse
weighting. We also studied the behavior of the estimators using different techniques such us
generalized linear regression versus machine learning algorithms, but we have not been able
to find a method that is the best in all cases. Finally, we study the influence of the density
function used, triangular or standard normal functions, and conclude that they work similarly.
A case study involving a non-probability sample that took place during the COVID-19
lockdown was conducted to verify the real performance of the proposed methodology, obtain
a better estimate, and control the value of the variance.