Propensity score adjustment using machine learning classification algorithms to control selection bias in online surveys
Metadatos
Afficher la notice complèteEditorial
Public Library of Science
Date
2020-04Referencia bibliográfica
Ferri-García R, Rueda MdM (2020) Propensity score adjustment using machine learning classification algorithms to control selection bias in online surveys. PLoS ONE 15(4): e0231500. [https://doi.org/10.1371/journal. pone.0231500]
Patrocinador
This study was partially supported by Ministerio de Economía y Competitividad, Spain [grant number MTM2015-63609-R] and, in terms of the first author, a FPU grant from the Ministerio de Ciencia, Innovacio´n y Universidades, Spain. The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.Résumé
Modern survey methods may be subject to non-observable bias, from various sources.
Among online surveys, for example, selection bias is prevalent, due to the sampling mechanism commonly used, whereby participants self-select from a subgroup whose characteristics differ from those of the target population. Several techniques have been proposed to
tackle this issue. One such is Propensity Score Adjustment (PSA), which is widely used and
has been analysed in various studies. The usual method of estimating the propensity score
is logistic regression, which requires a reference probability sample in addition to the online
nonprobability sample. The predicted propensities can be used for reweighting using various estimators. However, in the online survey context, there are alternatives that might outperform logistic regression regarding propensity estimation. The aim of the present study is
to determine the efficiency of some of these alternatives, involving Machine Learning (ML)
classification algorithms. PSA is applied in two simulation scenarios, representing situations
commonly found in online surveys, using logistic regression and ML models for propensity
estimation. The results obtained show that ML algorithms remove selection bias more effectively than logistic regression when used for PSA, but that their efficacy depends largely on
the selection mechanism employed and the dimensionality of the data.