<?xml version="1.0" encoding="UTF-8"?>
<rdf:RDF xmlns="http://purl.org/rss/1.0/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:dc="http://purl.org/dc/elements/1.1/">
<channel rdf:about="https://hdl.handle.net/10481/68300">
<title>Grupo: Diseño y análisis estadístico de encuestas por muestreo (FQM365)</title>
<link>https://hdl.handle.net/10481/68300</link>
<description/>
<items>
<rdf:Seq>
<rdf:li rdf:resource="https://hdl.handle.net/10481/104749"/>
<rdf:li rdf:resource="https://hdl.handle.net/10481/104748"/>
<rdf:li rdf:resource="https://hdl.handle.net/10481/104747"/>
<rdf:li rdf:resource="https://hdl.handle.net/10481/104746"/>
<rdf:li rdf:resource="https://hdl.handle.net/10481/104745"/>
</rdf:Seq>
</items>
<dc:date>2026-04-19T16:43:00Z</dc:date>
</channel>
<item rdf:about="https://hdl.handle.net/10481/104749">
<title>Evaluation of available techniques and its combinations to address selection bias in nonprobability surveys</title>
<link>https://hdl.handle.net/10481/104749</link>
<description>Evaluation of available techniques and its combinations to address selection bias in nonprobability surveys
Rueda-Sánchez, Jorge Luis; Ferri García, Ramón; Rueda García, María del Mar; Cobo Rodríguez, Beatriz
New survey methodologies that often produce nonprobability samples have recently&#13;
become very important. However, estimates from nonprobability samples can be&#13;
subject to selection bias, which is primarily caused by the lack of coverage and the&#13;
respondent’s ability to decide whether or not to participate in the survey. In such&#13;
cases, inclusion probabilities can be zero or unknown. When this happens, the estimators&#13;
normally used in sample surveys are useless, and we must employ methods&#13;
to reduce this bias. There is a wide variety of techniques to achieve this which&#13;
depend on the auxiliary information available, but no study has determined which is&#13;
better among all. In this paper, we briefly explain most of these methods and conduct&#13;
an extended study to compare their performances. We will study superpopulation&#13;
models, which require knowledge of the auxiliary variables of all individuals in&#13;
the population, linear calibration, which requires the population totals of the covariates,&#13;
and several techniques that use a reference probability sample, such as propensity&#13;
score adjustment, propensity-adjusted probability prediction, Kernel Weighting,&#13;
Statistical Matching and, Doubly Robust estimators. In addition, we compare&#13;
their performance using linear regression or XGBoost as a predictive model, and the&#13;
design weights in estimating inclusion probabilities or not, and with or without prior&#13;
variables selection. The study was performed using five different datasets to determine&#13;
which technique provides accurate and reliable estimates from nonprobability&#13;
samples.
</description>
</item>
<item rdf:about="https://hdl.handle.net/10481/104748">
<title>A new technique for handling non-probability samples based on model-assisted kernel weighting</title>
<link>https://hdl.handle.net/10481/104748</link>
<description>A new technique for handling non-probability samples based on model-assisted kernel weighting
Cobo Rodríguez, Beatriz; Rueda-Sánchez, Jorge Luis; Ferri García, Ramón; Rueda García, María del Mar
non-probability samples. Non-probability samples are increasingly used for their low research&#13;
costs and the speed of the attainment of results, but these surveys are expected to have strong&#13;
selection bias caused by several mechanisms that can eventually lead to unreliable estimates&#13;
of the population parameters of interest. Thus, the classical methods of statistical inference do&#13;
not apply because the probabilities of inclusion in the sample for individual members of the&#13;
population are not known. Therefore, in the last few decades, new possibilities of inference&#13;
from non-probability sources have appeared.&#13;
Statistical theory offers different methods for addressing selection bias based on the&#13;
availability of auxiliary information about other variables related to the main variable, which&#13;
must have been measured in the non-probability sample. Two important approaches are inverse&#13;
probability weighting and mass imputation. Other methods can be regarded as combinations of&#13;
these two approaches.&#13;
This study proposes a new estimation technique for non-probability samples. We call this&#13;
technique model-assisted kernel weighting, which is combined with some machine learning&#13;
techniques. The proposed technique is evaluated in a simulation study using data from a&#13;
population and drawing samples using designs with varying levels of complexity for, a study&#13;
on the relative bias and mean squared error in this estimator under certain conditions. After&#13;
analyzing the results, we see that the proposed estimator has the smallest value of both&#13;
the relative bias and the mean squared error when considering different sample sizes, and&#13;
in general, the kernel weighting methods reduced more bias compared to based on inverse&#13;
weighting. We also studied the behavior of the estimators using different techniques such us&#13;
generalized linear regression versus machine learning algorithms, but we have not been able&#13;
to find a method that is the best in all cases. Finally, we study the influence of the density&#13;
function used, triangular or standard normal functions, and conclude that they work similarly.&#13;
A case study involving a non-probability sample that took place during the COVID-19&#13;
lockdown was conducted to verify the real performance of the proposed methodology, obtain&#13;
a better estimate, and control the value of the variance.
</description>
</item>
<item rdf:about="https://hdl.handle.net/10481/104747">
<title>Estimating response propensities in nonprobability surveys using machine learning weighted models.</title>
<link>https://hdl.handle.net/10481/104747</link>
<description>Estimating response propensities in nonprobability surveys using machine learning weighted models.
Ferri García, Ramón; Rueda-Sánchez, Jorge Luis; Rueda García, María del Mar; Cobo Rodríguez, Beatriz
Propensity Score Adjustment (PSA) is a widely accepted method to reduce selection bias&#13;
in nonprobability samples. In this approach, the (unknown) response probability of each&#13;
individual is estimated in a nonprobability sample, using a reference probability sample. This,&#13;
the researcher obtains a representation of the target population, reflecting the differences (for&#13;
a set of auxiliary variables) between the population and the nonprobability sample, from which&#13;
response probabilities can be estimated.&#13;
Auxiliary probability samples are usually produced by surveys with complex sampling&#13;
designs, meaning that the use of design weights is crucial to accurately calculate response&#13;
probabilities. When a linear model is used for this task, maximising a pseudo log-likelihood&#13;
function which involves design weights provides consistent estimates for the inverse probability&#13;
weighting estimator. However, little is known about how design weights may benefit the&#13;
estimates when techniques such as machine learning classifiers are used.&#13;
This study aims to investigate the behaviour of Propensity Score Adjustment with machine&#13;
learning classifiers, subject to the use of weights in the modelling step. A theoretical approximation&#13;
to the problem is presented, together with a simulation study highlighting the properties&#13;
of estimators using different types of weights in the propensity modelling step.
</description>
</item>
<item rdf:about="https://hdl.handle.net/10481/104746">
<title>Kernel Weighting for blending probability and non-probability survey samples</title>
<link>https://hdl.handle.net/10481/104746</link>
<description>Kernel Weighting for blending probability and non-probability survey samples
Rueda García, María Del Mar; Cobo Rodríguez, Beatriz; Rueda-Sánchez, Jorge Luis; Ferri García, Ramón; Castro Martín, Luis
In this paper we review some methods proposed in the literature for combining a nonprobability and a probability sample with the purpose of obtaining an estimator with a smaller bias and standard error than the estimators that can be obtained using only the probability sample. We propose a new methodology based on the kernel weighting method. We discuss the properties of the new estimator when there is only selection bias and when there are both coverage and selection biases. We perform an extensive simulation study to better understand the behaviour of the proposed estimator.
</description>
</item>
<item rdf:about="https://hdl.handle.net/10481/104745">
<title>Estimation of the distribution function and quantiles through data integration</title>
<link>https://hdl.handle.net/10481/104745</link>
<description>Estimation of the distribution function and quantiles through data integration
Cobo Rodríguez, Beatriz; Martínez, Sergio; Rueda García, María del Mar
collecting detailed data from individuals. Non-probability sampling is a relatively&#13;
inexpensive data source, although they require special treatment because the estimate&#13;
may suffer from sample selection bias. In this paper, we consider methods&#13;
for integrating a non-representative volunteer sample into a probability survey. We&#13;
investigate several approaches to correcting non-probability sample selection bias&#13;
in the estimation of the distribution function. We combine the estimators of the&#13;
distribution function that correct the selection bias with the design unbiased estimators&#13;
based on the probability sample. Our methodology for combining the voluntary&#13;
and probability samples can be applied to other non-linear parameters. Empirical&#13;
evidence of the improvements offered by the proposed methodology is provided in&#13;
simulation settings.
</description>
</item>
</rdf:RDF>
