<rdf:RDF xmlns:rdf="http://www.openarchives.org/OAI/2.0/rdf/" xmlns:ow="http://www.ontoweb.org/ontology/1#" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:ds="http://dspace.org/ds/elements/1.1/" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns:doc="http://www.lyncode.com/xoai" xsi:schemaLocation="http://www.openarchives.org/OAI/2.0/rdf/ http://www.openarchives.org/OAI/2.0/rdf.xsd">
   <ow:Publication rdf:about="oai:digibug.ugr.es:10481/85892">
      <dc:title>Multi-Label Quantification</dc:title>
      <dc:creator>Moreo Fernández, Alejandro</dc:creator>
      <dc:creator>Aparicio, Manuel Francisco</dc:creator>
      <dc:creator>Sebastiani, Fabrizio</dc:creator>
      <dc:description>The work of A. Moreo and F. Sebastiani has been supported by the SoBigData++ project, funded&#xd;
by the European Commission (Grant 871042) under the H2020 Programme INFRAIA-2019-1, by&#xd;
the AI4Media project, funded by the European Commission (Grant 951911) under the H2020&#xd;
Programme ICT-48-2020, and by the SoBigData.it and FAIR projects funded by the Italian Ministry&#xd;
of University and Research under the NextGenerationEU program; the authors’ opinions do not&#xd;
necessarily reflect those of the funding agencies. The work of M. Francisco has been supported by&#xd;
the FPI 2017 predoctoral programme, from the Spanish Ministry of Economy and Competitiveness&#xd;
(MINECO), grant BES-2017-081202.</dc:description>
      <dc:description>Quantification, variously called supervised prevalence estimation or learning to quantify, is the supervised&#xd;
learning task of generating predictors of the relative frequencies (a.k.a. prevalence values) of the classes of&#xd;
interest in unlabelled data samples. While many quantification methods have been proposed in the past&#xd;
for binary problems and, to a lesser extent, single-label multiclass problems, the multi-label setting (i.e.,&#xd;
the scenario in which the classes of interest are not mutually exclusive) remains by and large unexplored.&#xd;
A straightforward solution to the multi-label quantification problem could simply consist of recasting the&#xd;
problem as a set of independent binary quantification problems. Such a solution is simple but naïve, since&#xd;
the independence assumption upon which it rests is, in most cases, not satisfied. In these cases, knowing&#xd;
the relative frequency of one class could be of help in determining the prevalence of other related classes.&#xd;
We propose the first truly multi-label quantification methods, i.e., methods for inferring estimators of class&#xd;
prevalence values that strive to leverage the stochastic dependencies among the classes of interest in order&#xd;
to predict their relative frequencies more accurately. We show empirical evidence that natively multi-label&#xd;
solutions outperform the naïve approaches by a large margin. The code to reproduce all our experiments is&#xd;
available online.</dc:description>
      <dc:date>2023-11-28T10:49:40Z</dc:date>
      <dc:date>2023-11-28T10:49:40Z</dc:date>
      <dc:date>2023-06</dc:date>
      <dc:type>journal article</dc:type>
      <dc:identifier>Published version:  Alejandro Moreo, Manuel Francisco, and Fabrizio Sebastiani. 2023. Multi-Label Quantification. ACM Trans. Knowl. Discov. Data. 18, 1, Article 4 (August 2023), 36 pages. [https://doi.org/10.1145/3606264]</dc:identifier>
      <dc:identifier>https://hdl.handle.net/10481/85892</dc:identifier>
      <dc:identifier>10.1145/3606264</dc:identifier>
      <dc:language>eng</dc:language>
      <dc:relation>info:eu-repo/grantAgreement/EC/H2020/INFRAIA-2019-1/871042</dc:relation>
      <dc:relation>info:eu-repo/grantAgreement/EC/H2020/ICT-48-2020/951911</dc:relation>
      <dc:rights>http://creativecommons.org/licenses/by-nc-nd/4.0/</dc:rights>
      <dc:rights>open access</dc:rights>
      <dc:rights>Attribution-NonCommercial-NoDerivatives 4.0 Internacional</dc:rights>
      <dc:publisher>Association for Computing Machinery</dc:publisher>
   </ow:Publication>
</rdf:RDF>