NOFACE: A new framework for irrelevant content filtering in social media according to credibility and expertise
Metadatos
Mostrar el registro completo del ítemEditorial
Elsevier
Materia
Social media mining Pre-processing Credibility World embeddings
Fecha
2022-07-13Referencia bibliográfica
J. Angel Diaz-Garcia, M. Dolores Ruiz, Maria J. Martin-Bautista, NOFACE: A new framework for irrelevant content filtering in social media according to credibility and expertise, Expert Systems with Applications, Volume 208, 2022, 118063, ISSN 0957-4174, [https://doi.org/10.1016/j.eswa.2022.118063]
Patrocinador
European Commission 786687; Andalusian government FEDER operative program P18-RT-2947 B-TIC-145-UGR18; University of Granada's internal plan PPJIB2021-04; Spanish Government FPU18/00150Resumen
Social networks have taken an irreplaceable role in our lives. They are used daily by millions of people
to communicate and inform themselves. This success has also led to a lot of irrelevant content and even
misinformation on social media. In this paper, we propose a user-centred framework to reduce the amount
of irrelevant content in social networks to support further stages of data mining processes. The system also
helps in the reduction of misinformation in social networks, since it selects credible and reputable users. The
system is based on the belief that if a user is credible then their content will be credible. Our proposal uses
word embeddings in a first stage, to create a set of interesting users according to their expertise. After that, in
a later stage, it employs social network metrics to further narrow down the relevant users according to their
credibility in the network. To validate the framework, it has been tested with two real Big Data problems on
Twitter. One related to COVID-19 tweets and the other to last United States elections on 3rd November. Both
are problems in which finding relevant content may be difficult due to the large amount of data published
during the last years. The proposed framework, called NOFACE, reduces the number of irrelevant users posting
about the topic, taking only those that have a higher credibility, and thus giving interesting information
about the selected topic. This entails a reduction of irrelevant information, mitigating therefore the presence
of misinformation on a posterior data mining method application, improving the obtained results, as it is
illustrated in the mentioned two topics using clustering, association rules and LDA techniques.