<rdf:RDF xmlns:rdf="http://www.openarchives.org/OAI/2.0/rdf/" xmlns:ow="http://www.ontoweb.org/ontology/1#" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:ds="http://dspace.org/ds/elements/1.1/" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns:doc="http://www.lyncode.com/xoai" xsi:schemaLocation="http://www.openarchives.org/OAI/2.0/rdf/ http://www.openarchives.org/OAI/2.0/rdf.xsd">
   <ow:Publication rdf:about="oai:digibug.ugr.es:10481/76565">
      <dc:title>Offensive Language Detection in Arabic Social Networks Using Evolutionary-Based Classifiers Learned From Fine-Tuned Embeddings</dc:title>
      <dc:creator>Shannaq, Fatima</dc:creator>
      <dc:creator>Castillo Valdivieso, Pedro Ángel</dc:creator>
      <dc:subject>Arabic harassment dataset</dc:subject>
      <dc:subject>Deep learning</dc:subject>
      <dc:subject>Evolutionary algorithm</dc:subject>
      <dc:subject>Fine-tuned word embedding</dc:subject>
      <dc:subject>Hate speech</dc:subject>
      <dc:subject>Offensive language</dc:subject>
      <dc:subject>Optimization</dc:subject>
      <dc:description>Social networks facilitate communication between people from all over the world.&#xd;
Unfortunately, the excessive use of social networks leads to the rise of antisocial behaviors such as the&#xd;
spread of online offensive language, cyberbullying (CB), and hate speech (HS). Therefore, abusive\offensive&#xd;
and hate detection become a crucial part of cyberharassment. Manual detection of cyberharassment is&#xd;
cumbersome, slow, and not even feasible in rapidly growing data. In this study, we addressed the challenges&#xd;
of automatic detection of the offensive tweets in the Arabic language. The main contribution of this study is&#xd;
to design and implement an intelligent prediction system encompassing a two-stage optimization approach&#xd;
to identify and classify the offensive from the non-offensive text. In the  rst stage, the proposed approach&#xd;
 ne-tuned the pre-trainedword embedding models by training them for several epochs on the training dataset.&#xd;
The embeddings of the vocabularies in the new dataset are trained and added to the old embeddings. While&#xd;
in the second stage, it employed a hybrid approach of two classi ers, namely XGBoost and SVM, and a&#xd;
genetic algorithm (GA) to mitigate the drawback of the classi ers in  nding the optimal hyperparameter&#xd;
values to run the proposed approach. We tested the proposed approach on Arabic Cyberbullying Corpus&#xd;
(ArCybC), which contains tweets collected from four Twitter domains: gaming, sports, news, and celebrities.&#xd;
The ArCybC dataset has four categories: sexual, racial, intelligence, and appearance. The proposed approach&#xd;
produced superior results, in which the SVM algorithm with the Aravec SkipGram word embedding model&#xd;
achieved an accuracy rate of 88.2% and an F1-score rate of 87.8%.</dc:description>
      <dc:date>2022-09-07T09:30:55Z</dc:date>
      <dc:date>2022-09-07T09:30:55Z</dc:date>
      <dc:date>2022-07-14</dc:date>
      <dc:type>journal article</dc:type>
      <dc:identifier>F. Shannaq... [et al.]. "Offensive Language Detection in Arabic Social Networks Using Evolutionary-Based Classifiers Learned From Fine-Tuned Embeddings," in IEEE Access, vol. 10, pp. 75018-75039, 2022, doi: [10.1109/ACCESS.2022.3190960]</dc:identifier>
      <dc:identifier>http://hdl.handle.net/10481/76565</dc:identifier>
      <dc:identifier>10.1109/ACCESS.2022.3190960</dc:identifier>
      <dc:language>eng</dc:language>
      <dc:rights>http://creativecommons.org/licenses/by-nc-nd/4.0/</dc:rights>
      <dc:rights>open access</dc:rights>
      <dc:rights>Attribution-NonCommercial-NoDerivatives 4.0 Internacional</dc:rights>
      <dc:publisher>IEEE</dc:publisher>
   </ow:Publication>
</rdf:RDF>