Mitigating Linguistic Aggression in Group Decision-Making: A Comparative Analysis of AI-Driven Hostility Detection
Metadata
Show full item recordAuthor
Trillo Vílchez, José Ramón; González-Quesada, Juan Carlos; Cabrerizo Lorite, Francisco Javier; Pérez Gálvez, Ignacio JavierEditorial
Springer Nature
Materia
Sentiment Analysis Group Decision-Making Large language models (LLMs)
Date
2026-03-14Referencia bibliográfica
Trillo, J.R.; González-Quesada, J.C.; Cabrerizo, F.J. & Pérez Gálvez, I. J. (2026). Mitigating Linguistic Aggression in Group Decision-Making: A Comparative Analysis of AI-Driven Hostility Detection. Evolving Systems, vol. 17, nº 47. https://doi.org/10.1007/s12530-026-09813-1
Sponsorship
MICIU/AEI/10.13039/501,100,011,033 and by ERDF/EU (PID2022–139297OB-I00); Regional Ministry of University, Research and Innovation and by the European Union (C-ING-165-UGR23); Universidad de Granada/CBUAAbstract
The process of group decision-making is an integral component not only for quotidian interactions but also for strategic
deliberations. However, it is profoundly shaped by the inherent semantic indeterminacy of natural language. This linguistic ambiguity starkly contrasts the syntactic and semantic precision characteristic of machine-generated language.
Furthermore, the conveyance of affective states–such as aggressiveness or elation–via natural language introduces a layer
of complexity that can significantly perturb the equilibrium of the group decision-making process. In response to these
challenges, we propose an advanced consensus-reaching methodology based on sentiment analysis to quantify and mitigate aggressiveness in discourse. This study conducts a comparative evaluation of three state-of-the-art large language
models: Gemini, Copilot, and ChatGPT for their efficacy in detecting and assessing hostility. By calibrating the influence of individual participants based on their degree of linguistic aggression, the proposed framework attenuates the
disproportionate impact of dominant voices, thus fostering a more balanced and equitable deliberative environment. This
methodological innovation not only incentivizes the adoption of a more dispassionate and constructive linguistic register
but also safeguards the integrity of collective decision-making processes against the distortive effects of undue emotional
influence. Across five repeated evaluations per comment, ChatGPT and Gemini exhibited < 5% variance, while Copilot
showed ≈ 8 − 12%; in all cases, hostility-aware weighting reduced the most aggressive expert’s influence by ≈ 27 − 29%,
yielding robust group rankings. These mechanisms improve consensus quality by reducing bias from aggressive discourse,
and they are expected to foster higher group satisfaction through perceived fairness in deliberation. Potential improvements include benchmarking against gold standards, extending to multilingual and multimodal contexts, and enhancing
transparency for end-users.





