Multi -Objective Optimization for Gender Bias Mitigation in Word Embeddings Applied to Hate Speech Detection

Título: Multi -Objective Optimization for Gender Bias Mitigation in Word Embeddings Applied to Hate Speech Detection

Autores: Gustavo Augusto Pires, Michel Bessani, João Pedro A.F. Campos & Luisa Marques Laboissiere

Resumo: Word embeddings have become core components in modern natural language processing (NLP) systems, because they are important tools to represent words as a set of vectors. However, these representations often reflect and perpetuate societal biases from the training data, particularly gender bias. To address this issue, recent research has proposed multi-objective optimization strategies that aim to mitigate bias while preserving the semantic integrity of the embeddings. In this study, we investigate the practical impact of such optimized embeddings in the context of hate speech detection. This application is one of the most socially sensitive NLP tasks. We apply embeddings optimized for fairness and semantic correlation to two benchmark datasets, training neural classifiers under identical conditions. Our results show that bias-reduced and semantically-tuned embeddings maintain, and in some cases slightly improve, classification performance compared to the original representations. These findings demonstrate that it is possible to reduce gender bias in word embeddings without compromising their downstream effectiveness. This work contributes to the development of fairer and more reliable NLP systems suitable for deployment in real-world moderation and decision-making contexts.

Palavras-chave: Gender bias; word embeddings; multi-objective optimization; hate speech detection.

Páginas: 6

Código DOI: 10.21528/CBIC2025-1191917

Artigo em PDF: CBIC_2025_paper1191917.pdf

Arquivo BibTeX:
CBIC_2025_1191917.bib