Detecting Fraudulent Campaigns on a Social Media Platform with Text-Based Machine Learning

Título: Detecting Fraudulent Campaigns on a Social Media Platform with Text-Based Machine Learning

Autores: Fernando G Ferreira, Thiago Ciodaro, Felipe Fink Grael, Vitor do Carmo, Danielle de Pinho Mello, Alékis de Carvalho Moreira, João Gabriel Haddad, Bruno Mauricio Martins, Débora Gomes Salles & Rose Marie Santini de Oliveira

Resumo: The growing prevalence of fraudulent digital ads, which frequently exploit institutional imagery and microtargeting strategies, is compounded by constantly evolving patterns designed to evade detection. This scenario underscores the need for machine learning approaches based on natural language processing to identify and classify this content effectively. This study proposes a methodological framework for detecting and analyzing fraudulent advertisements on digital platforms, relying exclusively on textual content. A Support Vector Machine classifier was developed using TF-IDF features and trained on a manually annotated dataset. The model achieved a mean F2 score of 0.93 under nested cross-validation, prioritizing recall to reduce the risk of false negatives in a human-in-the-loop annotation workflow. The model was applied at acquisition time to more than 40,000 ads which 77% were confirmed as fraudulent. These campaigns often impersonated public figures and exploited themes such as health, finance, and public programs. A post hoc interpretability analysis was performed to identify the linguistic markers most indicative of fraud.

Palavras-chave: Fraud Detection; SVM; Digital Advertising; Natural Language Processing; Data Science Application.

Páginas: 8

Código DOI: 10.21528/CBIC2025-1191948

Artigo em PDF: CBIC_2025_paper1191948.pdf

Arquivo BibTeX:
CBIC_2025_1191948.bib