Projects | Uthayasanker Thayasivam

UNMASKING HATE: AN INTEGRATED APPROACH TO DETECTING HATE SPEECH IN SOCIAL MEDIA


Description


This project addresses the growing challenge of online hate speech by enhancing detection through an advanced framework combining dual contrastive learning, emotion profiling, and author profiling. Leveraging fine-tuned models like BERTweet, RoBERTa, and TimeLMs, our approach aims to identify both direct and indirect hate speech across diverse social and political contexts.

Motivation and Research Objective

The surge in hate speech—especially after notable events like Elon Musk’s Twitter acquisition—highlights the urgent need for real-time, adaptable detection tools. Our objective is to develop a robust, interpretable hate speech detection system that incorporates contextual, emotional, and author-based features. By doing so, we aim to improve model accuracy and foster safer, more inclusive online platforms.

Results and Impact


Our experiments confirm that fine-tuning modern language models like BERTweet, TimeLMs, and RoBERTa significantly improves hate speech detection, especially on balanced datasets like HatEval. Integrating emotion and author profiling further boosted performance—with TimeLMs + Emotion + Author achieving up to 81.31% accuracy and 81.22% F1 score.

However, results varied for imbalanced datasets like Davidson. While author profiling and emotion analysis improved detection for some models, others showed reduced performance due to class imbalance. To address this, we applied data augmentation techniques like back-translation (via French, German, Japanese, etc.), which introduced diversity into hate speech samples and enhanced model generalization.

Additionally, we built classification pipelines using BERT and RoBERTa embeddings combined with traditional ML classifiers (SVC, XGBoost, etc.) to evaluate robustness. Among these, SVC with BERT embeddings performed best, especially when trained on back-translated datasets.

Overall, our integrated approach shows strong potential for building adaptable, accurate, and real-time hate speech detection systems, especially in diverse and evolving online environments.

Team members

Krishan Chavinda 

Department of Computer Science and Engineering,
University of Moratuwa, Sri Lanka.


Pasan S. Kalansooriya 

Department of Computer Science and Engineering,
University of Moratuwa, Sri Lanka.


Thushalya Weerasuriya

Department of Computer Science and Engineering,
University of Moratuwa, Sri Lanka.