Multilingual Speech Emotion Recognition

#Machine Learning #NLP #Speech Recognition #Multilingual #Tamil #Dravidian Languages

Overview

Speech is a powerful medium that carries not only linguistic content but also paralinguistic cues like emotion and speaker identity. While Speech Emotion Recognition (SER) systems have seen significant progress in high-resource, monolingual settings, their applicability in multilingual contexts—especially for low-resource languages—remains limited. Dravidian languages such as Tamil, Telugu, Kannada, and Malayalam are widely spoken but severely underrepresented in SER research. This lack of representation restricts the development of inclusive emotion-aware systems, particularly in regions where these languages are dominant.

Motivation & Objectives

The primary motivation behind this project is to bridge the resource and performance gap in SER for underrepresented languages, with a focus on Tamil. This is addressed through three key research objectives:

**Dataset Creation** – To build a high-quality Tamil emotional speech dataset that includes core emotion classes like Happy, Sad, Angry, Fear, and Neutral.

**Multilingual Benchmarking** – To conduct a large-scale survey of global SER datasets across the top 70 most spoken languages, identifying usable emotional speech corpora in 29 languages.

**Model Development** – To design and train a multilingual SER model that supports Tamil and other Dravidian languages, while maintaining efficiency and cross-lingual generalization suitable for real-world, resource-constrained applications.

Impact & Results

Performance

The experimental results confirm that KuralNet offers strong multilingual generalization and excels particularly in low-resource Dravidian languages. Among the 13 supported languages, KuralNet achieved the highest macro F1-scores and weighted accuracy in Tamil, Kannada, and Malayalam, outperforming established baselines such as Emotion2Vec-Large, XGBoost, and Random Forest. The most notable performance gain was in Kannada, where KuralNet exceeded the macro F1-score of Emotion2Vec-Large by +0.55, showcasing its capability to learn robust emotional patterns even in settings with limited annotated data.

Real-world Impact

The outcomes of this work have significant implications for both academic research and real-world applications. For AI Researchers and Developers: KuralNet provides a scalable baseline for multilingual SER and sets a new standard for performance in Dravidian and other low-resource languages. For Industry Applications: By supporting languages like Tamil, Kannada, and Malayalam, KuralNet can be integrated into call centers, mental health tools, language learning platforms, and virtual assistants, enhancing their emotional intelligence and inclusivity.

Gallery

Team Members - Research Group

Project Team