Praveenasarma Baskarakurukkal

student

A final-year Computer Science and Engineering undergraduate at the University of  Moratuwa with a strong interest in backend development and a growing passion for Machine  Learning and Artificial Intelligence. As part of our final year Research and Development  Project, our team is working on Semantic Similarity and Uncertainty Quantification (UQ),  aiming to develop a model that not only compares the meaning of two texts but also  estimates the confidence of the model’s predictions.  

I am currently exploring the use of a knowledge graph-based approach to capture semantic  relationships and estimate uncertainty in textual data. This direction is under investigation as  a potential enhancement to traditional transformer-based similarity models. We aim to  explore the development of interpretable and trustworthy AI systems, with potential  applications in domains such as legal text analysis and intelligent search.  

"Not just what the model thinks but how sure it is"

Brief Description of Project

This project focuses on developing a semantic similarity model capable of computing relationships between textual inputs while also estimating the confidence of its predictions. Traditional models typically output a similarity score without providing insight into how confident they are in those predictions—an important limitation for high-stakes or mission-critical applications.

To address this, the project incorporates uncertainty quantification techniques to enhance the reliability and transparency of semantic similarity models. It explores advanced methods such as Monte Carlo Dropout, Bayesian Neural Networks, and information-theoretic metrics to capture and quantify prediction uncertainty.

Models will be evaluated using widely recognized benchmark datasets, including Semantic Textual Similarity (STS) and Quora Question Pairs (QQP).

Objectives

  • Analyze current methods for semantic similarity and uncertainty quantification
  • Identify suitable architectures for combining similarity scoring with confidence estimation
  • Investigate the use of knowledge graph-based techniques for deeper contextual understanding
  • Develop a working prototype that outputs both similarity scores and associated uncertainty
  • Compare the proposed approach against baseline models to assess effectiveness

Impact

This project aims to advance the development of more transparent and trustworthy AI systems for applications where accuracy and confidence are critical, such as legal, healthcare, and decision support systems.