Sangaran Thevarasa

Sangaran Thevarasa

Final Year Undergraduate

BSc Computer Science & Engineering
"Extending Context Length of Encoder Trasformers for Enhanced Entity Extraction in Legal Contracts"

About

I am Sangaran Thevarasa, an undergraduate student in the Computer Science and Engineering department at the University of Moratuwa. I am passionate about exploring various aspects of Computer Science, with a strong interest in Software development, Data Science and Machine Learning. I am a motivated and dedicated individual, always eager to expand my knowledge and contribute to technological advancements in my field.

Projects

The project focuses on developing techniques to increase the context window of encoder-based transformer models, such as RoBERTa, to improve the accuracy of entity extraction from lengthy legal documents. This project addresses the limitations of standard models with fixed context lengths, which struggle with long contracts, by implementing advanced methods like positional embedding interpolation and custom attention mechanisms. The goal is to achieve higher precision and recall in identifying legal entities, ensuring more reliable and comprehensive contract analysis.

Research Areas

Machine Learning • NLP • Data Science