The project focuses on developing techniques to increase the context window of encoder-based transformer models, such as RoBERTa, to improve the accuracy of entity extraction from lengthy legal documents. This project addresses the limitations of standard models with fixed context lengths, which struggle with long contracts, by implementing advanced methods like positional embedding interpolation and custom attention mechanisms. The goal is to achieve higher precision and recall in identifying legal entities, ensuring more reliable and comprehensive contract analysis.
Sangaran Thevarasa
I am Sangaran Thevarasa, an undergraduate student in the Computer Science and Engineering department at the University of Moratuwa. I am passionate about exploring various aspects of Computer Science, with a strong interest in Software development, Data Science and Machine Learning. I am a motivated and dedicated individual, always eager to expand my knowledge and contribute to technological advancements in my field.