The project focuses on developing techniques to increase the context window of encoder-based transformer models, such as RoBERTa, to improve the accuracy of entity extraction from lengthy legal documents. This project addresses the limitations of standard models with fixed context lengths, which struggle with long contracts, by implementing advanced methods like positional embedding interpolation and custom attention mechanisms. The goal is to achieve higher precision and recall in identifying legal entities, ensuring more reliable and comprehensive contract analysis.
Pairavi Thanancheyan
I am Pairavi Thanancheyan, currently pursuing a degree in Computer Science and Engineering at the University of Moratuwa. With hands-on experience in data science projects and a passion for learning and exploring AI and machine learning, I value teamwork and continuous improvement, always striving to make a positive impact through my work.