Tamil language Corpus consist of articles from Wikipedia & Tamil daily news , Dataset split into train and test for ease of use in building machine learning models
License - CC BY 4.0
Authors - Vanagamudi and Gaurov
Language - Tamil
Reference- https://github.com/vanangamudi/tamil-lm2